3
0

Refactor NGS Pulsar role: remove obsolete files and add new task and handler definitions

This commit is contained in:
NiceDevil
2025-08-24 10:05:24 +02:00
parent c17fe406f2
commit 5d3aed6890
7 changed files with 302 additions and 12 deletions

View File

@@ -1,12 +0,0 @@
---
# roles/install_ngs-pulsar/defaults/main.yml
pulsar_api_url: "https://NGS_API_URL"
pulsar_binary_name: "ngs-pulsar"
pulsar_architecture: "amd64"
pulsar_binary_url: "{{ pulsar_api_url }}/v1/_/pulsar/latest/{{ pulsar_binary_name }}-{{ pulsar_architecture }}"
pulsar_install_path: "/opt/enginsight/pulsar"
pulsar_license: "server"
pulsar_validate_certs: true
pulsar_accept_eula: true
pulsar_cleanup_script: true
pulsar_show_output: true

302
siem/backup-rsyslog.md Normal file
View File

@@ -0,0 +1,302 @@
# SIEM Integration Guide: USB Backup System
## System Overview
The USB Backup System is an automated backup solution that monitors USB disks and performs incremental backups of VeeamBackup files. This guide provides technical details for SIEM integration and log parsing.
---
## Syslog Configuration
### Connection Details
```yaml
Protocol: UDP/TCP/TCP-TLS (configurable)
Default Port: 514
Facility: local0 (16) - configurable via SYSLOG_FACILITY
Source Hostname: veeam-repo (configurable via SYSLOG_HOSTNAME)
Application Name: usb-backup
```
### Message Format
```
<PRIORITY>TIMESTAMP HOSTNAME APPLICATION: MESSAGE
```
**Example:**
```
<134>2025-08-23T17:05:30 veeam-repo usb-backup: Backup operation started
```
### Priority Calculation
- **Facility:** 16 (local0) × 8 = 128
- **Severity Levels:**
- ERROR: 3 → Priority 131 (128+3)
- WARNING: 4 → Priority 132 (128+4)
- INFO: 6 → Priority 134 (128+6)
---
## Event Categories & Message Patterns
### 1. Backup Lifecycle Events
#### Backup Operations
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| Start | `INFO` | `Backup operation started` | Backup process initiated |
| Success | `INFO` | `Backup operation completed successfully` | Backup finished successfully |
| Skip | `INFO` | `Backup skipped - no changes detected` | No files needed copying |
| Auto-trigger | `INFO` | `Auto-backup triggered - new files detected` | Automatic backup started |
#### File Operations
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| File Copy | `INFO` | `Copied backup file: {filename}` | Individual file copied |
| Config Sync | `INFO` | `Copied VeeamConfigBackup: {count} files` | Config backup completed |
| File Cleanup | `INFO` | `Removed for overwrite: {filename}` | Old file removed before copy |
| Copy Failure | `ERROR` | `Failed to copy backup file: {filename}` | File copy failed |
#### Progress Tracking
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| Daily Start | `INFO` | `Starting backup for date: {YYYY-MM-DD}` | Processing specific date |
| Daily Complete | `INFO` | `Completed backup for date {date}: {copied} copied, {skipped} skipped` | Daily summary |
### 2. System Status Events
#### Mount/Unmount Operations
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| Mount Success | `INFO` | `USB-Monitor: Successfully mounted {disk_label}` | USB disk connected |
| Unmount | `INFO` | `USB-Monitor: Disk unmounted successfully` | USB disk removed |
| Stale Mount | `WARNING` | `USB-Monitor: Stale mount detected` | Hardware disconnected unexpectedly |
| Mount Error | `ERROR` | `USB-Monitor: Failed to mount backup disk` | Mount operation failed |
#### Process Management
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| Lock Acquired | `INFO` | `Lock acquired (PID: {process_id})` | Process started |
| Lock Released | `INFO` | `Lock released` | Process completed |
| Collision Avoided | `INFO` | `Auto-backup skipped - backup already running` | Prevented parallel execution |
### 3. Warnings & Errors
#### Administrative Warnings
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| Friday Alert | `WARNING` | `Friday warning: No USB backup disk mounted before weekend` | Weekend preparation reminder |
| Target Missing | `WARNING` | `Auto-backup skipped - target not mounted` | USB disk not connected |
| Disk Full | `WARNING` | `Disk usage would exceed 95% - stopping for safety` | Storage capacity warning |
#### Critical Errors
| Event | Severity | Message Pattern | Description |
|-------|----------|-----------------|-------------|
| System Failure | `ERROR` | `Backup failed during scan and analyze` | Core system error |
| Configuration Error | `ERROR` | `Failed to load configuration` | Config file issues |
| Hardware Error | `ERROR` | `All unmount attempts failed` | Hardware malfunction |
---
## Parsing Recommendations
### Primary Regex Pattern
```regex
^<(\d+)>(\S+)\s+(\S+)\s+([^:]+):\s+(.+)$
```
**Capture Groups:**
1. Priority (131-134)
2. Timestamp (ISO 8601)
3. Hostname
4. Application (usb-backup)
5. Message content
### Event-Specific Patterns
#### Backup Operations
```regex
# Backup lifecycle
Backup operation (started|completed successfully|failed)
# File operations
(Copied|Failed to copy) backup file: (.+)
Copied VeeamConfigBackup: (\d+) files
# Statistics extraction
(\d+) copied, (\d+) skipped
```
#### Mount Events
```regex
# Mount operations
USB-Monitor: .*(mounted|unmounted|detected)
(Successfully mounted|unmounted) (.+)
# Error conditions
Failed to (copy|mount|sync) (.+)
```
#### Metrics Extraction
```regex
# File counts
(\d+) files (copied|skipped|found)
# Data sizes (TB, GB, MB, KB, B)
(\d+(?:\.\d+)?)(TB|GB|MB|KB|B)
# Time durations
(\d+)s$
```
---
## Structured Data Fields
### Recommended JSON Schema
```json
{
"timestamp": "2025-08-23T17:05:30Z",
"hostname": "veeam-repo",
"application": "usb-backup",
"severity": "INFO|WARNING|ERROR",
"priority": 134,
"category": "backup|mount|system|error",
"action": "started|completed|failed|copied|mounted|unmounted",
"object": "filename.vbk|disk_label|YYYY-MM-DD",
"metrics": {
"files_copied": 42,
"files_skipped": 128,
"data_size_bytes": 4398046511104,
"duration_seconds": 3600,
"disk_utilization_percent": 85
},
"details": {
"disk_name": "HDD-01",
"backup_date": "2025-08-23",
"target_path": "/mnt/backup-usb",
"process_id": 12345
}
}
```
---
## SIEM Rules & Alerting
### Critical Alert Conditions
#### Immediate Response Required
```yaml
- condition: severity = "ERROR" AND message CONTAINS "Failed to copy backup file"
alert_level: HIGH
description: "Backup file copy failure"
- condition: severity = "ERROR" AND message CONTAINS "Backup failed"
alert_level: HIGH
description: "Complete backup system failure"
- condition: severity = "ERROR" AND message CONTAINS "Mount.*failed"
alert_level: MEDIUM
description: "USB disk mount failure"
```
#### Business Process Monitoring
```yaml
- condition: severity = "WARNING" AND message CONTAINS "Friday warning" AND time BETWEEN "11:00" AND "12:00" AND day = "FRIDAY"
alert_level: LOW
description: "Weekend backup preparation reminder"
- condition: NOT EXISTS "Backup operation completed successfully" IN last 25 hours
alert_level: MEDIUM
description: "Missing daily backup completion"
- condition: message CONTAINS "Disk usage would exceed 95%"
alert_level: MEDIUM
description: "Backup disk approaching capacity"
```
### Performance Metrics Dashboard
#### Key Performance Indicators
| Metric | Query Pattern | Frequency |
|--------|---------------|-----------|
| **Backup Success Rate** | `"completed successfully" / "operation started" * 100` | Daily |
| **Data Throughput** | Extract size from `"Copied.*files \((\d+(?:\.\d+)?)(TB\|GB)"` | Daily |
| **Average Duration** | Time between `"started"` and `"completed"` | Daily |
| **Disk Utilization** | Extract from `"Space Utilization: (\d+)%"` | Per backup |
| **Error Rate** | `COUNT(severity="ERROR") / COUNT(*)` | Daily |
#### Capacity Planning
| Metric | Pattern | Purpose |
|--------|---------|---------|
| **Files Processed** | `"(\d+) copied, (\d+) skipped"` | Growth trending |
| **Disk Rotation** | `"mounted (HDD-\d+)"` frequency | Hardware utilization |
| **Weekend Accumulation** | Files between Friday-Monday | Capacity planning |
---
## Business Rules & SLAs
### Service Level Expectations
- **Backup Frequency:** Daily (within 24-hour window)
- **Maximum Failure Rate:** <5% monthly
- **Recovery Time:** <30 minutes for manual intervention
- **Disk Capacity Warning:** 90% utilization threshold
- **Weekend Preparation:** Friday 11 AM reminder system
### Operational Procedures
1. **Daily Health Check:** Verify backup completion message
2. **Weekly Capacity Review:** Monitor disk utilization trends
3. **Monthly Hardware Rotation:** Track disk usage patterns
4. **Quarterly Failure Analysis:** Review error patterns and system improvements
---
## Testing & Validation
### Sample Log Entries for Parser Testing
```
<134>2025-08-23T09:30:15 veeam-repo usb-backup: Backup operation started
<134>2025-08-23T09:30:16 veeam-repo usb-backup: Found 131 files that need copying
<134>2025-08-23T09:35:22 veeam-repo usb-backup: Copied backup file: srv-01.vm-20250823T093522.vbk
<134>2025-08-23T10:45:33 veeam-repo usb-backup: Completed backup for date 2025-08-23: 42 copied, 89 skipped
<134>2025-08-23T10:45:35 veeam-repo usb-backup: Backup operation completed successfully
<132>2025-08-25T11:15:00 veeam-repo usb-backup: Friday warning: No USB backup disk mounted before weekend
<131>2025-08-23T14:22:11 veeam-repo usb-backup: Failed to copy backup file: srv-02.vm-20250823T142211.vbk
```
### Validation Queries
```sql
-- Daily backup completion rate
SELECT DATE(timestamp) as backup_date,
COUNT(*) FILTER (WHERE message LIKE '%completed successfully%') as completed,
COUNT(*) FILTER (WHERE message LIKE '%operation started%') as started,
ROUND(completed::float / started * 100, 2) as success_rate
FROM syslog_events
WHERE application = 'usb-backup'
GROUP BY DATE(timestamp)
ORDER BY backup_date DESC;
-- Error trend analysis
SELECT DATE(timestamp) as date,
severity,
COUNT(*) as event_count,
array_agg(DISTINCT message) as error_types
FROM syslog_events
WHERE application = 'usb-backup'
AND severity IN ('ERROR', 'WARNING')
GROUP BY DATE(timestamp), severity
ORDER BY date DESC, severity;
```
---
## Contact Information
For technical questions about this integration:
- **System Administrator:** [Your Contact]
- **Log Format Changes:** Version-controlled, advance notification provided
- **Emergency Escalation:** Monitor for ERROR severity events
**Document Version:** 1.0
**Last Updated:** 2025-08-23
**Next Review:** 2025-11-23