Operational management aspect
The operational management aspect describes how a system stays healthy over time: monitoring, incidents, continuity, and controlled change.
What this aspect answers
Section titled “What this aspect answers”- How do we know the system is healthy (and how do we diagnose quickly)?
- How do we handle incidents, changes, and recurring problems?
- How do we recover data and restore service?
Organization
Section titled “Organization”- Ownership for reliability and support responsibility.
- Clear escalation paths for incidents.
Process
Section titled “Process”- Incident/change/problem handling (with verification).
- Backup and recovery expectations.
Technique
Section titled “Technique”- Monitoring, logging, and alerting.
- Backup/restore mechanisms and verification.
Operationalization (where to go next)
Section titled “Operationalization (where to go next)”- Procedures and troubleshooting: Run & Support
- Asset facts (servers/domains/environments): Reference
- Observability expectations (concept): Architecture principles