Self-hosted docs
Incident Templates
Pre-defined templates for common AI operations incidents.
Provider outage template
Step-by-step response for handling provider service disruptions.
- Detection: Automated alerts for service degradation or outage
- Assessment: Impact analysis on affected routes and workloads
- Response: Failover to approved backup endpoints
- Communication: Stakeholder notification and status updates
- Recovery: Monitoring and rollback procedures
Policy violation template
Containment workflow for guardrail and compliance violations.
- Detection: Real-time policy evaluation and alerting
- Containment: Route isolation and traffic redirection
- Investigation: Evidence collection and root cause analysis
- Remediation: Policy adjustment and validation
- Review: Post-incident analysis and prevention planning