Self-hosted docs
Failover Playbooks
Detailed procedures for executing safe failover operations.
Latency-based failover
Automated failover when latency exceeds defined thresholds.
- Threshold: Configurable latency limits per route
- Detection: Continuous monitoring with rolling window analysis
- Validation: Pre-failover health checks on backup endpoints
- Execution: Gradual traffic shift with monitoring
- Rollback: Automatic return when primary endpoint recovers
Availability-based failover
Failover when primary endpoints become unavailable.
- Detection: Error rate and availability monitoring
- Threshold: Configurable failure rates and time windows
- Fallback: Priority-based endpoint selection
- Notification: Automated alerts to operations teams
- Recovery: Health monitoring and automatic rollback