- Reset consecutiveFailures on cooldown expiry (half-open state)
so a single retry failure doesn't immediately re-open the circuit
- Exclude AI Engine app-level 500s from circuit breaker count
(only network/infra errors: timeout, 502, 503, 504, 429)
- Return null gracefully instead of throwing 503 when no cache exists
- Add DB fallback for non-cooldown AI Engine failures
- Remove blocking wait-and-retry that held requests for up to 20s
- Add 4-level fallback when AI circuit breaker fires cooldown:
1) In-memory cache (10min TTL)
2) DB stored prediction (no TTL filter)
3) DB cached prediction (with model version check)
4) Wait out cooldown + retry once (max 20s wait)
- Raise circuit breaker threshold from 3 to 5 consecutive failures
- Reduce cooldown duration from 30s to 15s for faster recovery
- Add extractCooldownMs helper to parse remaining ms from error detail