Files
iddaai-be/mds/archive/V20_AI_ENGINE_AND_FEEDER_EVOLUTION.md
fahricansecer 2f0b85a0c7
Deploy Iddaai Backend / build-and-deploy (push) Failing after 18s
first (part 2: other directories)
2026-04-16 15:11:25 +03:00

3.3 KiB
Executable File

V20 "Beast" Ensemble AI Model & Feeder Evolution (Feb 2026)

Author: AI Agent (Antigravity)
Status: Operational / Stable
Focus: High-Precision Sport Predictions & Feeder Resilience


🚀 1. V20 Ensemble "Beast" Architecture

V20 is a significant leap from V17, moving from a single XGBoost model to a multi-engine ensemble approach. It synthesizes four specialized sub-engines:

Engine Responsibility Data Source
TeamPredictor Historical form, H2H, and ELO ratings. matches, leagues
PlayerPredictor Individual player ratings (V3) and tactical impact. players, match_player_participation
OddsPredictor Market sentiment and value discovery. odd_categories, odd_selections
RefereePredictor Disciplinarian bias (cards/fouls mapping). match_officials, official_roles

🧠 Core Innovation: Upset Detection

V20 includes a dedicated UpsetEngine (Surprise Discovery).

  • It identifies "trap" matches where a strong favorite might fail due to motivation gaps, derby tension, or relegation battles.
  • Flags matches with RISK_LEVEL: HIGH/EXTREME if surprise markers are detected.

🛠️ 2. Recent Stability & Fixes (Feb 8, 2026)

During recent live testing, critical stability patches were applied to ensure 100% reliability of the Python AI Engine.

🛡️ Null-Safety (The "NoneType" Correction)

  • Problem: Model crashes when standings data (league positions) were missing for new or minor league matches.
  • Fix: Implemented exhaustive null-checks in ContextEngine, UpsetEngine, and V20EnsemblePredictor. The model now gracefully handles None values and provides baseline predictions instead of failing.
  • Affected Files: ai-engine/features/context_engine.py, ai-engine/features/upset_engine.py, ai-engine/models/v20_ensemble.py.

Infrastructure: Local IP Cleanup

  • Problem: Several sub-engines had the production IP (13.49.226.80) hardcoded, causing timeouts in local development.
  • Fix: Mass replacement of production IPs with localhost across the entire ai-engine directory.
  • Tool used: Automated patch-ips.js script to ensure parity across all files.

📡 3. Feeder & Data-Fetcher Optimization

The live data flow was re-engineered for speed and accuracy.

🎯 Top League Filtering (top_leagues.json)

  • Optimization: Instead of processing 1200+ matches from Mackolik, the feeder now filters based on a curated list of IDs in top_leagues.json.
  • Result: Processing list reduced to ~160 matches. Feeder speed increased by ~7.5x.
  • Logic: DataFetcherTask now prioritizes high-value matches to save resources and API hits.

🕒 Lineup & Referee Coverage

  • Window Expansion:
    • Start: Fetches kadrolar (lineups) 4 hours before kickoff.
    • Persist: Continues updating up to 3 hours after the game to ensure scorers and officials are captured.
  • Accuracy: Confirmed successfully capturing 11-man starting lineups (XI) for top leagues like Premier League.

📊 4. Model Capabilities

  • Markets: MS (1X2), O/U (1.5, 2.5, 3.5), BTTS (KG), HT/FT, Corners, and Cards.
  • Output: Predicted xG (Expected Goals), Top 5 likely scores, and Smart Value recommendations.

This report serves as the technical baseline for the V20 implementation phase.