Files
iddaai-be/mds/archive/V17_MIGRATION_AND_TRAINING_GUIDE.md
fahricansecer 2f0b85a0c7
Deploy Iddaai Backend / build-and-deploy (push) Failing after 18s
first (part 2: other directories)
2026-04-16 15:11:25 +03:00

138 lines
5.2 KiB
Markdown
Executable File

# V17 "Galacticos" Migration & Training Guide
**Date:** February 6, 2026
**Status:** Architecture Implemented / Training Ready
---
## 1. Executive Summary
This document details the transition of the Suggest-Bet AI Engine from a monolithic, script-based system to a scalable, service-oriented architecture (SOA) powered by FastAPI and Docker. It also documents the rigorous data analysis that led to the "Strict Filtering" training strategy for the V17 Model.
**Core Achievement:** The system now treats players as first-class citizens (Embeddings) and evaluates match context (Odds, Form) dynamically, served via a high-performance HTTP API.
---
## 2. Architecture Overhaul
### Before (Legacy)
* **Execution:** Backend spawned new Python processes (`child_process.spawn`) for EVERY prediction request.
* **Performance:** ~3-5 seconds latency per request (loading PyTorch models from disk repeatedly).
* **Maintenance:** Spaghetti code mixing API logic, feature engineering, and training scripts.
* **Integration:** Brittle stdout parsing (Backend read text output from Python).
### After (V17 Beast Mode)
* **Execution:** Persistent Dockerized Service (`ai-engine`).
* **Performance:** <100ms latency (Models kept in RAM).
* **Structure:** Clean separation of concerns:
* `ai-engine/app`: FastAPI Routes (HTTP Layer).
* `ai-engine/core`: Pure Business Logic (Model, Features).
* `ai-engine/data`: Database Abstraction.
* **Integration:** Type-safe HTTP requests via `axios` from NestJS.
---
## 3. Data Analysis & Strategy Shift
We analyzed the database (~167k matches) and discovered critical data quality issues that were poisoning previous models.
### The "Garbage Data" Problem
* **Total Matches:** ~167,000
* **Matches with Lineups:** Only ~73,000 (44%)
* **Matches with Odds:** ~94,000 (56%)
* **Intersection (Quality Data):** < 50,000 matches.
**Conclusion:** Training on the full dataset forces the model to learn from "blind" matches (missing lineups) or "contextless" matches (missing odds), leading to hallucinations.
### The "Top Leagues" Solution
We analyzed 20 Top Leagues (Premier League, LaLiga, etc.) and found elite data quality:
* **Premier League:** 77% Lineup Coverage.
* **Championship:** 88% Lineup Coverage.
**Decision:** The V17 training pipeline now **strictly filters** for:
1. Top 20 Leagues (`top_leagues.json`).
2. Full Lineup Availability (11+ players per team).
3. Odds Availability.
This reduces the dataset size but drastically increases **Signal-to-Noise Ratio**.
---
## 4. The V17 Model "Galacticos"
### Philosophy
Instead of rating "Team A vs Team B", V17 rates "These 11 Players vs Those 11 Players" in the context of current odds and form.
### Input Vector (The Brain)
1. **Player Embeddings:** Each of the ~17,000 elite players has a learnable 32-dimensional vector.
2. **Context Vector (32-dim):**
* **Odds (9):** 1X2, Over/Under, BTTS (Normalized).
* **Form (12):** Goals Scored/Conceded, Win Rate (Home/Away).
* **H2H (3):** Historical dominance.
* **Advanced (8):** Rest Days (Fatigue), League Standing Diff.
### Output Heads (Multi-Task Learning)
The model is trained to predict **everything at once** to understand the game deeply:
* **Match Result (1X2):** CrossEntropy Loss.
* **Goals (Home/Away):** MSE Loss.
* **BTTS & Over 2.5:** Binary CrossEntropy Loss.
---
## 5. How to Train (The Beast Trainer)
The training logic is consolidated into a single robust script: `ai-engine/core/training/trainer_v17.py`.
### Prerequisites
* Docker container running (`ai-engine`).
* Database populated with historical data.
### Execution Command
Run this from your host terminal (project root):
```bash
# 1. Install dependencies (if running locally outside docker)
pip3 install torch numpy pandas tqdm psycopg2-binary scikit-learn python-dotenv
# 2. Run Training
export DATABASE_URL="postgresql://suggestbet:SuGGesT2026SecuRe@127.0.0.1:15432/boilerplate_db?schema=public"
python3 ai-engine/core/training/trainer_v17.py
```
### Key Metrics to Watch
The trainer reports **"High Confidence Accuracy"**.
* *Bad:* Overall Acc 50%, High Conf Acc 55%.
* *Good:* Overall Acc 55%, High Conf Acc **>85%**. (This is our goal).
---
## 6. Deployment & Usage
### 1. Docker
The AI Engine is now part of `docker-compose.yml`.
```bash
docker-compose up -d --build
```
### 2. Backend Usage
NestJS services (`SmartCouponService`, `PredictionsProcessor`) now use `axios` to call the AI Engine:
```typescript
// Example call
const response = await axios.post(`http://ai-engine:8000/predict/v17/${matchId}`);
```
### 3. API Endpoints
* `POST /predict/v17/{match_id}`: Single match prediction.
* `GET /health`: Health check.
---
## 7. Future Roadmap (TODOs)
1. **Curriculum Learning V2:** Sort training data by "Difficulty" (Goal Difference) to teach easy matches first. (Partially implemented).
2. **Live Dashboard:** A simple frontend page to visualize `v17_comprehensive.pth` training metrics in real-time.
3. **Auto-Retraining:** A Cron job (BullMQ) to retrain the model every Monday with the weekend's results.
4. **Feedback Loop:** Integrate `ai_predictions_log` table to feed "Wrong Predictions" back into training with higher weight.
---
*Generated by Gemini CLI Agent - Senior ML Engineer*