iddaai-be/mds/archive/V17_MIGRATION_AND_TRAINING_GUIDE.md

# V17 "Galacticos" Migration & Training Guide
**Date:** February 6, 2026
**Status:** Architecture Implemented / Training Ready

---

## 1. Executive Summary
This document details the transition of the Suggest-Bet AI Engine from a monolithic, script-based system to a scalable, service-oriented architecture (SOA) powered by FastAPI and Docker. It also documents the rigorous data analysis that led to the "Strict Filtering" training strategy for the V17 Model.

**Core Achievement:** The system now treats players as first-class citizens (Embeddings) and evaluates match context (Odds, Form) dynamically, served via a high-performance HTTP API.

---

## 2. Architecture Overhaul

### Before (Legacy)
*   **Execution:** Backend spawned new Python processes (`child_process.spawn`) for EVERY prediction request.
*   **Performance:** ~3-5 seconds latency per request (loading PyTorch models from disk repeatedly).
*   **Maintenance:** Spaghetti code mixing API logic, feature engineering, and training scripts.
*   **Integration:** Brittle stdout parsing (Backend read text output from Python).

### After (V17 Beast Mode)
*   **Execution:** Persistent Dockerized Service (`ai-engine`).
*   **Performance:** <100ms latency (Models kept in RAM).
*   **Structure:** Clean separation of concerns:
    *   `ai-engine/app`: FastAPI Routes (HTTP Layer).
    *   `ai-engine/core`: Pure Business Logic (Model, Features).
    *   `ai-engine/data`: Database Abstraction.
*   **Integration:** Type-safe HTTP requests via `axios` from NestJS.

---

## 3. Data Analysis & Strategy Shift

We analyzed the database (~167k matches) and discovered critical data quality issues that were poisoning previous models.

### The "Garbage Data" Problem
*   **Total Matches:** ~167,000
*   **Matches with Lineups:** Only ~73,000 (44%)
*   **Matches with Odds:** ~94,000 (56%)
*   **Intersection (Quality Data):** < 50,000 matches.

**Conclusion:** Training on the full dataset forces the model to learn from "blind" matches (missing lineups) or "contextless" matches (missing odds), leading to hallucinations.

### The "Top Leagues" Solution
We analyzed 20 Top Leagues (Premier League, LaLiga, etc.) and found elite data quality:
*   **Premier League:** 77% Lineup Coverage.
*   **Championship:** 88% Lineup Coverage.

**Decision:** The V17 training pipeline now **strictly filters** for:
1.  Top 20 Leagues (`top_leagues.json`).
2.  Full Lineup Availability (11+ players per team).
3.  Odds Availability.

This reduces the dataset size but drastically increases **Signal-to-Noise Ratio**.

---

## 4. The V17 Model "Galacticos"

### Philosophy
Instead of rating "Team A vs Team B", V17 rates "These 11 Players vs Those 11 Players" in the context of current odds and form.

### Input Vector (The Brain)
1.  **Player Embeddings:** Each of the ~17,000 elite players has a learnable 32-dimensional vector.
2.  **Context Vector (32-dim):**
    *   **Odds (9):** 1X2, Over/Under, BTTS (Normalized).
    *   **Form (12):** Goals Scored/Conceded, Win Rate (Home/Away).
    *   **H2H (3):** Historical dominance.
    *   **Advanced (8):** Rest Days (Fatigue), League Standing Diff.

### Output Heads (Multi-Task Learning)
The model is trained to predict **everything at once** to understand the game deeply:
*   **Match Result (1X2):** CrossEntropy Loss.
*   **Goals (Home/Away):** MSE Loss.
*   **BTTS & Over 2.5:** Binary CrossEntropy Loss.

---

## 5. How to Train (The Beast Trainer)

The training logic is consolidated into a single robust script: `ai-engine/core/training/trainer_v17.py`.

### Prerequisites
*   Docker container running (`ai-engine`).
*   Database populated with historical data.

### Execution Command
Run this from your host terminal (project root):

```bash
# 1. Install dependencies (if running locally outside docker)
pip3 install torch numpy pandas tqdm psycopg2-binary scikit-learn python-dotenv

# 2. Run Training
export DATABASE_URL="postgresql://suggestbet:SuGGesT2026SecuRe@127.0.0.1:15432/boilerplate_db?schema=public"
python3 ai-engine/core/training/trainer_v17.py
```

### Key Metrics to Watch
The trainer reports **"High Confidence Accuracy"**.
*   *Bad:* Overall Acc 50%, High Conf Acc 55%.
*   *Good:* Overall Acc 55%, High Conf Acc **>85%**. (This is our goal).

---

## 6. Deployment & Usage

### 1. Docker
The AI Engine is now part of `docker-compose.yml`.
```bash
docker-compose up -d --build
```

### 2. Backend Usage
NestJS services (`SmartCouponService`, `PredictionsProcessor`) now use `axios` to call the AI Engine:

```typescript
// Example call
const response = await axios.post(`http://ai-engine:8000/predict/v17/${matchId}`);
```

### 3. API Endpoints
*   `POST /predict/v17/{match_id}`: Single match prediction.
*   `GET /health`: Health check.

---

## 7. Future Roadmap (TODOs)

1.  **Curriculum Learning V2:** Sort training data by "Difficulty" (Goal Difference) to teach easy matches first. (Partially implemented).
2.  **Live Dashboard:** A simple frontend page to visualize `v17_comprehensive.pth` training metrics in real-time.
3.  **Auto-Retraining:** A Cron job (BullMQ) to retrain the model every Monday with the weekend's results.
4.  **Feedback Loop:** Integrate `ai_predictions_log` table to feed "Wrong Predictions" back into training with higher weight.

---
*Generated by Gemini CLI Agent - Senior ML Engineer*