# Suggest-Bet-BE — AI Agent Context > **Last Updated:** 2026-04-06 > **Purpose:** Comprehensive project reference for AI agents working on this codebase. --- ## 1. Project Overview **Suggest-Bet-BE** is an **AI-powered sports betting prediction platform** backend. It provides: - AI-driven predictions for football & basketball matches - Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies) - Live score tracking & odds monitoring - Web scraping from Mackolik.com for historical & live match data - Google Gemini AI for natural language match commentary - User coupon tracking (ROI, Win Rate analytics) ### Technology Stack | Layer | Technology | | ----------- | -------------------------------------------- | | Backend API | NestJS 11 (TypeScript) | | AI Engine | Python FastAPI (v20+) | | Database | PostgreSQL 16 + Prisma ORM | | Queue | BullMQ + Redis (optional) | | Cache | Redis or in-memory fallback | | Auth | JWT + Passport (Access 15min + Refresh 7day) | | Scraping | Axios + Cheerio (Mackolik HTML parsing) | | Logging | Pino (structured logging) | | i18n | nestjs-i18n (TR, EN) | | API Docs | Swagger | | Deploy | Docker Compose | --- ## 2. Architecture ``` ┌──────────────────────────────────────────────────────────────────┐ │ CLIENTS (Web/Mobile) │ └───────────────────────────────┬──────────────────────────────────┘ │ HTTP/REST ┌───────────────────────────────▼──────────────────────────────────┐ │ NestJS Backend (Port 3005) │ │ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │ │ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │ │ │ Module │ Module │ Module │ Module │ Module │ │ │ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │ │ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │ │ │ Module │ Module │ Module │ Poster │ Module │ │ │ │SporToto │ Feeder │ Users │ │ │ │ │ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ │ Services: AiService | MatchAnalysis | Scraper │ │ │ ├──────────────────────────────────────────────────────────────┤ │ │ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │ │ └──────────────────────────────────────────────────────────────┘ │ ────┬─────────────────┬────────────────────┬──────────────────────┘ │ │ │ ▼ ▼ ▼ ┌─────────┐ ┌──────────────┐ ┌──────────────────┐ │PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │ │ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │ └───────── └────────────── └────────────────── │ ───────▼───────┐ │ Mackolik API │ │ (Data Source) │ └───────────────┘ ``` ### Database Statistics (~) - `matches`: 237K permanent match records - `live_matches`: ~82 active/upcoming matches (daily cycle) - `match_player_participation`: 3.3M - `odd_selections`: 8.5M - `teams`: 19,595 | `players`: 217K | `leagues`: 1,505 --- ## 3. Directory Structure ``` src/ ├── app.module.ts # Root module (Redis, Config, i18n, guards) ├── main.ts # Entry point, Swagger, Helmet, ValidationPipe ├── common/ # Shared layer │ ├── base/ # Generic BaseService & BaseController │ ├── types/ # ApiResponse, pagination DTOs │ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper) │ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor │ ├── decorators/ # @Public(), @Roles(), @CurrentUser() │ └── queues/ # BullMQ queue module ├── config/ # Env validation (Zod), config factories ├── database/ # PrismaService ├── i18n/ # TR/EN translations (common, errors, validation, auth) ├── modules/ # 13 feature modules │ ├── admin/ # Superadmin panel (user mgmt, settings, analytics) │ ├── analysis/ # Multi-match analysis orchestration │ ├── auth/ # JWT auth, refresh tokens, guards │ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService │ ├── feeder/ # Historical data scraping (Mackolik) │ ├── gemini/ # Google Gemini AI integration │ ├── health/ # Liveness, readiness, AI Engine health │ ├── leagues/ # Country/league/team discovery, H2H │ ├── matches/ # Match listing, details, active leagues │ ├── predictions/ # AI predictions with BullMQ queue & 6h cache │ ├── social-poster/ # Twitter API v2, Canvas image generation │ ├── spor-toto/ # Spor Toto integration │ └── users/ # User CRUD (BaseController pattern) ├── scripts/ # Feeder runners, cleanup scripts ├── services/ # Shared services │ ├── ai.service.ts # Python AI Engine bridge │ ├── match-analysis.service.ts # 7-phase analysis orchestrator │ └── scraper.service.ts # Mackolik HTML scraping └── tasks/ # Cron jobs (15min, 30min, daily) ├── data-fetcher.task.ts # Live matches, odds fetching ├── live-updater.task.ts # Score updates, match finalization └── limit-resetter.task.ts # Usage limits, subscription expiry ai-engine/ # Python FastAPI ML engine ├── main.py # FastAPI app, routes ├── services/ # single_match_orchestrator.py ├── core/ # Core algorithms ├── features/ # Feature engineering ├── models/ # ML models ├── training/ # Model training scripts ├── config/ # Configuration ├── utils/ # Utility functions └── tests/ # Test files ``` --- ## 4. Key Modules ### Auth Module - Register, Login, Refresh, Logout endpoints - bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored) - Global guards: `JwtAuthGuard`, `RolesGuard`, `PermissionsGuard` ### Predictions Module - Requires Redis (`REDIS_ENABLED=true`), conditionally loaded - BullMQ queue with worker processor - 6-hour TTL cache on prediction results - AI Engine call: `POST /v20plus/analyze/{matchId}` ### Coupons Module - `SmartCouponService`: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE) - `UserCouponService`: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok) ### Feeder Module - Historical scraping from 2023-06-01 to present (reverse chronological) - Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff - Resume support with state management ### Analysis Module - Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons) - 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save ### Social Poster Module - Twitter API v2 integration - Canvas-based prediction card image generation - Gemini-powered Turkish caption generation --- ## 5. Scheduled Tasks (Cron) | Task | Schedule | Description | | --------------------------- | -------------- | -------------------------------------------------------- | | `fetchLiveMatches()` | `*/15 * * * *` | Fetch football matches from Mackolik API | | `fetchOddsForPreMatches()` | `*/15 * * * *` | Fetch odds for upcoming matches (football + basketball) | | `fetchBasketballMatches()` | Manual | Basketball data via `basketball_top_leagues.json` filter | | `updateLiveScores()` | `*/15 * * * *` | Update live match scores | | `finalizeFinishedMatches()` | `*/30 * * * *` | Migrate finished: live_matches → matches table | | `resetUsageLimits()` | `0 3 * * *` | Reset daily usage limits (03:00 Istanbul time) | | `cleanupOldData()` | `0 4 * * *` | Delete 30-day old AI logs, 1-day finished live_matches | | `checkSubscriptions()` | `0 0 * * *` | Mark expired subscriptions | --- ## 6. AI Engine (Python FastAPI) Independent microservice on port 8000. ### Endpoints | Method | Path | Description | | ------ | ---------------------------------- | ------------------------------- | | POST | `/v20plus/analyze/{match_id}` | Single match analysis (main) | | GET | `/v20plus/analyze-htms/{match_id}` | First half - Full time analysis | | GET | `/v20plus/analyze-htft/{match_id}` | HT/FT probabilities | | POST | `/v20plus/coupon` | Smart coupon generation | | GET | `/v20plus/daily-banker` | Daily banker picks | | GET | `/v20plus/reversal-watchlist` | Score reversal watchlist | | GET | `/health` | Health check | ### Output Structure (`SingleMatchPredictionPackage`) ```typescript { model_version: "v20plus.X", match_info: { match_id, match_name, home_team, away_team, league, match_date_ms }, data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts }, risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings }, main_pick: { market, pick, probability, confidence, odds, bet_grade, edge }, value_pick: { ... }, bet_advice: { playable, suggested_stake_units, reason }, bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }], supporting_picks: [...], aggressive_pick: { market, pick, probability, confidence, odds }, scenario_top5: [{ score, prob }], score_prediction: { ft, ht, xg_home, xg_away, xg_total }, market_board: { ... }, reasoning_factors: string[], ai_commentary: string // Turkish commentary from Gemini } ``` --- ## 7. API Response Format All responses follow this standard structure: ```json { "success": true, "status": 200, "message": "İşlem başarıyla tamamlandı", // i18n translated "data": { ... }, "errors": [] } ``` **Critical Rule:** Controllers must NEVER return raw Prisma entities. Always use Response DTOs with `@Exclude()` and `@Expose()` from `class-transformer`. --- ## 8. Configuration ### Environment Variables ```env NODE_ENV=development PORT=3005 DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db JWT_SECRET=your-secret-key JWT_ACCESS_EXPIRATION=15m JWT_REFRESH_EXPIRATION=7d REDIS_ENABLED=false REDIS_HOST=localhost REDIS_PORT=6379 AI_ENGINE_URL=http://127.0.0.1:8000 ENABLE_GEMINI=false GOOGLE_API_KEY=your-api-key ``` ### Config Files - `top_leagues.json` — Football top league IDs (live match filter) - `basketball_top_leagues.json` — Basketball top league IDs - `bet-type.json` — Bet type definitions --- ## 9. Build & Run Commands ```bash # Development npm run start:dev # Watch mode (port 3005) # Production npm run build && npm run start:prod # Feeder (Data Collection) npm run feeder:historical # Historical scraping (2023-06→present) npm run feeder:fill-gaps # Fill missing data npm run feeder:basketball # Basketball data npm run feeder:live # Live data # Database npx prisma generate # Regenerate Prisma client npx prisma migrate dev # Run migrations npx prisma db seed # Seed database # Testing npm run test # Unit tests npm run test:e2e # E2E tests npx jest src/path/to/file.spec.ts # Single test file # Lint/Format npm run lint # ESLint with Prettier npm run format # Prettier write # Docker docker-compose up -d postgres redis # Infrastructure docker-compose up -d # All services # AI Engine (Python) cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload # Utility npm run swagger:summary # Export endpoint summary npm run cleanup:live # Cleanup live matches ``` --- ## 10. Code Style Guidelines ### Imports Order ```typescript // 1. NestJS/common imports import { Controller, Get, Post, Body } from '@nestjs/common'; // 2. External packages import * as bcrypt from 'bcrypt'; // 3. Local imports (relative) import { UsersService } from './users.service'; ``` ### Naming Conventions - Classes/Interfaces: `PascalCase` - Variables/Functions: `camelCase` - Constants: `UPPER_SNAKE_CASE` - Files: `kebab-case` - DTOs: `Entity + Dto` suffix (CreateUserDto, UpdateUserDto) ### Types - `strictNullChecks: true` — null/undefined checks required - `noImplicitAny: false` — `any` allowed (Prisma dynamic access) - Specify function return types: `async findOne(id: string): Promise` ### Error Handling ```typescript // Use NestJS HTTP Exceptions with i18n keys throw new NotFoundException('USER_NOT_FOUND'); throw new ConflictException('EMAIL_ALREADY_EXISTS'); // Reference src/i18n/{lang}/errors.json for available keys ``` --- ## 11. Known Issues & Gotchas 1. **Predictions module** requires Redis. Disabled when `REDIS_ENABLED=false`. 2. **Gemini AI** is optional. Returns `null` commentary when disabled. 3. **Global Exception Filter** wraps all errors as HTTP 200 (status in body). 4. **Lineup scraping** is disabled — only Team Stats are used (V20 optimization). 5. **Feeder V17 AI feature calculation** is disabled — V20 model runs in Python. 6. **BigInt serialization**: `BigInt.prototype.toJSON = function() { return this.toString(); }` polyfill in main.ts. 7. **i18n assets** copied via `nest-cli.json` `"assets": ["i18n/**/*"]` config. --- ## 12. Reference Files for AI Agents When working on this project, consult: - `project_summary.md` — Comprehensive project documentation (Turkish) - `README.md` — Architecture decisions, quick start guide - `prompt.md` — AI assistant reference guide with agent roles - `AGENTS.md` — Coding guidelines, DTO patterns, test structure - `.agent/` — Skills and agent role definitions - `top_leagues.json` / `basketball_top_leagues.json` — League filters --- ## 13. Team Logos Team logo URL template: `https://file.mackolikfeeds.com/teams/{teamId}` --- ## 14. 🆕 VQWEN Model Integration (Since 2026-04-06) We have integrated a new high-performance prediction engine called **VQWEN v3**. ### VQWEN Model Features - **Accuracy:** +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets). - **Features Used:** - `ELO Ratings` (Real-time team strength). - `Contextual Goals` (Home/Away specific performance). - `Rest Days` (Fatigue factor for teams playing < 3 days). - `H2H Win Rate` (Historical dominance). - `Form Points` (Last 5 games streak). - `Squad Strength` (Based on starting XI participation). - **Files:** - `ai-engine/scripts/train_vqwen_v3.py` — Training script. - `ai-engine/services/single_match_orchestrator.py` — Integration point. - `ai-engine/models/vqwen/` — Pickle models (`vqwen_ms.pkl`, etc.). ### New Live Lineup/Sidelined Fetcher - **Problem:** `lineups` and `sidelined` columns in `live_matches` were empty. - **Fix:** Added `updateLineupsAndSidelined()` method to `src/tasks/data-fetcher.task.ts`. - **Mechanism:** Uses `FeederScraperService.fetchStartingFormation` directly via Cron (`*/15 * * * *`). - **Status:** Active. ### Database Schema Updates - **`substate` Column:** Added to `matches` table to track specific match states (e.g., "penalties", "overtime", "postponed"). - **Sport Partition:** Tables are now partitioned by sport (`football_team_stats` vs `basketball_team_stats`). --- ## 16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07) ### HT/FT Reversal (1/2 & 2/1) Pattern Detection Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns. #### Key Findings (147,248 matches analyzed) | Metric | Value | |--------|-------| | **Total Reversal Matches** | 13,112 (8.90%) | | **1/2 (Home leads HT, Away wins FT)** | 5,992 (4.07%) | | **2/1 (Away leads HT, Home wins FT)** | 7,120 (4.84%) | #### 🚨 Basketball Leagues Have Suspiciously High Reversal Rates | League | Reversals | Total | Rate | |--------|-----------|-------|------| | Eurobasket U20 | 36 | 120 | **30.00%** 🔴 | | EuroLeague 🏀 | 183 | 639 | **28.64%** 🔴 | | PBA Commissioners 🏀 | 54 | 189 | **28.57%** 🔴 | | Ulusal Süper Lig 🏀 | 148 | 547 | **27.06%** 🔴 | | NBA 🏀 | 656 | 2,696 | **24.33%** 🔴 | **All top 15 leagues by reversal rate are BASKETBALL.** Football leagues show normal rates (5-8%). #### Suspicious Patterns 1. **Comeback Magnitude:** - 1 goal/point: 36.1% (normal) - 2 goals/points: 13.1% (suspicious) - **3+ goals/points: 50.8%** 🔴 **EXTREMELY HIGH** 2. **Extreme Comebacks (Basketball):** - Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!) - Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!) - These are statistically near-impossible without manipulation 3. **Favorite Loss Rate:** - 42.7% of reversals had the pre-match favorite lose (should be ~25-30%) #### Impact on Model - HT/FT model accuracy: **20.3%** (low due to reversal noise) - Basketball reversal data creates **training noise** - **Recommendation:** Either exclude basketball from HT/FT training or train separate basketball-specific model #### HT/FT Model Files - **Training script:** `ai-engine/scripts/train_htft_vqwen.py` - **Model output:** `ai-engine/models/xgboost/xgb_ht_ft.json` + `.pkl` - **Features:** 27 (Odds + HT/FT Tendencies + League stats) - **Status:** Working, outputs 9-class probabilities in `market_board.HTFT.probs` --- ## 17. 🐛 Lineup Parsing Fix (Since 2026-04-07) ### Problem AI Engine reported `"lineup_unavailable"` and `"lineup_incomplete"` flags even when `live_matches.lineups` contained full 11/11 lineup data from Mackolik. ### Root Cause Mackolik stores lineups in `"stats"` key format: ```json { "stats": { "home": [{ "personId": "...", "position": "...", ... }, ...], "away": [{ "personId": "...", "position": "...", ... }, ...] } } ``` But the parser expected `"xi"`, `"starting"`, or `"lineup"` keys at root level. ### Fix Updated `_parse_lineups_json()` in `ai-engine/services/single_match_orchestrator.py`: - Added fallback to check `lineups_json.get("stats")` for home/away arrays - Now correctly parses Mackolik's nested format - Result: `home_lineup_count: 11`, `away_lineup_count: 11`, `lineup_source: "confirmed_live"` --- ## 18. Docker Deployment ```yaml # docker-compose.yml services: services: app: # NestJS (port 3000→3000) postgres: # PostgreSQL 17 Alpine (port 15432:5432) redis: # Redis 7 Alpine (port 6379) adminer: # Database UI (dev profile, port 8080) ai-engine: # Python FastAPI (port 8002:8000) ``` --- _This file is maintained for AI agent context. Update when architecture or conventions change._