Files
iddaai-be/QWEN.md
fahricansecer 7814e0bc6b
Deploy Iddaai Backend / build-and-deploy (push) Failing after 4s
first (part 1: root files)
2026-04-16 15:09:10 +03:00

21 KiB
Raw Permalink Blame History

Suggest-Bet-BE — AI Agent Context

Last Updated: 2026-04-06 Purpose: Comprehensive project reference for AI agents working on this codebase.


1. Project Overview

Suggest-Bet-BE is an AI-powered sports betting prediction platform backend. It provides:

  • AI-driven predictions for football & basketball matches
  • Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
  • Live score tracking & odds monitoring
  • Web scraping from Mackolik.com for historical & live match data
  • Google Gemini AI for natural language match commentary
  • User coupon tracking (ROI, Win Rate analytics)

Technology Stack

Layer Technology
Backend API NestJS 11 (TypeScript)
AI Engine Python FastAPI (v20+)
Database PostgreSQL 16 + Prisma ORM
Queue BullMQ + Redis (optional)
Cache Redis or in-memory fallback
Auth JWT + Passport (Access 15min + Refresh 7day)
Scraping Axios + Cheerio (Mackolik HTML parsing)
Logging Pino (structured logging)
i18n nestjs-i18n (TR, EN)
API Docs Swagger
Deploy Docker Compose

2. Architecture

┌──────────────────────────────────────────────────────────────────┐
│                        CLIENTS (Web/Mobile)                       │
└───────────────────────────────┬──────────────────────────────────┘
                                │ HTTP/REST
┌───────────────────────────────▼──────────────────────────────────┐
│                    NestJS Backend (Port 3005)                      │
│  ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐  │
│  │  Auth   │  Admin   │ Matches  │ Leagues  │  Predictions    │  │
│  │ Module  │ Module   │  Module  │  Module  │   Module        │  │
│  ├─────────┼──────────┼──────────┼──────────┼─────────────────┤  │
│  │ Coupons │ Analysis │  Gemini  │ Social-  │    Health       │  │
│  │ Module  │  Module  │  Module  │ Poster   │   Module        │  │
│  │SporToto │  Feeder  │  Users   │          │                 │  │
│  └─────────┴──────────┴──────────┴──────────┴─────────────────┘  │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │  Services: AiService | MatchAnalysis | Scraper              │ │
│  ├──────────────────────────────────────────────────────────────┤ │
│  │  Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter    │ │
│  └──────────────────────────────────────────────────────────────┘ │
────┬─────────────────┬────────────────────┬──────────────────────┘
     │                 │                    │
     ▼                 ▼                    ▼
┌─────────┐    ┌──────────────┐    ┌──────────────────┐
│PostgreSQL│    │ Redis/BullMQ │    │ AI Engine (py)   │
│  (3.6GB) │    │  (Optional)  │    │ FastAPI:8000     │
└─────────    └──────────────    └──────────────────
                                           │
                                   ───────▼───────┐
                                   │ Mackolik API   │
                                   │ (Data Source)  │
                                   └───────────────┘

Database Statistics (~)

  • matches: 237K permanent match records
  • live_matches: ~82 active/upcoming matches (daily cycle)
  • match_player_participation: 3.3M
  • odd_selections: 8.5M
  • teams: 19,595 | players: 217K | leagues: 1,505

3. Directory Structure

src/
├── app.module.ts           # Root module (Redis, Config, i18n, guards)
├── main.ts                 # Entry point, Swagger, Helmet, ValidationPipe
├── common/                 # Shared layer
│   ├── base/               # Generic BaseService<T> & BaseController<T>
│   ├── types/              # ApiResponse<T>, pagination DTOs
│   ├── filters/            # GlobalExceptionFilter (HTTP 200 wrapper)
│   ├── interceptors/       # ResponseInterceptor, SanitizeInterceptor
│   ├── decorators/         # @Public(), @Roles(), @CurrentUser()
│   └── queues/             # BullMQ queue module
├── config/                 # Env validation (Zod), config factories
├── database/               # PrismaService
├── i18n/                   # TR/EN translations (common, errors, validation, auth)
├── modules/                # 13 feature modules
│   ├── admin/              # Superadmin panel (user mgmt, settings, analytics)
│   ├── analysis/           # Multi-match analysis orchestration
│   ├── auth/               # JWT auth, refresh tokens, guards
│   ├── coupons/            # SmartCouponService (5 strategies), UserCouponService
│   ├── feeder/             # Historical data scraping (Mackolik)
│   ├── gemini/             # Google Gemini AI integration
│   ├── health/             # Liveness, readiness, AI Engine health
│   ├── leagues/            # Country/league/team discovery, H2H
│   ├── matches/            # Match listing, details, active leagues
│   ├── predictions/        # AI predictions with BullMQ queue & 6h cache
│   ├── social-poster/      # Twitter API v2, Canvas image generation
│   ├── spor-toto/          # Spor Toto integration
│   └── users/              # User CRUD (BaseController pattern)
├── scripts/                # Feeder runners, cleanup scripts
├── services/               # Shared services
│   ├── ai.service.ts       # Python AI Engine bridge
│   ├── match-analysis.service.ts  # 7-phase analysis orchestrator
│   └── scraper.service.ts  # Mackolik HTML scraping
└── tasks/                  # Cron jobs (15min, 30min, daily)
    ├── data-fetcher.task.ts      # Live matches, odds fetching
    ├── live-updater.task.ts      # Score updates, match finalization
    └── limit-resetter.task.ts    # Usage limits, subscription expiry

ai-engine/                # Python FastAPI ML engine
├── main.py               # FastAPI app, routes
├── services/             # single_match_orchestrator.py
├── core/                 # Core algorithms
├── features/             # Feature engineering
├── models/               # ML models
├── training/             # Model training scripts
├── config/               # Configuration
├── utils/                # Utility functions
└── tests/                # Test files

4. Key Modules

Auth Module

  • Register, Login, Refresh, Logout endpoints
  • bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
  • Global guards: JwtAuthGuard, RolesGuard, PermissionsGuard

Predictions Module

  • Requires Redis (REDIS_ENABLED=true), conditionally loaded
  • BullMQ queue with worker processor
  • 6-hour TTL cache on prediction results
  • AI Engine call: POST /v20plus/analyze/{matchId}

Coupons Module

  • SmartCouponService: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)
  • UserCouponService: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)

Feeder Module

  • Historical scraping from 2023-06-01 to present (reverse chronological)
  • Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
  • Resume support with state management

Analysis Module

  • Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
  • 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save

Social Poster Module

  • Twitter API v2 integration
  • Canvas-based prediction card image generation
  • Gemini-powered Turkish caption generation

5. Scheduled Tasks (Cron)

Task Schedule Description
fetchLiveMatches() */15 * * * * Fetch football matches from Mackolik API
fetchOddsForPreMatches() */15 * * * * Fetch odds for upcoming matches (football + basketball)
fetchBasketballMatches() Manual Basketball data via basketball_top_leagues.json filter
updateLiveScores() */15 * * * * Update live match scores
finalizeFinishedMatches() */30 * * * * Migrate finished: live_matches → matches table
resetUsageLimits() 0 3 * * * Reset daily usage limits (03:00 Istanbul time)
cleanupOldData() 0 4 * * * Delete 30-day old AI logs, 1-day finished live_matches
checkSubscriptions() 0 0 * * * Mark expired subscriptions

6. AI Engine (Python FastAPI)

Independent microservice on port 8000.

Endpoints

Method Path Description
POST /v20plus/analyze/{match_id} Single match analysis (main)
GET /v20plus/analyze-htms/{match_id} First half - Full time analysis
GET /v20plus/analyze-htft/{match_id} HT/FT probabilities
POST /v20plus/coupon Smart coupon generation
GET /v20plus/daily-banker Daily banker picks
GET /v20plus/reversal-watchlist Score reversal watchlist
GET /health Health check

Output Structure (SingleMatchPredictionPackage)

{
  model_version: "v20plus.X",
  match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
  data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
  risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
  main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
  value_pick: { ... },
  bet_advice: { playable, suggested_stake_units, reason },
  bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
  supporting_picks: [...],
  aggressive_pick: { market, pick, probability, confidence, odds },
  scenario_top5: [{ score, prob }],
  score_prediction: { ft, ht, xg_home, xg_away, xg_total },
  market_board: { ... },
  reasoning_factors: string[],
  ai_commentary: string  // Turkish commentary from Gemini
}

7. API Response Format

All responses follow this standard structure:

{
  "success": true,
  "status": 200,
  "message": "İşlem başarıyla tamamlandı",  // i18n translated
  "data": { ... },
  "errors": []
}

Critical Rule: Controllers must NEVER return raw Prisma entities. Always use Response DTOs with @Exclude() and @Expose() from class-transformer.


8. Configuration

Environment Variables

NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=false
REDIS_HOST=localhost
REDIS_PORT=6379
AI_ENGINE_URL=http://127.0.0.1:8000
ENABLE_GEMINI=false
GOOGLE_API_KEY=your-api-key

Config Files

  • top_leagues.json — Football top league IDs (live match filter)
  • basketball_top_leagues.json — Basketball top league IDs
  • bet-type.json — Bet type definitions

9. Build & Run Commands

# Development
npm run start:dev              # Watch mode (port 3005)

# Production
npm run build && npm run start:prod

# Feeder (Data Collection)
npm run feeder:historical      # Historical scraping (2023-06→present)
npm run feeder:fill-gaps       # Fill missing data
npm run feeder:basketball      # Basketball data
npm run feeder:live            # Live data

# Database
npx prisma generate            # Regenerate Prisma client
npx prisma migrate dev         # Run migrations
npx prisma db seed             # Seed database

# Testing
npm run test                   # Unit tests
npm run test:e2e               # E2E tests
npx jest src/path/to/file.spec.ts  # Single test file

# Lint/Format
npm run lint                   # ESLint with Prettier
npm run format                 # Prettier write

# Docker
docker-compose up -d postgres redis   # Infrastructure
docker-compose up -d                 # All services

# AI Engine (Python)
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload

# Utility
npm run swagger:summary        # Export endpoint summary
npm run cleanup:live           # Cleanup live matches

10. Code Style Guidelines

Imports Order

// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';

// 2. External packages
import * as bcrypt from 'bcrypt';

// 3. Local imports (relative)
import { UsersService } from './users.service';

Naming Conventions

  • Classes/Interfaces: PascalCase
  • Variables/Functions: camelCase
  • Constants: UPPER_SNAKE_CASE
  • Files: kebab-case
  • DTOs: Entity + Dto suffix (CreateUserDto, UpdateUserDto)

Types

  • strictNullChecks: true — null/undefined checks required
  • noImplicitAny: falseany allowed (Prisma dynamic access)
  • Specify function return types: async findOne(id: string): Promise<User>

Error Handling

// Use NestJS HTTP Exceptions with i18n keys
throw new NotFoundException('USER_NOT_FOUND');
throw new ConflictException('EMAIL_ALREADY_EXISTS');

// Reference src/i18n/{lang}/errors.json for available keys

11. Known Issues & Gotchas

  1. Predictions module requires Redis. Disabled when REDIS_ENABLED=false.
  2. Gemini AI is optional. Returns null commentary when disabled.
  3. Global Exception Filter wraps all errors as HTTP 200 (status in body).
  4. Lineup scraping is disabled — only Team Stats are used (V20 optimization).
  5. Feeder V17 AI feature calculation is disabled — V20 model runs in Python.
  6. BigInt serialization: BigInt.prototype.toJSON = function() { return this.toString(); } polyfill in main.ts.
  7. i18n assets copied via nest-cli.json "assets": ["i18n/**/*"] config.

12. Reference Files for AI Agents

When working on this project, consult:

  • project_summary.md — Comprehensive project documentation (Turkish)
  • README.md — Architecture decisions, quick start guide
  • prompt.md — AI assistant reference guide with agent roles
  • AGENTS.md — Coding guidelines, DTO patterns, test structure
  • .agent/ — Skills and agent role definitions
  • top_leagues.json / basketball_top_leagues.json — League filters

13. Team Logos

Team logo URL template: https://file.mackolikfeeds.com/teams/{teamId}


14. 🆕 VQWEN Model Integration (Since 2026-04-06)

We have integrated a new high-performance prediction engine called VQWEN v3.

VQWEN Model Features

  • Accuracy: +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
  • Features Used:
    • ELO Ratings (Real-time team strength).
    • Contextual Goals (Home/Away specific performance).
    • Rest Days (Fatigue factor for teams playing < 3 days).
    • H2H Win Rate (Historical dominance).
    • Form Points (Last 5 games streak).
    • Squad Strength (Based on starting XI participation).
  • Files:
    • ai-engine/scripts/train_vqwen_v3.py — Training script.
    • ai-engine/services/single_match_orchestrator.py — Integration point.
    • ai-engine/models/vqwen/ — Pickle models (vqwen_ms.pkl, etc.).

New Live Lineup/Sidelined Fetcher

  • Problem: lineups and sidelined columns in live_matches were empty.
  • Fix: Added updateLineupsAndSidelined() method to src/tasks/data-fetcher.task.ts.
  • Mechanism: Uses FeederScraperService.fetchStartingFormation directly via Cron (*/15 * * * *).
  • Status: Active.

Database Schema Updates

  • substate Column: Added to matches table to track specific match states (e.g., "penalties", "overtime", "postponed").
  • Sport Partition: Tables are now partitioned by sport (football_team_stats vs basketball_team_stats).

16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)

HT/FT Reversal (1/2 & 2/1) Pattern Detection

Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.

Key Findings (147,248 matches analyzed)

Metric Value
Total Reversal Matches 13,112 (8.90%)
1/2 (Home leads HT, Away wins FT) 5,992 (4.07%)
2/1 (Away leads HT, Home wins FT) 7,120 (4.84%)

🚨 Basketball Leagues Have Suspiciously High Reversal Rates

League Reversals Total Rate
Eurobasket U20 36 120 30.00% 🔴
EuroLeague 🏀 183 639 28.64% 🔴
PBA Commissioners 🏀 54 189 28.57% 🔴
Ulusal Süper Lig 🏀 148 547 27.06% 🔴
NBA 🏀 656 2,696 24.33% 🔴

All top 15 leagues by reversal rate are BASKETBALL. Football leagues show normal rates (5-8%).

Suspicious Patterns

  1. Comeback Magnitude:

    • 1 goal/point: 36.1% (normal)
    • 2 goals/points: 13.1% (suspicious)
    • 3+ goals/points: 50.8% 🔴 EXTREMELY HIGH
  2. Extreme Comebacks (Basketball):

    • Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
    • Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
    • These are statistically near-impossible without manipulation
  3. Favorite Loss Rate:

    • 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)

Impact on Model

  • HT/FT model accuracy: 20.3% (low due to reversal noise)
  • Basketball reversal data creates training noise
  • Recommendation: Either exclude basketball from HT/FT training or train separate basketball-specific model

HT/FT Model Files

  • Training script: ai-engine/scripts/train_htft_vqwen.py
  • Model output: ai-engine/models/xgboost/xgb_ht_ft.json + .pkl
  • Features: 27 (Odds + HT/FT Tendencies + League stats)
  • Status: Working, outputs 9-class probabilities in market_board.HTFT.probs

17. 🐛 Lineup Parsing Fix (Since 2026-04-07)

Problem

AI Engine reported "lineup_unavailable" and "lineup_incomplete" flags even when live_matches.lineups contained full 11/11 lineup data from Mackolik.

Root Cause

Mackolik stores lineups in "stats" key format:

{
  "stats": {
    "home": [{ "personId": "...", "position": "...", ... }, ...],
    "away": [{ "personId": "...", "position": "...", ... }, ...]
  }
}

But the parser expected "xi", "starting", or "lineup" keys at root level.

Fix

Updated _parse_lineups_json() in ai-engine/services/single_match_orchestrator.py:

  • Added fallback to check lineups_json.get("stats") for home/away arrays
  • Now correctly parses Mackolik's nested format
  • Result: home_lineup_count: 11, away_lineup_count: 11, lineup_source: "confirmed_live"

18. Docker Deployment

# docker-compose.yml services:
services:
  app: # NestJS (port 3000→3000)
  postgres: # PostgreSQL 17 Alpine (port 15432:5432)
  redis: # Redis 7 Alpine (port 6379)
  adminer: # Database UI (dev profile, port 8080)
  ai-engine: # Python FastAPI (port 8002:8000)

This file is maintained for AI agent context. Update when architecture or conventions change.