21 KiB
Suggest-Bet-BE — AI Agent Context
Last Updated: 2026-04-06 Purpose: Comprehensive project reference for AI agents working on this codebase.
1. Project Overview
Suggest-Bet-BE is an AI-powered sports betting prediction platform backend. It provides:
- AI-driven predictions for football & basketball matches
- Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
- Live score tracking & odds monitoring
- Web scraping from Mackolik.com for historical & live match data
- Google Gemini AI for natural language match commentary
- User coupon tracking (ROI, Win Rate analytics)
Technology Stack
| Layer | Technology |
|---|---|
| Backend API | NestJS 11 (TypeScript) |
| AI Engine | Python FastAPI (v20+) |
| Database | PostgreSQL 16 + Prisma ORM |
| Queue | BullMQ + Redis (optional) |
| Cache | Redis or in-memory fallback |
| Auth | JWT + Passport (Access 15min + Refresh 7day) |
| Scraping | Axios + Cheerio (Mackolik HTML parsing) |
| Logging | Pino (structured logging) |
| i18n | nestjs-i18n (TR, EN) |
| API Docs | Swagger |
| Deploy | Docker Compose |
2. Architecture
┌──────────────────────────────────────────────────────────────────┐
│ CLIENTS (Web/Mobile) │
└───────────────────────────────┬──────────────────────────────────┘
│ HTTP/REST
┌───────────────────────────────▼──────────────────────────────────┐
│ NestJS Backend (Port 3005) │
│ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │
│ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │
│ │ Module │ Module │ Module │ Module │ Module │ │
│ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │
│ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │
│ │ Module │ Module │ Module │ Poster │ Module │ │
│ │SporToto │ Feeder │ Users │ │ │ │
│ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Services: AiService | MatchAnalysis | Scraper │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │
│ └──────────────────────────────────────────────────────────────┘ │
────┬─────────────────┬────────────────────┬──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
│PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │
│ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │
└───────── └────────────── └──────────────────
│
───────▼───────┐
│ Mackolik API │
│ (Data Source) │
└───────────────┘
Database Statistics (~)
matches: 237K permanent match recordslive_matches: ~82 active/upcoming matches (daily cycle)match_player_participation: 3.3Modd_selections: 8.5Mteams: 19,595 |players: 217K |leagues: 1,505
3. Directory Structure
src/
├── app.module.ts # Root module (Redis, Config, i18n, guards)
├── main.ts # Entry point, Swagger, Helmet, ValidationPipe
├── common/ # Shared layer
│ ├── base/ # Generic BaseService<T> & BaseController<T>
│ ├── types/ # ApiResponse<T>, pagination DTOs
│ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper)
│ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor
│ ├── decorators/ # @Public(), @Roles(), @CurrentUser()
│ └── queues/ # BullMQ queue module
├── config/ # Env validation (Zod), config factories
├── database/ # PrismaService
├── i18n/ # TR/EN translations (common, errors, validation, auth)
├── modules/ # 13 feature modules
│ ├── admin/ # Superadmin panel (user mgmt, settings, analytics)
│ ├── analysis/ # Multi-match analysis orchestration
│ ├── auth/ # JWT auth, refresh tokens, guards
│ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService
│ ├── feeder/ # Historical data scraping (Mackolik)
│ ├── gemini/ # Google Gemini AI integration
│ ├── health/ # Liveness, readiness, AI Engine health
│ ├── leagues/ # Country/league/team discovery, H2H
│ ├── matches/ # Match listing, details, active leagues
│ ├── predictions/ # AI predictions with BullMQ queue & 6h cache
│ ├── social-poster/ # Twitter API v2, Canvas image generation
│ ├── spor-toto/ # Spor Toto integration
│ └── users/ # User CRUD (BaseController pattern)
├── scripts/ # Feeder runners, cleanup scripts
├── services/ # Shared services
│ ├── ai.service.ts # Python AI Engine bridge
│ ├── match-analysis.service.ts # 7-phase analysis orchestrator
│ └── scraper.service.ts # Mackolik HTML scraping
└── tasks/ # Cron jobs (15min, 30min, daily)
├── data-fetcher.task.ts # Live matches, odds fetching
├── live-updater.task.ts # Score updates, match finalization
└── limit-resetter.task.ts # Usage limits, subscription expiry
ai-engine/ # Python FastAPI ML engine
├── main.py # FastAPI app, routes
├── services/ # single_match_orchestrator.py
├── core/ # Core algorithms
├── features/ # Feature engineering
├── models/ # ML models
├── training/ # Model training scripts
├── config/ # Configuration
├── utils/ # Utility functions
└── tests/ # Test files
4. Key Modules
Auth Module
- Register, Login, Refresh, Logout endpoints
- bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
- Global guards:
JwtAuthGuard,RolesGuard,PermissionsGuard
Predictions Module
- Requires Redis (
REDIS_ENABLED=true), conditionally loaded - BullMQ queue with worker processor
- 6-hour TTL cache on prediction results
- AI Engine call:
POST /v20plus/analyze/{matchId}
Coupons Module
SmartCouponService: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)UserCouponService: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)
Feeder Module
- Historical scraping from 2023-06-01 to present (reverse chronological)
- Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
- Resume support with state management
Analysis Module
- Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
- 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save
Social Poster Module
- Twitter API v2 integration
- Canvas-based prediction card image generation
- Gemini-powered Turkish caption generation
5. Scheduled Tasks (Cron)
| Task | Schedule | Description |
|---|---|---|
fetchLiveMatches() |
*/15 * * * * |
Fetch football matches from Mackolik API |
fetchOddsForPreMatches() |
*/15 * * * * |
Fetch odds for upcoming matches (football + basketball) |
fetchBasketballMatches() |
Manual | Basketball data via basketball_top_leagues.json filter |
updateLiveScores() |
*/15 * * * * |
Update live match scores |
finalizeFinishedMatches() |
*/30 * * * * |
Migrate finished: live_matches → matches table |
resetUsageLimits() |
0 3 * * * |
Reset daily usage limits (03:00 Istanbul time) |
cleanupOldData() |
0 4 * * * |
Delete 30-day old AI logs, 1-day finished live_matches |
checkSubscriptions() |
0 0 * * * |
Mark expired subscriptions |
6. AI Engine (Python FastAPI)
Independent microservice on port 8000.
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v20plus/analyze/{match_id} |
Single match analysis (main) |
| GET | /v20plus/analyze-htms/{match_id} |
First half - Full time analysis |
| GET | /v20plus/analyze-htft/{match_id} |
HT/FT probabilities |
| POST | /v20plus/coupon |
Smart coupon generation |
| GET | /v20plus/daily-banker |
Daily banker picks |
| GET | /v20plus/reversal-watchlist |
Score reversal watchlist |
| GET | /health |
Health check |
Output Structure (SingleMatchPredictionPackage)
{
model_version: "v20plus.X",
match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
value_pick: { ... },
bet_advice: { playable, suggested_stake_units, reason },
bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
supporting_picks: [...],
aggressive_pick: { market, pick, probability, confidence, odds },
scenario_top5: [{ score, prob }],
score_prediction: { ft, ht, xg_home, xg_away, xg_total },
market_board: { ... },
reasoning_factors: string[],
ai_commentary: string // Turkish commentary from Gemini
}
7. API Response Format
All responses follow this standard structure:
{
"success": true,
"status": 200,
"message": "İşlem başarıyla tamamlandı", // i18n translated
"data": { ... },
"errors": []
}
Critical Rule: Controllers must NEVER return raw Prisma entities. Always use Response DTOs with @Exclude() and @Expose() from class-transformer.
8. Configuration
Environment Variables
NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=false
REDIS_HOST=localhost
REDIS_PORT=6379
AI_ENGINE_URL=http://127.0.0.1:8000
ENABLE_GEMINI=false
GOOGLE_API_KEY=your-api-key
Config Files
top_leagues.json— Football top league IDs (live match filter)basketball_top_leagues.json— Basketball top league IDsbet-type.json— Bet type definitions
9. Build & Run Commands
# Development
npm run start:dev # Watch mode (port 3005)
# Production
npm run build && npm run start:prod
# Feeder (Data Collection)
npm run feeder:historical # Historical scraping (2023-06→present)
npm run feeder:fill-gaps # Fill missing data
npm run feeder:basketball # Basketball data
npm run feeder:live # Live data
# Database
npx prisma generate # Regenerate Prisma client
npx prisma migrate dev # Run migrations
npx prisma db seed # Seed database
# Testing
npm run test # Unit tests
npm run test:e2e # E2E tests
npx jest src/path/to/file.spec.ts # Single test file
# Lint/Format
npm run lint # ESLint with Prettier
npm run format # Prettier write
# Docker
docker-compose up -d postgres redis # Infrastructure
docker-compose up -d # All services
# AI Engine (Python)
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Utility
npm run swagger:summary # Export endpoint summary
npm run cleanup:live # Cleanup live matches
10. Code Style Guidelines
Imports Order
// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';
// 2. External packages
import * as bcrypt from 'bcrypt';
// 3. Local imports (relative)
import { UsersService } from './users.service';
Naming Conventions
- Classes/Interfaces:
PascalCase - Variables/Functions:
camelCase - Constants:
UPPER_SNAKE_CASE - Files:
kebab-case - DTOs:
Entity + Dtosuffix (CreateUserDto, UpdateUserDto)
Types
strictNullChecks: true— null/undefined checks requirednoImplicitAny: false—anyallowed (Prisma dynamic access)- Specify function return types:
async findOne(id: string): Promise<User>
Error Handling
// Use NestJS HTTP Exceptions with i18n keys
throw new NotFoundException('USER_NOT_FOUND');
throw new ConflictException('EMAIL_ALREADY_EXISTS');
// Reference src/i18n/{lang}/errors.json for available keys
11. Known Issues & Gotchas
- Predictions module requires Redis. Disabled when
REDIS_ENABLED=false. - Gemini AI is optional. Returns
nullcommentary when disabled. - Global Exception Filter wraps all errors as HTTP 200 (status in body).
- Lineup scraping is disabled — only Team Stats are used (V20 optimization).
- Feeder V17 AI feature calculation is disabled — V20 model runs in Python.
- BigInt serialization:
BigInt.prototype.toJSON = function() { return this.toString(); }polyfill in main.ts. - i18n assets copied via
nest-cli.json"assets": ["i18n/**/*"]config.
12. Reference Files for AI Agents
When working on this project, consult:
project_summary.md— Comprehensive project documentation (Turkish)README.md— Architecture decisions, quick start guideprompt.md— AI assistant reference guide with agent rolesAGENTS.md— Coding guidelines, DTO patterns, test structure.agent/— Skills and agent role definitionstop_leagues.json/basketball_top_leagues.json— League filters
13. Team Logos
Team logo URL template: https://file.mackolikfeeds.com/teams/{teamId}
14. 🆕 VQWEN Model Integration (Since 2026-04-06)
We have integrated a new high-performance prediction engine called VQWEN v3.
VQWEN Model Features
- Accuracy: +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
- Features Used:
ELO Ratings(Real-time team strength).Contextual Goals(Home/Away specific performance).Rest Days(Fatigue factor for teams playing < 3 days).H2H Win Rate(Historical dominance).Form Points(Last 5 games streak).Squad Strength(Based on starting XI participation).
- Files:
ai-engine/scripts/train_vqwen_v3.py— Training script.ai-engine/services/single_match_orchestrator.py— Integration point.ai-engine/models/vqwen/— Pickle models (vqwen_ms.pkl, etc.).
New Live Lineup/Sidelined Fetcher
- Problem:
lineupsandsidelinedcolumns inlive_matcheswere empty. - Fix: Added
updateLineupsAndSidelined()method tosrc/tasks/data-fetcher.task.ts. - Mechanism: Uses
FeederScraperService.fetchStartingFormationdirectly via Cron (*/15 * * * *). - Status: Active.
Database Schema Updates
substateColumn: Added tomatchestable to track specific match states (e.g., "penalties", "overtime", "postponed").- Sport Partition: Tables are now partitioned by sport (
football_team_statsvsbasketball_team_stats).
16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)
HT/FT Reversal (1/2 & 2/1) Pattern Detection
Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.
Key Findings (147,248 matches analyzed)
| Metric | Value |
|---|---|
| Total Reversal Matches | 13,112 (8.90%) |
| 1/2 (Home leads HT, Away wins FT) | 5,992 (4.07%) |
| 2/1 (Away leads HT, Home wins FT) | 7,120 (4.84%) |
🚨 Basketball Leagues Have Suspiciously High Reversal Rates
| League | Reversals | Total | Rate |
|---|---|---|---|
| Eurobasket U20 | 36 | 120 | 30.00% 🔴 |
| EuroLeague 🏀 | 183 | 639 | 28.64% 🔴 |
| PBA Commissioners 🏀 | 54 | 189 | 28.57% 🔴 |
| Ulusal Süper Lig 🏀 | 148 | 547 | 27.06% 🔴 |
| NBA 🏀 | 656 | 2,696 | 24.33% 🔴 |
All top 15 leagues by reversal rate are BASKETBALL. Football leagues show normal rates (5-8%).
Suspicious Patterns
-
Comeback Magnitude:
- 1 goal/point: 36.1% (normal)
- 2 goals/points: 13.1% (suspicious)
- 3+ goals/points: 50.8% 🔴 EXTREMELY HIGH
-
Extreme Comebacks (Basketball):
- Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
- Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
- These are statistically near-impossible without manipulation
-
Favorite Loss Rate:
- 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)
Impact on Model
- HT/FT model accuracy: 20.3% (low due to reversal noise)
- Basketball reversal data creates training noise
- Recommendation: Either exclude basketball from HT/FT training or train separate basketball-specific model
HT/FT Model Files
- Training script:
ai-engine/scripts/train_htft_vqwen.py - Model output:
ai-engine/models/xgboost/xgb_ht_ft.json+.pkl - Features: 27 (Odds + HT/FT Tendencies + League stats)
- Status: Working, outputs 9-class probabilities in
market_board.HTFT.probs
17. 🐛 Lineup Parsing Fix (Since 2026-04-07)
Problem
AI Engine reported "lineup_unavailable" and "lineup_incomplete" flags even when live_matches.lineups contained full 11/11 lineup data from Mackolik.
Root Cause
Mackolik stores lineups in "stats" key format:
{
"stats": {
"home": [{ "personId": "...", "position": "...", ... }, ...],
"away": [{ "personId": "...", "position": "...", ... }, ...]
}
}
But the parser expected "xi", "starting", or "lineup" keys at root level.
Fix
Updated _parse_lineups_json() in ai-engine/services/single_match_orchestrator.py:
- Added fallback to check
lineups_json.get("stats")for home/away arrays - Now correctly parses Mackolik's nested format
- Result:
home_lineup_count: 11,away_lineup_count: 11,lineup_source: "confirmed_live"
18. Docker Deployment
# docker-compose.yml services:
services:
app: # NestJS (port 3000→3000)
postgres: # PostgreSQL 17 Alpine (port 15432:5432)
redis: # Redis 7 Alpine (port 6379)
adminer: # Database UI (dev profile, port 8080)
ai-engine: # Python FastAPI (port 8002:8000)
This file is maintained for AI agent context. Update when architecture or conventions change.