518 lines
21 KiB
Markdown
518 lines
21 KiB
Markdown
# Suggest-Bet-BE — AI Agent Context
|
||
|
||
> **Last Updated:** 2026-04-06
|
||
> **Purpose:** Comprehensive project reference for AI agents working on this codebase.
|
||
|
||
---
|
||
|
||
## 1. Project Overview
|
||
|
||
**Suggest-Bet-BE** is an **AI-powered sports betting prediction platform** backend. It provides:
|
||
|
||
- AI-driven predictions for football & basketball matches
|
||
- Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
|
||
- Live score tracking & odds monitoring
|
||
- Web scraping from Mackolik.com for historical & live match data
|
||
- Google Gemini AI for natural language match commentary
|
||
- User coupon tracking (ROI, Win Rate analytics)
|
||
|
||
### Technology Stack
|
||
|
||
| Layer | Technology |
|
||
| ----------- | -------------------------------------------- |
|
||
| Backend API | NestJS 11 (TypeScript) |
|
||
| AI Engine | Python FastAPI (v20+) |
|
||
| Database | PostgreSQL 16 + Prisma ORM |
|
||
| Queue | BullMQ + Redis (optional) |
|
||
| Cache | Redis or in-memory fallback |
|
||
| Auth | JWT + Passport (Access 15min + Refresh 7day) |
|
||
| Scraping | Axios + Cheerio (Mackolik HTML parsing) |
|
||
| Logging | Pino (structured logging) |
|
||
| i18n | nestjs-i18n (TR, EN) |
|
||
| API Docs | Swagger |
|
||
| Deploy | Docker Compose |
|
||
|
||
---
|
||
|
||
## 2. Architecture
|
||
|
||
```
|
||
┌──────────────────────────────────────────────────────────────────┐
|
||
│ CLIENTS (Web/Mobile) │
|
||
└───────────────────────────────┬──────────────────────────────────┘
|
||
│ HTTP/REST
|
||
┌───────────────────────────────▼──────────────────────────────────┐
|
||
│ NestJS Backend (Port 3005) │
|
||
│ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │
|
||
│ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │
|
||
│ │ Module │ Module │ Module │ Module │ Module │ │
|
||
│ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │
|
||
│ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │
|
||
│ │ Module │ Module │ Module │ Poster │ Module │ │
|
||
│ │SporToto │ Feeder │ Users │ │ │ │
|
||
│ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │
|
||
│ ┌──────────────────────────────────────────────────────────────┐ │
|
||
│ │ Services: AiService | MatchAnalysis | Scraper │ │
|
||
│ ├──────────────────────────────────────────────────────────────┤ │
|
||
│ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │
|
||
│ └──────────────────────────────────────────────────────────────┘ │
|
||
────┬─────────────────┬────────────────────┬──────────────────────┘
|
||
│ │ │
|
||
▼ ▼ ▼
|
||
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
|
||
│PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │
|
||
│ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │
|
||
└───────── └────────────── └──────────────────
|
||
│
|
||
───────▼───────┐
|
||
│ Mackolik API │
|
||
│ (Data Source) │
|
||
└───────────────┘
|
||
```
|
||
|
||
### Database Statistics (~)
|
||
|
||
- `matches`: 237K permanent match records
|
||
- `live_matches`: ~82 active/upcoming matches (daily cycle)
|
||
- `match_player_participation`: 3.3M
|
||
- `odd_selections`: 8.5M
|
||
- `teams`: 19,595 | `players`: 217K | `leagues`: 1,505
|
||
|
||
---
|
||
|
||
## 3. Directory Structure
|
||
|
||
```
|
||
src/
|
||
├── app.module.ts # Root module (Redis, Config, i18n, guards)
|
||
├── main.ts # Entry point, Swagger, Helmet, ValidationPipe
|
||
├── common/ # Shared layer
|
||
│ ├── base/ # Generic BaseService<T> & BaseController<T>
|
||
│ ├── types/ # ApiResponse<T>, pagination DTOs
|
||
│ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper)
|
||
│ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor
|
||
│ ├── decorators/ # @Public(), @Roles(), @CurrentUser()
|
||
│ └── queues/ # BullMQ queue module
|
||
├── config/ # Env validation (Zod), config factories
|
||
├── database/ # PrismaService
|
||
├── i18n/ # TR/EN translations (common, errors, validation, auth)
|
||
├── modules/ # 13 feature modules
|
||
│ ├── admin/ # Superadmin panel (user mgmt, settings, analytics)
|
||
│ ├── analysis/ # Multi-match analysis orchestration
|
||
│ ├── auth/ # JWT auth, refresh tokens, guards
|
||
│ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService
|
||
│ ├── feeder/ # Historical data scraping (Mackolik)
|
||
│ ├── gemini/ # Google Gemini AI integration
|
||
│ ├── health/ # Liveness, readiness, AI Engine health
|
||
│ ├── leagues/ # Country/league/team discovery, H2H
|
||
│ ├── matches/ # Match listing, details, active leagues
|
||
│ ├── predictions/ # AI predictions with BullMQ queue & 6h cache
|
||
│ ├── social-poster/ # Twitter API v2, Canvas image generation
|
||
│ ├── spor-toto/ # Spor Toto integration
|
||
│ └── users/ # User CRUD (BaseController pattern)
|
||
├── scripts/ # Feeder runners, cleanup scripts
|
||
├── services/ # Shared services
|
||
│ ├── ai.service.ts # Python AI Engine bridge
|
||
│ ├── match-analysis.service.ts # 7-phase analysis orchestrator
|
||
│ └── scraper.service.ts # Mackolik HTML scraping
|
||
└── tasks/ # Cron jobs (15min, 30min, daily)
|
||
├── data-fetcher.task.ts # Live matches, odds fetching
|
||
├── live-updater.task.ts # Score updates, match finalization
|
||
└── limit-resetter.task.ts # Usage limits, subscription expiry
|
||
|
||
ai-engine/ # Python FastAPI ML engine
|
||
├── main.py # FastAPI app, routes
|
||
├── services/ # single_match_orchestrator.py
|
||
├── core/ # Core algorithms
|
||
├── features/ # Feature engineering
|
||
├── models/ # ML models
|
||
├── training/ # Model training scripts
|
||
├── config/ # Configuration
|
||
├── utils/ # Utility functions
|
||
└── tests/ # Test files
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Key Modules
|
||
|
||
### Auth Module
|
||
|
||
- Register, Login, Refresh, Logout endpoints
|
||
- bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
|
||
- Global guards: `JwtAuthGuard`, `RolesGuard`, `PermissionsGuard`
|
||
|
||
### Predictions Module
|
||
|
||
- Requires Redis (`REDIS_ENABLED=true`), conditionally loaded
|
||
- BullMQ queue with worker processor
|
||
- 6-hour TTL cache on prediction results
|
||
- AI Engine call: `POST /v20plus/analyze/{matchId}`
|
||
|
||
### Coupons Module
|
||
|
||
- `SmartCouponService`: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)
|
||
- `UserCouponService`: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)
|
||
|
||
### Feeder Module
|
||
|
||
- Historical scraping from 2023-06-01 to present (reverse chronological)
|
||
- Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
|
||
- Resume support with state management
|
||
|
||
### Analysis Module
|
||
|
||
- Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
|
||
- 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save
|
||
|
||
### Social Poster Module
|
||
|
||
- Twitter API v2 integration
|
||
- Canvas-based prediction card image generation
|
||
- Gemini-powered Turkish caption generation
|
||
|
||
---
|
||
|
||
## 5. Scheduled Tasks (Cron)
|
||
|
||
| Task | Schedule | Description |
|
||
| --------------------------- | -------------- | -------------------------------------------------------- |
|
||
| `fetchLiveMatches()` | `*/15 * * * *` | Fetch football matches from Mackolik API |
|
||
| `fetchOddsForPreMatches()` | `*/15 * * * *` | Fetch odds for upcoming matches (football + basketball) |
|
||
| `fetchBasketballMatches()` | Manual | Basketball data via `basketball_top_leagues.json` filter |
|
||
| `updateLiveScores()` | `*/15 * * * *` | Update live match scores |
|
||
| `finalizeFinishedMatches()` | `*/30 * * * *` | Migrate finished: live_matches → matches table |
|
||
| `resetUsageLimits()` | `0 3 * * *` | Reset daily usage limits (03:00 Istanbul time) |
|
||
| `cleanupOldData()` | `0 4 * * *` | Delete 30-day old AI logs, 1-day finished live_matches |
|
||
| `checkSubscriptions()` | `0 0 * * *` | Mark expired subscriptions |
|
||
|
||
---
|
||
|
||
## 6. AI Engine (Python FastAPI)
|
||
|
||
Independent microservice on port 8000.
|
||
|
||
### Endpoints
|
||
|
||
| Method | Path | Description |
|
||
| ------ | ---------------------------------- | ------------------------------- |
|
||
| POST | `/v20plus/analyze/{match_id}` | Single match analysis (main) |
|
||
| GET | `/v20plus/analyze-htms/{match_id}` | First half - Full time analysis |
|
||
| GET | `/v20plus/analyze-htft/{match_id}` | HT/FT probabilities |
|
||
| POST | `/v20plus/coupon` | Smart coupon generation |
|
||
| GET | `/v20plus/daily-banker` | Daily banker picks |
|
||
| GET | `/v20plus/reversal-watchlist` | Score reversal watchlist |
|
||
| GET | `/health` | Health check |
|
||
|
||
### Output Structure (`SingleMatchPredictionPackage`)
|
||
|
||
```typescript
|
||
{
|
||
model_version: "v20plus.X",
|
||
match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
|
||
data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
|
||
risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
|
||
main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
|
||
value_pick: { ... },
|
||
bet_advice: { playable, suggested_stake_units, reason },
|
||
bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
|
||
supporting_picks: [...],
|
||
aggressive_pick: { market, pick, probability, confidence, odds },
|
||
scenario_top5: [{ score, prob }],
|
||
score_prediction: { ft, ht, xg_home, xg_away, xg_total },
|
||
market_board: { ... },
|
||
reasoning_factors: string[],
|
||
ai_commentary: string // Turkish commentary from Gemini
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 7. API Response Format
|
||
|
||
All responses follow this standard structure:
|
||
|
||
```json
|
||
{
|
||
"success": true,
|
||
"status": 200,
|
||
"message": "İşlem başarıyla tamamlandı", // i18n translated
|
||
"data": { ... },
|
||
"errors": []
|
||
}
|
||
```
|
||
|
||
**Critical Rule:** Controllers must NEVER return raw Prisma entities. Always use Response DTOs with `@Exclude()` and `@Expose()` from `class-transformer`.
|
||
|
||
---
|
||
|
||
## 8. Configuration
|
||
|
||
### Environment Variables
|
||
|
||
```env
|
||
NODE_ENV=development
|
||
PORT=3005
|
||
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
|
||
JWT_SECRET=your-secret-key
|
||
JWT_ACCESS_EXPIRATION=15m
|
||
JWT_REFRESH_EXPIRATION=7d
|
||
REDIS_ENABLED=false
|
||
REDIS_HOST=localhost
|
||
REDIS_PORT=6379
|
||
AI_ENGINE_URL=http://127.0.0.1:8000
|
||
ENABLE_GEMINI=false
|
||
GOOGLE_API_KEY=your-api-key
|
||
```
|
||
|
||
### Config Files
|
||
|
||
- `top_leagues.json` — Football top league IDs (live match filter)
|
||
- `basketball_top_leagues.json` — Basketball top league IDs
|
||
- `bet-type.json` — Bet type definitions
|
||
|
||
---
|
||
|
||
## 9. Build & Run Commands
|
||
|
||
```bash
|
||
# Development
|
||
npm run start:dev # Watch mode (port 3005)
|
||
|
||
# Production
|
||
npm run build && npm run start:prod
|
||
|
||
# Feeder (Data Collection)
|
||
npm run feeder:historical # Historical scraping (2023-06→present)
|
||
npm run feeder:fill-gaps # Fill missing data
|
||
npm run feeder:basketball # Basketball data
|
||
npm run feeder:live # Live data
|
||
|
||
# Database
|
||
npx prisma generate # Regenerate Prisma client
|
||
npx prisma migrate dev # Run migrations
|
||
npx prisma db seed # Seed database
|
||
|
||
# Testing
|
||
npm run test # Unit tests
|
||
npm run test:e2e # E2E tests
|
||
npx jest src/path/to/file.spec.ts # Single test file
|
||
|
||
# Lint/Format
|
||
npm run lint # ESLint with Prettier
|
||
npm run format # Prettier write
|
||
|
||
# Docker
|
||
docker-compose up -d postgres redis # Infrastructure
|
||
docker-compose up -d # All services
|
||
|
||
# AI Engine (Python)
|
||
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||
|
||
# Utility
|
||
npm run swagger:summary # Export endpoint summary
|
||
npm run cleanup:live # Cleanup live matches
|
||
```
|
||
|
||
---
|
||
|
||
## 10. Code Style Guidelines
|
||
|
||
### Imports Order
|
||
|
||
```typescript
|
||
// 1. NestJS/common imports
|
||
import { Controller, Get, Post, Body } from '@nestjs/common';
|
||
|
||
// 2. External packages
|
||
import * as bcrypt from 'bcrypt';
|
||
|
||
// 3. Local imports (relative)
|
||
import { UsersService } from './users.service';
|
||
```
|
||
|
||
### Naming Conventions
|
||
|
||
- Classes/Interfaces: `PascalCase`
|
||
- Variables/Functions: `camelCase`
|
||
- Constants: `UPPER_SNAKE_CASE`
|
||
- Files: `kebab-case`
|
||
- DTOs: `Entity + Dto` suffix (CreateUserDto, UpdateUserDto)
|
||
|
||
### Types
|
||
|
||
- `strictNullChecks: true` — null/undefined checks required
|
||
- `noImplicitAny: false` — `any` allowed (Prisma dynamic access)
|
||
- Specify function return types: `async findOne(id: string): Promise<User>`
|
||
|
||
### Error Handling
|
||
|
||
```typescript
|
||
// Use NestJS HTTP Exceptions with i18n keys
|
||
throw new NotFoundException('USER_NOT_FOUND');
|
||
throw new ConflictException('EMAIL_ALREADY_EXISTS');
|
||
|
||
// Reference src/i18n/{lang}/errors.json for available keys
|
||
```
|
||
|
||
---
|
||
|
||
## 11. Known Issues & Gotchas
|
||
|
||
1. **Predictions module** requires Redis. Disabled when `REDIS_ENABLED=false`.
|
||
2. **Gemini AI** is optional. Returns `null` commentary when disabled.
|
||
3. **Global Exception Filter** wraps all errors as HTTP 200 (status in body).
|
||
4. **Lineup scraping** is disabled — only Team Stats are used (V20 optimization).
|
||
5. **Feeder V17 AI feature calculation** is disabled — V20 model runs in Python.
|
||
6. **BigInt serialization**: `BigInt.prototype.toJSON = function() { return this.toString(); }` polyfill in main.ts.
|
||
7. **i18n assets** copied via `nest-cli.json` `"assets": ["i18n/**/*"]` config.
|
||
|
||
---
|
||
|
||
## 12. Reference Files for AI Agents
|
||
|
||
When working on this project, consult:
|
||
|
||
- `project_summary.md` — Comprehensive project documentation (Turkish)
|
||
- `README.md` — Architecture decisions, quick start guide
|
||
- `prompt.md` — AI assistant reference guide with agent roles
|
||
- `AGENTS.md` — Coding guidelines, DTO patterns, test structure
|
||
- `.agent/` — Skills and agent role definitions
|
||
- `top_leagues.json` / `basketball_top_leagues.json` — League filters
|
||
|
||
---
|
||
|
||
## 13. Team Logos
|
||
|
||
Team logo URL template: `https://file.mackolikfeeds.com/teams/{teamId}`
|
||
|
||
---
|
||
|
||
## 14. 🆕 VQWEN Model Integration (Since 2026-04-06)
|
||
|
||
We have integrated a new high-performance prediction engine called **VQWEN v3**.
|
||
|
||
### VQWEN Model Features
|
||
- **Accuracy:** +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
|
||
- **Features Used:**
|
||
- `ELO Ratings` (Real-time team strength).
|
||
- `Contextual Goals` (Home/Away specific performance).
|
||
- `Rest Days` (Fatigue factor for teams playing < 3 days).
|
||
- `H2H Win Rate` (Historical dominance).
|
||
- `Form Points` (Last 5 games streak).
|
||
- `Squad Strength` (Based on starting XI participation).
|
||
- **Files:**
|
||
- `ai-engine/scripts/train_vqwen_v3.py` — Training script.
|
||
- `ai-engine/services/single_match_orchestrator.py` — Integration point.
|
||
- `ai-engine/models/vqwen/` — Pickle models (`vqwen_ms.pkl`, etc.).
|
||
|
||
### New Live Lineup/Sidelined Fetcher
|
||
- **Problem:** `lineups` and `sidelined` columns in `live_matches` were empty.
|
||
- **Fix:** Added `updateLineupsAndSidelined()` method to `src/tasks/data-fetcher.task.ts`.
|
||
- **Mechanism:** Uses `FeederScraperService.fetchStartingFormation` directly via Cron (`*/15 * * * *`).
|
||
- **Status:** Active.
|
||
|
||
### Database Schema Updates
|
||
- **`substate` Column:** Added to `matches` table to track specific match states (e.g., "penalties", "overtime", "postponed").
|
||
- **Sport Partition:** Tables are now partitioned by sport (`football_team_stats` vs `basketball_team_stats`).
|
||
|
||
---
|
||
|
||
## 16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)
|
||
|
||
### HT/FT Reversal (1/2 & 2/1) Pattern Detection
|
||
|
||
Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.
|
||
|
||
#### Key Findings (147,248 matches analyzed)
|
||
|
||
| Metric | Value |
|
||
|--------|-------|
|
||
| **Total Reversal Matches** | 13,112 (8.90%) |
|
||
| **1/2 (Home leads HT, Away wins FT)** | 5,992 (4.07%) |
|
||
| **2/1 (Away leads HT, Home wins FT)** | 7,120 (4.84%) |
|
||
|
||
#### 🚨 Basketball Leagues Have Suspiciously High Reversal Rates
|
||
|
||
| League | Reversals | Total | Rate |
|
||
|--------|-----------|-------|------|
|
||
| Eurobasket U20 | 36 | 120 | **30.00%** 🔴 |
|
||
| EuroLeague 🏀 | 183 | 639 | **28.64%** 🔴 |
|
||
| PBA Commissioners 🏀 | 54 | 189 | **28.57%** 🔴 |
|
||
| Ulusal Süper Lig 🏀 | 148 | 547 | **27.06%** 🔴 |
|
||
| NBA 🏀 | 656 | 2,696 | **24.33%** 🔴 |
|
||
|
||
**All top 15 leagues by reversal rate are BASKETBALL.** Football leagues show normal rates (5-8%).
|
||
|
||
#### Suspicious Patterns
|
||
|
||
1. **Comeback Magnitude:**
|
||
- 1 goal/point: 36.1% (normal)
|
||
- 2 goals/points: 13.1% (suspicious)
|
||
- **3+ goals/points: 50.8%** 🔴 **EXTREMELY HIGH**
|
||
|
||
2. **Extreme Comebacks (Basketball):**
|
||
- Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
|
||
- Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
|
||
- These are statistically near-impossible without manipulation
|
||
|
||
3. **Favorite Loss Rate:**
|
||
- 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)
|
||
|
||
#### Impact on Model
|
||
|
||
- HT/FT model accuracy: **20.3%** (low due to reversal noise)
|
||
- Basketball reversal data creates **training noise**
|
||
- **Recommendation:** Either exclude basketball from HT/FT training or train separate basketball-specific model
|
||
|
||
#### HT/FT Model Files
|
||
|
||
- **Training script:** `ai-engine/scripts/train_htft_vqwen.py`
|
||
- **Model output:** `ai-engine/models/xgboost/xgb_ht_ft.json` + `.pkl`
|
||
- **Features:** 27 (Odds + HT/FT Tendencies + League stats)
|
||
- **Status:** Working, outputs 9-class probabilities in `market_board.HTFT.probs`
|
||
|
||
---
|
||
|
||
## 17. 🐛 Lineup Parsing Fix (Since 2026-04-07)
|
||
|
||
### Problem
|
||
AI Engine reported `"lineup_unavailable"` and `"lineup_incomplete"` flags even when `live_matches.lineups` contained full 11/11 lineup data from Mackolik.
|
||
|
||
### Root Cause
|
||
Mackolik stores lineups in `"stats"` key format:
|
||
```json
|
||
{
|
||
"stats": {
|
||
"home": [{ "personId": "...", "position": "...", ... }, ...],
|
||
"away": [{ "personId": "...", "position": "...", ... }, ...]
|
||
}
|
||
}
|
||
```
|
||
|
||
But the parser expected `"xi"`, `"starting"`, or `"lineup"` keys at root level.
|
||
|
||
### Fix
|
||
Updated `_parse_lineups_json()` in `ai-engine/services/single_match_orchestrator.py`:
|
||
- Added fallback to check `lineups_json.get("stats")` for home/away arrays
|
||
- Now correctly parses Mackolik's nested format
|
||
- Result: `home_lineup_count: 11`, `away_lineup_count: 11`, `lineup_source: "confirmed_live"`
|
||
|
||
---
|
||
|
||
## 18. Docker Deployment
|
||
|
||
```yaml
|
||
# docker-compose.yml services:
|
||
services:
|
||
app: # NestJS (port 3000→3000)
|
||
postgres: # PostgreSQL 17 Alpine (port 15432:5432)
|
||
redis: # Redis 7 Alpine (port 6379)
|
||
adminer: # Database UI (dev profile, port 8080)
|
||
ai-engine: # Python FastAPI (port 8002:8000)
|
||
```
|
||
|
||
---
|
||
|
||
_This file is maintained for AI agent context. Update when architecture or conventions change._
|