first (part 1: root files)
Deploy Iddaai Backend / build-and-deploy (push) Failing after 4s

This commit is contained in:
2026-04-16 15:09:10 +03:00
parent b4173c10bb
commit 7814e0bc6b
38 changed files with 18494 additions and 0 deletions
+517
View File
@@ -0,0 +1,517 @@
# Suggest-Bet-BE — AI Agent Context
> **Last Updated:** 2026-04-06
> **Purpose:** Comprehensive project reference for AI agents working on this codebase.
---
## 1. Project Overview
**Suggest-Bet-BE** is an **AI-powered sports betting prediction platform** backend. It provides:
- AI-driven predictions for football & basketball matches
- Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
- Live score tracking & odds monitoring
- Web scraping from Mackolik.com for historical & live match data
- Google Gemini AI for natural language match commentary
- User coupon tracking (ROI, Win Rate analytics)
### Technology Stack
| Layer | Technology |
| ----------- | -------------------------------------------- |
| Backend API | NestJS 11 (TypeScript) |
| AI Engine | Python FastAPI (v20+) |
| Database | PostgreSQL 16 + Prisma ORM |
| Queue | BullMQ + Redis (optional) |
| Cache | Redis or in-memory fallback |
| Auth | JWT + Passport (Access 15min + Refresh 7day) |
| Scraping | Axios + Cheerio (Mackolik HTML parsing) |
| Logging | Pino (structured logging) |
| i18n | nestjs-i18n (TR, EN) |
| API Docs | Swagger |
| Deploy | Docker Compose |
---
## 2. Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ CLIENTS (Web/Mobile) │
└───────────────────────────────┬──────────────────────────────────┘
│ HTTP/REST
┌───────────────────────────────▼──────────────────────────────────┐
│ NestJS Backend (Port 3005) │
│ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │
│ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │
│ │ Module │ Module │ Module │ Module │ Module │ │
│ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │
│ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │
│ │ Module │ Module │ Module │ Poster │ Module │ │
│ │SporToto │ Feeder │ Users │ │ │ │
│ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Services: AiService | MatchAnalysis | Scraper │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │
│ └──────────────────────────────────────────────────────────────┘ │
────┬─────────────────┬────────────────────┬──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
│PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │
│ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │
└───────── └────────────── └──────────────────
───────▼───────┐
│ Mackolik API │
│ (Data Source) │
└───────────────┘
```
### Database Statistics (~)
- `matches`: 237K permanent match records
- `live_matches`: ~82 active/upcoming matches (daily cycle)
- `match_player_participation`: 3.3M
- `odd_selections`: 8.5M
- `teams`: 19,595 | `players`: 217K | `leagues`: 1,505
---
## 3. Directory Structure
```
src/
├── app.module.ts # Root module (Redis, Config, i18n, guards)
├── main.ts # Entry point, Swagger, Helmet, ValidationPipe
├── common/ # Shared layer
│ ├── base/ # Generic BaseService<T> & BaseController<T>
│ ├── types/ # ApiResponse<T>, pagination DTOs
│ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper)
│ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor
│ ├── decorators/ # @Public(), @Roles(), @CurrentUser()
│ └── queues/ # BullMQ queue module
├── config/ # Env validation (Zod), config factories
├── database/ # PrismaService
├── i18n/ # TR/EN translations (common, errors, validation, auth)
├── modules/ # 13 feature modules
│ ├── admin/ # Superadmin panel (user mgmt, settings, analytics)
│ ├── analysis/ # Multi-match analysis orchestration
│ ├── auth/ # JWT auth, refresh tokens, guards
│ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService
│ ├── feeder/ # Historical data scraping (Mackolik)
│ ├── gemini/ # Google Gemini AI integration
│ ├── health/ # Liveness, readiness, AI Engine health
│ ├── leagues/ # Country/league/team discovery, H2H
│ ├── matches/ # Match listing, details, active leagues
│ ├── predictions/ # AI predictions with BullMQ queue & 6h cache
│ ├── social-poster/ # Twitter API v2, Canvas image generation
│ ├── spor-toto/ # Spor Toto integration
│ └── users/ # User CRUD (BaseController pattern)
├── scripts/ # Feeder runners, cleanup scripts
├── services/ # Shared services
│ ├── ai.service.ts # Python AI Engine bridge
│ ├── match-analysis.service.ts # 7-phase analysis orchestrator
│ └── scraper.service.ts # Mackolik HTML scraping
└── tasks/ # Cron jobs (15min, 30min, daily)
├── data-fetcher.task.ts # Live matches, odds fetching
├── live-updater.task.ts # Score updates, match finalization
└── limit-resetter.task.ts # Usage limits, subscription expiry
ai-engine/ # Python FastAPI ML engine
├── main.py # FastAPI app, routes
├── services/ # single_match_orchestrator.py
├── core/ # Core algorithms
├── features/ # Feature engineering
├── models/ # ML models
├── training/ # Model training scripts
├── config/ # Configuration
├── utils/ # Utility functions
└── tests/ # Test files
```
---
## 4. Key Modules
### Auth Module
- Register, Login, Refresh, Logout endpoints
- bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
- Global guards: `JwtAuthGuard`, `RolesGuard`, `PermissionsGuard`
### Predictions Module
- Requires Redis (`REDIS_ENABLED=true`), conditionally loaded
- BullMQ queue with worker processor
- 6-hour TTL cache on prediction results
- AI Engine call: `POST /v20plus/analyze/{matchId}`
### Coupons Module
- `SmartCouponService`: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)
- `UserCouponService`: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)
### Feeder Module
- Historical scraping from 2023-06-01 to present (reverse chronological)
- Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
- Resume support with state management
### Analysis Module
- Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
- 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save
### Social Poster Module
- Twitter API v2 integration
- Canvas-based prediction card image generation
- Gemini-powered Turkish caption generation
---
## 5. Scheduled Tasks (Cron)
| Task | Schedule | Description |
| --------------------------- | -------------- | -------------------------------------------------------- |
| `fetchLiveMatches()` | `*/15 * * * *` | Fetch football matches from Mackolik API |
| `fetchOddsForPreMatches()` | `*/15 * * * *` | Fetch odds for upcoming matches (football + basketball) |
| `fetchBasketballMatches()` | Manual | Basketball data via `basketball_top_leagues.json` filter |
| `updateLiveScores()` | `*/15 * * * *` | Update live match scores |
| `finalizeFinishedMatches()` | `*/30 * * * *` | Migrate finished: live_matches → matches table |
| `resetUsageLimits()` | `0 3 * * *` | Reset daily usage limits (03:00 Istanbul time) |
| `cleanupOldData()` | `0 4 * * *` | Delete 30-day old AI logs, 1-day finished live_matches |
| `checkSubscriptions()` | `0 0 * * *` | Mark expired subscriptions |
---
## 6. AI Engine (Python FastAPI)
Independent microservice on port 8000.
### Endpoints
| Method | Path | Description |
| ------ | ---------------------------------- | ------------------------------- |
| POST | `/v20plus/analyze/{match_id}` | Single match analysis (main) |
| GET | `/v20plus/analyze-htms/{match_id}` | First half - Full time analysis |
| GET | `/v20plus/analyze-htft/{match_id}` | HT/FT probabilities |
| POST | `/v20plus/coupon` | Smart coupon generation |
| GET | `/v20plus/daily-banker` | Daily banker picks |
| GET | `/v20plus/reversal-watchlist` | Score reversal watchlist |
| GET | `/health` | Health check |
### Output Structure (`SingleMatchPredictionPackage`)
```typescript
{
model_version: "v20plus.X",
match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
value_pick: { ... },
bet_advice: { playable, suggested_stake_units, reason },
bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
supporting_picks: [...],
aggressive_pick: { market, pick, probability, confidence, odds },
scenario_top5: [{ score, prob }],
score_prediction: { ft, ht, xg_home, xg_away, xg_total },
market_board: { ... },
reasoning_factors: string[],
ai_commentary: string // Turkish commentary from Gemini
}
```
---
## 7. API Response Format
All responses follow this standard structure:
```json
{
"success": true,
"status": 200,
"message": "İşlem başarıyla tamamlandı", // i18n translated
"data": { ... },
"errors": []
}
```
**Critical Rule:** Controllers must NEVER return raw Prisma entities. Always use Response DTOs with `@Exclude()` and `@Expose()` from `class-transformer`.
---
## 8. Configuration
### Environment Variables
```env
NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=false
REDIS_HOST=localhost
REDIS_PORT=6379
AI_ENGINE_URL=http://127.0.0.1:8000
ENABLE_GEMINI=false
GOOGLE_API_KEY=your-api-key
```
### Config Files
- `top_leagues.json` — Football top league IDs (live match filter)
- `basketball_top_leagues.json` — Basketball top league IDs
- `bet-type.json` — Bet type definitions
---
## 9. Build & Run Commands
```bash
# Development
npm run start:dev # Watch mode (port 3005)
# Production
npm run build && npm run start:prod
# Feeder (Data Collection)
npm run feeder:historical # Historical scraping (2023-06→present)
npm run feeder:fill-gaps # Fill missing data
npm run feeder:basketball # Basketball data
npm run feeder:live # Live data
# Database
npx prisma generate # Regenerate Prisma client
npx prisma migrate dev # Run migrations
npx prisma db seed # Seed database
# Testing
npm run test # Unit tests
npm run test:e2e # E2E tests
npx jest src/path/to/file.spec.ts # Single test file
# Lint/Format
npm run lint # ESLint with Prettier
npm run format # Prettier write
# Docker
docker-compose up -d postgres redis # Infrastructure
docker-compose up -d # All services
# AI Engine (Python)
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Utility
npm run swagger:summary # Export endpoint summary
npm run cleanup:live # Cleanup live matches
```
---
## 10. Code Style Guidelines
### Imports Order
```typescript
// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';
// 2. External packages
import * as bcrypt from 'bcrypt';
// 3. Local imports (relative)
import { UsersService } from './users.service';
```
### Naming Conventions
- Classes/Interfaces: `PascalCase`
- Variables/Functions: `camelCase`
- Constants: `UPPER_SNAKE_CASE`
- Files: `kebab-case`
- DTOs: `Entity + Dto` suffix (CreateUserDto, UpdateUserDto)
### Types
- `strictNullChecks: true` — null/undefined checks required
- `noImplicitAny: false``any` allowed (Prisma dynamic access)
- Specify function return types: `async findOne(id: string): Promise<User>`
### Error Handling
```typescript
// Use NestJS HTTP Exceptions with i18n keys
throw new NotFoundException('USER_NOT_FOUND');
throw new ConflictException('EMAIL_ALREADY_EXISTS');
// Reference src/i18n/{lang}/errors.json for available keys
```
---
## 11. Known Issues & Gotchas
1. **Predictions module** requires Redis. Disabled when `REDIS_ENABLED=false`.
2. **Gemini AI** is optional. Returns `null` commentary when disabled.
3. **Global Exception Filter** wraps all errors as HTTP 200 (status in body).
4. **Lineup scraping** is disabled — only Team Stats are used (V20 optimization).
5. **Feeder V17 AI feature calculation** is disabled — V20 model runs in Python.
6. **BigInt serialization**: `BigInt.prototype.toJSON = function() { return this.toString(); }` polyfill in main.ts.
7. **i18n assets** copied via `nest-cli.json` `"assets": ["i18n/**/*"]` config.
---
## 12. Reference Files for AI Agents
When working on this project, consult:
- `project_summary.md` — Comprehensive project documentation (Turkish)
- `README.md` — Architecture decisions, quick start guide
- `prompt.md` — AI assistant reference guide with agent roles
- `AGENTS.md` — Coding guidelines, DTO patterns, test structure
- `.agent/` — Skills and agent role definitions
- `top_leagues.json` / `basketball_top_leagues.json` — League filters
---
## 13. Team Logos
Team logo URL template: `https://file.mackolikfeeds.com/teams/{teamId}`
---
## 14. 🆕 VQWEN Model Integration (Since 2026-04-06)
We have integrated a new high-performance prediction engine called **VQWEN v3**.
### VQWEN Model Features
- **Accuracy:** +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
- **Features Used:**
- `ELO Ratings` (Real-time team strength).
- `Contextual Goals` (Home/Away specific performance).
- `Rest Days` (Fatigue factor for teams playing < 3 days).
- `H2H Win Rate` (Historical dominance).
- `Form Points` (Last 5 games streak).
- `Squad Strength` (Based on starting XI participation).
- **Files:**
- `ai-engine/scripts/train_vqwen_v3.py` — Training script.
- `ai-engine/services/single_match_orchestrator.py` — Integration point.
- `ai-engine/models/vqwen/` — Pickle models (`vqwen_ms.pkl`, etc.).
### New Live Lineup/Sidelined Fetcher
- **Problem:** `lineups` and `sidelined` columns in `live_matches` were empty.
- **Fix:** Added `updateLineupsAndSidelined()` method to `src/tasks/data-fetcher.task.ts`.
- **Mechanism:** Uses `FeederScraperService.fetchStartingFormation` directly via Cron (`*/15 * * * *`).
- **Status:** Active.
### Database Schema Updates
- **`substate` Column:** Added to `matches` table to track specific match states (e.g., "penalties", "overtime", "postponed").
- **Sport Partition:** Tables are now partitioned by sport (`football_team_stats` vs `basketball_team_stats`).
---
## 16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)
### HT/FT Reversal (1/2 & 2/1) Pattern Detection
Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.
#### Key Findings (147,248 matches analyzed)
| Metric | Value |
|--------|-------|
| **Total Reversal Matches** | 13,112 (8.90%) |
| **1/2 (Home leads HT, Away wins FT)** | 5,992 (4.07%) |
| **2/1 (Away leads HT, Home wins FT)** | 7,120 (4.84%) |
#### 🚨 Basketball Leagues Have Suspiciously High Reversal Rates
| League | Reversals | Total | Rate |
|--------|-----------|-------|------|
| Eurobasket U20 | 36 | 120 | **30.00%** 🔴 |
| EuroLeague 🏀 | 183 | 639 | **28.64%** 🔴 |
| PBA Commissioners 🏀 | 54 | 189 | **28.57%** 🔴 |
| Ulusal Süper Lig 🏀 | 148 | 547 | **27.06%** 🔴 |
| NBA 🏀 | 656 | 2,696 | **24.33%** 🔴 |
**All top 15 leagues by reversal rate are BASKETBALL.** Football leagues show normal rates (5-8%).
#### Suspicious Patterns
1. **Comeback Magnitude:**
- 1 goal/point: 36.1% (normal)
- 2 goals/points: 13.1% (suspicious)
- **3+ goals/points: 50.8%** 🔴 **EXTREMELY HIGH**
2. **Extreme Comebacks (Basketball):**
- Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
- Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
- These are statistically near-impossible without manipulation
3. **Favorite Loss Rate:**
- 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)
#### Impact on Model
- HT/FT model accuracy: **20.3%** (low due to reversal noise)
- Basketball reversal data creates **training noise**
- **Recommendation:** Either exclude basketball from HT/FT training or train separate basketball-specific model
#### HT/FT Model Files
- **Training script:** `ai-engine/scripts/train_htft_vqwen.py`
- **Model output:** `ai-engine/models/xgboost/xgb_ht_ft.json` + `.pkl`
- **Features:** 27 (Odds + HT/FT Tendencies + League stats)
- **Status:** Working, outputs 9-class probabilities in `market_board.HTFT.probs`
---
## 17. 🐛 Lineup Parsing Fix (Since 2026-04-07)
### Problem
AI Engine reported `"lineup_unavailable"` and `"lineup_incomplete"` flags even when `live_matches.lineups` contained full 11/11 lineup data from Mackolik.
### Root Cause
Mackolik stores lineups in `"stats"` key format:
```json
{
"stats": {
"home": [{ "personId": "...", "position": "...", ... }, ...],
"away": [{ "personId": "...", "position": "...", ... }, ...]
}
}
```
But the parser expected `"xi"`, `"starting"`, or `"lineup"` keys at root level.
### Fix
Updated `_parse_lineups_json()` in `ai-engine/services/single_match_orchestrator.py`:
- Added fallback to check `lineups_json.get("stats")` for home/away arrays
- Now correctly parses Mackolik's nested format
- Result: `home_lineup_count: 11`, `away_lineup_count: 11`, `lineup_source: "confirmed_live"`
---
## 18. Docker Deployment
```yaml
# docker-compose.yml services:
services:
app: # NestJS (port 3000→3000)
postgres: # PostgreSQL 17 Alpine (port 15432:5432)
redis: # Redis 7 Alpine (port 6379)
adminer: # Database UI (dev profile, port 8080)
ai-engine: # Python FastAPI (port 8002:8000)
```
---
_This file is maintained for AI agent context. Update when architecture or conventions change._