Files
fahricansecer 7814e0bc6b
Deploy Iddaai Backend / build-and-deploy (push) Failing after 4s
first (part 1: root files)
2026-04-16 15:09:10 +03:00

518 lines
21 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Suggest-Bet-BE — AI Agent Context
> **Last Updated:** 2026-04-06
> **Purpose:** Comprehensive project reference for AI agents working on this codebase.
---
## 1. Project Overview
**Suggest-Bet-BE** is an **AI-powered sports betting prediction platform** backend. It provides:
- AI-driven predictions for football & basketball matches
- Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
- Live score tracking & odds monitoring
- Web scraping from Mackolik.com for historical & live match data
- Google Gemini AI for natural language match commentary
- User coupon tracking (ROI, Win Rate analytics)
### Technology Stack
| Layer | Technology |
| ----------- | -------------------------------------------- |
| Backend API | NestJS 11 (TypeScript) |
| AI Engine | Python FastAPI (v20+) |
| Database | PostgreSQL 16 + Prisma ORM |
| Queue | BullMQ + Redis (optional) |
| Cache | Redis or in-memory fallback |
| Auth | JWT + Passport (Access 15min + Refresh 7day) |
| Scraping | Axios + Cheerio (Mackolik HTML parsing) |
| Logging | Pino (structured logging) |
| i18n | nestjs-i18n (TR, EN) |
| API Docs | Swagger |
| Deploy | Docker Compose |
---
## 2. Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ CLIENTS (Web/Mobile) │
└───────────────────────────────┬──────────────────────────────────┘
│ HTTP/REST
┌───────────────────────────────▼──────────────────────────────────┐
│ NestJS Backend (Port 3005) │
│ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │
│ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │
│ │ Module │ Module │ Module │ Module │ Module │ │
│ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │
│ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │
│ │ Module │ Module │ Module │ Poster │ Module │ │
│ │SporToto │ Feeder │ Users │ │ │ │
│ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Services: AiService | MatchAnalysis | Scraper │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │
│ └──────────────────────────────────────────────────────────────┘ │
────┬─────────────────┬────────────────────┬──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
│PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │
│ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │
└───────── └────────────── └──────────────────
───────▼───────┐
│ Mackolik API │
│ (Data Source) │
└───────────────┘
```
### Database Statistics (~)
- `matches`: 237K permanent match records
- `live_matches`: ~82 active/upcoming matches (daily cycle)
- `match_player_participation`: 3.3M
- `odd_selections`: 8.5M
- `teams`: 19,595 | `players`: 217K | `leagues`: 1,505
---
## 3. Directory Structure
```
src/
├── app.module.ts # Root module (Redis, Config, i18n, guards)
├── main.ts # Entry point, Swagger, Helmet, ValidationPipe
├── common/ # Shared layer
│ ├── base/ # Generic BaseService<T> & BaseController<T>
│ ├── types/ # ApiResponse<T>, pagination DTOs
│ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper)
│ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor
│ ├── decorators/ # @Public(), @Roles(), @CurrentUser()
│ └── queues/ # BullMQ queue module
├── config/ # Env validation (Zod), config factories
├── database/ # PrismaService
├── i18n/ # TR/EN translations (common, errors, validation, auth)
├── modules/ # 13 feature modules
│ ├── admin/ # Superadmin panel (user mgmt, settings, analytics)
│ ├── analysis/ # Multi-match analysis orchestration
│ ├── auth/ # JWT auth, refresh tokens, guards
│ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService
│ ├── feeder/ # Historical data scraping (Mackolik)
│ ├── gemini/ # Google Gemini AI integration
│ ├── health/ # Liveness, readiness, AI Engine health
│ ├── leagues/ # Country/league/team discovery, H2H
│ ├── matches/ # Match listing, details, active leagues
│ ├── predictions/ # AI predictions with BullMQ queue & 6h cache
│ ├── social-poster/ # Twitter API v2, Canvas image generation
│ ├── spor-toto/ # Spor Toto integration
│ └── users/ # User CRUD (BaseController pattern)
├── scripts/ # Feeder runners, cleanup scripts
├── services/ # Shared services
│ ├── ai.service.ts # Python AI Engine bridge
│ ├── match-analysis.service.ts # 7-phase analysis orchestrator
│ └── scraper.service.ts # Mackolik HTML scraping
└── tasks/ # Cron jobs (15min, 30min, daily)
├── data-fetcher.task.ts # Live matches, odds fetching
├── live-updater.task.ts # Score updates, match finalization
└── limit-resetter.task.ts # Usage limits, subscription expiry
ai-engine/ # Python FastAPI ML engine
├── main.py # FastAPI app, routes
├── services/ # single_match_orchestrator.py
├── core/ # Core algorithms
├── features/ # Feature engineering
├── models/ # ML models
├── training/ # Model training scripts
├── config/ # Configuration
├── utils/ # Utility functions
└── tests/ # Test files
```
---
## 4. Key Modules
### Auth Module
- Register, Login, Refresh, Logout endpoints
- bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
- Global guards: `JwtAuthGuard`, `RolesGuard`, `PermissionsGuard`
### Predictions Module
- Requires Redis (`REDIS_ENABLED=true`), conditionally loaded
- BullMQ queue with worker processor
- 6-hour TTL cache on prediction results
- AI Engine call: `POST /v20plus/analyze/{matchId}`
### Coupons Module
- `SmartCouponService`: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)
- `UserCouponService`: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)
### Feeder Module
- Historical scraping from 2023-06-01 to present (reverse chronological)
- Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
- Resume support with state management
### Analysis Module
- Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
- 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save
### Social Poster Module
- Twitter API v2 integration
- Canvas-based prediction card image generation
- Gemini-powered Turkish caption generation
---
## 5. Scheduled Tasks (Cron)
| Task | Schedule | Description |
| --------------------------- | -------------- | -------------------------------------------------------- |
| `fetchLiveMatches()` | `*/15 * * * *` | Fetch football matches from Mackolik API |
| `fetchOddsForPreMatches()` | `*/15 * * * *` | Fetch odds for upcoming matches (football + basketball) |
| `fetchBasketballMatches()` | Manual | Basketball data via `basketball_top_leagues.json` filter |
| `updateLiveScores()` | `*/15 * * * *` | Update live match scores |
| `finalizeFinishedMatches()` | `*/30 * * * *` | Migrate finished: live_matches → matches table |
| `resetUsageLimits()` | `0 3 * * *` | Reset daily usage limits (03:00 Istanbul time) |
| `cleanupOldData()` | `0 4 * * *` | Delete 30-day old AI logs, 1-day finished live_matches |
| `checkSubscriptions()` | `0 0 * * *` | Mark expired subscriptions |
---
## 6. AI Engine (Python FastAPI)
Independent microservice on port 8000.
### Endpoints
| Method | Path | Description |
| ------ | ---------------------------------- | ------------------------------- |
| POST | `/v20plus/analyze/{match_id}` | Single match analysis (main) |
| GET | `/v20plus/analyze-htms/{match_id}` | First half - Full time analysis |
| GET | `/v20plus/analyze-htft/{match_id}` | HT/FT probabilities |
| POST | `/v20plus/coupon` | Smart coupon generation |
| GET | `/v20plus/daily-banker` | Daily banker picks |
| GET | `/v20plus/reversal-watchlist` | Score reversal watchlist |
| GET | `/health` | Health check |
### Output Structure (`SingleMatchPredictionPackage`)
```typescript
{
model_version: "v20plus.X",
match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
value_pick: { ... },
bet_advice: { playable, suggested_stake_units, reason },
bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
supporting_picks: [...],
aggressive_pick: { market, pick, probability, confidence, odds },
scenario_top5: [{ score, prob }],
score_prediction: { ft, ht, xg_home, xg_away, xg_total },
market_board: { ... },
reasoning_factors: string[],
ai_commentary: string // Turkish commentary from Gemini
}
```
---
## 7. API Response Format
All responses follow this standard structure:
```json
{
"success": true,
"status": 200,
"message": "İşlem başarıyla tamamlandı", // i18n translated
"data": { ... },
"errors": []
}
```
**Critical Rule:** Controllers must NEVER return raw Prisma entities. Always use Response DTOs with `@Exclude()` and `@Expose()` from `class-transformer`.
---
## 8. Configuration
### Environment Variables
```env
NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=false
REDIS_HOST=localhost
REDIS_PORT=6379
AI_ENGINE_URL=http://127.0.0.1:8000
ENABLE_GEMINI=false
GOOGLE_API_KEY=your-api-key
```
### Config Files
- `top_leagues.json` — Football top league IDs (live match filter)
- `basketball_top_leagues.json` — Basketball top league IDs
- `bet-type.json` — Bet type definitions
---
## 9. Build & Run Commands
```bash
# Development
npm run start:dev # Watch mode (port 3005)
# Production
npm run build && npm run start:prod
# Feeder (Data Collection)
npm run feeder:historical # Historical scraping (2023-06→present)
npm run feeder:fill-gaps # Fill missing data
npm run feeder:basketball # Basketball data
npm run feeder:live # Live data
# Database
npx prisma generate # Regenerate Prisma client
npx prisma migrate dev # Run migrations
npx prisma db seed # Seed database
# Testing
npm run test # Unit tests
npm run test:e2e # E2E tests
npx jest src/path/to/file.spec.ts # Single test file
# Lint/Format
npm run lint # ESLint with Prettier
npm run format # Prettier write
# Docker
docker-compose up -d postgres redis # Infrastructure
docker-compose up -d # All services
# AI Engine (Python)
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Utility
npm run swagger:summary # Export endpoint summary
npm run cleanup:live # Cleanup live matches
```
---
## 10. Code Style Guidelines
### Imports Order
```typescript
// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';
// 2. External packages
import * as bcrypt from 'bcrypt';
// 3. Local imports (relative)
import { UsersService } from './users.service';
```
### Naming Conventions
- Classes/Interfaces: `PascalCase`
- Variables/Functions: `camelCase`
- Constants: `UPPER_SNAKE_CASE`
- Files: `kebab-case`
- DTOs: `Entity + Dto` suffix (CreateUserDto, UpdateUserDto)
### Types
- `strictNullChecks: true` — null/undefined checks required
- `noImplicitAny: false``any` allowed (Prisma dynamic access)
- Specify function return types: `async findOne(id: string): Promise<User>`
### Error Handling
```typescript
// Use NestJS HTTP Exceptions with i18n keys
throw new NotFoundException('USER_NOT_FOUND');
throw new ConflictException('EMAIL_ALREADY_EXISTS');
// Reference src/i18n/{lang}/errors.json for available keys
```
---
## 11. Known Issues & Gotchas
1. **Predictions module** requires Redis. Disabled when `REDIS_ENABLED=false`.
2. **Gemini AI** is optional. Returns `null` commentary when disabled.
3. **Global Exception Filter** wraps all errors as HTTP 200 (status in body).
4. **Lineup scraping** is disabled — only Team Stats are used (V20 optimization).
5. **Feeder V17 AI feature calculation** is disabled — V20 model runs in Python.
6. **BigInt serialization**: `BigInt.prototype.toJSON = function() { return this.toString(); }` polyfill in main.ts.
7. **i18n assets** copied via `nest-cli.json` `"assets": ["i18n/**/*"]` config.
---
## 12. Reference Files for AI Agents
When working on this project, consult:
- `project_summary.md` — Comprehensive project documentation (Turkish)
- `README.md` — Architecture decisions, quick start guide
- `prompt.md` — AI assistant reference guide with agent roles
- `AGENTS.md` — Coding guidelines, DTO patterns, test structure
- `.agent/` — Skills and agent role definitions
- `top_leagues.json` / `basketball_top_leagues.json` — League filters
---
## 13. Team Logos
Team logo URL template: `https://file.mackolikfeeds.com/teams/{teamId}`
---
## 14. 🆕 VQWEN Model Integration (Since 2026-04-06)
We have integrated a new high-performance prediction engine called **VQWEN v3**.
### VQWEN Model Features
- **Accuracy:** +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
- **Features Used:**
- `ELO Ratings` (Real-time team strength).
- `Contextual Goals` (Home/Away specific performance).
- `Rest Days` (Fatigue factor for teams playing < 3 days).
- `H2H Win Rate` (Historical dominance).
- `Form Points` (Last 5 games streak).
- `Squad Strength` (Based on starting XI participation).
- **Files:**
- `ai-engine/scripts/train_vqwen_v3.py` — Training script.
- `ai-engine/services/single_match_orchestrator.py` — Integration point.
- `ai-engine/models/vqwen/` — Pickle models (`vqwen_ms.pkl`, etc.).
### New Live Lineup/Sidelined Fetcher
- **Problem:** `lineups` and `sidelined` columns in `live_matches` were empty.
- **Fix:** Added `updateLineupsAndSidelined()` method to `src/tasks/data-fetcher.task.ts`.
- **Mechanism:** Uses `FeederScraperService.fetchStartingFormation` directly via Cron (`*/15 * * * *`).
- **Status:** Active.
### Database Schema Updates
- **`substate` Column:** Added to `matches` table to track specific match states (e.g., "penalties", "overtime", "postponed").
- **Sport Partition:** Tables are now partitioned by sport (`football_team_stats` vs `basketball_team_stats`).
---
## 16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)
### HT/FT Reversal (1/2 & 2/1) Pattern Detection
Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.
#### Key Findings (147,248 matches analyzed)
| Metric | Value |
|--------|-------|
| **Total Reversal Matches** | 13,112 (8.90%) |
| **1/2 (Home leads HT, Away wins FT)** | 5,992 (4.07%) |
| **2/1 (Away leads HT, Home wins FT)** | 7,120 (4.84%) |
#### 🚨 Basketball Leagues Have Suspiciously High Reversal Rates
| League | Reversals | Total | Rate |
|--------|-----------|-------|------|
| Eurobasket U20 | 36 | 120 | **30.00%** 🔴 |
| EuroLeague 🏀 | 183 | 639 | **28.64%** 🔴 |
| PBA Commissioners 🏀 | 54 | 189 | **28.57%** 🔴 |
| Ulusal Süper Lig 🏀 | 148 | 547 | **27.06%** 🔴 |
| NBA 🏀 | 656 | 2,696 | **24.33%** 🔴 |
**All top 15 leagues by reversal rate are BASKETBALL.** Football leagues show normal rates (5-8%).
#### Suspicious Patterns
1. **Comeback Magnitude:**
- 1 goal/point: 36.1% (normal)
- 2 goals/points: 13.1% (suspicious)
- **3+ goals/points: 50.8%** 🔴 **EXTREMELY HIGH**
2. **Extreme Comebacks (Basketball):**
- Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
- Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
- These are statistically near-impossible without manipulation
3. **Favorite Loss Rate:**
- 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)
#### Impact on Model
- HT/FT model accuracy: **20.3%** (low due to reversal noise)
- Basketball reversal data creates **training noise**
- **Recommendation:** Either exclude basketball from HT/FT training or train separate basketball-specific model
#### HT/FT Model Files
- **Training script:** `ai-engine/scripts/train_htft_vqwen.py`
- **Model output:** `ai-engine/models/xgboost/xgb_ht_ft.json` + `.pkl`
- **Features:** 27 (Odds + HT/FT Tendencies + League stats)
- **Status:** Working, outputs 9-class probabilities in `market_board.HTFT.probs`
---
## 17. 🐛 Lineup Parsing Fix (Since 2026-04-07)
### Problem
AI Engine reported `"lineup_unavailable"` and `"lineup_incomplete"` flags even when `live_matches.lineups` contained full 11/11 lineup data from Mackolik.
### Root Cause
Mackolik stores lineups in `"stats"` key format:
```json
{
"stats": {
"home": [{ "personId": "...", "position": "...", ... }, ...],
"away": [{ "personId": "...", "position": "...", ... }, ...]
}
}
```
But the parser expected `"xi"`, `"starting"`, or `"lineup"` keys at root level.
### Fix
Updated `_parse_lineups_json()` in `ai-engine/services/single_match_orchestrator.py`:
- Added fallback to check `lineups_json.get("stats")` for home/away arrays
- Now correctly parses Mackolik's nested format
- Result: `home_lineup_count: 11`, `away_lineup_count: 11`, `lineup_source: "confirmed_live"`
---
## 18. Docker Deployment
```yaml
# docker-compose.yml services:
services:
app: # NestJS (port 3000→3000)
postgres: # PostgreSQL 17 Alpine (port 15432:5432)
redis: # Redis 7 Alpine (port 6379)
adminer: # Database UI (dev profile, port 8080)
ai-engine: # Python FastAPI (port 8002:8000)
```
---
_This file is maintained for AI agent context. Update when architecture or conventions change._