85 Commits

Author SHA1 Message Date
fahricansecer b62a4f2161 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m4s
2026-06-10 03:01:33 +03:00
fahricansecer c3e44ee697 gg65
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m5s
2026-06-07 22:50:33 +03:00
fahricansecer 42b6c7ce43 Update data-fetcher.task.ts
Deploy Iddaai Backend / build-and-deploy (push) Successful in 58s
2026-06-07 21:28:52 +03:00
fahricansecer 7b17aa1fee gg2
Deploy Iddaai Backend / build-and-deploy (push) Successful in 33s
2026-06-07 15:59:41 +03:00
fahricansecer c338aba1c0 gg4
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m5s
2026-06-07 15:17:08 +03:00
fahricansecer 1c03fa5e1c gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 33s
2026-06-06 14:08:30 +03:00
fahricansecer 9e41407cb5 gg3
Deploy Iddaai Backend / build-and-deploy (push) Successful in 35s
2026-06-05 00:36:24 +03:00
fahricansecer b9700f9fda national
Deploy Iddaai Backend / build-and-deploy (push) Successful in 58s
2026-06-02 13:20:45 +03:00
fahricansecer 033a29c79c Update qualified_leagues.json
Deploy Iddaai Backend / build-and-deploy (push) Successful in 54s
2026-06-02 12:07:13 +03:00
fahricansecer 4e563e996e vv
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m7s
2026-06-02 03:37:00 +03:00
fahricansecer 671979b07d gg2
Deploy Iddaai Backend / build-and-deploy (push) Successful in 35s
2026-05-29 13:35:17 +03:00
fahricansecer b5cb412236 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m6s
2026-05-29 11:59:51 +03:00
fahricansecer 659110c806 Update handoff doc + add backtest checkpoint/resume
Deploy Iddaai Backend / build-and-deploy (push) Successful in 4m32s
2026-05-25 22:29:05 +03:00
fahricansecer 988ee2f50d Add backtest pipeline, betting_brain filters, score coherence + social v3
betting_brain.py:
- HARD_MIN_SAMPLES=50 floor for calibrator bypass
- ev_edge < 0 + >= 0.20 hard vetoes
- BTTS muted (grid search found no profitable config)
- Per-market optimal envelopes (MS, OU25)
- Score coherence filter: main_pick must agree with score prediction
- HTFT reversal cross-check for MS picks

feature_builder.py / data_loader.py:
- Real home/away_position from data (was hardcoded 10)
- Cup detection wired into UpsetEngine
- _estimate_league_position with 300-day season filter

New scripts:
- diagnostic_backtest.py: per-bet diagnostic backtest with loss patterns
- optimize_filters.py: grid search per-market optimal thresholds
- analyze_backtest_csv.py: root-cause hypothesis testing on CSV
- compare_backtests.py: side-by-side validation with verdict
- test_score_coherence.py: smoke test for coherence filter (20/20 pass)

Reports:
- diagnostic_backtest_20260525_024437 (50-match smoke)
- diagnostic_backtest_20260525_035649 (1000-match in-sample)
- filter_optimization_patch.json (grid search winners per market)

Social poster v3:
- satori + resvg HTML/CSS rendering pipeline
- Twemoji football/basketball + flag SVGs
- caption SEO: 12 curated hashtags per post
- image SEO: descriptive filenames + .json metadata sidecar
- /health, /preview-png, /run-now endpoints

Docs:
- mds/SESSION_HANDOFF.md: full session state for cross-machine continuity
- mds/SOCIAL_POSTER_SETUP.md: API keys + test commands

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 20:43:28 +03:00
fahricansecer b619c2454a gg3
Deploy Iddaai Backend / build-and-deploy (push) Successful in 32s
2026-05-25 02:19:12 +03:00
fahricansecer fa48f87f53 gg2
Deploy Iddaai Backend / build-and-deploy (push) Successful in 4m53s
2026-05-24 17:29:31 +03:00
fahricansecer 920ae7ce38 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 59s
2026-05-24 02:58:53 +03:00
fahricansecer 02f9aea333 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 28s
2026-05-24 02:49:08 +03:00
fahricansecer 15c6313246 Merge branch 'main' of https://gitea.bilgich.com/fahricansecer/iddaai-be
Deploy Iddaai Backend / build-and-deploy (push) Successful in 54s
2026-05-24 02:44:52 +03:00
fahricansecer 1b420a425e Update .gitignore 2026-05-24 02:43:10 +03:00
fahricansecer 55e62d8fe5 .gitea/workflows/deploy.yml Güncelle
Deploy Iddaai Backend / build-and-deploy (push) Successful in 4m56s
2026-05-24 02:30:14 +03:00
fahricansecer 21e05148c8 feat: league tier system + retrained V25 models (48 quality leagues)
Deploy Iddaai Backend / build-and-deploy (push) Failing after 3m56s
- Add LeagueTier DB model and Prisma schema
- Add league-tiers service (CRUD, sync, retrain trigger)
- Add league-tiers controller with admin API endpoints
- Add /v1/admin/retrain endpoint in AI engine (extract→train→reload pipeline)
- Retrain V25 Pro with 48 quality leagues (MS accuracy: 26.9%→51.4%)
- Update qualified_leagues.json (443→48 leagues)
- Include V25 model files in repo for Docker deployment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-20 21:57:15 +03:00
fahricansecer e001ce9ab5 fix: guarantee iddaai-ai-engine network alias on every deploy
Deploy Iddaai Backend / build-and-deploy (push) Successful in 29s
2026-05-20 10:40:00 +03:00
fahricansecer 9481ad7094 changes
Deploy Iddaai Backend / build-and-deploy (push) Successful in 42s
2026-05-20 10:10:28 +03:00
fahricansecer 1d4aa36602 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 31s
2026-05-18 00:08:50 +03:00
fahricansecer 5574a3c59d feat: separate commentary endpoint - non-blocking Ollama
Deploy Iddaai Backend / build-and-deploy (push) Successful in 30s
2026-05-17 16:47:05 +03:00
fahricansecer 94c7a4481a main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 37s
2026-05-17 02:17:22 +03:00
fahricansecer 17ace9bd12 feat: Ollama AI expert commentary integration
Deploy Iddaai Backend / build-and-deploy (push) Successful in 37s
- OllamaClient utility for llama3.2:3b API calls (timeout 30s, non-fatal)
- OllamaCommentary service builds structured Turkish prompt from prediction data
- PredictionsService enriches response with ai_expert_commentary field
- Frontend prediction-card displays AI commentary section above match_commentary
2026-05-17 02:09:04 +03:00
fahricansecer 2b87669f41 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 31s
2026-05-13 16:56:14 +03:00
fahricansecer 2507678bc0 gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 32s
2026-05-12 17:41:49 +03:00
fahricansecer 2b8dce665f gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 1m8s
2026-05-12 03:06:54 +03:00
fahricansecer b6d64b59bf main
Deploy Iddaai Backend / build-and-deploy (push) Failing after 2m6s
2026-05-12 02:43:02 +03:00
fahricansecer f8599bdb9a gg
Deploy Iddaai Backend / build-and-deploy (push) Failing after 2m1s
2026-05-11 23:11:41 +03:00
fahricansecer 4dcc4ced50 gg
Deploy Iddaai Backend / build-and-deploy (push) Failing after 2m15s
2026-05-11 20:50:31 +03:00
fahricansecer 70fdc066c7 Merge branch 'v28'
Deploy Iddaai Backend / build-and-deploy (push) Successful in 6s
2026-05-10 22:52:21 +03:00
fahricansecer f3362f266c gg 2026-05-10 22:52:05 +03:00
fahricansecer 8ce8fa5b94 Merge pull request 'gg' (#6) from v28 into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 39s
Reviewed-on: #6
2026-05-10 10:39:32 +03:00
fahricansecer c525b12dfd gg 2026-05-10 10:37:45 +03:00
fahricansecer 497b5d8d3b Merge pull request 'feat(ai-engine): value sniper thresholds and logic relaxed' (#5) from v28 into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 30s
Reviewed-on: #5
2026-05-06 17:56:24 +03:00
fahricansecer 4f7090e2d9 feat(ai-engine): value sniper thresholds and logic relaxed 2026-05-06 17:44:45 +03:00
fahricansecer 5b5f83c8cf fix(ai-engine): remove target leakage from training data extraction
Deploy Iddaai Backend / build-and-deploy (push) Successful in 6s
- goals_form now uses avg of last 5 historical matches instead of current match goals
- squad_quality removes current match goals/assists, uses only pre-match known data
- adds temporal filtering via match_id -> mst_utc mapping
2026-05-05 22:35:04 +03:00
fahricansecer bfddcaca7d gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 6s
2026-05-05 21:27:06 +03:00
fahricansecer 56d560af08 Update single_match_orchestrator.py
Deploy Iddaai Backend / build-and-deploy (push) Successful in 8s
2026-05-05 20:59:59 +03:00
fahricansecer 4bc51cfa99 fix(ai-engine): hoist ms_edge before score prediction branch to prevent UnboundLocalError
Deploy Iddaai Backend / build-and-deploy (push) Successful in 5s
2026-05-05 20:34:14 +03:00
fahricansecer fdb8a5d0f0 fix(ai-engine): sync FEATURE_COLS with trained models (82→102 features)
Deploy Iddaai Backend / build-and-deploy (push) Successful in 6s
- Load feature columns dynamically from feature_cols.json
- Add 20 missing odds_*_present boolean flags to fallback list
- Fixes LightGBM 'features in data (82) != training data (102)' crash
2026-05-05 20:29:55 +03:00
fahricansecer 22596e69f2 fix(predictions): circuit breaker resilience + graceful degradation
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
- Reset consecutiveFailures on cooldown expiry (half-open state)
  so a single retry failure doesn't immediately re-open the circuit
- Exclude AI Engine app-level 500s from circuit breaker count
  (only network/infra errors: timeout, 502, 503, 504, 429)
- Return null gracefully instead of throwing 503 when no cache exists
- Add DB fallback for non-cooldown AI Engine failures
- Remove blocking wait-and-retry that held requests for up to 20s
2026-05-05 20:19:25 +03:00
fahricansecer f32badbd8f fix(predictions): cooldown fallback cascade + circuit breaker tuning
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
- Add 4-level fallback when AI circuit breaker fires cooldown:
  1) In-memory cache (10min TTL)
  2) DB stored prediction (no TTL filter)
  3) DB cached prediction (with model version check)
  4) Wait out cooldown + retry once (max 20s wait)
- Raise circuit breaker threshold from 3 to 5 consecutive failures
- Reduce cooldown duration from 30s to 15s for faster recovery
- Add extractCooldownMs helper to parse remaining ms from error detail
2026-05-05 20:11:19 +03:00
fahricansecer 5645b38f20 main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 32s
2026-05-05 17:09:11 +03:00
fahricansecer 244d8f5366 feat(ai): expand training to 68K+ matches, add score model, backfill implied odds
Deploy Iddaai Backend / build-and-deploy (push) Successful in 6s
- extract_training_data.py: switch from top_leagues.json (23) to qualified_leagues.json (265)
- update_implied_odds.py: new script to backfill implied odds from real market data
- train_score_model.py: rewrite with v25 102-feature set + temporal split
- single_match_orchestrator.py: integrate ML score model with heuristic fallback
2026-05-05 16:04:00 +03:00
fahricansecer 9bb8f39bca gg
Deploy Iddaai Backend / build-and-deploy (push) Successful in 2m45s
2026-05-05 14:06:20 +03:00
fahricansecer 7a1cf14e2f Update matches.service.ts
Deploy Iddaai Backend / build-and-deploy (push) Successful in 28s
2026-05-05 10:47:00 +03:00
fahricansecer 62c797d299 Update matches.service.ts
Deploy Iddaai Backend / build-and-deploy (push) Successful in 29s
2026-05-05 10:13:23 +03:00
fahricansecer 34cc4a6cbb Update matches.service.ts
Deploy Iddaai Backend / build-and-deploy (push) Successful in 30s
2026-05-05 01:04:56 +03:00
fahricansecer 27e96da31d main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 29s
2026-05-04 18:00:40 +03:00
fahricansecer 145a8b336b fix(feeder): preserve pre-match odds when match goes live
Deploy Iddaai Backend / build-and-deploy (push) Successful in 29s
Live odds have missing selections (e.g. '1' key removed from Maç Sonucu
after kickoff), causing the AI model to produce wildly incorrect predictions
(e.g. 3.5% home win for Bristol City). Two guards added:

1. fetchOddsForMatches: Exclude live/finished matches from odds fetch query
2. processMatchOdds: Skip odds/lineups/sidelined overwrite if match already
   has pre-match odds and is live/finished
2026-05-02 16:32:42 +03:00
fahricansecer 7a8960edb8 chore: remove debug checkpoint logs and temp SQL files
Deploy Iddaai Backend / build-and-deploy (push) Successful in 37s
2026-04-26 17:09:22 +03:00
fahricansecer 691c52f610 perf: replace Prisma relation queries with raw SQL for getExistingMatchIds and getMissingScopes - fixes Pi hang
Deploy Iddaai Backend / build-and-deploy (push) Successful in 39s
2026-04-26 17:07:19 +03:00
fahricansecer bc461429f6 debug: add checkpoint timestamps to processDate for hang diagnosis
Deploy Iddaai Backend / build-and-deploy (push) Successful in 46s
2026-04-26 17:04:46 +03:00
fahricansecer a338d02244 main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 2m42s
2026-04-26 03:07:18 +03:00
fahricansecer 1623432039 fix: watchdog force-kill with SIGKILL fallback when process.exit is blocked 2026-04-26 02:27:51 +03:00
fahricansecer 4c7930e9d2 feat: add watchdog timer to detect and recover from hung API requests
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
2026-04-25 11:20:30 +03:00
fahricansecer ec463cb927 fix: make canvas import optional for ARM64 compatibility 2026-04-25 02:41:53 +03:00
fahricansecer eab95c4e5c Update feeder.service.ts
Deploy Iddaai Backend / build-and-deploy (push) Successful in 30s
2026-04-25 02:23:38 +03:00
fahricansecer 9027cc9900 v28
Deploy Iddaai Backend / build-and-deploy (push) Successful in 3m21s
2026-04-24 23:46:28 +03:00
fahricansecer 3875f2a512 Create v28-pro-max-architecture.md
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
2026-04-24 02:30:26 +03:00
fahricansecer 300dceeb4b Merge branch 'main' of https://gitea.bilgich.com/fahricansecer/iddaai-be
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
2026-04-24 02:10:48 +03:00
fahricansecer ad01976fb9 fix: lineup data normalization + tomorrow match sync + player field mapping 2026-04-24 02:09:58 +03:00
fahricansecer 6880eb92f5 Merge pull request 'v26-shadow' (#4) from v26-shadow into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 27s
Reviewed-on: #4
2026-04-24 01:15:54 +03:00
fahricansecer 9e2edd590c Merge branch 'main' into v26-shadow 2026-04-24 01:15:18 +03:00
fahricansecer b5c2edf346 gg 2026-04-24 01:15:05 +03:00
fahricansecer bf7473c1e7 Merge pull request 'fix: update version tags to v28 and temporarily disable cache for predictions' (#3) from v26-shadow into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 31s
Reviewed-on: #3
2026-04-24 00:30:55 +03:00
fahricansecer 1f26a5bf2f fix: update version tags to v28 and temporarily disable cache for predictions 2026-04-24 00:11:00 +03:00
fahricansecer fb53fdf1df Merge pull request 'v26-shadow' (#2) from v26-shadow into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 2m51s
Reviewed-on: #2
2026-04-23 22:29:23 +03:00
fahricansecer 634204acf0 v28 2026-04-23 22:22:59 +03:00
fahricansecer df428ed1e8 gg 2026-04-22 02:17:02 +03:00
fahricansecer 2ccd6831eb gg 2026-04-21 16:53:56 +03:00
fahricansecer 1346924387 gg 2026-04-19 13:23:00 +03:00
fahricansecer e4c74025e5 Merge pull request 'cron' (#1) from cron into main
Deploy Iddaai Backend / build-and-deploy (push) Successful in 2m48s
Reviewed-on: #1
2026-04-16 17:22:36 +03:00
fahricansecer c8e7e4e927 cr 2026-04-16 17:21:48 +03:00
Gitea Actions c8fa4c442d chore: add docker info 2026-04-16 12:20:56 +00:00
fahricansecer 0f917695dd chore: add workflow
Check Docker Pi / check-docker (push) Successful in 6s
2026-04-16 15:20:42 +03:00
fahricansecer 249c57346e first (part 4: residual files)
Deploy Iddaai Backend / build-and-deploy (push) Successful in 26s
2026-04-16 15:13:24 +03:00
fahricansecer 182f4aae16 first (part 3: src directory)
Deploy Iddaai Backend / build-and-deploy (push) Successful in 33s
2026-04-16 15:12:27 +03:00
fahricansecer 2f0b85a0c7 first (part 2: other directories)
Deploy Iddaai Backend / build-and-deploy (push) Failing after 18s
2026-04-16 15:11:25 +03:00
fahricansecer 7814e0bc6b first (part 1: root files)
Deploy Iddaai Backend / build-and-deploy (push) Failing after 4s
2026-04-16 15:09:10 +03:00
582 changed files with 272770 additions and 12 deletions
+27
View File
@@ -0,0 +1,27 @@
node_modules
dist
.git
.env
.env.*
*.backup
*.dump
ai-engine/
venv/
__pycache__/
*.pyc
# IDE files
.vscode/
.idea/
# Ignore test coverage and log files
coverage/
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
# Uploads
uploads/
public/uploads/
+33 -9
View File
@@ -11,13 +11,27 @@ jobs:
- name: Kodu Cek
uses: actions/checkout@v4
- name: Docker Build
- name: Docker Build (Backend)
run: docker build -t iddaai-be:latest .
- name: Eski Konteyneri Sil
run: docker rm -f iddaai-be || true
- name: Docker Build (AI Engine)
run: docker build -t iddaai-ai-engine:latest ./ai-engine
- name: Yeni Versiyonu Baslat
- name: Eski Konteynerleri Sil
run: |
docker rm -f iddaai-be || true
docker rm -f iddaai-ai-engine || true
- name: AI Engine'i Baslat
run: |
docker run -d \
--name iddaai-ai-engine \
--restart unless-stopped \
--network iddaai_iddaai-network \
-e DATABASE_URL='${{ secrets.DATABASE_URL }}' \
iddaai-ai-engine:latest
- name: Backend'i Baslat
run: |
docker run -d \
--name iddaai-be \
@@ -25,11 +39,21 @@ jobs:
--network iddaai_iddaai-network \
-p 127.0.0.1:1810:3005 \
-e NODE_ENV=production \
-e DATABASE_URL='postgresql://iddaai_user:IddaA1_S4crET!@iddaai-postgres:5432/iddaai_db?schema=public' \
-e REDIS_HOST='iddaai-redis' \
-e REDIS_PORT='6379' \
-e REDIS_PASSWORD='IddaA1_Redis_Pass!' \
-e DATABASE_URL='${{ secrets.DATABASE_URL }}' \
-e REDIS_HOST='${{ secrets.REDIS_HOST }}' \
-e REDIS_PORT='${{ secrets.REDIS_PORT }}' \
-e REDIS_PASSWORD='${{ secrets.REDIS_PASSWORD }}' \
-e AI_ENGINE_URL='http://iddaai-ai-engine:8000' \
-e JWT_SECRET='b7V8jM2wP1L5mQxs2RdfFkAsLpI2oG!w' \
-e JWT_SECRET='${{ secrets.JWT_SECRET }}' \
-e JWT_ACCESS_EXPIRATION='1d' \
iddaai-be:latest /bin/sh -c "npx prisma migrate deploy && node dist/src/main.js"
- name: Saglik Kontrolu
run: |
sleep 10
echo "=== AI Engine logs ==="
docker logs --tail 30 iddaai-ai-engine || true
echo "=== Backend logs ==="
docker logs --tail 30 iddaai-be || true
echo "=== AI Engine health ==="
docker exec iddaai-ai-engine python -c "import urllib.request; print(urllib.request.urlopen('http://127.0.0.1:8000/health').read().decode())" || echo "AI engine health check failed"
+57
View File
@@ -0,0 +1,57 @@
# Node
node_modules/
dist/
dist-*/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
# Environment
.env
.env.*
!.env.example
# Python
__pycache__/
*.py[cod]
*$py.class
venv/
.venv/
env/
# Database / Docker Volumes
/data/
ai-engine/data/**/*.csv
ai-engine/data/v26_shadow/
ai-engine/data/__pycache__/
postgres-data/
redis-data/
# OS / Editor
.DS_Store
.idea/
.vscode/
# Tests / Coverage
coverage/
# Logs
logs/
*.log
# Uploads
uploads/
public/uploads/
# Large Datasets and ML Models
ai-engine/models/*
!ai-engine/models/*.py
!ai-engine/models/v25/
!ai-engine/models/v27/
!ai-engine/models/basketball_v25/
!ai-engine/models/calibration/
models/*
!models/*.py
colab_export/
+322
View File
@@ -0,0 +1,322 @@
# AGENTS.md - Coding Agent Guidelines
Bu dosya, bu repoda çalışan AI kodlama ajanları için rehberdir.
---
## 1. Build / Lint / Test Commands
```bash
# Development
npm run start:dev # Dev server with watch mode
npm run build # Production build (nest build)
# Linting & Formatting
npm run lint # ESLint with Prettier
npm run format # Prettier write
# Testing
npm run test # Run all unit tests
npm run test:watch # Watch mode
npm run test:e2e # End-to-end tests
npx jest src/path/to/file.spec.ts # Run single test file
npx jest --testNamePattern="test name" # Run specific test
# Database
npx prisma generate # Generate Prisma client (required after install)
npx prisma migrate dev # Run migrations
npx prisma db seed # Seed database
# Feeder Scripts
npm run feeder:historical # Historical data fetch
npm run feeder:live # Live match data fetch
npm run feeder:basketball # Basketball data fetch
```
---
## 2. Code Style Guidelines
### Imports (Sıralama)
```typescript
// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';
import { ApiTags, ApiOperation } from '@nestjs/swagger';
// 2. External packages
import { plainToInstance } from 'class-transformer';
import * as bcrypt from 'bcrypt';
// 3. Local imports (relative)
import { UsersService } from './users.service';
import { CreateUserDto } from './dto/user.dto';
import { ApiResponse, createSuccessResponse } from '../../common/types';
```
### Formatting
- **Single quotes** for strings
- **Trailing commas** always
- Prettier ile formatlama zorunlu
- Dosya sonu boş satır
### Types & Type Safety
- `strictNullChecks: true` - null/undefined kontrolü zorunlu
- `noImplicitAny: false` - any kullanımına izin var (Prisma dynamic access için)
- Fonksiyon return type belirt: `async findOne(id: string): Promise<User>`
- Interface > Type alias (objeler için)
### Naming Conventions
```typescript
// Classes & Interfaces: PascalCase
class UsersService {}
interface ApiResponse<T> {}
// Variables & Functions: camelCase
const userService = new UsersService();
async function findUserById() {}
// Constants: UPPER_SNAKE_CASE
const JWT_SECRET = 'secret';
const IS_PUBLIC_KEY = 'isPublic';
// Files: kebab-case
user.dto.ts;
users.service.ts;
predictions.processor.spec.ts;
// DTOs: Entity + Dto suffix
(CreateUserDto, UpdateUserDto, UserResponseDto);
```
---
## 3. DTO Pattern
### Request DTOs
```typescript
export class CreateUserDto {
@ApiPropertyOptional({ example: 'user@example.com' })
@IsEmail()
email: string;
@IsString()
@MinLength(8)
password: string;
@IsOptional()
@IsString()
firstName?: string;
}
```
### Response DTOs (Security Critical)
```typescript
@Exclude()
export class UserResponseDto {
@Expose()
id: string;
@Expose()
email: string;
// passwordHash intentionally NOT exposed
}
```
### Controller Usage
```typescript
@Get('me')
async getMe(@CurrentUser() user: User): Promise<ApiResponse<UserResponseDto>> {
const fullUser = await this.usersService.findOneWithDetails(user.id);
return createSuccessResponse(
plainToInstance(UserResponseDto, fullUser),
);
}
```
**KRITIK:** Asla raw Prisma entity döndürme. Her zaman Response DTO kullan.
---
## 4. Architecture Patterns
### Service Layer
```typescript
@Injectable()
export class UsersService extends BaseService<
User,
CreateUserDto,
UpdateUserDto
> {
constructor(prisma: PrismaService) {
super(prisma, 'User');
}
// Custom methods...
}
```
### Controller Layer
```typescript
@ApiTags('Users')
@ApiBearerAuth()
@Controller('users')
export class UsersController extends BaseController<
User,
CreateUserDto,
UpdateUserDto
> {
constructor(private readonly usersService: UsersService) {
super(usersService, 'User');
}
}
```
### API Response Format
```typescript
// All responses use this structure
{
"success": true,
"status": 200,
"message": "Success",
"data": { ... },
"errors": []
}
// Helper functions
createSuccessResponse(data, 'Message')
createErrorResponse('Message', 400, ['error1'])
createPaginatedResponse(items, total, page, limit)
```
---
## 5. Error Handling
### Throw NestJS HTTP Exceptions
```typescript
// Correct
throw new NotFoundException('User not found');
throw new ConflictException('EMAIL_ALREADY_EXISTS');
throw new UnauthorizedException('INVALID_CREDENTIALS');
// Wrong
throw new Error('User not found'); // Don't use generic Error
```
### i18n Error Keys
```typescript
// Use translatable keys (check src/i18n/{lang}/errors.json)
throw new ConflictException('EMAIL_ALREADY_EXISTS');
// Translates to: "Email already exists" (en) / "Email zaten kayıtlı" (tr)
```
### Global Exception Filter
- Tüm hatalar HTTP 200 ile döner (status body içinde)
- `NODE_ENV=development` ise stack trace eklenir
- Validation hataları otomatik formatlanır
---
## 6. Testing
### Unit Test Structure
```typescript
import { Test, TestingModule } from '@nestjs/testing';
describe('UsersService', () => {
let service: UsersService;
let prisma: PrismaService;
beforeEach(async () => {
const module: TestingModule = await Test.createTestingModule({
providers: [
UsersService,
{ provide: PrismaService, useValue: mockPrisma },
],
}).compile();
service = module.get<UsersService>(UsersService);
});
it('should find user by id', async () => {
// Arrange
mockPrisma.user.findUnique.mockResolvedValue(mockUser);
// Act
const result = await service.findOne('id');
// Assert
expect(result).toEqual(mockUser);
});
});
```
### Mocking External Dependencies
```typescript
jest.mock('axios');
const mockedAxios = axios as jest.Mocked<typeof axios>;
beforeEach(() => {
jest.clearAllMocks();
mockedAxios.post.mockResolvedValue({ data: { ok: true } });
});
```
---
## 7. Module Registration
Redis-enabled modüller için `app.module.ts`:
```typescript
const redisEnabled = process.env.REDIS_ENABLED === 'true';
@Module({
imports: [
...(redisEnabled ? [QueueModule, PredictionsModule] : []),
// ...
],
})
```
---
## 8. Environment Variables
Zorunlu (`.env`):
```env
NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://postgres:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
REDIS_ENABLED=false
AI_ENGINE_URL=http://127.0.0.1:8000
```
---
## 9. Pre-commit Checklist
1. `npm run lint` - Lint errors fixed
2. `npm run build` - Build succeeds
3. `npm run test` - All tests pass
4. Response DTOs used for all API responses
5. No secrets/credentials in code
+273
View File
@@ -0,0 +1,273 @@
# 🚀 Suggest-Bet-BE — Deployment Guide
> **Tarih:** 2026-04-03
> **Versiyon:** Sport Partition Release (Futbol/Basketbol Ayrımı)
> **Amaç:** Masaüstü veya sunucuya kurulum adımları
---
## 🔑 Şifreler ve Bağlantı Bilgileri
| Servis | Kullanıcı | Şifre | Host | Port |
|--------|-----------|-------|------|------|
| **PostgreSQL** | `suggestbet` | `SuGGesT2026SecuRe` | `localhost` | `15432` |
| **Redis** | — | `RedisSecure2026` | `localhost` | `6379` |
| **JWT Secret** | — | `9bfa42fbdc6031da6d7c0bd30e9f5b6378a071613d0c02acf95eb576249c3a25` | — | — |
**Database URL:**
```
postgresql://suggestbet:SuGGesT2026SecuRe@localhost:15432/boilerplate_db?schema=public
```
---
## 📋 Gereksinimler
- **Node.js:** v20.19+
- **Docker + Docker Compose:** PostgreSQL + Redis için
- **npm:** Paket yöneticisi
---
## 🔧 Adım Adım Kurulum
### Adım 1: Kodu Çek
```bash
cd ~/Documents/Suggest-Bet-BE
git pull origin main
```
### Adım 2: .env Dosyasını Oluştur
```bash
# /Users/piton/Documents/Suggest-Bet-BE/.env
NODE_ENV=development
PORT=3005
DATABASE_URL="postgresql://suggestbet:SuGGesT2026SecuRe@localhost:15432/boilerplate_db?schema=public"
JWT_SECRET=9bfa42fbdc6031da6d7c0bd30e9f5b6378a071613d0c02acf95eb576249c3a25
JWT_ACCESS_EXPIRATION=7d
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=true
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=RedisSecure2026
DEFAULT_LANGUAGE=en
FALLBACK_LANGUAGE=en
ENABLE_MAIL=false
ENABLE_S3=false
ENABLE_WEBSOCKET=false
ENABLE_MULTI_TENANCY=false
THROTTLE_TTL=60000
THROTTLE_LIMIT=100
ENABLE_GEMINI=true
GOOGLE_API_KEY=your-google-api-key
GEMINI_MODEL=gemini-2.5-flash
AI_ENGINE_URL=http://127.0.0.1:8000
```
### Adım 3: Docker Infrastructure Başlat
```bash
cd ~/Documents/Suggest-Bet-BE
docker compose up -d postgres redis
```
PostgreSQL'in hazır olduğunu kontrol et:
```bash
docker exec -i suggestbet-postgres pg_isready -U suggestbet
# Çıktı: /var/run/postgresql:5432 - accepting connections
```
### Adım 4: Dump'u Restore Et
```bash
# Dump dosyasını container'a kopyala
docker cp /path/to/dump-boilerplate_db-202604020914-v5 suggestbet-postgres:/tmp/dump_file
# Restore et
export PGPASSWORD="SuGGesT2026SecuRe"
docker exec -e PGPASSWORD="$PGPASSWORD" suggestbet-postgres pg_restore \
-U suggestbet -d boilerplate_db --clean --if-exists /tmp/dump_file
```
### Adım 5: Sport Partition Migration'ını Çalıştır
**Sırayla çalıştır — her biri ayrı transaction:**
```bash
export PGPASSWORD="SuGGesT2026SecuRe"
DB="suggestbet-postgres"
MIGRATION_DIR="prisma/migrations/20260403161000_sport_partition"
# 1. Yeni team stats tabloları oluştur
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/01_create_team_stats.sql
# 2. Team stats verilerini kopyala
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/02_copy_team_stats.sql
# 3. Yeni AI features tabloları oluştur
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/03_create_ai_features.sql
# 4. AI features verilerini kopyala
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/04_copy_ai_features.sql
# 5. match_player_stats → basketball_player_stats rename
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/05_rename_player_stats.sql
# 6. odd_categories + odd_selections'e sport kolonu ekle
docker exec -e PGPASSWORD="$PGPASSWORD" -i $DB psql -U suggestbet -d boilerplate_db < $MIGRATION_DIR/06_add_sport_to_odds.sql
```
**odd_selections için batch update (14.8M satır — her çalıştır 1M günceller):**
```bash
# Bunu "remaining = 0" olana kadar tekrar tekrar çalıştır
export PGPASSWORD="SuGGesT2026SecuRe"
docker exec -e PGPASSWORD="$PGPASSWORD" -i suggestbet-postgres psql -U suggestbet -d boilerplate_db -c "
WITH t AS (
SELECT os.db_id, oc.sport
FROM odd_selections os
JOIN odd_categories oc ON os.odd_category_db_id = oc.db_id
WHERE os.sport IS NULL
LIMIT 1000000
)
UPDATE odd_selections SET sport = t.sport FROM t WHERE odd_selections.db_id = t.db_id;
SELECT COUNT(*) as remaining FROM odd_selections WHERE sport IS NULL;
"
```
**Kalan satırlar bitince index oluştur:**
```bash
export PGPASSWORD="SuGGesT2026SecuRe"
docker exec -e PGPASSWORD="$PGPASSWORD" -i suggestbet-postgres psql -U suggestbet -d boilerplate_db -c "
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_odd_selections_sport ON odd_selections(sport) WHERE sport IS NOT NULL;
"
```
### Adım 6: Bağımlılıkları Yükle + Prisma Generate
```bash
cd ~/Documents/Suggest-Bet-BE
# Bağımlılıkları yükle
npm ci
# Prisma client oluştur
npx prisma generate
```
### Adım 7: Build + Başlat
```bash
# Build
npm run build
# Başlat
npm run start:prod
```
### Adım 8: Doğrulama
```bash
# Sağlık kontrolü
curl http://localhost:3005/api/health
# Swagger UI
open http://localhost:3005/api/docs
# Yeni tabloları kontrol et
export PGPASSWORD="SuGGesT2026SecuRe"
docker exec -e PGPASSWORD="$PGPASSWORD" -i suggestbet-postgres psql -U suggestbet -d boilerplate_db -c "
SELECT 'football_team_stats' as tbl, COUNT(*) FROM football_team_stats
UNION ALL SELECT 'basketball_team_stats', COUNT(*) FROM basketball_team_stats
UNION ALL SELECT 'basketball_player_stats', COUNT(*) FROM basketball_player_stats
UNION ALL SELECT 'odd_categories (sport)', COUNT(*) FROM odd_categories WHERE sport IS NOT NULL
UNION ALL SELECT 'odd_selections (sport)', COUNT(*) FROM odd_selections WHERE sport IS NOT NULL;
"
```
---
## 🤖 AI Engine (Opsiyonel)
```bash
cd ~/Documents/Suggest-Bet-BE/ai-engine
# Bağımlılıklar
pip install -r requirements.txt
# Başlat
uvicorn main:app --host 0.0.0.0 --port 8000
```
---
## ✅ Tablo Durumu (Migration Sonrası)
| Tablo | Satır (~) | Durum |
|-------|-----------|-------|
| `football_team_stats` | 217,956 | ✅ Yeni |
| `basketball_team_stats` | 48,824 | ✅ Yeni |
| `basketball_player_stats` | 273,140 | ✅ Rename edildi |
| `football_ai_features` | 0 | ⚠️ Feeder dolduracak |
| `basketball_ai_features` | 0 | ⚠️ Feeder dolduracak |
| `odd_categories (sport)` | 2,695,511 | ✅ Güncellendi |
| `odd_selections (sport)` | 14,810,396 | ✅ Güncellendi |
| `match_team_stats` (ESKİ) | 266,780 | 🗑️ Silinebilir (yedek olarak kalsın) |
| `match_ai_features` (ESKİ) | 0 | 🗑️ Silinebilir |
---
## 🗑️ Eski Tabloları Silme (Opsiyonel)
**SADECE her şey çalıştığını doğruladıktan sonra:**
```bash
export PGPASSWORD="SuGGesT2026SecuRe"
docker exec -e PGPASSWORD="$PGPASSWORD" -i suggestbet-postgres psql -U suggestbet -d boilerplate_db -c "
DROP TABLE IF EXISTS match_team_stats CASCADE;
DROP TABLE IF EXISTS match_ai_features CASCADE;
"
```
---
## 🔧 Sorun Giderme
### PostgreSQL başlamıyor (postmaster.pid hatası)
```bash
docker compose stop postgres
docker compose rm -f postgres
docker volume rm suggest-bet-be_pgml_data
docker compose up -d postgres
# Sonra dump + migration tekrar
```
### Docker Desktop başlamıyor (disk dolu)
```bash
# Büyük dosyaları temizle
rm -rf ~/Library/Caches/Homebrew/*
rm -rf ~/.npm/_cacache
docker system prune -af
df -h / # En az 3-4GB boş olmalı
```
### Feeder çalışmıyor
```bash
# Logları kontrol et
tail -f logs/app.log # veya docker logs suggestbet-app
# Manuel feeder çalıştır
npm run feeder:live
```
---
## 📝 Notlar
- **Veri kaybolmaz** — eski tablolar migration sonrası silinmez, yedek olarak kalır
- **Feeder** otomatik yeni tablolara yazar (`footballTeamStats`, `basketballTeamStats`, vb.)
- **Redis** opsiyonel — `REDIS_ENABLED=false` yapabilirsin (in-memory fallback)
- **Swagger** sadece development modunda aktif
+3 -3
View File
@@ -16,7 +16,7 @@ RUN npm ci
COPY . .
# Generate Prisma client
RUN npx prisma generate
RUN DATABASE_URL="postgresql://dummy:dummy@localhost/dummy" npx prisma generate
# Build the application
RUN npm run build
@@ -38,7 +38,7 @@ RUN apk add --no-cache --virtual .build-deps python3 make g++ cairo-dev pango-de
# Copy Prisma schema and generate client
COPY prisma ./prisma
RUN npx prisma generate
RUN DATABASE_URL="postgresql://dummy:dummy@localhost/dummy" npx prisma generate
# Copy built application
COPY --from=builder /app/dist ./dist
@@ -47,7 +47,7 @@ COPY --from=builder /app/dist ./dist
COPY --from=builder /app/src/i18n ./dist/i18n
# Copy league filter config files (critical: without these, feeder stores ALL matches)
COPY top_leagues.json basketball_top_leagues.json ./
COPY qualified_leagues.json top_leagues.json basketball_top_leagues.json ./
# Set environment
ENV NODE_ENV=production
+517
View File
@@ -0,0 +1,517 @@
# Suggest-Bet-BE — AI Agent Context
> **Last Updated:** 2026-04-06
> **Purpose:** Comprehensive project reference for AI agents working on this codebase.
---
## 1. Project Overview
**Suggest-Bet-BE** is an **AI-powered sports betting prediction platform** backend. It provides:
- AI-driven predictions for football & basketball matches
- Smart coupon generation (SAFE, BALANCED, AGGRESSIVE, VALUE, MIRACLE strategies)
- Live score tracking & odds monitoring
- Web scraping from Mackolik.com for historical & live match data
- Google Gemini AI for natural language match commentary
- User coupon tracking (ROI, Win Rate analytics)
### Technology Stack
| Layer | Technology |
| ----------- | -------------------------------------------- |
| Backend API | NestJS 11 (TypeScript) |
| AI Engine | Python FastAPI (v20+) |
| Database | PostgreSQL 16 + Prisma ORM |
| Queue | BullMQ + Redis (optional) |
| Cache | Redis or in-memory fallback |
| Auth | JWT + Passport (Access 15min + Refresh 7day) |
| Scraping | Axios + Cheerio (Mackolik HTML parsing) |
| Logging | Pino (structured logging) |
| i18n | nestjs-i18n (TR, EN) |
| API Docs | Swagger |
| Deploy | Docker Compose |
---
## 2. Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ CLIENTS (Web/Mobile) │
└───────────────────────────────┬──────────────────────────────────┘
│ HTTP/REST
┌───────────────────────────────▼──────────────────────────────────┐
│ NestJS Backend (Port 3005) │
│ ┌─────────┬──────────┬──────────┬──────────┬─────────────────┐ │
│ │ Auth │ Admin │ Matches │ Leagues │ Predictions │ │
│ │ Module │ Module │ Module │ Module │ Module │ │
│ ├─────────┼──────────┼──────────┼──────────┼─────────────────┤ │
│ │ Coupons │ Analysis │ Gemini │ Social- │ Health │ │
│ │ Module │ Module │ Module │ Poster │ Module │ │
│ │SporToto │ Feeder │ Users │ │ │ │
│ └─────────┴──────────┴──────────┴──────────┴─────────────────┘ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Services: AiService | MatchAnalysis | Scraper │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ Tasks: DataFetcher (Cron) | LiveUpdater | LimitResetter │ │
│ └──────────────────────────────────────────────────────────────┘ │
────┬─────────────────┬────────────────────┬──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
│PostgreSQL│ │ Redis/BullMQ │ │ AI Engine (py) │
│ (3.6GB) │ │ (Optional) │ │ FastAPI:8000 │
└───────── └────────────── └──────────────────
───────▼───────┐
│ Mackolik API │
│ (Data Source) │
└───────────────┘
```
### Database Statistics (~)
- `matches`: 237K permanent match records
- `live_matches`: ~82 active/upcoming matches (daily cycle)
- `match_player_participation`: 3.3M
- `odd_selections`: 8.5M
- `teams`: 19,595 | `players`: 217K | `leagues`: 1,505
---
## 3. Directory Structure
```
src/
├── app.module.ts # Root module (Redis, Config, i18n, guards)
├── main.ts # Entry point, Swagger, Helmet, ValidationPipe
├── common/ # Shared layer
│ ├── base/ # Generic BaseService<T> & BaseController<T>
│ ├── types/ # ApiResponse<T>, pagination DTOs
│ ├── filters/ # GlobalExceptionFilter (HTTP 200 wrapper)
│ ├── interceptors/ # ResponseInterceptor, SanitizeInterceptor
│ ├── decorators/ # @Public(), @Roles(), @CurrentUser()
│ └── queues/ # BullMQ queue module
├── config/ # Env validation (Zod), config factories
├── database/ # PrismaService
├── i18n/ # TR/EN translations (common, errors, validation, auth)
├── modules/ # 13 feature modules
│ ├── admin/ # Superadmin panel (user mgmt, settings, analytics)
│ ├── analysis/ # Multi-match analysis orchestration
│ ├── auth/ # JWT auth, refresh tokens, guards
│ ├── coupons/ # SmartCouponService (5 strategies), UserCouponService
│ ├── feeder/ # Historical data scraping (Mackolik)
│ ├── gemini/ # Google Gemini AI integration
│ ├── health/ # Liveness, readiness, AI Engine health
│ ├── leagues/ # Country/league/team discovery, H2H
│ ├── matches/ # Match listing, details, active leagues
│ ├── predictions/ # AI predictions with BullMQ queue & 6h cache
│ ├── social-poster/ # Twitter API v2, Canvas image generation
│ ├── spor-toto/ # Spor Toto integration
│ └── users/ # User CRUD (BaseController pattern)
├── scripts/ # Feeder runners, cleanup scripts
├── services/ # Shared services
│ ├── ai.service.ts # Python AI Engine bridge
│ ├── match-analysis.service.ts # 7-phase analysis orchestrator
│ └── scraper.service.ts # Mackolik HTML scraping
└── tasks/ # Cron jobs (15min, 30min, daily)
├── data-fetcher.task.ts # Live matches, odds fetching
├── live-updater.task.ts # Score updates, match finalization
└── limit-resetter.task.ts # Usage limits, subscription expiry
ai-engine/ # Python FastAPI ML engine
├── main.py # FastAPI app, routes
├── services/ # single_match_orchestrator.py
├── core/ # Core algorithms
├── features/ # Feature engineering
├── models/ # ML models
├── training/ # Model training scripts
├── config/ # Configuration
├── utils/ # Utility functions
└── tests/ # Test files
```
---
## 4. Key Modules
### Auth Module
- Register, Login, Refresh, Logout endpoints
- bcrypt (12 rounds), JWT Access (15min) + Refresh Token (7 days, DB-stored)
- Global guards: `JwtAuthGuard`, `RolesGuard`, `PermissionsGuard`
### Predictions Module
- Requires Redis (`REDIS_ENABLED=true`), conditionally loaded
- BullMQ queue with worker processor
- 6-hour TTL cache on prediction results
- AI Engine call: `POST /v20plus/analyze/{matchId}`
### Coupons Module
- `SmartCouponService`: 5 strategies (SAFE ≥78% confidence/2 matches, BALANCED, AGGRESSIVE, VALUE EV+, MIRACLE)
- `UserCouponService`: Coupon creation, bet settlement (MS 1/X/2, Alt/Üst, KG Var/Yok)
### Feeder Module
- Historical scraping from 2023-06-01 to present (reverse chronological)
- Concurrency=20, 300ms delay, 50 max retry, 502 exponential backoff
- Resume support with state management
### Analysis Module
- Usage limits: Free (10 analyses/3 coupons/day) vs Premium (50 analyses/10 coupons)
- 7-phase flow: URL Parse → Scrape → Python Engine → Strategy → Similar Matches → Final Prediction → DB Save
### Social Poster Module
- Twitter API v2 integration
- Canvas-based prediction card image generation
- Gemini-powered Turkish caption generation
---
## 5. Scheduled Tasks (Cron)
| Task | Schedule | Description |
| --------------------------- | -------------- | -------------------------------------------------------- |
| `fetchLiveMatches()` | `*/15 * * * *` | Fetch football matches from Mackolik API |
| `fetchOddsForPreMatches()` | `*/15 * * * *` | Fetch odds for upcoming matches (football + basketball) |
| `fetchBasketballMatches()` | Manual | Basketball data via `basketball_top_leagues.json` filter |
| `updateLiveScores()` | `*/15 * * * *` | Update live match scores |
| `finalizeFinishedMatches()` | `*/30 * * * *` | Migrate finished: live_matches → matches table |
| `resetUsageLimits()` | `0 3 * * *` | Reset daily usage limits (03:00 Istanbul time) |
| `cleanupOldData()` | `0 4 * * *` | Delete 30-day old AI logs, 1-day finished live_matches |
| `checkSubscriptions()` | `0 0 * * *` | Mark expired subscriptions |
---
## 6. AI Engine (Python FastAPI)
Independent microservice on port 8000.
### Endpoints
| Method | Path | Description |
| ------ | ---------------------------------- | ------------------------------- |
| POST | `/v20plus/analyze/{match_id}` | Single match analysis (main) |
| GET | `/v20plus/analyze-htms/{match_id}` | First half - Full time analysis |
| GET | `/v20plus/analyze-htft/{match_id}` | HT/FT probabilities |
| POST | `/v20plus/coupon` | Smart coupon generation |
| GET | `/v20plus/daily-banker` | Daily banker picks |
| GET | `/v20plus/reversal-watchlist` | Score reversal watchlist |
| GET | `/health` | Health check |
### Output Structure (`SingleMatchPredictionPackage`)
```typescript
{
model_version: "v20plus.X",
match_info: { match_id, match_name, home_team, away_team, league, match_date_ms },
data_quality: { label: "HIGH"|"MEDIUM"|"LOW", score, flags, lineup_counts },
risk: { level: "LOW"|"MEDIUM"|"HIGH"|"EXTREME", score, is_surprise_risk, warnings },
main_pick: { market, pick, probability, confidence, odds, bet_grade, edge },
value_pick: { ... },
bet_advice: { playable, suggested_stake_units, reason },
bet_summary: [{ market, pick, raw_confidence, calibrated_confidence, bet_grade }],
supporting_picks: [...],
aggressive_pick: { market, pick, probability, confidence, odds },
scenario_top5: [{ score, prob }],
score_prediction: { ft, ht, xg_home, xg_away, xg_total },
market_board: { ... },
reasoning_factors: string[],
ai_commentary: string // Turkish commentary from Gemini
}
```
---
## 7. API Response Format
All responses follow this standard structure:
```json
{
"success": true,
"status": 200,
"message": "İşlem başarıyla tamamlandı", // i18n translated
"data": { ... },
"errors": []
}
```
**Critical Rule:** Controllers must NEVER return raw Prisma entities. Always use Response DTOs with `@Exclude()` and `@Expose()` from `class-transformer`.
---
## 8. Configuration
### Environment Variables
```env
NODE_ENV=development
PORT=3005
DATABASE_URL=postgresql://user:password@localhost:15432/boilerplate_db
JWT_SECRET=your-secret-key
JWT_ACCESS_EXPIRATION=15m
JWT_REFRESH_EXPIRATION=7d
REDIS_ENABLED=false
REDIS_HOST=localhost
REDIS_PORT=6379
AI_ENGINE_URL=http://127.0.0.1:8000
ENABLE_GEMINI=false
GOOGLE_API_KEY=your-api-key
```
### Config Files
- `top_leagues.json` — Football top league IDs (live match filter)
- `basketball_top_leagues.json` — Basketball top league IDs
- `bet-type.json` — Bet type definitions
---
## 9. Build & Run Commands
```bash
# Development
npm run start:dev # Watch mode (port 3005)
# Production
npm run build && npm run start:prod
# Feeder (Data Collection)
npm run feeder:historical # Historical scraping (2023-06→present)
npm run feeder:fill-gaps # Fill missing data
npm run feeder:basketball # Basketball data
npm run feeder:live # Live data
# Database
npx prisma generate # Regenerate Prisma client
npx prisma migrate dev # Run migrations
npx prisma db seed # Seed database
# Testing
npm run test # Unit tests
npm run test:e2e # E2E tests
npx jest src/path/to/file.spec.ts # Single test file
# Lint/Format
npm run lint # ESLint with Prettier
npm run format # Prettier write
# Docker
docker-compose up -d postgres redis # Infrastructure
docker-compose up -d # All services
# AI Engine (Python)
cd ai-engine && uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Utility
npm run swagger:summary # Export endpoint summary
npm run cleanup:live # Cleanup live matches
```
---
## 10. Code Style Guidelines
### Imports Order
```typescript
// 1. NestJS/common imports
import { Controller, Get, Post, Body } from '@nestjs/common';
// 2. External packages
import * as bcrypt from 'bcrypt';
// 3. Local imports (relative)
import { UsersService } from './users.service';
```
### Naming Conventions
- Classes/Interfaces: `PascalCase`
- Variables/Functions: `camelCase`
- Constants: `UPPER_SNAKE_CASE`
- Files: `kebab-case`
- DTOs: `Entity + Dto` suffix (CreateUserDto, UpdateUserDto)
### Types
- `strictNullChecks: true` — null/undefined checks required
- `noImplicitAny: false``any` allowed (Prisma dynamic access)
- Specify function return types: `async findOne(id: string): Promise<User>`
### Error Handling
```typescript
// Use NestJS HTTP Exceptions with i18n keys
throw new NotFoundException('USER_NOT_FOUND');
throw new ConflictException('EMAIL_ALREADY_EXISTS');
// Reference src/i18n/{lang}/errors.json for available keys
```
---
## 11. Known Issues & Gotchas
1. **Predictions module** requires Redis. Disabled when `REDIS_ENABLED=false`.
2. **Gemini AI** is optional. Returns `null` commentary when disabled.
3. **Global Exception Filter** wraps all errors as HTTP 200 (status in body).
4. **Lineup scraping** is disabled — only Team Stats are used (V20 optimization).
5. **Feeder V17 AI feature calculation** is disabled — V20 model runs in Python.
6. **BigInt serialization**: `BigInt.prototype.toJSON = function() { return this.toString(); }` polyfill in main.ts.
7. **i18n assets** copied via `nest-cli.json` `"assets": ["i18n/**/*"]` config.
---
## 12. Reference Files for AI Agents
When working on this project, consult:
- `project_summary.md` — Comprehensive project documentation (Turkish)
- `README.md` — Architecture decisions, quick start guide
- `prompt.md` — AI assistant reference guide with agent roles
- `AGENTS.md` — Coding guidelines, DTO patterns, test structure
- `.agent/` — Skills and agent role definitions
- `top_leagues.json` / `basketball_top_leagues.json` — League filters
---
## 13. Team Logos
Team logo URL template: `https://file.mackolikfeeds.com/teams/{teamId}`
---
## 14. 🆕 VQWEN Model Integration (Since 2026-04-06)
We have integrated a new high-performance prediction engine called **VQWEN v3**.
### VQWEN Model Features
- **Accuracy:** +244.4 Units profit in Time-Series Backtest (75.1% Win Rate on BTTS/Over markets).
- **Features Used:**
- `ELO Ratings` (Real-time team strength).
- `Contextual Goals` (Home/Away specific performance).
- `Rest Days` (Fatigue factor for teams playing < 3 days).
- `H2H Win Rate` (Historical dominance).
- `Form Points` (Last 5 games streak).
- `Squad Strength` (Based on starting XI participation).
- **Files:**
- `ai-engine/scripts/train_vqwen_v3.py` — Training script.
- `ai-engine/services/single_match_orchestrator.py` — Integration point.
- `ai-engine/models/vqwen/` — Pickle models (`vqwen_ms.pkl`, etc.).
### New Live Lineup/Sidelined Fetcher
- **Problem:** `lineups` and `sidelined` columns in `live_matches` were empty.
- **Fix:** Added `updateLineupsAndSidelined()` method to `src/tasks/data-fetcher.task.ts`.
- **Mechanism:** Uses `FeederScraperService.fetchStartingFormation` directly via Cron (`*/15 * * * *`).
- **Status:** Active.
### Database Schema Updates
- **`substate` Column:** Added to `matches` table to track specific match states (e.g., "penalties", "overtime", "postponed").
- **Sport Partition:** Tables are now partitioned by sport (`football_team_stats` vs `basketball_team_stats`).
---
## 16. 🔍 HT/FT Reversal Analysis (Since 2026-04-07)
### HT/FT Reversal (1/2 & 2/1) Pattern Detection
Reversal matches (İY/MS = 1/2 or 2/1) are statistically rare events that can indicate match-fixing or unusual patterns.
#### Key Findings (147,248 matches analyzed)
| Metric | Value |
|--------|-------|
| **Total Reversal Matches** | 13,112 (8.90%) |
| **1/2 (Home leads HT, Away wins FT)** | 5,992 (4.07%) |
| **2/1 (Away leads HT, Home wins FT)** | 7,120 (4.84%) |
#### 🚨 Basketball Leagues Have Suspiciously High Reversal Rates
| League | Reversals | Total | Rate |
|--------|-----------|-------|------|
| Eurobasket U20 | 36 | 120 | **30.00%** 🔴 |
| EuroLeague 🏀 | 183 | 639 | **28.64%** 🔴 |
| PBA Commissioners 🏀 | 54 | 189 | **28.57%** 🔴 |
| Ulusal Süper Lig 🏀 | 148 | 547 | **27.06%** 🔴 |
| NBA 🏀 | 656 | 2,696 | **24.33%** 🔴 |
**All top 15 leagues by reversal rate are BASKETBALL.** Football leagues show normal rates (5-8%).
#### Suspicious Patterns
1. **Comeback Magnitude:**
- 1 goal/point: 36.1% (normal)
- 2 goals/points: 13.1% (suspicious)
- **3+ goals/points: 50.8%** 🔴 **EXTREMELY HIGH**
2. **Extreme Comebacks (Basketball):**
- Mineros vs Irapuato: HT 39-45 → FT 102-61 (41 point swing!)
- Utah vs Memphis: HT 65-64 → FT 103-140 (37 point swing!)
- These are statistically near-impossible without manipulation
3. **Favorite Loss Rate:**
- 42.7% of reversals had the pre-match favorite lose (should be ~25-30%)
#### Impact on Model
- HT/FT model accuracy: **20.3%** (low due to reversal noise)
- Basketball reversal data creates **training noise**
- **Recommendation:** Either exclude basketball from HT/FT training or train separate basketball-specific model
#### HT/FT Model Files
- **Training script:** `ai-engine/scripts/train_htft_vqwen.py`
- **Model output:** `ai-engine/models/xgboost/xgb_ht_ft.json` + `.pkl`
- **Features:** 27 (Odds + HT/FT Tendencies + League stats)
- **Status:** Working, outputs 9-class probabilities in `market_board.HTFT.probs`
---
## 17. 🐛 Lineup Parsing Fix (Since 2026-04-07)
### Problem
AI Engine reported `"lineup_unavailable"` and `"lineup_incomplete"` flags even when `live_matches.lineups` contained full 11/11 lineup data from Mackolik.
### Root Cause
Mackolik stores lineups in `"stats"` key format:
```json
{
"stats": {
"home": [{ "personId": "...", "position": "...", ... }, ...],
"away": [{ "personId": "...", "position": "...", ... }, ...]
}
}
```
But the parser expected `"xi"`, `"starting"`, or `"lineup"` keys at root level.
### Fix
Updated `_parse_lineups_json()` in `ai-engine/services/single_match_orchestrator.py`:
- Added fallback to check `lineups_json.get("stats")` for home/away arrays
- Now correctly parses Mackolik's nested format
- Result: `home_lineup_count: 11`, `away_lineup_count: 11`, `lineup_source: "confirmed_live"`
---
## 18. Docker Deployment
```yaml
# docker-compose.yml services:
services:
app: # NestJS (port 3000→3000)
postgres: # PostgreSQL 17 Alpine (port 15432:5432)
redis: # Redis 7 Alpine (port 6379)
adminer: # Database UI (dev profile, port 8080)
ai-engine: # Python FastAPI (port 8002:8000)
```
---
_This file is maintained for AI agent context. Update when architecture or conventions change._
Executable
+337
View File
@@ -0,0 +1,337 @@
# 🚀 Enterprise NestJS Boilerplate (Antigravity Edition)
[![NestJS](https://img.shields.io/badge/NestJS-E0234E?style=for-the-badge&logo=nestjs&logoColor=white)](https://nestjs.com/)
[![TypeScript](https://img.shields.io/badge/TypeScript-3178C6?style=for-the-badge&logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
[![Prisma](https://img.shields.io/badge/Prisma-2D3748?style=for-the-badge&logo=prisma&logoColor=white)](https://www.prisma.io/)
[![PostgreSQL](https://img.shields.io/badge/PostgreSQL-4169E1?style=for-the-badge&logo=postgresql&logoColor=white)](https://www.postgresql.org/)
[![Docker](https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white)](https://www.docker.com/)
> **FOR AI AGENTS & DEVELOPERS:** This documentation is structured to provide deep context, architectural decisions, and operational details to ensure seamless handover to any AI coding assistant (like Antigravity) or human developer.
---
## 🧠 Project Context & Architecture (Read Me First)
This is an **opinionated, production-ready** backend boilerplate built with NestJS. It is designed to be scalable, type-safe, and fully localized.
### 🏗️ Core Philosophy
- **Type Safety First:** Strict TypeScript configuration. `any` is forbidden. DTOs are the source of truth.
- **Generic Abstraction:** `BaseService` and `BaseController` handle 80% of CRUD operations, allowing developers to focus on business logic.
- **i18nNative:** Localization is not an afterthought. It is baked into the exception filters, response interceptors, and guards.
- **Security by Default:** JWT Auth, RBAC (Role-Based Access Control), Throttling, and Helmet are pre-configured.
### 📐 Architectural Decision Records (ADR)
_To understand WHY things are the way they are:_
1. **Handling i18n Assets:**
- **Problem:** Translation JSON files are not TypeScript code, so `tsc` ignores them during build.
- **Solution:** We configured `nest-cli.json` with `"assets": ["i18n/**/*"]`. This ensures `src/i18n` is copied to `dist/i18n` automatically.
- **Note:** When running with `node`, ensure `dist/main.js` can find these files.
2. **Global Response Wrapping:**
- **Mechanism:** `ResponseInterceptor` wraps all successful responses.
- **Feature:** It automatically translates the "Operation successful" message based on the `Accept-Language` header using `I18nService`.
- **Output Format:**
```json
{
"success": true,
"status": 200,
"message": "İşlem başarıyla tamamlandı", // Translated
"data": { ... }
}
```
3. **Centralized Error Handling:**
- **Mechanism:** `GlobalExceptionFilter` catches all `HttpException` and unknown `Error` types.
- **Feature:** It accepts error keys (e.g., `AUTH_REQUIRED`) and translates them using `i18n`. If a translation is found in `errors.json`, it is returned; otherwise, the original message is shown.
4. **UUID Generation:**
- **Decision:** We use Node.js native `crypto.randomUUID()` instead of the external `uuid` package to avoid CommonJS/ESM compatibility issues.
---
## 🚀 Quick Start for AI & Humans
### 1. Prerequisites
- **Node.js:** v20.19+ (LTS)
- **Docker:** For running PostgreSQL and Redis effortlessly.
- **Package Manager:** `npm` (Lockfile: `package-lock.json`)
### 2. Environment Setup
```bash
cp .env.example .env
# ⚠️ CRITICAL: Ensure DATABASE_URL includes the username!
# Example: postgresql://postgres:password@localhost:15432/boilerplate_db
# Required for v20 prediction flow:
# AI_ENGINE_URL=http://127.0.0.1:8000
```
### 3. Installation & Database
```bash
# Install dependencies
npm ci
# Start Infrastructure (Postgres + Redis)
docker-compose up -d postgres redis
# Generate Prisma Client (REQUIRED after install)
npx prisma generate
# Run Migrations
npx prisma migrate dev
# Seed Database (Optional - Creates Admin & Roles)
npx prisma db seed
```
### 4. Running the App
```bash
# Debug Mode (Watch) - Best for Development
npm run start:dev
# Production Build & Run
npm run build
npm run start:prod
```
---
## 🛡️ Response Standardization & Type Safety Protocol
This boilerplate enforces a strict **"No-Leak"** policy for API responses to ensure both Security and Developer Experience.
### 1. The `unknown` Type is Forbidden
- **Rule:** Controllers must NEVER return `ApiResponse<unknown>` or raw Prisma entities.
- **Why:** Returning raw entities risks exposing sensitive fields like `password` hashes or internal metadata. It also breaks contract visibility for frontend developers.
### 2. DTO Pattern & Serialization
- **Tool:** We use `class-transformer` for all response serialization.
- **Implementation:**
- All Response DTOs must use `@Exclude()` class-level decorator.
- Only fields explicitly marked with `@Expose()` are returned to the client.
- Controllers use `plainToInstance(UserResponseDto, data)` before returning data.
**Example:**
```typescript
// ✅ Good: Secure & Typed
@Get('me')
async getMe(@CurrentUser() user: User): Promise<ApiResponse<UserResponseDto>> {
return createSuccessResponse(plainToInstance(UserResponseDto, user));
}
// ❌ Bad: Leaks password hash & Weak Types
@Get('me')
async getMe(@CurrentUser() user: User) {
return createSuccessResponse(user);
}
```
---
## ⚡ High-Performance Caching (Redis Strategy)
To ensure enterprise-grade performance, we utilize **Redis** for caching frequently accessed data (e.g., Roles, Permissions).
- **Library:** `@nestjs/cache-manager` with `cache-manager-redis-yet` (Supports Redis v6+ / v7).
- **Configuration:** Global Cache Module in `AppModule`.
- **Strategy:** Read-heavy endpoints use `@UseInterceptors(CacheInterceptor)`.
- **Invalidation:** Write operations (Create/Update/Delete) manually invalidate relevant cache keys.
**Usage:**
```typescript
// 1. Automatic Caching
@Get('roles')
@UseInterceptors(CacheInterceptor)
@CacheKey('roles_list') // Unique Key
@CacheTTL(60000) // 60 Seconds
async getAllRoles() { ... }
// 2. Manual Invalidation (Inject CACHE_MANAGER)
async createRole(...) {
// ... create role logic
await this.cacheManager.del('roles_list'); // Clear cache
}
```
---
## 🤖 Gemini AI Integration (Optional)
This boilerplate includes an **optional** AI module powered by Google's Gemini API. It's disabled by default and can be enabled during CLI setup or manually.
### Configuration
Add these to your `.env` file:
```env
# Enable Gemini AI features
ENABLE_GEMINI=true
# Your Google API Key (get from https://aistudio.google.com/apikey)
GOOGLE_API_KEY=your-api-key-here
# Model to use (optional, defaults to gemini-2.5-flash)
GEMINI_MODEL=gemini-2.5-flash
```
### Usage
The `GeminiService` is globally available when enabled:
```typescript
import { GeminiService } from './modules/gemini';
@Injectable()
export class MyService {
constructor(private readonly gemini: GeminiService) {}
async generateContent() {
// Check if Gemini is available
if (!this.gemini.isAvailable()) {
throw new Error('AI features are not enabled');
}
// 1. Simple Text Generation
const { text, usage } = await this.gemini.generateText(
'Write a product description for a coffee mug',
);
// 2. With System Prompt & Options
const { text } = await this.gemini.generateText('Translate: Hello World', {
systemPrompt: 'You are a professional Turkish translator',
temperature: 0.3,
maxTokens: 500,
});
// 3. Multi-turn Chat
const { text } = await this.gemini.chat([
{ role: 'user', content: 'What is TypeScript?' },
{
role: 'model',
content: 'TypeScript is a typed superset of JavaScript...',
},
{ role: 'user', content: 'Give me an example' },
]);
// 4. Structured JSON Output
interface ProductData {
name: string;
price: number;
features: string[];
}
const { data } = await this.gemini.generateJSON<ProductData>(
'Generate a product entry for a wireless mouse',
'{ name: string, price: number, features: string[] }',
);
console.log(data.name, data.price); // Fully typed!
}
}
```
### Available Methods
| Method | Description |
| ------------------------------------------- | ------------------------------------------------ |
| `isAvailable()` | Check if Gemini is properly configured and ready |
| `generateText(prompt, options?)` | Generate text from a single prompt |
| `chat(messages, options?)` | Multi-turn conversation |
| `generateJSON<T>(prompt, schema, options?)` | Generate and parse structured JSON |
### Options
```typescript
interface GeminiGenerateOptions {
model?: string; // Override default model
systemPrompt?: string; // System instructions
temperature?: number; // Creativity (0-1)
maxTokens?: number; // Max response length
}
```
## 🌍 Internationalization (i18n) Guide
Unique to this project is the deep integration of `nestjs-i18n`.
- **Location:** `src/i18n/{lang}/`
- **Files:**
- `common.json`: Generic messages (success, welcome)
- `errors.json`: Error codes (AUTH_REQUIRED, USER_NOT_FOUND)
- `validation.json`: Validation messages (IS_EMAIL)
- `auth.json`: Auth specific success messages (LOGIN_SUCCESS)
**How to Translate a New Error:**
1. Throw an exception with a key: `throw new ConflictException('EMAIL_EXISTS');`
2. Add `"EMAIL_EXISTS": "Email already taken"` to `src/i18n/en/errors.json`.
3. Add Turkish translation to `src/i18n/tr/errors.json`.
4. Start server; the `GlobalExceptionFilter` handles the rest.
---
## 🧪 Testing & CI/CD
- **GitHub Actions:** `.github/workflows/ci.yml` handles build and linting checks on push.
- **Local Testing:**
```bash
npm run test # Unit tests
npm run test:e2e # End-to-End tests
```
---
## 📂 System Map (Directory Structure)
```
src/
├── app.module.ts # Root module (Redis, Config, i18n setup)
├── main.ts # Entry point
├── common/ # Shared resources
│ ├── base/ # Abstract BaseService & BaseController (CRUD)
│ ├── types/ # Interfaces (ApiResponse, PaginatedData)
│ ├── filters/ # Global Exception Filter
│ └── interceptors/ # Response Interceptor
├── config/ # Application configuration
├── database/ # Prisma Service
├── i18n/ # Localization assets
└── modules/ # Feature modules
├── admin/ # Admin capabilities (Roles, Permissions + Caching)
│ ├── admin.controller.ts
│ └── dto/ # Admin Response DTOs
├── auth/ # Authentication layer
├── gemini/ # 🤖 Optional AI module (Google Gemini)
├── health/ # Health checks
└── users/ # User management
```
---
## 🛠️ Troubleshooting (Known Issues)
**1. `EADDRINUSE: address already in use`**
- **Fix:** `lsof -ti:3000 | xargs kill -9`
**2. `PrismaClientInitializationError` / Database Connection Hangs**
- **Fix:** Check `.env` `DATABASE_URL`. Ensure `docker-compose up` is running.
**3. Cache Manager Deprecation Warnings**
- **Context:** `cache-manager-redis-yet` may show deprecation warnings regarding `Keyv`. This is expected as we wait for the ecosystem to stabilize on `cache-manager` v6/v7. The current implementation is fully functional.
---
## 📃 License
This project is proprietary and confidential.
+43
View File
@@ -0,0 +1,43 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.egg-info/
*.egg
dist/
build/
.eggs/
# Virtual environment
venv/
.venv/
env/
# IDE
.idea/
.vscode/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Environment
.env
.env.*
# Test & Coverage
.pytest_cache/
htmlcov/
.coverage
*.cover
# Logs
*.log
# Training data (large CSVs)
data/training_data*.csv
# Reports (generated at runtime)
reports/
+313
View File
@@ -0,0 +1,313 @@
# IDDAAI — Bahis Motoru Operasyon Workflow'u (V31d)
> Bu doküman, AI bahis tahmin motorunun **nasıl çalıştırılacağı, doğrulanacağı,
> izleneceği ve yeniden ayarlanacağına** dair operasyon kılavuzudur.
> Hedef: **hem hacim hem kâr** — gerçekçi beklenti **premium tier'da +%30 ROI**,
> daha geniş ağda +%515.
>
> Son güncelleme: 2026-05-29 · Judge sürümü: `judge-v31d-evidence-tiers`
>
> **V31d ne değiştirdi (hacim krizi çözümü):** V31c yalnızca **28 oynanabilir
> bahis / 10k maç** üretiyordu çünkü iki veto (`calibrated_confidence_too_low`,
> `play_score_too_low`) HER underdog'u reddediyordu — bunlar ">%45 model güveni
> iste" diyen FAVORİ-seçme kuralı. Ama kârlı bir 6.5 oran underdog'u zaten sadece
> ~%20 tutar; kâr oran priminden gelir. V31d, **MS değer-tier eşleşmelerinde** bu
> iki vetoyu kaldırır ve skoru tier kalitesinden üretir. Sonuç (60g doğrulama):
> **28 → 602 oynanabilir bahis (22x), 1.6u → +39.4u, ROI %28 → +%32.7.**
> Tüm zengin analiz çıktısı (market_board, v25/v27, triple_value, olasılıklar)
> **aynen korunur** — yalnızca `playable` bayrağı değişir.
---
## 0. TL;DR — En Önemli 5 Kural
1. **SADECE TEKLİ BAHİS OYNA. KOMBİNE YOK.** Matematiksel olarak kanıtlandı:
1-leg `+%3.4` → 2-leg `-%32` → 3-leg `-%67` → 4-leg `-%83`. Marjinal +EV bacakları
çarpmak kazancı yok eder.
2. **Asıl kâr MS (1X2) underdog bölgesinde.** Oran ≥ 6.0 + model_gap ≥ 0 = en yüksek ROI.
3. **Hiçbir market mute edilmez.** Tier sistemi filtreler; gerçek ROI'ler görünür kalır
(`MUTED_MARKETS = set()`).
4. **Kalibrasyon ≠ Bahis sinyali.** MS tier'ları ham model olasılığını kullanır
(`model_gap`, `ev_edge`). İzotonik kalibratörler sadece ekrandaki `calibrated_confidence`'i
etkiler (BTTS/OU25'te şişik — dikkat).
5. **Backtest'e körü körüne güvenme.** Model eğitim kesim tarihini bil; in-sample/out-of-sample
ayrımını her zaman yap (bkz. Bölüm 6).
---
## 1. Sistem Mimarisi (Pipeline)
```
Maç verisi (DB: matches, odds, elo, form, h2h…)
[V25 Ensemble] XGBoost + LightGBM + CatBoost → her market için ham olasılık
[V27 Dual-Engine] ikinci görüş / consensus (AGREE / DISAGREE)
[İzotonik Kalibrasyon] ham olasılık → calibrated_confidence (ekran için)
└─ kalibratörü OLMAYAN marketlerde hafif damping (×0.92)
[BettingBrain V31d — Deterministik Hâkim]
├─ ev_edge = calibrated_probability × oran 1 (ham-prob + market blend)
├─ model_gap = ham_model_olasılık implied_prob
├─ trap_market = market geçmiş banttan fazla fiyatlamış mı?
├─ odds_reliability = lig bazında geçmiş Brier skorundan
└─ MARKET_ODDS_TIERS → value_tier (premium/strong/standard) → bet_grade (A/B/C)
[Çıktı] bet_summary[] → playable, value_tier, stake_units, bet_grade
→ BE (smart-coupon) → FE / Mobile
```
**Anahtar dosyalar:**
- `services/betting_brain.py` — deterministik hâkim, tier tanımları (`MARKET_ODDS_TIERS`)
- `services/orchestrator/market_board.py` — ev_edge/model_gap/kalibrasyon hesapları
- `scripts/diagnostic_backtest_multi.py` — çok-pick backtest (maç başına TÜM marketler)
- `models/v25/`, `models/calibration/` — model ve kalibratör dosyaları
---
## 2. V31d — Kanıta Dayalı Kademeli Değer Sistemi (Evidence-Based Tiers)
Kullanıcı risk iştahına göre seçer. Her tier maç başına ayrı sinyal üretir.
**Sadece premium otomatik STAKE'lenir (BET); strong/standard WATCH** olarak görünür
(tam analiz gösterilir, oynanmaz) çünkü 60 günlük veri o bantların ~başabaş olduğunu
söylüyor.
| Tier | Grade | Oran bandı | Filtre | 60g ROI* | Aksiyon | Karakter |
|------|:----:|-----------|--------|:----:|:----:|----------|
| **premium** | A | **6.00 7.50** | model_gap ≥ 0, rel ≥ 0.30 | **+%32.7** | **BET** | Doğrulanmış edge; ~%20 hit, yüksek varyans |
| **strong** | B | 5.00 6.00 | model_gap ≥ 0, rel ≥ 0.30 | ~%1 (başabaş) | WATCH | Görünür, oynanmaz (kanıt yetersiz) |
| **standard** | C | 3.00 5.00 | model_gap ≥ 0, rel ≥ 0.30 | +%0.5 (başabaş) | WATCH | Hacim bölgesi, marj yok |
| info (—) | — | markete özel | ultrastrict (min_edge≥0.02, rel≥0.45-0.55, trap yok) | ~0 | REJECT/info | Bilgi amaçlı, nadiren geçer |
\* 60 günlük doğrulamadan (72.582 settled satır, 7.793 maç, 2026-04-17..05-28;
`ms_envelope.py` + `new_gate_sim.py`). premium: 602 bahis, +%32.7 ROI, +39.4u,
%20.6 hit, **6 haftanın 6'sı da pozitif**, OOS(>05-24) +%47.4.
**NEDEN 6.07.5 (V31c'deki 6.050.0 değil):** edge dar bir banda yoğunlaşmış.
`6.07.0 +%35` · `7.08.0 ~başabaş` · **`8.0+ NEGATİF`** (%10..26, longshot mezarlığı).
Eski geniş premium tier kaybeden longshot'ları içeri alıyordu. 7.5 üstünde modelin
edge'i buharlaşıyor.
**Tasarım mantığı:** premium = ROI **ve** hacim motoru (60g'de ~14 bahis/gün = bol hacim).
Bahisçi:
- **Düşük risk / yüksek kalite** istiyorsa → sadece **premium (A)** oyna (varsayılan).
- **Daha fazla hacim** istiyorsa → premium bandını 6.08.0'e genişlet (ROI +%32.7 → +%19,
hâlâ sağlam, +%44 hacim) — `MARKET_ODDS_TIERS["MS"]` premium `max_odds`'u değiştir.
**Non-MS marketler (DC, OU25, OU35, BTTS, HT, OU15, HTFT, OE, HT_OU05, HT_OU15, CARDS):**
hepsi `ultrastrict` tek-tier ile bilgi amaçlı. Geçmiş veride sistematik olarak kayıp
verdikleri için BET üretmeleri zorlaştırıldı (mute YOK — sadece sıkı eşik).
**Veto mantığı (V31d kritik):** value-tier eşleşmelerinde `calibrated_confidence_too_low`
ve `play_score_too_low` vetoları KALDIRILIR (bunlar favori-seçme kuralı). Ama gerçek
koruma vetoları AKTİF kalır: `extreme_negative_ev` (ev<0.20), `ev_edge_too_high_trap`
(ev≥0.30), `htft_reversal_risk_high`, `v25_v27_hard_disagreement`, `low_reliability_hard`.
60g'de premium tier-eşleşmelerinin ~%71'i oynanabilir oldu; kalan ~%29 bu koruma
vetolarıyla doğru şekilde reddedildi.
---
## 3. EN İYİ BAHİS DEĞERLERİ — Kesin Sıralama (Best Bet Values)
> "Multi bahislerde bütün bahis değerlerinin en iyisi" sorusunun cevabı.
> **Hepsi TEKLİ oynanır.** (Aşağıdaki ROI'ler 0.2u sabit stake simülasyonundan.)
### MS (1X2) underdog — ince oran-bandı haritası (60g, gap ≥ 0)
> "Hangi bahis hangi oranda tutuyor" sorusunun kesin cevabı. `ms_envelope.py`.
> drop-3/5 = en büyük 3/5 kazancı çıkarınca ROI (konsantrasyon/sağlamlık testi).
| Oran bandı | Bahis | Hit% | ROI | drop-3 ROI | Karar |
|-----------|------:|-----:|----:|-----:|:-----:|
| **6.0 6.5** | 469 | %22.0 | **+%37.7** | +%34.4 | ✅ elit |
| **6.0 7.0** | 492 | %21.5 | **+%35.2** | +%29.9 | ✅ elit, sağlam |
| **6.0 7.5** (premium) | 645 | %20.0 | **+%29.3** | +%24.4 | ✅ ÖNERİLEN |
| 6.0 8.0 | 928 | %17.7 | +%19.1 | +%15.5 | ✅ hacim opsiyonu |
| 7.5 8.0 | 283 | %12.4 | %4.0 | — | ❌ |
| 8.0 9.0 | 78 | %9.0 | %25.7 | — | ❌ longshot |
| 9.0+ | ~266 | <%10 | negatif | — | ❌ mezarlık |
| 5.0 6.0 (strong) | ~1000 | %18 | ~%1 | — | ⚠️ başabaş → WATCH |
| 3.0 5.0 (standard) | ~5745 | %27 | +%0.5 | — | ⚠️ başabaş → WATCH |
**Korumalı premium (htft/disagreement vetoları uygulanmış) = staked set:**
602 bahis · %20.6 hit · **+%32.7 ROI** · +39.4u · 6/6 hafta pozitif · OOS +%47.4.
**Okuma:** Edge tamamen **6.07.5** bandında. 8.0 üstü longshot'lar kaybeder
(eski 6.050.0 premium tier'ı bu yüzden sulandırıyordu). 5.0 altı başabaş.
Premium tek başına ~14 bahis/gün = hem hacim hem +%32.7 ROI.
### ❌ İşe YARAMAYAN yapılandırmalar
- **Kombine (parlay):** her ek bacak ROI'yi çökertir (yukarıdaki TL;DR).
- **MS 8.0+ longshot:** %10..26 ROI, model edge'i yok.
- **MS 5.06.0 / 3.05.0:** başabaş; WATCH olarak göster, stake'leme.
- **OU25 her konfigürasyon:** sistematik kayıp (60g'de OU25 %22.8, OU35 %17.2).
- **BTTS:** sadece çok yüksek reliability'de marjinal.
---
## 4. KRİTİK KURAL — Tekli Bahis, Kombine Yok
| Kupon tipi | Hit% | ROI | Sonuç |
|-----------|-----:|----:|:-----:|
| 1-leg (tekli) | ~%24 | **+%3.4** | ✅ |
| 2-leg | düşük | %32.4 | ❌ |
| 3-leg | çok düşük | %66.6 | ❌ |
| 4-leg | minimal | %83.0 | ❌ |
**Neden:** Tekil bacaklar yalnızca marjinal +EV. Kombine, kazanma olasılıklarını
çarparken (her biri <1) kayıp olasılığını üssel büyütür. Düz (flat) tekli stake
matematiksel olarak üstündür. **Ürün, kullanıcıyı kombineye teşvik etmemeli;**
"günün premium tekli değerleri" şeklinde sunmalı.
---
## 5. Önerilen Stake Politikası
- **Flat stake** (sabit birim) — Kelly değil. Marjinal edge'de Kelly varyansı patlatır.
- **premium (A): 0.5u sabit** (`VALUE_TIER_STAKE_UNITS`). ~%20 hit + uzun kayıp serileri
(60g'de en uzun 35 ardışık kayıp) nedeniyle KÜÇÜK tutulur — kâr **frekanstan** gelir,
bahis başı büyüklükten değil. Bankroll/risk iştahı izin veriyorsa artırılabilir.
- strong/standard WATCH = stake YOK (görünür ama oynanmaz).
- Günlük/maç başına 1 sinyal; aynı maça birden çok tier'dan bahis = korelasyon riski,
en yüksek value_tier'ı seç.
- **Drawdown uyarısı:** 0.5u'da en kötü tarihsel düşüş ≈ 34u; 35 ardışık kayıp mümkün.
Bu bir maraton stratejisidir — kısa vadeli sonuçlara göre stake değiştirme.
---
## 6. Backtest Metodolojisi & Leakage Disiplini ⚠️
**En kritik bölüm. Backtest sayıları yanlış yorumlanırsa sistem kârlı sanılıp kaybettirir.**
### 6.1 Komut
```bash
# Konteyner içinde:
python scripts/diagnostic_backtest_multi.py --days 60 --max-matches 10000 \
--progress-interval 100 --checkpoint-every 200
# Çıktı: reports/multi_backtest_YYYYMMDD.{csv,json,txt}
# Checkpoint'li → kesilirse kaldığı yerden devam eder.
```
### 6.2 Lookahead / Sızıntı (leakage) kontrolü — ZORUNLU
- **Feature lookahead:** ✅ temiz — feature'lar match_date ÖNCESİ veriden hesaplanıyor.
- **Model eğitim-seti üyeliği:** Bunu HER ZAMAN kontrol et. Kalibratörler
`models/calibration/*_metrics.json` içindeki `last_trained` tarihinde, son ~5000
maç üzerinde fit edilir. Backtest penceresi bu tarihle çakışırsa **calibrated_confidence
in-sample (şişik)** olur.
- **Pratik test (ucuz):** Backtest sonucunu eğitim kesim tarihine göre ikiye böl;
in-sample vs out-of-sample hit% karşılaştır. Tüm-market hit% **neredeyse aynıysa**
(örn. %49.7 vs %49.4) → temel modellerde anlamlı sızıntı YOK, edge gerçek.
Eski veride hit% **aniden yükseliyorsa** → o dönem eğitim setinde, ROI'yi yok say.
- Hazır script: `/tmp/leakage_split.py <csv>` (eğitim tarihine göre böler).
- **Geriye doğru ne kadar gidilebilir?** Modeller en son holdout penceresini (≈son
10k maç ≈ 60-70 gün) eğitimden hariç tutuyor. Bu yüzden **~60 gün geriye backtest
çoğunlukla temiz holdout'tur.** Daha geriye (90+ gün) gitmek eğitim setine girip
ROI'yi yapay iyi gösterebilir → kaçın.
### 6.3 Doğrulama scriptleri
- `/tmp/v31c_validation.py <csv>` — V31c tier dökümü (premium/strong/standard ROI).
- `/tmp/best_bet_values.py <csv>` — grid-search liderlik tablosu + portföy + kombine testi.
- `/tmp/leakage_split.py <csv>` — in/out-of-sample sızıntı probu.
### 6.4 Doğrulama eşiği (bir tier "kârlı" sayılmadan önce)
- n ≥ 50 bahis (tercihen ≥ 200), out-of-sample.
- ROI > 0 hem in- hem out-of-sample'da, ya da en azından OOS'ta çökmemiş.
- Kümülatif kâr eğrisi yukarı trend (tek bir şanslı güne bağlı değil).
---
## 7. Operasyonel Döngü (Cadence)
### Günlük
- Motor sağlık kontrolü (futbol pipeline çalışıyor mu; basketbol `readiness_summary`
hatası bilinen/zararsız).
- Günün sinyallerini üret; **premium (A) tekli** değerleri öne çıkar.
- Settle olan dünün bahislerini logla (gerçek hit/ROI takibi).
### Haftalık
- Son 7-14 günün gerçek sonuçlarını backtest tahminiyle karşılaştır (calibration drift).
- Tier bazında gerçekleşen ROI'yi izle; standard (C) sürekli negatifse eşik sıkılaştır.
### Aylık
- Modelleri yeniden eğit (Colab: `extract_training_data_v27.py` → eğitim → `fetch_xgb_models.sh`).
- **Yeniden eğitimden sonra MUTLAKA** 60 günlük backtest + leakage_split ile yeniden doğrula.
- Tier eşiklerini güncelle (Bölüm 8).
- `models/calibration/*_metrics.json` `last_trained` tarihini not et (bir sonraki
backtest'in OOS penceresini bilmek için).
---
## 8. Tier / Eşik Güncelleme Protokolü
1. Yeni backtest CSV'sini al → `v31c_validation.py` + `leakage_split.py` çalıştır.
2. Her tier için OOS ROI'ye bak:
- ROI sağlam pozitif + n yeterli → koru.
- ROI marjinal/negatif → oran bandını daralt veya min_reliability/min_model_gap yükselt.
- premium 6.0+ eşiği: OOS'ta hâlâ en iyi ROI mi? Değilse bandı kaydır (örn. 6.5+).
3. `betting_brain.py``MARKET_ODDS_TIERS` düzenle, **versiyon string'ini artır**
(`judge-v31c-…``judge-v31d-…`).
4. Lokal syntax kontrol → sunucuya deploy (Bölüm 9) → yeniden doğrula.
5. Tier'lar netleştikten SONRA `value_tier`'ı UI'a yay (BE smart-coupon → FE badge → mobil).
---
## 9. Deploy Prosedürü (AI Engine)
```bash
# 1. Lokal syntax kontrol
python3 -c "import ast; ast.parse(open('services/betting_brain.py').read())"
# 2. Sunucuya kopyala (SSH: port 2222, kullanıcı haruncan)
scp -P 2222 services/betting_brain.py haruncan@<host>:/tmp/betting_brain.py
# 3. Konteynere koy + import testi
docker cp /tmp/betting_brain.py iddaai-ai-engine:/app/services/betting_brain.py
docker exec iddaai-ai-engine python -c "from services.betting_brain import BettingBrain; print('OK')"
# 4. Yeniden başlat + doğrula
docker restart iddaai-ai-engine
docker exec iddaai-ai-engine python -c "from services.betting_brain import BettingBrain as B; \
print([t['value_tier'] for t in B().MARKET_ODDS_TIERS['MS']])"
```
> Not: Port 8000 host-localhost'a expose DEĞİL; sağlık testini konteyner içinden veya
> Docker network üzerinden yap. Basketbol `readiness_summary` hatası bilinen, bloklamıyor.
---
## 10. Bilinen Sınırlamalar & Uyarılar
- **Kalibrasyon şişmesi:** BTTS / OU25 izotonik kalibratörleri olasılığı %10-15 fazla
gösteriyor (overcalibrated). Bu marketlerde ekrandaki `calibrated_confidence`'e tam
güvenme; bahis kararı zaten ham-prob `model_gap`/`ev_edge` ile veriliyor.
- **Out-of-sample örneklem küçük:** Eğitim kesim tarihinden sonraki temiz pencere dar
olabilir (~200 MS bahsi). İstatistiksel kesinlik için ileriye doğru gerçek sonuç
biriktir (paper-trade) veya 60 günlük holdout backtest kullan.
- **standard (C) tier kırılgan:** in-sample +%0.4, küçük OOS örnekte negatife düşebiliyor.
Hacim için var; ROI garantisi değil.
- **Tek pencere overfit riski:** Tek bir sezon/dönem penceresine göre ayar yapma;
farklı lig/sezon çeşitliliği ara.
- **Basketbol:** `BasketballV25Predictor.readiness_summary` eksik — futbolu etkilemiyor,
ayrı düzeltilecek.
---
## 11. Hızlı Komut Referansı
```bash
# 60 günlük backtest (konteyner içi)
python scripts/diagnostic_backtest_multi.py --days 60 --max-matches 10000
# Doğrulama (CSV lokale çekildikten sonra)
python3 /tmp/v31c_validation.py reports/multi_backtest_YYYYMMDD.csv
python3 /tmp/best_bet_values.py reports/multi_backtest_YYYYMMDD.csv
python3 /tmp/leakage_split.py reports/multi_backtest_YYYYMMDD.csv
# Kalibratör eğitim tarihleri
grep -o '"last_trained":[^,]*' models/calibration/*.json
```
+39
View File
@@ -0,0 +1,39 @@
# --- AI Engine Dockerfile ---
# Python 3.11 with v20+ prediction stack (XGBoost + LightGBM)
FROM python:3.11-slim
WORKDIR /app
# System dependencies
RUN apt-get update && apt-get install -y \
gcc \
libpq-dev \
curl \
libgomp1 \
procps \
&& rm -rf /var/lib/apt/lists/*
# Python dependencies
# Install PyTorch CPU version separately to save space
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu
# Copy requirements (without torch)
COPY requirements-docker.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create models directory
RUN mkdir -p /app/models
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://127.0.0.1:8000/health')" || exit 1
# Start FastAPI with uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
+115
View File
@@ -0,0 +1,115 @@
import os
import json
import yaml
from typing import Dict, Any, Optional
class EnsembleConfig:
_instance: Optional['EnsembleConfig'] = None
_config: Dict[str, Any] = {}
def __new__(cls):
if cls._instance is None:
cls._instance = super(EnsembleConfig, cls).__new__(cls)
cls._instance._load_config()
return cls._instance
def _load_config(self):
"""Load configuration from YAML file."""
config_path = os.path.join(os.path.dirname(__file__), 'ensemble_config.yaml')
try:
with open(config_path, 'r', encoding='utf-8') as f:
self._config = yaml.safe_load(f)
# print(f"✅ Loaded ensemble config from {config_path}")
except Exception as e:
print(f"❌ Failed to load ensemble config: {e}")
self._config = {}
def get(self, key: str, default: Any = None) -> Any:
"""Get configuration value by key (supports dot notation for nested keys)."""
keys = key.split('.')
value = self._config
try:
for k in keys:
value = value[k]
return value
except (KeyError, TypeError):
return default
# Singleton accessor
def get_config() -> EnsembleConfig:
return EnsembleConfig()
# ── Market Thresholds Loader ────────────────────────────────────────────
_market_thresholds_cache: Optional[Dict[str, Any]] = None
def load_market_thresholds() -> Dict[str, Any]:
"""
Load market thresholds from JSON config file.
Returns the full config dict with 'markets' and 'defaults' keys.
Caches after first load for performance.
"""
global _market_thresholds_cache
if _market_thresholds_cache is not None:
return _market_thresholds_cache
config_path = os.path.join(os.path.dirname(__file__), 'market_thresholds.json')
try:
with open(config_path, 'r', encoding='utf-8') as f:
data = json.load(f)
_market_thresholds_cache = data
print(f"✅ Market thresholds loaded: {len(data.get('markets', {}))} markets (v={data.get('_meta', {}).get('version', '?')})")
return data
except Exception as e:
print(f"❌ Failed to load market thresholds: {e} — using built-in defaults")
_market_thresholds_cache = {"markets": {}, "defaults": {
"calibration": 0.55,
"min_conf": 55.0,
"min_play_score": 68.0,
"min_edge": 0.02,
"odds_band_min_sample": 0.0,
"odds_band_min_edge": 0.0,
}}
return _market_thresholds_cache
def build_threshold_dict(field: str) -> Dict[str, float]:
"""
Build a flat {market: value} dict for a specific threshold field.
Usage:
calibration_map = build_threshold_dict("calibration")
# → {"MS": 0.62, "DC": 0.82, ...}
"""
data = load_market_thresholds()
markets = data.get("markets", {})
result: Dict[str, float] = {}
for market, cfg in markets.items():
if field in cfg:
result[market] = float(cfg[field])
return result
def get_threshold_default(field: str) -> float:
"""Get the default fallback value for a threshold field."""
data = load_market_thresholds()
defaults = data.get("defaults", {})
return float(defaults.get(field, 0.0))
if __name__ == "__main__":
# Test
cfg = get_config()
print(f"Weights: {cfg.get('engine_weights')}")
print(f"Team Weight: {cfg.get('engine_weights.team')}")
print()
print("--- Market Thresholds ---")
for field in ["calibration", "min_conf", "min_play_score", "min_edge"]:
d = build_threshold_dict(field)
print(f"{field}: {d}")
print(f"Default calibration: {get_threshold_default('calibration')}")
+197
View File
@@ -0,0 +1,197 @@
model_ensemble:
xgb_weight: 0.50
lgb_weight: 0.50
temperature: 1.5
default_ms_odds:
home: 2.65
draw: 3.20
away: 2.65
elo_staleness_days: 14
odds_staleness_hours: 48
engine_weights:
team: 0.30
player: 0.25
odds: 0.30
referee: 0.15
min_weight: 0.05
weight_redistribution:
player_missing_to_team: 0.5
player_missing_to_odds: 0.5
referee_missing_to_team: 0.4
referee_missing_to_odds: 0.6
referee_min_matches: 5
match_result:
min_draw_prob: 0.15
over_under:
prob_min: 0.02
prob_max: 0.98
ou15_threshold: 0.55
ou25_threshold: 0.52
ou35_threshold: 0.48
btts_threshold: 0.58
poisson_blend_weight: 0.25
poisson_grid_max: 6
half_time:
ft_to_ht_ratio: 0.42
poisson_grid_max: 5
ht_over_05_min: 0.20
ht_over_05_max: 0.95
ht_ou_threshold: 0.55
ht_draw_floor: 0.28
low_xg_threshold: 2.0
low_xg_ratio_adjust: 0.85
confidence:
agreement_boost: 1.3
disagreement_penalty: 0.7
handicap:
xg_diff_threshold: 1.2
corners:
xg_multiplier: 3.0
baseline: 3.0
home_dominant_bonus: 1.5
away_dominant_bonus: 1.0
dominance_threshold: 0.6
line: 9.5
cards:
derby_heat_factor: 1.3
line: 4.5
score:
poisson_grid_max: 7
ms_confidence_threshold: 15.0
risk:
# Lowered thresholds for better surprise detection (was 0.20+)
# Model typically outputs 4-8% for reversals, so we need lower thresholds
surprise_threshold: 0.05
surprise_threshold_top: 0.05
surprise_threshold_non_top: 0.06
surprise_threshold_favorite_reversal: 0.06
surprise_threshold_favorite_reversal_top: 0.06
surprise_threshold_favorite_reversal_non_top: 0.08
surprise_threshold_underdog_reversal: 0.05
surprise_threshold_underdog_reversal_top: 0.05
surprise_threshold_underdog_reversal_non_top: 0.06
surprise_threshold_basketball: 0.08
surprise_threshold_basketball_top: 0.08
surprise_threshold_basketball_non_top: 0.10
surprise_min_top_gap: 0.01
surprise_min_top_gap_top: 0.01
surprise_min_top_gap_non_top: 0.015
# New: Upset alert threshold for potential upsets (lower than main threshold)
upset_alert_threshold: 0.05 # 5% - alert when reversal prob > 5%
htft_temperature: 1.25
htft_temperature_top: 1.25
htft_temperature_non_top: 1.35
htft_temperature_basketball: 1.08
htft_temperature_basketball_top: 1.08
htft_temperature_basketball_non_top: 1.15
htft_reversal_multiplier: 0.60
htft_reversal_multiplier_top: 0.60
htft_reversal_multiplier_non_top: 0.45
htft_reversal_multiplier_favorite: 0.72
htft_reversal_multiplier_favorite_top: 0.72
htft_reversal_multiplier_favorite_non_top: 0.55
htft_reversal_multiplier_underdog: 0.45
htft_reversal_multiplier_underdog_top: 0.45
htft_reversal_multiplier_underdog_non_top: 0.30
htft_reversal_multiplier_basketball: 0.90
htft_reversal_multiplier_basketball_top: 0.90
htft_reversal_multiplier_basketball_non_top: 0.75
htft_reversal_gap_medium: 0.50
htft_reversal_gap_strong: 1.00
htft_prior_min_matches: 300
htft_prior_blend_league: 0.65
htft_prior_blend_top: 0.50
htft_prior_blend_non_top: 0.58
htft_prior_odds_blend_top: 0.35
htft_prior_odds_blend_top_with_league: 0.22
htft_favorite_balance_gap: 0.20
htft_reversal_cap_factor: 2.30
extreme_upset: 0.7
high_upset: 0.5
medium_upset: 0.3
extreme_warnings: 3
high_warnings: 2
balanced_match_gap: 0.1
referee_min_data: 10
recommendations:
confidence_threshold: 45
value_confidence_min: 10
value_confidence_max: 30
value_edge_margin: 0.02
value_upgrade_edge: 5.0
# ACİL DÜZELTİLDİ: Güvenilir marketler genişletildi
safe_markets: ['ÇŞ', '1.5 Üst/Alt', '2.5 Üst/Alt']
# ACİL DÜZELTİLDİ: Market bazlı minimum confidence threshold'lar (Artık Olasılık Yüzdesi!)
market_min_confidence:
MS: 50.0 # Match result is hardest; 50%+ true probability is actually strong
ÇŞ: 65.0 # Double chance naturally has high probability (2 sides of 3)
1.5 Üst/Alt: 70.0 # 1.5 Goals needs to be highly probable to be worth playing
2.5 Üst/Alt: 55.0 # Standard threshold for 50/50 lines
3.5 Üst/Alt: 60.0 # Needs higher certianty than 2.5
BTTS: 60.0 # Both Teams To Score - raised for accuracy (was 47.7%)
risk_safe_boost: 1.2
risk_ms_penalty_high: 0.5
risk_ms_penalty_medium: 0.8
risk_other_penalty: 0.7
# ACİL DÜZELTİLDİ: Market weights güvenilir marketlere göre ayarlandı
market_weights:
MS: 0.5 # ⬇️ Düşürüldü (zayıf performans)
ÇŞ: 1.5 # ⬆️ Artırıldı (güçlü performans)
1.5 Üst/Alt: 1.6 # ⬆️ En yüksek (en güvenilir)
2.5 Üst/Alt: 1.2 # ⬆️ Artırıldı
3.5 Üst/Alt: 0.9 # ⬇️ Düşürüldü
BTTS: 0.4 # ⬇️ Düşürüldü (zayıf performans)
# Confidence Calibration (backtest-derived accuracy)
baseline_accuracy: 65.0
market_accuracy:
MS: 52.1 # ❌ Zayıf
ÇŞ: 77.9 # ✅ İyi
1.5 Üst/Alt: 82.1 # ✅ Mükemmel
2.5 Üst/Alt: 61.4 # ⚠️ Orta
3.5 Üst/Alt: 60.7 # ⚠️ Orta
BTTS: 50.7 # ❌ Zayıf
calibration_buckets:
ms_home:
heavy_fav: 1.40 # home odds <= 1.40
fav: 1.80 # home odds > 1.40 and <= 1.80
balanced: 2.50 # home odds > 1.80 and <= 2.50
underdog: 99.0 # home odds > 2.50
team_xg:
home_base: 1.35
away_base: 1.10
home_conversion_mult: 3.0
away_conversion_mult: 2.5
sidelined:
position_weights:
K: 0.35
D: 0.20
O: 0.25
F: 0.30
max_rating: 10
adaptation_threshold: 10
adaptation_discount: 0.5
goalkeeper_penalty: 0.15
confidence_boost: 10
max_impact: 0.85
key_player_threshold: 3
recent_matches_lookback: 15
+115
View File
@@ -0,0 +1,115 @@
{
"_meta": {
"version": "v34",
"description": "Market-specific thresholds for the betting engine pipeline — V34 odds-aware gate fix",
"rule": "max_reachable (100 × calibration) MUST be > min_conf + 8",
"updated_at": "2026-05-10",
"changelog": "V34: Reduced min_edge to realistic levels for odds-aware V25 model. Model output ≈ market-implied, so large EV edges are mathematically impossible."
},
"markets": {
"MS": {
"calibration": 0.62,
"min_conf": 20.0,
"min_play_score": 28.0,
"min_edge": 0.005,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"DC": {
"calibration": 0.82,
"min_conf": 40.0,
"min_play_score": 50.0,
"min_edge": 0.003,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"OU15": {
"calibration": 0.84,
"min_conf": 45.0,
"min_play_score": 50.0,
"min_edge": 0.003,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"OU25": {
"calibration": 0.68,
"min_conf": 30.0,
"min_play_score": 40.0,
"min_edge": 0.005,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"OU35": {
"calibration": 0.60,
"min_conf": 20.0,
"min_play_score": 30.0,
"min_edge": 0.008,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.008
},
"BTTS": {
"calibration": 0.65,
"min_conf": 30.0,
"min_play_score": 40.0,
"min_edge": 0.005,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"HT": {
"calibration": 0.58,
"min_conf": 20.0,
"min_play_score": 28.0,
"min_edge": 0.01,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.008
},
"HT_OU05": {
"calibration": 0.68,
"min_conf": 35.0,
"min_play_score": 42.0,
"min_edge": 0.005,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.005
},
"HT_OU15": {
"calibration": 0.60,
"min_conf": 25.0,
"min_play_score": 32.0,
"min_edge": 0.008,
"odds_band_min_sample": 8.0,
"odds_band_min_edge": 0.008
},
"OE": {
"calibration": 0.62,
"min_conf": 35.0,
"min_play_score": 32.0,
"min_edge": 0.005
},
"CARDS": {
"calibration": 0.58,
"min_conf": 30.0,
"min_play_score": 35.0,
"min_edge": 0.008
},
"HCAP": {
"calibration": 0.56,
"min_conf": 25.0,
"min_play_score": 30.0,
"min_edge": 0.015
},
"HTFT": {
"calibration": 0.45,
"min_conf": 10.0,
"min_play_score": 18.0,
"min_edge": 0.02
}
},
"defaults": {
"calibration": 0.55,
"min_conf": 55.0,
"min_play_score": 60.0,
"min_edge": 0.008,
"odds_band_min_sample": 0.0,
"odds_band_min_edge": 0.0
}
}
+8
View File
@@ -0,0 +1,8 @@
from .base_calculator import BaseCalculator, CalculationContext
from .match_result_calculator import MatchResultCalculator
from .over_under_calculator import OverUnderCalculator
from .half_time_calculator import HalfTimeCalculator
from .score_calculator import ScoreCalculator
from .other_markets_calculator import OtherMarketsCalculator
from .risk_assessor import RiskAssessor
from .bet_recommender import BetRecommender, MarketPredictionDTO
+53
View File
@@ -0,0 +1,53 @@
"""
Base classes and context dataclass for all calculators.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any
@dataclass
class CalculationContext:
"""Context object holding all inputs for calculators."""
team_pred: Any
player_pred: Any
odds_pred: Any
referee_pred: Any
upset_factors: Any
weights: dict[str, float]
player_mods: dict[str, float]
referee_mods: dict[str, float]
match_id: str
home_team_name: str
away_team_name: str
odds_data: dict[str, float]
home_xg: float
away_xg: float
total_xg: float
league_id: str | None = None
sport: str = "football"
is_top_league: bool = False
# Risk info (populated later)
risk_level: str = "MEDIUM"
is_surprise: bool = False
# XGBoost Predictions (New)
xgboost_preds: dict[str, Any] = field(default_factory=dict)
class BaseCalculator:
"""Base class for all market calculators."""
def __init__(self, config: dict[str, Any]) -> None:
self.config = config
def calculate(self, ctx: CalculationContext) -> dict[str, Any]:
raise NotImplementedError("Subclasses must implement calculate()")
+210
View File
@@ -0,0 +1,210 @@
from dataclasses import dataclass, field
from typing import List, Optional, Any
from .base_calculator import BaseCalculator, CalculationContext
from .match_result_calculator import MatchResultPrediction
from .over_under_calculator import OverUnderPrediction
from .risk_assessor import RiskAnalysis
@dataclass
class MarketPredictionDTO:
market_type: str
pick: str
probability: float
confidence: float
odds: float = 0.0
is_recommended: bool = False
is_value_bet: bool = False
edge: float = 0.0
is_skip: bool = False # NEW: If model is unsure, mark as skip
@dataclass
class RecommendationResult:
best_bet: Optional[MarketPredictionDTO]
recommended_bets: List[MarketPredictionDTO]
alternative_bet: Optional[MarketPredictionDTO]
value_bets: List[MarketPredictionDTO]
skipped_bets: List[MarketPredictionDTO] # NEW: Track what we decided NOT to predict
class BetRecommender(BaseCalculator):
def calculate(self, # type: ignore[override]
ctx: CalculationContext,
ms_res: MatchResultPrediction,
ou_res: OverUnderPrediction,
risk: RiskAnalysis) -> RecommendationResult:
odds_data = ctx.odds_data
# Market-Specific Minimum Confidence Thresholds (Hard Gates)
# Below these, we say "I don't know" (SKIP)
min_conf_thresholds = {
"MS": 45.0, # 3-way is hard, need at least 45%
"ÇŞ": 40.0, # Double chance is safer, but still need 40%
"1.5 Üst/Alt": 50.0,
"2.5 Üst/Alt": 45.0,
"3.5 Üst/Alt": 45.0,
"BTTS": 45.0,
"HT": 40.0,
}
# Prepare candidates
markets = [
MarketPredictionDTO("MS", ms_res.ms_pick,
ms_res.ms_home_prob if ms_res.ms_pick == "1" else (ms_res.ms_away_prob if ms_res.ms_pick == "2" else ms_res.ms_draw_prob),
ms_res.ms_confidence,
odds_data.get(f"ms_{ms_res.ms_pick.lower()}", 0)),
MarketPredictionDTO("ÇŞ", ms_res.dc_pick,
ms_res.dc_1x_prob if ms_res.dc_pick == "1X" else (ms_res.dc_x2_prob if ms_res.dc_pick == "X2" else ms_res.dc_12_prob),
ms_res.dc_confidence,
odds_data.get(f"dc_{ms_res.dc_pick.lower()}", 0)),
MarketPredictionDTO("1.5 Üst/Alt", ou_res.ou15_pick,
ou_res.over_15_prob if "Üst" in ou_res.ou15_pick else ou_res.under_15_prob,
ou_res.ou15_confidence, 0),
MarketPredictionDTO("2.5 Üst/Alt", ou_res.ou25_pick,
ou_res.over_25_prob if "Üst" in ou_res.ou25_pick else ou_res.under_25_prob,
ou_res.ou25_confidence,
odds_data.get("ou25_o" if "Üst" in ou_res.ou25_pick else "ou25_u", 0)),
MarketPredictionDTO("3.5 Üst/Alt", ou_res.ou35_pick,
ou_res.over_35_prob if "Üst" in ou_res.ou35_pick else ou_res.under_35_prob,
ou_res.ou35_confidence, 0),
MarketPredictionDTO("BTTS", ou_res.btts_pick,
ou_res.btts_yes_prob if "Var" in ou_res.btts_pick else ou_res.btts_no_prob,
ou_res.btts_confidence,
odds_data.get("btts_y" if "Var" in ou_res.btts_pick else "btts_n", 0)),
]
# Market weights from config (historical accuracy weighting)
market_weights = self.config.get("recommendations.market_weights", {})
default_weight = 1.0
safe_markets = set(self.config.get("recommendations.safe_markets", ["ÇŞ", "1.5 Üst/Alt"]))
risk_level = risk.risk_level
# Confidence calibration (backtest-derived accuracy scaling)
market_accuracy = self.config.get("recommendations.market_accuracy", {})
baseline_accuracy = self.config.get("recommendations.baseline_accuracy", 65.0)
def _calibrated_confidence(m):
"""Scale raw confidence by market's historical accuracy ratio."""
accuracy = market_accuracy.get(m.market_type, baseline_accuracy) if isinstance(market_accuracy, dict) else baseline_accuracy
ratio = accuracy / baseline_accuracy
return m.confidence * ratio
def _score(m):
mw = market_weights.get(m.market_type, default_weight) if isinstance(market_weights, dict) else default_weight
# 1. Base Score: calibrated confidence * market weight
cal_conf = _calibrated_confidence(m)
score = cal_conf * mw
# 2. Value/Edge Bonus
odds_val = m.odds if m.odds is not None else 0.0
if odds_val > 0:
implied = 1.0 / odds_val
edge = (m.probability - implied) * 100
if edge > 0:
score += edge * 4.0
# 3. Risk adjustment
if risk_level in ("HIGH", "EXTREME"):
if m.market_type in safe_markets:
score *= self.config.get("recommendations.risk_safe_boost", 1.2)
elif m.market_type == "MS":
score *= self.config.get("recommendations.risk_ms_penalty_high", 0.5)
else:
score *= self.config.get("recommendations.risk_other_penalty", 0.7)
elif risk_level == "MEDIUM":
if m.market_type == "MS":
score *= self.config.get("recommendations.risk_ms_penalty_medium", 0.8)
# 4. Extreme Confidence Bonus
if cal_conf > 80:
score *= 1.15
return score
recommended = []
value_bets = []
skipped_bets = []
conf_thr = self.config.get("recommendations.confidence_threshold", 60)
val_min = self.config.get("recommendations.value_confidence_min", 45) # Increased from 30
val_max = self.config.get("recommendations.value_confidence_max", 60)
val_margin = self.config.get("recommendations.value_edge_margin", 0.03) # Increased from 0.02
val_upgrade = self.config.get("recommendations.value_upgrade_edge", 5.0)
for m in markets:
# --- SKIP LOGIC (Hard Gate) ---
# 1. Confidence is below market threshold
min_conf = min_conf_thresholds.get(m.market_type, 45.0)
if m.confidence < min_conf:
m.is_skip = True
skipped_bets.append(m)
continue
# 2. Negative Value Edge (Odds are too low for our probability)
if m.odds > 0:
implied = 1.0 / m.odds
edge = m.probability - implied
# If our prob is significantly lower than implied (negative edge > 3%), SKIP
if edge < -0.03:
m.is_skip = True
skipped_bets.append(m)
continue
# --- PROCESS BET ---
# 1. Regular recommended
if m.confidence >= conf_thr:
m.is_recommended = True
recommended.append(m)
# 2. Value bet logic
if m.confidence is not None and val_min <= m.confidence <= val_max and m.odds > 0:
implied = 1.0 / m.odds
if m.probability > (implied + val_margin):
m.is_value_bet = True
m.edge = (m.probability - implied) * 100
if m.edge > val_upgrade:
m.is_recommended = True
recommended.append(m)
else:
value_bets.append(m)
# Best bet (from recommended only)
best_bet = None
if recommended:
# Re-sort only recommended markets to find the best one
valid_markets = [m for m in markets if not m.is_skip and m.is_recommended]
if valid_markets:
valid_markets.sort(key=_score, reverse=True)
best_bet = valid_markets[0]
best_bet.is_recommended = True
# Alternative bet
alternative = None
if risk.is_surprise_risk and ms_res.ms_pick in ["1", "2"]:
# Check if alternative is not skipped
alt_candidate = MarketPredictionDTO(
"2.5 Üst/Alt", ou_res.ou25_pick,
ou_res.over_25_prob if "Üst" in ou_res.ou25_pick else ou_res.under_25_prob,
ou_res.ou25_confidence,
odds_data.get("ou25_o" if "Üst" in ou_res.ou25_pick else "ou25_u", 0)
)
if alt_candidate.confidence >= min_conf_thresholds.get("2.5 Üst/Alt", 45.0):
alternative = alt_candidate
return RecommendationResult(
best_bet=best_bet,
recommended_bets=recommended,
alternative_bet=alternative,
value_bets=value_bets,
skipped_bets=skipped_bets
)
+32
View File
@@ -0,0 +1,32 @@
def calc_confidence_3way(top_prob: float) -> float:
"""Returns the true win probability percentage (e.g. 0.45 -> 45.0)."""
return max(0, min(99.0, top_prob * 100))
def calc_confidence_2way(prob: float) -> float:
"""Returns the true win probability percentage for the favored side."""
# Find the probability of the >0.5 side
win_prob = prob if prob >= 0.5 else (1.0 - prob)
return max(0, min(99.0, win_prob * 100))
def calc_confidence_dc(top_prob: float) -> float:
"""Returns the true win probability percentage for double chance."""
return max(0, min(99.0, top_prob * 100))
def calc_confidence_3way_with_agreement(top_prob: float, agreement_ratio: float,
boost: float = 1.05, penalty: float = 0.95) -> float:
"""
Returns the true win probability percentage, slightly adjusted by engine consensus.
Args:
top_prob: highest probability among options
agreement_ratio: 0.0 to 1.0 — how many engines agree on the pick
"""
base = calc_confidence_3way(top_prob)
# Slight nudge rather than massive swing, to keep it feeling like a true probability
if agreement_ratio >= 0.75:
return min(99.0, base * boost)
elif agreement_ratio <= 0.25:
return max(0.0, base * penalty)
return base
@@ -0,0 +1,131 @@
"""
Expert Recommendation Engine (Senior Level)
============================================
Evaluates ALL markets, classifies by risk, and ensures NO "empty" recommendations.
Prioritizes user safety by clearly labeling risk levels.
"""
from dataclasses import dataclass, field
from typing import List, Optional, Any, Dict
from .base_calculator import BaseCalculator, CalculationContext
from .match_result_calculator import MatchResultPrediction
from .over_under_calculator import OverUnderPrediction
from .risk_assessor import RiskAnalysis
@dataclass
class ExpertPick:
market_type: str
pick: str
probability: float
confidence: float
odds: float
edge: float # Expected value percentage
# Risk Classification
risk_level: str # SAFE, MEDIUM, RISKY, SURPRISE
reasoning: str # Why this pick? (e.g., "High xG support", "Value detected")
@dataclass
class ExpertResult:
main_pick: ExpertPick
safe_alternative: Optional[ExpertPick]
value_picks: List[ExpertPick]
surprise_picks: List[ExpertPick]
market_summary: Dict[str, float] # {market: probability}
class ExpertRecommender(BaseCalculator):
def calculate(self, # type: ignore[override]
ctx: CalculationContext,
ms_res: MatchResultPrediction,
ou_res: OverUnderPrediction,
risk: RiskAnalysis) -> ExpertResult:
odds_data = ctx.odds_data
all_picks: List[ExpertPick] = []
# ─── 1. Helper to Evaluate Pick ───
def evaluate(market: str, pick: str, prob: float, odd_key: str):
odd_val = float(odds_data.get(odd_key, 0))
# If odd is missing/low, estimate it via probability (Kelly-ish estimation)
if odd_val <= 1.01:
odd_val = round(1.0 / (prob + 0.05), 2) # Conservative estimation
reasoning = "Derived (No market odd)"
else:
reasoning = "Market Confirmed"
implied = 1.0 / odd_val
edge = (prob - implied) * 100
# ─── Risk Classification ───
if prob >= 0.75 and odd_val <= 1.45:
level = "SAFE"
elif edge > 5.0:
level = "VALUE"
elif odd_val >= 2.50 and prob >= 0.35:
level = "SURPRISE"
else:
level = "MEDIUM"
all_picks.append(ExpertPick(
market_type=market, pick=pick, probability=prob,
confidence=prob * 100, odds=odd_val, edge=edge,
risk_level=level, reasoning=reasoning
))
# ─── 2. Evaluate All Major Markets ───
# MS
evaluate("MS", ms_res.ms_pick,
ms_res.ms_home_prob if ms_res.ms_pick == "1" else (ms_res.ms_away_prob if ms_res.ms_pick == "2" else ms_res.ms_draw_prob),
f"ms_{ms_res.ms_pick.lower()}")
# Double Chance
evaluate("DC", ms_res.dc_pick,
ms_res.dc_1x_prob if ms_res.dc_pick == "1X" else (ms_res.dc_x2_prob if ms_res.dc_pick == "X2" else ms_res.dc_12_prob),
f"dc_{ms_res.dc_pick.lower()}")
# OU25
evaluate("OU25", ou_res.ou25_pick,
ou_res.over_25_prob if "Üst" in ou_res.ou25_pick else ou_res.under_25_prob,
"ou25_o" if "Üst" in ou_res.ou25_pick else "ou25_u")
# BTTS
evaluate("BTTS", ou_res.btts_pick,
ou_res.btts_yes_prob if "Var" in ou_res.btts_pick else ou_res.btts_no_prob,
"btts_y" if "Var" in ou_res.btts_pick else "btts_n")
# OU15
evaluate("OU15", ou_res.ou15_pick,
ou_res.over_15_prob if "Üst" in ou_res.ou15_pick else ou_res.under_15_prob,
"ou15_o" if "Üst" in ou_res.ou15_pick else "ou15_u")
# ─── 3. Sort and Select ───
# Sort by a mix of Confidence and Edge
all_picks.sort(key=lambda p: (p.probability * 0.6) + (max(0, p.edge/100) * 0.4), reverse=True)
main = all_picks[0]
# Find Safe Alternative (if main isn't Safe)
safe_alt = next((p for p in all_picks if p.risk_level == "SAFE"), None)
if safe_alt == main: safe_alt = None
value_picks = [p for p in all_picks if p.risk_level == "VALUE" and p != main]
surprise_picks = [p for p in all_picks if p.risk_level == "SURPRISE"]
# Market Summary for UI
market_summary = {
"MS_Home": ms_res.ms_home_prob,
"MS_Draw": ms_res.ms_draw_prob,
"MS_Away": ms_res.ms_away_prob,
"OU25_Over": ou_res.over_25_prob,
"BTTS_Yes": ou_res.btts_yes_prob
}
return ExpertResult(
main_pick=main,
safe_alternative=safe_alt,
value_picks=value_picks,
surprise_picks=surprise_picks,
market_summary=market_summary
)
+179
View File
@@ -0,0 +1,179 @@
import math
from dataclasses import dataclass
from .base_calculator import BaseCalculator, CalculationContext
from .confidence import calc_confidence_3way, calc_confidence_2way
@dataclass
class HalfTimePrediction:
ht_home_prob: float
ht_draw_prob: float
ht_away_prob: float
ht_pick: str
ht_confidence: float
ht_over_05_prob: float
ht_under_05_prob: float
ht_over_15_prob: float
ht_under_15_prob: float
ht_ou_pick: str
ht_ou15_pick: str
ht_home_xg: float
ht_away_xg: float
class HalfTimeCalculator(BaseCalculator):
def _poisson_pmf(self, k, lam):
"""Poisson probability mass function."""
if lam <= 0:
return 1.0 if k == 0 else 0.0
return (lam ** k) * math.exp(-lam) / math.factorial(k)
def calculate(self, ctx: CalculationContext) -> HalfTimePrediction: # type: ignore[override]
team_pred = ctx.team_pred
odds_pred = ctx.odds_pred
# Config
ft_to_ht_ratio = self.config.get("half_time.ft_to_ht_ratio", 0.42)
grid_max = self.config.get("half_time.poisson_grid_max", 5)
draw_floor = self.config.get("half_time.ht_draw_floor", 0.35)
low_xg_thr = self.config.get("half_time.low_xg_threshold", 2.0)
low_xg_adj = self.config.get("half_time.low_xg_ratio_adjust", 0.85)
# FT xG (blended team + odds)
ft_home_xg = (team_pred.home_xg + odds_pred.poisson_home_xg) / 2
ft_away_xg = (team_pred.away_xg + odds_pred.poisson_away_xg) / 2
total_ft_xg = ft_home_xg + ft_away_xg
# Dynamic HT ratio: düşük xG maçlarda ratio'yu küçült
# Çünkü düşük gollü maçlarda ilk yarıda gol olma ihtimali daha da düşük
effective_ratio = ft_to_ht_ratio
if total_ft_xg < low_xg_thr:
effective_ratio *= low_xg_adj
# HT xG
ht_home_xg = ft_home_xg * effective_ratio
ht_away_xg = ft_away_xg * effective_ratio
ht_total_xg = ht_home_xg + ht_away_xg
# Compute HT 1X2 via bivariate Poisson grid
ht_home = 0.0
ht_away = 0.0
ht_draw = 0.0
# Also compute O/U while iterating
total_goals_prob = {}
for i in range(grid_max):
for j in range(grid_max):
p = self._poisson_pmf(i, ht_home_xg) * self._poisson_pmf(j, ht_away_xg)
if i > j:
ht_home += p
elif i < j:
ht_away += p
else:
ht_draw += p
total = i + j
total_goals_prob[total] = total_goals_prob.get(total, 0.0) + p
# Draw floor: düşük xG maçlarda beraberlik olasılığını minimum seviyeye çek
if ht_draw < draw_floor:
deficit = draw_floor - ht_draw
ht_draw = draw_floor
# Deficit'i home ve away'den orantılı düş
total_ha = ht_home + ht_away
if total_ha > 0:
ht_home -= deficit * (ht_home / total_ha)
ht_away -= deficit * (ht_away / total_ha)
# Normalize
total_prob = ht_home + ht_draw + ht_away
if total_prob > 0:
ht_home /= total_prob
ht_draw /= total_prob
ht_away /= total_prob
# XGBoost Integration (HT 1X2 and HT/FT Models)
w_xgb = self.config.get("xgboost.weight_ht", 0.60)
xgb_ht_home, xgb_ht_draw, xgb_ht_away = None, None, None
if "ht_result" in ctx.xgboost_preds:
probs = ctx.xgboost_preds["ht_result"]
xgb_ht_home, xgb_ht_draw, xgb_ht_away = probs["home"], probs["draw"], probs["away"]
elif "ht_ft" in ctx.xgboost_preds:
# Fallback to HT/FT marginals
htft_payload = ctx.xgboost_preds.get("ht_ft", {})
probs = None
if isinstance(htft_payload, dict):
labels = ("1/1", "1/X", "1/2", "X/1", "X/X", "X/2", "2/1", "2/X", "2/2")
if all(label in htft_payload for label in labels):
probs = [float(htft_payload[label]) for label in labels]
if probs is None:
probs = ctx.xgboost_preds.get("ht_ft_raw")
if probs is not None and len(probs) == 9:
xgb_ht_home = sum(probs[0:3])
xgb_ht_draw = sum(probs[3:6])
xgb_ht_away = sum(probs[6:9])
if xgb_ht_home is not None:
ht_home = ht_home * (1 - w_xgb) + xgb_ht_home * w_xgb
ht_draw = ht_draw * (1 - w_xgb) + xgb_ht_draw * w_xgb
ht_away = ht_away * (1 - w_xgb) + xgb_ht_away * w_xgb
# Re-normalize
total = ht_home + ht_draw + ht_away
ht_home /= total
ht_draw /= total
ht_away /= total
# HT O/U 0.5
ht_over_05 = 1.0 - math.exp(-ht_total_xg)
if "ht_ou05" in ctx.xgboost_preds:
w_xgb = self.config.get("xgboost.weight_ou", 0.60)
xgb_ht_over_05 = float(ctx.xgboost_preds["ht_ou05"])
ht_over_05 = ht_over_05 * (1 - w_xgb) + xgb_ht_over_05 * w_xgb
ht_over_05_min = self.config.get("half_time.ht_over_05_min", 0.20)
ht_over_05_max = self.config.get("half_time.ht_over_05_max", 0.95)
ht_over_05 = max(ht_over_05_min, min(ht_over_05_max, ht_over_05))
# HT O/U 1.5
# P(total >= 2) = 1 - P(0) - P(1)
ht_over_15 = sum(p for g, p in total_goals_prob.items() if g >= 2)
if "ht_ou15" in ctx.xgboost_preds:
w_xgb = self.config.get("xgboost.weight_ou", 0.60)
xgb_ht_over_15 = float(ctx.xgboost_preds["ht_ou15"])
ht_over_15 = ht_over_15 * (1 - w_xgb) + xgb_ht_over_15 * w_xgb
ht_over_15 = max(0.02, min(0.95, ht_over_15))
# Picks
ht_probs = [(ht_home, "İY 1"), (ht_draw, "İY X"), (ht_away, "İY 2")]
ht_sorted = sorted(ht_probs, key=lambda x: x[0], reverse=True)
ht_pick = ht_sorted[0][1]
ht_confidence = calc_confidence_3way(ht_sorted[0][0])
# HT O/U picks
ht_ou_thr = self.config.get("half_time.ht_ou_threshold", 0.55)
ht_ou_pick = "İY 0.5 Üst" if ht_over_05 > ht_ou_thr else "İY 0.5 Alt"
ht_ou15_pick = "İY 1.5 Üst" if ht_over_15 > 0.45 else "İY 1.5 Alt"
return HalfTimePrediction(
ht_home_prob=ht_home,
ht_draw_prob=ht_draw,
ht_away_prob=ht_away,
ht_pick=ht_pick,
ht_confidence=ht_confidence,
ht_over_05_prob=ht_over_05,
ht_under_05_prob=1.0 - ht_over_05,
ht_over_15_prob=ht_over_15,
ht_under_15_prob=1.0 - ht_over_15,
ht_ou_pick=ht_ou_pick,
ht_ou15_pick=ht_ou15_pick,
ht_home_xg=ht_home_xg,
ht_away_xg=ht_away_xg
)
+142
View File
@@ -0,0 +1,142 @@
from dataclasses import dataclass
from typing import Dict, Any, List
from .base_calculator import BaseCalculator, CalculationContext
from .confidence import calc_confidence_3way_with_agreement, calc_confidence_dc
@dataclass
class MatchResultPrediction:
ms_home_prob: float
ms_draw_prob: float
ms_away_prob: float
ms_pick: str
ms_confidence: float
dc_1x_prob: float
dc_x2_prob: float
dc_12_prob: float
dc_pick: str
dc_confidence: float
class MatchResultCalculator(BaseCalculator):
def _get_engine_winner(self, home_prob: float, draw_prob: float, away_prob: float) -> str:
"""Determine which outcome an engine favors."""
probs = {"1": home_prob, "X": draw_prob, "2": away_prob}
return max(probs, key=probs.__getitem__)
def calculate(self, ctx: CalculationContext) -> MatchResultPrediction: # type: ignore[override]
# Weights
w_team = ctx.weights["team"]
w_player = ctx.weights["player"]
w_odds = ctx.weights["odds"]
w_referee = ctx.weights["referee"]
# Engine predictions
team_pred = ctx.team_pred
odds_pred = ctx.odds_pred
player_mods = ctx.player_mods
referee_mods = ctx.referee_mods
# Weighted ensemble for 1X2
ms_home = (
team_pred.home_win_prob * w_team +
odds_pred.market_home_prob * w_odds +
team_pred.home_win_prob * player_mods["home_modifier"] * w_player +
odds_pred.market_home_prob * referee_mods["home_modifier"] * w_referee
)
ms_away = (
team_pred.away_win_prob * w_team +
odds_pred.market_away_prob * w_odds +
team_pred.away_win_prob * player_mods["away_modifier"] * w_player +
odds_pred.market_away_prob / referee_mods["home_modifier"] * w_referee
)
ms_draw = 1.0 - ms_home - ms_away
# XGBoost Integration
if "ms" in ctx.xgboost_preds:
xgb_probs = ctx.xgboost_preds["ms"]
w_xgb = self.config.get("xgboost.weight_ms", 0.70)
w_heuristic = 1.0 - w_xgb
ms_home = ms_home * w_heuristic + xgb_probs["home"] * w_xgb
ms_draw = ms_draw * w_heuristic + xgb_probs["draw"] * w_xgb
ms_away = ms_away * w_heuristic + xgb_probs["away"] * w_xgb
# Re-normalize
total = ms_home + ms_draw + ms_away
ms_home /= total
ms_draw /= total
ms_away /= total
# Min draw probability clamping
min_draw = self.config.get("match_result.min_draw_prob", 0.15)
if ms_draw < min_draw:
ms_draw = min_draw
total = ms_home + ms_away + ms_draw
ms_home /= total
ms_away /= total
ms_draw /= total
# Double Chance
dc_1x = ms_home + ms_draw
dc_x2 = ms_draw + ms_away
dc_12 = ms_home + ms_away
# MS pick
ms_probs = [(ms_home, "1"), (ms_draw, "X"), (ms_away, "2")]
ms_sorted = sorted(ms_probs, key=lambda x: x[0], reverse=True)
ms_pick = ms_sorted[0][1]
# === ENGINE AGREEMENT ===
# Determine each engine's winner and calculate agreement ratio
team_winner = self._get_engine_winner(
team_pred.home_win_prob, team_pred.draw_prob, team_pred.away_win_prob
)
odds_winner = self._get_engine_winner(
odds_pred.market_home_prob, odds_pred.market_draw_prob, odds_pred.market_away_prob
)
# Player-modified: team probs * player modifiers
player_adj_home = team_pred.home_win_prob * player_mods["home_modifier"]
player_adj_away = team_pred.away_win_prob * player_mods["away_modifier"]
player_adj_draw = max(0.01, 1.0 - player_adj_home - player_adj_away)
player_winner = self._get_engine_winner(player_adj_home, player_adj_draw, player_adj_away)
# Referee-modified: odds probs * referee modifiers
ref_adj_home = odds_pred.market_home_prob * referee_mods["home_modifier"]
ref_adj_away = odds_pred.market_away_prob / referee_mods["home_modifier"]
ref_adj_draw = max(0.01, 1.0 - ref_adj_home - ref_adj_away)
referee_winner = self._get_engine_winner(ref_adj_home, ref_adj_draw, ref_adj_away)
# Count how many engines agree with final pick
engines = [team_winner, odds_winner, player_winner, referee_winner]
agreement_count = sum(1 for e in engines if e == ms_pick)
agreement_ratio = agreement_count / len(engines)
# Confidence with agreement
boost = self.config.get("confidence.agreement_boost", 1.3)
penalty = self.config.get("confidence.disagreement_penalty", 0.7)
ms_confidence = calc_confidence_3way_with_agreement(
ms_sorted[0][0], agreement_ratio, boost, penalty
)
# DC pick
dc_probs = [(dc_1x, "1X"), (dc_x2, "X2"), (dc_12, "12")]
dc_sorted = sorted(dc_probs, key=lambda x: x[0], reverse=True)
dc_pick = dc_sorted[0][1]
dc_confidence = calc_confidence_dc(dc_sorted[0][0])
return MatchResultPrediction(
ms_home_prob=ms_home,
ms_draw_prob=ms_draw,
ms_away_prob=ms_away,
ms_pick=ms_pick,
ms_confidence=ms_confidence,
dc_1x_prob=dc_1x,
dc_x2_prob=dc_x2,
dc_12_prob=dc_12,
dc_pick=dc_pick,
dc_confidence=dc_confidence
)
@@ -0,0 +1,56 @@
from dataclasses import dataclass
from typing import Dict, Tuple
@dataclass
class AnomalyResult:
is_anomaly: bool
side: str = ""
severity: float = 0.0
reason: str = ""
class OddsAnomalyDetector:
"""
Detects mismatches between bookmaker odds and underlying team metrics.
A 'Bookmaker Trap' is when a team has very low odds (heavy favorite)
but their xG/defense metrics are surprisingly poor.
"""
def __init__(self, config: Dict):
self.config = config
# Thresholds
self.fav_odds_threshold = self.config.get("anomaly.fav_odds_threshold", 1.75)
self.min_xg_for_fav = self.config.get("anomaly.min_xg_for_fav", 1.25)
self.max_conceded_for_fav = self.config.get("anomaly.max_conceded_for_fav", 1.30)
self.opp_min_xg_threat = self.config.get("anomaly.opp_min_xg_threat", 1.10)
def detect_trap(self,
odds_data: Dict[str, float],
home_xg: float,
away_xg: float,
home_conceded_avg: float,
away_conceded_avg: float) -> tuple[bool, AnomalyResult]:
"""
Check if the match is a potential odds trap.
Returns: (has_trap, AnomalyResult)
"""
ms_h = odds_data.get("ms_h", 0.0)
ms_a = odds_data.get("ms_a", 0.0)
# Check Home Favorite Trap
if 1.0 < ms_h <= self.fav_odds_threshold:
# Home is favored. Check metrics.
if home_xg < self.min_xg_for_fav and (away_xg > self.opp_min_xg_threat or home_conceded_avg > self.max_conceded_for_fav):
severity = (self.fav_odds_threshold - ms_h) + (self.min_xg_for_fav - home_xg)
reason = f"🚨 ODDS ANOMALY (TRAP): Home odds ({ms_h}) suspiciously low despite poor metrics (xG: {round(home_xg, 2)}, Conceded: {round(home_conceded_avg, 2)})"
return True, AnomalyResult(True, "H", min(10.0, severity * 2), reason)
# Check Away Favorite Trap
if 1.0 < ms_a <= self.fav_odds_threshold:
# Away is favored. Check metrics
if away_xg < self.min_xg_for_fav and (home_xg > self.opp_min_xg_threat or away_conceded_avg > self.max_conceded_for_fav):
severity = (self.fav_odds_threshold - ms_a) + (self.min_xg_for_fav - away_xg)
reason = f"🚨 ODDS ANOMALY (TRAP): Away odds ({ms_a}) suspiciously low despite poor metrics (xG: {round(away_xg, 2)}, Conceded: {round(away_conceded_avg, 2)})"
return True, AnomalyResult(True, "A", min(10.0, severity * 2), reason)
return False, AnomalyResult(False)
+115
View File
@@ -0,0 +1,115 @@
from dataclasses import dataclass
import math
from .base_calculator import BaseCalculator, CalculationContext
from .match_result_calculator import MatchResultPrediction
@dataclass
class OtherMarketsPrediction:
total_corners_pred: float
corner_pick: str | None
total_cards_pred: float
card_pick: str
cards_over_prob: float
cards_under_prob: float
cards_confidence: float
handicap_pick: str
handicap_home_prob: float
handicap_draw_prob: float
handicap_away_prob: float
handicap_confidence: float
odd_even_pick: str
odd_prob: float
even_prob: float
class OtherMarketsCalculator(BaseCalculator):
def calculate( # type: ignore[override]
self,
ctx: CalculationContext,
ms_result: MatchResultPrediction,
) -> OtherMarketsPrediction:
if "handicap_ms" in ctx.xgboost_preds:
handicap_payload = ctx.xgboost_preds["handicap_ms"]
handicap_home_prob = float(handicap_payload.get("h1", 0.33))
handicap_draw_prob = float(handicap_payload.get("hx", 0.34))
handicap_away_prob = float(handicap_payload.get("h2", 0.33))
else:
xg_diff = ctx.home_xg - ctx.away_xg
threshold = float(self.config.get("handicap.xg_diff_threshold", 1.2))
if xg_diff > threshold:
handicap_home_prob, handicap_draw_prob, handicap_away_prob = 0.58, 0.24, 0.18
elif xg_diff < -threshold:
handicap_home_prob, handicap_draw_prob, handicap_away_prob = 0.18, 0.24, 0.58
else:
handicap_home_prob, handicap_draw_prob, handicap_away_prob = 0.28, 0.44, 0.28
handicap_confidence = max(
handicap_home_prob,
handicap_draw_prob,
handicap_away_prob,
) * 100.0
if handicap_home_prob >= handicap_draw_prob and handicap_home_prob >= handicap_away_prob:
handicap_pick = "H 1 (Ev -1)"
elif handicap_away_prob >= handicap_home_prob and handicap_away_prob >= handicap_draw_prob:
handicap_pick = "H 2 (Dep -1)"
else:
handicap_pick = "H 0 (Beraberlik)"
total_corners = 0.0
corner_pick = None
card_line = float(self.config.get("cards.line", 4.5))
if "cards_ou45" in ctx.xgboost_preds:
cards_over_prob = float(ctx.xgboost_preds["cards_ou45"])
total_cards = 5.0 if cards_over_prob > 0.50 else 3.5
else:
referee_average = float(ctx.referee_pred.avg_yellow_cards)
match_heat = 1.0
is_derby = bool(
ctx.upset_factors.reasoning
and "DERBY" in str(ctx.upset_factors.reasoning[0]),
)
if is_derby:
match_heat = float(self.config.get("cards.derby_heat_factor", 1.3))
total_cards = referee_average * match_heat
delta = total_cards - card_line
cards_over_prob = 1.0 / (1.0 + math.exp(-delta * 0.9))
cards_over_prob = max(0.02, min(0.98, cards_over_prob))
cards_under_prob = 1.0 - cards_over_prob
cards_confidence = max(cards_over_prob, cards_under_prob) * 100.0
card_pick = f"{card_line} Ust" if cards_over_prob > 0.50 else f"{card_line} Alt"
lambda_total = ctx.total_xg
even_prob = math.exp(-lambda_total) * math.cosh(lambda_total)
if "odd_even" in ctx.xgboost_preds:
xgb_weight = float(self.config.get("xgboost.weight_ou", 0.60))
xgb_even_prob = float(ctx.xgboost_preds["odd_even"])
even_prob = even_prob * (1 - xgb_weight) + xgb_even_prob * xgb_weight
even_prob = max(0.02, min(0.98, even_prob))
odd_prob = 1.0 - even_prob
odd_even_pick = "Cift" if even_prob > 0.5 else "Tek"
return OtherMarketsPrediction(
total_corners_pred=total_corners,
corner_pick=corner_pick,
total_cards_pred=total_cards,
card_pick=card_pick,
cards_over_prob=cards_over_prob,
cards_under_prob=cards_under_prob,
cards_confidence=cards_confidence,
handicap_pick=handicap_pick,
handicap_home_prob=handicap_home_prob,
handicap_draw_prob=handicap_draw_prob,
handicap_away_prob=handicap_away_prob,
handicap_confidence=handicap_confidence,
odd_even_pick=odd_even_pick,
odd_prob=odd_prob,
even_prob=even_prob,
)
+174
View File
@@ -0,0 +1,174 @@
import math
from dataclasses import dataclass
from .base_calculator import BaseCalculator, CalculationContext
from .confidence import calc_confidence_2way
@dataclass
class OverUnderPrediction:
over_15_prob: float
under_15_prob: float
ou15_pick: str
ou15_confidence: float
over_25_prob: float
under_25_prob: float
ou25_pick: str
ou25_confidence: float
over_35_prob: float
under_35_prob: float
ou35_pick: str
ou35_confidence: float
btts_yes_prob: float
btts_no_prob: float
btts_pick: str
btts_confidence: float
class OverUnderCalculator(BaseCalculator):
def _poisson_pmf(self, k: int, lam: float) -> float:
if lam <= 0:
return 1.0 if k == 0 else 0.0
return (lam ** k) * math.exp(-lam) / math.factorial(k)
def _poisson_ou_probs(self, home_xg: float, away_xg: float, grid_max: int = 6):
"""Bivariate Poisson grid → O/U probabilities."""
total_goals_prob = {} # total_goals → cumulative probability
for i in range(grid_max):
for j in range(grid_max):
p = self._poisson_pmf(i, home_xg) * self._poisson_pmf(j, away_xg)
total = i + j
total_goals_prob[total] = total_goals_prob.get(total, 0.0) + p
# Cumulative
over_15 = sum(p for g, p in total_goals_prob.items() if g >= 2)
over_25 = sum(p for g, p in total_goals_prob.items() if g >= 3)
over_35 = sum(p for g, p in total_goals_prob.items() if g >= 4)
# BTTS: P(home >= 1) * P(away >= 1)
p_home_0 = self._poisson_pmf(0, home_xg)
p_away_0 = self._poisson_pmf(0, away_xg)
btts_yes = (1 - p_home_0) * (1 - p_away_0)
return over_15, over_25, over_35, btts_yes
def calculate(self, ctx: CalculationContext) -> OverUnderPrediction: # type: ignore[override]
odds_pred = ctx.odds_pred
referee_mods = ctx.referee_mods
# Config
prob_min = self.config.get("over_under.prob_min", 0.02)
prob_max = self.config.get("over_under.prob_max", 0.98)
blend_w = self.config.get("over_under.poisson_blend_weight", 0.4)
grid_max = self.config.get("over_under.poisson_grid_max", 6)
ou15_thr = self.config.get("over_under.ou15_threshold", 0.55)
ou25_thr = self.config.get("over_under.ou25_threshold", 0.52)
ou35_thr = self.config.get("over_under.ou35_threshold", 0.48)
btts_thr = self.config.get("over_under.btts_threshold", 0.58)
# 1. Poisson-based O/U from context xG (team + odds average)
p_over_15, p_over_25, p_over_35, p_btts = self._poisson_ou_probs(
ctx.home_xg, ctx.away_xg, int(grid_max)
)
# 2. Odds-based O/U (from odds engine Poisson)
o_over_15 = odds_pred.over_15_prob
o_over_25 = odds_pred.over_25_prob
o_over_35 = odds_pred.over_35_prob
o_btts = odds_pred.btts_yes_prob
# 3. Blend: poisson xG + odds Poisson
# Odds engine already uses Poisson internally, so keep blend weight low
# to avoid double-counting. Use majority odds weight for established markets.
over_15 = p_over_15 * blend_w + o_over_15 * (1 - blend_w)
over_25 = p_over_25 * blend_w + o_over_25 * (1 - blend_w)
over_35 = p_over_35 * blend_w + o_over_35 * (1 - blend_w)
# BTTS: keep primarily from odds engine (it was 63.6% accurate before)
# Only a small Poisson contribution to cross-validate
btts_blend = min(blend_w, 0.2)
btts_yes = p_btts * btts_blend + o_btts * (1 - btts_blend)
# XGBoost Integration (High Weight)
w_xgb = self.config.get("xgboost.weight_ou", 0.70)
if "ou25" in ctx.xgboost_preds:
over_25 = over_25 * (1 - w_xgb) + ctx.xgboost_preds["ou25"] * w_xgb
if "ou15" in ctx.xgboost_preds:
over_15 = over_15 * (1 - w_xgb) + ctx.xgboost_preds["ou15"] * w_xgb
if "ou35" in ctx.xgboost_preds:
over_35 = over_35 * (1 - w_xgb) + ctx.xgboost_preds["ou35"] * w_xgb
# BTTS: lower XGBoost weight (was 0.70) — Poisson/odds fundamentals matter more
w_xgb_btts = self.config.get("xgboost.weight_btts", 0.45)
if "btts" in ctx.xgboost_preds:
btts_yes = btts_yes * (1 - w_xgb_btts) + ctx.xgboost_preds["btts"] * w_xgb_btts
# 4. Referee modifier (only applied to goal totals, not BTTS)
ou_mod = referee_mods.get("over_25_modifier", 1.0)
over_15 *= ou_mod
over_25 *= ou_mod
over_35 *= ou_mod
# 5. Clamp
over_15 = max(prob_min, min(prob_max, over_15))
over_25 = max(prob_min, min(prob_max, over_25))
over_35 = max(prob_min, min(prob_max, over_35))
btts_yes = max(prob_min, min(prob_max, btts_yes))
# Picks & Confidence
ou15_pick = "Üst 1.5" if over_15 > ou15_thr else "Alt 1.5"
ou15_conf = calc_confidence_2way(over_15)
ou25_pick = "Üst 2.5" if over_25 > ou25_thr else "Alt 2.5"
ou25_conf = calc_confidence_2way(over_25)
ou35_pick = "Üst 3.5" if over_35 > ou35_thr else "Alt 3.5"
ou35_conf = calc_confidence_2way(over_35)
btts_pick = "KG Var" if btts_yes > btts_thr else "KG Yok"
btts_conf = calc_confidence_2way(btts_yes)
# --- SAFE BTTS PENALTY (v2 — tighter thresholds) ---
# Penalize BTTS confidence when fundamentals don't strongly support the pick.
try:
home_conceded = ctx.team_pred.raw_features.get("home_conceded_avg", 1.0)
away_conceded = ctx.team_pred.raw_features.get("away_conceded_avg", 1.0)
if btts_pick == "KG Var":
# "Var" needs BOTH teams to score → requires strong attack OR leaky defense
# Penalty if either xG is low AND defenses are solid
weak_attack = ctx.home_xg < 1.30 or ctx.away_xg < 1.15
solid_defense = home_conceded < 1.15 or away_conceded < 1.15
if weak_attack and solid_defense:
btts_conf *= 0.3
else: # KG Yok
# "Yok" needs at least one team to fail scoring
# Penalty if both have good xG AND both defenses are leaky
if ctx.home_xg >= 1.30 and ctx.away_xg >= 1.15 and home_conceded >= 1.20 and away_conceded >= 1.20:
btts_conf *= 0.3
except Exception as e:
print(f"⚠️ Safe BTTS Check Error: {e}")
pass
return OverUnderPrediction(
over_15_prob=over_15, under_15_prob=1-over_15,
ou15_pick=ou15_pick, ou15_confidence=ou15_conf,
over_25_prob=over_25, under_25_prob=1-over_25,
ou25_pick=ou25_pick, ou25_confidence=ou25_conf,
over_35_prob=over_35, under_35_prob=1-over_35,
ou35_pick=ou35_pick, ou35_confidence=ou35_conf,
btts_yes_prob=btts_yes, btts_no_prob=1-btts_yes,
btts_pick=btts_pick, btts_confidence=btts_conf
)
+289
View File
@@ -0,0 +1,289 @@
from dataclasses import dataclass, field
from typing import Dict, Any, List, Tuple
from .base_calculator import BaseCalculator, CalculationContext
from .odds_anomaly_detector import OddsAnomalyDetector
@dataclass
class RiskAnalysis:
risk_score: float
risk_level: str
is_surprise_risk: bool
reasons: List[str] = field(default_factory=list)
surprise_type: str = ""
risk_warnings: List[str] = field(default_factory=list)
class RiskAssessor(BaseCalculator):
"""
Assesses risk level of the match based on context and predictions.
"""
def __init__(self, config: Dict):
super().__init__(config)
self.anomaly_detector = OddsAnomalyDetector(config)
@staticmethod
def _safe_odd(value: Any) -> float:
try:
odd = float(value)
return odd if odd > 1.01 else 0.0
except (TypeError, ValueError):
return 0.0
def _favorite_profile_from_odds(self, odds_data: Dict[str, float]) -> Tuple[str, float]:
"""
Returns (favorite_side, gap_to_second_favorite).
favorite_side: H, A, D, or U (unknown)
"""
ms_h = self._safe_odd((odds_data or {}).get("ms_h"))
ms_d = self._safe_odd((odds_data or {}).get("ms_d"))
ms_a = self._safe_odd((odds_data or {}).get("ms_a"))
candidates = [(side, odd) for side, odd in (("H", ms_h), ("D", ms_d), ("A", ms_a)) if odd > 0.0]
if len(candidates) < 2:
return "U", 0.0
candidates.sort(key=lambda item: item[1])
favorite_side, favorite_odd = candidates[0]
second_odd = candidates[1][1]
return favorite_side, max(0.0, second_odd - favorite_odd)
def _dynamic_reversal_threshold(
self,
ctx: CalculationContext,
top_label: str,
) -> float:
"""
Dynamic threshold for reversal surprise flags.
Lower threshold => easier to trigger surprise.
"""
base_threshold = float(self.config.get("risk.surprise_threshold", 0.20))
sport_key = (ctx.sport or "football").lower().strip()
is_top_league = bool(getattr(ctx, "is_top_league", False))
if not is_top_league:
base_threshold = float(
self.config.get("risk.surprise_threshold_non_top", base_threshold + 0.04),
)
if sport_key == "basketball":
if is_top_league:
top_val = self.config.get("risk.surprise_threshold_basketball_top")
if top_val is not None:
return float(top_val)
base_val = self.config.get("risk.surprise_threshold_basketball")
return float(base_val) if base_val is not None else 0.30
non_top_val = self.config.get("risk.surprise_threshold_basketball_non_top")
return float(non_top_val) if non_top_val is not None else 0.34
if top_label not in ("1/2", "2/1"):
return base_threshold
winner_side = "A" if top_label == "1/2" else "H"
favorite_side, gap = self._favorite_profile_from_odds(ctx.odds_data)
if is_top_league:
top_fav = self.config.get("risk.surprise_threshold_favorite_reversal_top")
if top_fav is not None:
favorite_winner_threshold = float(top_fav)
else:
base_fav = self.config.get("risk.surprise_threshold_favorite_reversal")
favorite_winner_threshold = float(base_fav) if base_fav is not None else 0.26
top_ud = self.config.get("risk.surprise_threshold_underdog_reversal_top")
if top_ud is not None:
underdog_winner_threshold = float(top_ud)
else:
base_ud = self.config.get("risk.surprise_threshold_underdog_reversal")
underdog_winner_threshold = float(base_ud) if base_ud is not None else 0.20
else:
nt_fav = self.config.get("risk.surprise_threshold_favorite_reversal_non_top")
favorite_winner_threshold = float(nt_fav) if nt_fav is not None else 0.30
nt_ud = self.config.get("risk.surprise_threshold_underdog_reversal_non_top")
underdog_winner_threshold = float(nt_ud) if nt_ud is not None else 0.24
gm = self.config.get("risk.htft_reversal_gap_medium")
gap_medium = float(gm) if gm is not None else 0.50
gs = self.config.get("risk.htft_reversal_gap_strong")
gap_strong = float(gs) if gs is not None else 1.00
if favorite_side in ("H", "A"):
threshold = (
favorite_winner_threshold
if winner_side == favorite_side
else underdog_winner_threshold
)
if winner_side != favorite_side and gap >= gap_strong:
threshold += 0.03
elif winner_side != favorite_side and gap >= gap_medium:
threshold += 0.015
return threshold
return base_threshold
def calculate(self, ctx: CalculationContext, ms_result: Any = None) -> RiskAnalysis: # type: ignore[override]
"""
Wrapper for assess_risk to match BaseCalculator interface but with extra arg.
"""
return self.assess_risk(ctx)
def assess_risk(self, ctx: CalculationContext) -> RiskAnalysis:
"""
Calculate risk score and level.
Returns RiskAnalysis object.
"""
score = 5.0
reasons = []
is_surprise = ctx.is_surprise
surprise_type = ""
# 1. League deviation (from UpsetEngine)
if ctx.is_surprise:
score += 2.0
reasons.append("High Upset Potential detected by UpsetEngine")
# 1.5 Odds Anomaly Detection
try:
home_conceded = ctx.team_pred.raw_features.get("home_conceded_avg", 1.0)
away_conceded = ctx.team_pred.raw_features.get("away_conceded_avg", 1.0)
has_anomaly, anomaly_res = self.anomaly_detector.detect_trap(
ctx.odds_data,
ctx.home_xg,
ctx.away_xg,
home_conceded,
away_conceded
)
if has_anomaly:
is_surprise = True
score += anomaly_res.severity + 2.0
surprise_type = "Bookmaker Trap"
reasons.append(anomaly_res.reason)
except Exception as e:
print(f"⚠️ Odds Anomaly Detection Error: {e}")
pass
# 2. HT/FT Surprise Hunter (XGBoost)
# We look for 1/2 (idx 2) and 2/1 (idx 6) from the V20 HT/FT model
if "ht_ft" in ctx.xgboost_preds:
ht_ft = ctx.xgboost_preds["ht_ft"]
valid_items = [(k, float(v)) for k, v in ht_ft.items() if isinstance(v, (int, float))]
if valid_items:
ranked = sorted(valid_items, key=lambda item: item[1], reverse=True)
top_label, top_prob = ranked[0]
second_prob = ranked[1][1] if len(ranked) > 1 else 0.0
top_gap = top_prob - second_prob
threshold = self._dynamic_reversal_threshold(ctx, top_label)
if getattr(ctx, "is_top_league", False):
top_gap_val = self.config.get("risk.surprise_min_top_gap_top")
if top_gap_val is not None:
min_gap = float(top_gap_val)
else:
base_gap_val = self.config.get("risk.surprise_min_top_gap")
min_gap = float(base_gap_val) if base_gap_val is not None else 0.02
else:
non_top_gap_val = self.config.get("risk.surprise_min_top_gap_non_top")
min_gap = float(non_top_gap_val) if non_top_gap_val is not None else 0.03
# Trigger surprise only when reversal class is:
# - top HT/FT outcome
# - above dynamic threshold
# - separated from second class with a minimum gap
if top_label in ("1/2", "2/1") and top_prob > threshold and top_gap > min_gap:
is_surprise = True
score += 3.0
surprise_type = f"{top_label} Reversal"
reasons.append(
f"🔥 Surprise Hunter: {top_label} potential ({round(top_prob*100, 1)}%, gap {round(top_gap*100, 1)}pp)"
)
# NEW: Potential Upset Alert - even if reversal is not the top prediction
# This catches cases like Bayern vs Augsburg where 1/2 was only 2% but it happened
favorite_side, gap = self._favorite_profile_from_odds(ctx.odds_data)
# Get reversal probabilities
prob_12 = float(ht_ft.get("1/2", 0))
prob_21 = float(ht_ft.get("2/1", 0))
# DYNAMIC threshold based on odds - stronger favorite = lower threshold
# When home odds are 1.30, even 1% reversal probability is significant
base_threshold = float(self.config.get("risk.upset_alert_threshold", 0.05))
# Calculate dynamic threshold based on favorite strength
if favorite_side == "H":
home_odds = float(ctx.odds_data.get("ms_h", 2.0))
# Stronger favorite (lower odds) = lower threshold
# 1.20 odds -> 0.01 threshold, 1.50 odds -> 0.03 threshold, 2.0+ odds -> base threshold
if home_odds <= 1.25:
dynamic_threshold = 0.01 # 1% - extremely strong favorite
elif home_odds <= 1.40:
dynamic_threshold = 0.015 # 1.5% - very strong favorite
elif home_odds <= 1.60:
dynamic_threshold = 0.02 # 2% - strong favorite
elif home_odds < 2.00:
dynamic_threshold = 0.03 # 3% - moderate favorite
else:
dynamic_threshold = base_threshold
elif favorite_side == "A":
away_odds = float(ctx.odds_data.get("ms_a", 2.0))
if away_odds <= 1.25:
dynamic_threshold = 0.01
elif away_odds <= 1.40:
dynamic_threshold = 0.015
elif away_odds <= 1.60:
dynamic_threshold = 0.02
elif away_odds < 2.00:
dynamic_threshold = 0.03
else:
dynamic_threshold = base_threshold
else:
dynamic_threshold = base_threshold
# Check for potential upset based on favorite
if favorite_side == "H" and prob_12 > dynamic_threshold:
# Home favorite, but 1/2 (home leads HT, away wins FT) has potential
is_surprise = True
score += 2.0
surprise_type = "1/2 Potential Upset"
reasons.append(
f"⚠️ UPSET ALERT: Home favorite ({ctx.odds_data.get('ms_h', 'N/A')}) but 1/2 reversal risk ({round(prob_12*100, 1)}% > {round(dynamic_threshold*100, 1)}% threshold)"
)
elif favorite_side == "A" and prob_21 > dynamic_threshold:
# Away favorite, but 2/1 (away leads HT, home wins FT) has potential
is_surprise = True
score += 2.0
surprise_type = "2/1 Potential Upset"
reasons.append(
f"⚠️ UPSET ALERT: Away favorite ({ctx.odds_data.get('ms_a', 'N/A')}) but 2/1 reversal risk ({round(prob_21*100, 1)}% > {round(dynamic_threshold*100, 1)}% threshold)"
)
elif gap > 0.5 and (prob_12 > dynamic_threshold or prob_21 > dynamic_threshold):
# Strong favorite (big odds gap) with any reversal potential
reversal_type = "1/2" if prob_12 > prob_21 else "2/1"
reversal_prob = max(prob_12, prob_21)
is_surprise = True
score += 1.5
surprise_type = f"{reversal_type} Potential Upset"
reasons.append(
f"⚠️ UPSET ALERT: Strong favorite (gap {round(gap, 2)}) with {reversal_type} risk ({round(reversal_prob*100, 1)}%)"
)
# Determine level
if score < 4.0:
level = "LOW"
elif score < 7.0:
level = "MEDIUM"
elif score < 9.0:
level = "HIGH"
else:
level = "EXTREME"
return RiskAnalysis(
risk_score=score,
risk_level=level,
is_surprise_risk=is_surprise,
surprise_type=surprise_type,
reasons=reasons
)
+230
View File
@@ -0,0 +1,230 @@
import os
import pickle
import pandas as pd
import xgboost as xgb
from dataclasses import dataclass
from typing import List, Dict, Tuple, Optional
import math
from .base_calculator import BaseCalculator, CalculationContext
from .confidence import calc_confidence_3way, calc_confidence_dc
from .match_result_calculator import MatchResultPrediction
@dataclass
class ScorePrediction:
predicted_ft_score: str
predicted_ht_score: str
ft_scores_top5: List[Dict]
# Reconciled MS/DC predictions (can be updated here)
reconciled_ms: Optional[MatchResultPrediction] = None
class ScoreCalculator(BaseCalculator):
def __init__(self, config: Dict):
super().__init__(config)
self.xgb_home = None
self.xgb_away = None
self.xgb_ht_home = None
self.xgb_ht_away = None
self.scaler = None # If used
self.features = []
self._load_model()
def _load_model(self):
try:
model_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "..", "models", "xgb_score.pkl")
if os.path.exists(model_path):
with open(model_path, "rb") as f:
data = pickle.load(f)
# Handle both dictionary and direct model formats (just in case)
if isinstance(data, dict):
self.xgb_home = data.get("home_model")
self.xgb_away = data.get("away_model")
self.xgb_ht_home = data.get("ht_home_model")
self.xgb_ht_away = data.get("ht_away_model")
self.features = data.get("features", [])
else:
print("⚠️ Unexpected XGB score model format.")
print("✅ XGBoost Score Model loaded.")
else:
print(f"⚠️ XGBoost Score Model not found at {model_path}")
except Exception as e:
print(f"❌ Error loading XGBoost Score Model: {e}")
def _poisson_pmf(self, k, lam):
"""Poisson probability mass function."""
if lam <= 0:
return 1.0 if k == 0 else 0.0
return (lam ** k) * math.exp(-lam) / math.factorial(k)
def calculate(self, ctx: CalculationContext, ms_result: MatchResultPrediction) -> ScorePrediction: # type: ignore[override]
predicted_ht = None
# Default Lambdas (fallback)
lambda_home = max(0.5, ctx.home_xg)
lambda_away = max(0.5, ctx.away_xg)
# --- XGBOOST PREDICTION ---
if self.xgb_home and self.xgb_away and hasattr(ctx.team_pred, "raw_features"):
try:
# 1. Prepare Features
# We need to map ctx data to self.features list columns
raw = ctx.team_pred.raw_features
odds = ctx.odds_data or {}
# Use unified feature adapter for exact 56-feature sync
from features.feature_adapter import get_feature_adapter
df_input = get_feature_adapter().get_features(ctx)
# Predict FT
pred_h = self.xgb_home.predict(df_input)[0]
pred_a = self.xgb_away.predict(df_input)[0]
# Predict HT (if available)
if self.xgb_ht_home and self.xgb_ht_away:
pred_ht_h = self.xgb_ht_home.predict(df_input)[0]
pred_ht_a = self.xgb_ht_away.predict(df_input)[0]
# Clamp HT predictions (min 0, and shouldn't exceed FT in logic, but models are independent)
# We trust the model but ensure sanity (HT <= FT is hard to enforce without joint training, but usually holds)
ht_h_val = max(0.0, float(pred_ht_h))
ht_a_val = max(0.0, float(pred_ht_a))
predicted_ht = f"{round(ht_h_val)}-{round(ht_a_val)}"
else:
# Fallback if HT models missing
ht_h_val = max(0.0, float(pred_h) * 0.42)
ht_a_val = max(0.0, float(pred_a) * 0.42)
predicted_ht = f"{round(ht_h_val)}-{round(ht_a_val)}"
# Update lambdas with ML predictions
lambda_home = max(0.1, min(6.0, float(pred_h)))
lambda_away = max(0.1, min(6.0, float(pred_a)))
# Store raw XGB preds in context
ctx.xgboost_preds["score"] = {
"home": lambda_home,
"away": lambda_away,
"ht_home": ht_h_val,
"ht_away": ht_a_val
}
except Exception as e:
print(f"⚠️ XGBoost Score Prediction failed: {e}. Falling back to Poisson xG.")
# Fallback to current simple logic if ML fails
predicted_ht = f"{round(lambda_home * 0.42)}-{round(lambda_away * 0.42)}"
# --- POISSON GRID GENERATION ---
# Now use lambda_home/away (either ML or fallback) to generate grid
score_probs = {}
grid_max = self.config.get("score.poisson_grid_max", 7)
for i in range(grid_max):
for j in range(grid_max):
p = self._poisson_pmf(i, lambda_home) * self._poisson_pmf(j, lambda_away)
score_probs[f"{i}-{j}"] = round(p * 100, 2)
sorted_scores = sorted(score_probs.items(), key=lambda x: x[1], reverse=True)
# --- DERIVE MS PROBS FROM SCORES (CONSISTENCY CHECK) ---
poisson_ms_home = sum(p for s, p in score_probs.items()
for h, a in [s.split("-")] if int(h) > int(a))
poisson_ms_away = sum(p for s, p in score_probs.items()
for h, a in [s.split("-")] if int(h) < int(a))
poisson_ms_draw = sum(p for s, p in score_probs.items()
for h, a in [s.split("-")] if int(h) == int(a))
# Normalize
poisson_total = poisson_ms_home + poisson_ms_away + poisson_ms_draw
if poisson_total > 0:
poisson_ms_home /= poisson_total
poisson_ms_away /= poisson_total
poisson_ms_draw /= poisson_total
# --- HYBRID RECONCILIATION ---
threshold = self.config.get("score.ms_confidence_threshold", 15.0)
reconciled_result = ms_result
# If original confidence is low, trust new Score Model more
if ms_result.ms_confidence < threshold:
poisson_probs = [(poisson_ms_home, "1"), (poisson_ms_draw, "X"), (poisson_ms_away, "2")]
poisson_sorted = sorted(poisson_probs, key=lambda x: x[0], reverse=True)
new_ms_pick = poisson_sorted[0][1]
new_ms_conf = calc_confidence_3way(poisson_sorted[0][0])
# Recalculate DC
dc_1x = poisson_ms_home + poisson_ms_draw
dc_x2 = poisson_ms_draw + poisson_ms_away
dc_12 = poisson_ms_home + poisson_ms_away
dc_probs = [(dc_1x, "1X"), (dc_x2, "X2"), (dc_12, "12")]
dc_sorted = sorted(dc_probs, key=lambda x: x[0], reverse=True)
new_dc_pick = dc_sorted[0][1]
new_dc_conf = calc_confidence_dc(dc_sorted[0][0])
reconciled_result = MatchResultPrediction(
ms_home_prob=poisson_ms_home,
ms_draw_prob=poisson_ms_draw,
ms_away_prob=poisson_ms_away,
ms_pick=new_ms_pick,
ms_confidence=new_ms_conf,
dc_1x_prob=dc_1x,
dc_x2_prob=dc_x2,
dc_12_prob=dc_12,
dc_pick=new_dc_pick,
dc_confidence=new_dc_conf
)
# Select best score that matches MS Pick
# NEW LOGIC: We trust XGBoost/Poisson top score over generic MS Pick if MS Confidence is low.
# Otherwise, we filter the grid to match the MS pick.
ms_pick = reconciled_result.ms_pick
def _score_matches_ms(score_str, pick):
h, a = map(int, score_str.split("-"))
if pick == "1": return h > a
if pick == "2": return h < a
return h == a
matching_scores = [(s, p) for s, p in sorted_scores if _score_matches_ms(s, ms_pick)]
# Primary Prediction Strategy:
# If MS pick is highly confident, enforce it.
# But if the absolute best score in the grid contradicts it and has a high probability (e.g. >10%), trust the score model directly.
top_overall_score, top_overall_prob = sorted_scores[0]
if matching_scores and not (top_overall_prob > 12.0 and not _score_matches_ms(top_overall_score, ms_pick)):
predicted_ft = matching_scores[0][0]
else:
predicted_ft = top_overall_score
# If we didn't calculate HT via ML (exception case), do it now
if predicted_ht is None:
ft_to_ht = self.config.get("half_time.ft_to_ht_ratio", 0.42)
ht_h = round(lambda_home * ft_to_ht)
ht_a = round(lambda_away * ft_to_ht)
predicted_ht = f"{ht_h}-{ht_a}"
# --- CONSISTENCY CHECK ---
# Ensure HT score <= FT score
try:
ft_h, ft_a = map(int, predicted_ft.split("-"))
ht_h, ht_a = map(int, predicted_ht.split("-"))
# Clamp HT values
ht_h = min(ht_h, ft_h)
ht_a = min(ht_a, ft_a)
predicted_ht = f"{ht_h}-{ht_a}"
except ValueError:
pass # Malformed score string, ignore correction
ft_scores = [{"score": s, "prob": p} for s, p in sorted_scores[:5]]
return ScorePrediction(
predicted_ft_score=predicted_ft,
predicted_ht_score=predicted_ht,
ft_scores_top5=ft_scores,
reconciled_ms=reconciled_result
)
+10
View File
@@ -0,0 +1,10 @@
# ai-engine/core/engines/__init__.py
"""
Prediction Engines
"""
from .player_predictor import PlayerPredictorEngine, get_player_predictor
__all__ = [
"PlayerPredictorEngine", "get_player_predictor",
]
+359
View File
@@ -0,0 +1,359 @@
"""
Player Predictor Engine - V20 Ensemble Component
Analyzes squad quality, key players, and missing player impact.
Weight: 25% in ensemble
"""
import os
import sys
from typing import Dict, Optional, List
from dataclasses import dataclass
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from features.squad_analysis_engine import get_squad_analysis_engine
from features.sidelined_analyzer import get_sidelined_analyzer
@dataclass
class PlayerPrediction:
"""Player engine prediction output.
IMPORTANT: squad_quality uses the SAME composite formula as
extract_training_data.py so that inference values match the
distribution the model was trained on (~3-36 range).
"""
home_squad_quality: float = 12.0
away_squad_quality: float = 12.0
squad_diff: float = 0.0
home_key_players: int = 0
away_key_players: int = 0
home_missing_impact: float = 0.0
away_missing_impact: float = 0.0
home_goals_form: int = 0
away_goals_form: int = 0
home_lineup_goals_per90: float = 0.0
away_lineup_goals_per90: float = 0.0
home_lineup_assists_per90: float = 0.0
away_lineup_assists_per90: float = 0.0
home_squad_continuity: float = 0.5
away_squad_continuity: float = 0.5
home_top_scorer_form: int = 0
away_top_scorer_form: int = 0
home_avg_player_exp: float = 0.0
away_avg_player_exp: float = 0.0
home_goals_diversity: float = 0.0
away_goals_diversity: float = 0.0
lineup_available: bool = False
confidence: float = 0.0
class PlayerPredictorEngine:
"""
Player/Squad-based prediction engine.
Analyzes:
- Starting 11 quality
- Key player availability (top scorers)
- Missing player impact
- Recent goalscoring form per player
"""
def __init__(self):
self.squad_engine = get_squad_analysis_engine()
self.sidelined_analyzer = get_sidelined_analyzer()
print("✅ PlayerPredictorEngine initialized")
def predict(self,
match_id: str,
home_team_id: str,
away_team_id: str,
home_lineup: Optional[List[str]] = None,
away_lineup: Optional[List[str]] = None,
sidelined_data: Optional[Dict] = None) -> PlayerPrediction:
"""
Generate player-based prediction.
Args:
match_id: Match ID for lineup lookup
home_team_id: Home team ID
away_team_id: Away team ID
home_lineup: Optional list of home player IDs
away_lineup: Optional list of away player IDs
Returns:
PlayerPrediction with squad analysis
"""
# Get squad features
home_analysis = None
away_analysis = None
if home_lineup and away_lineup:
home_analysis = self.squad_engine.analyze_squad_from_list(
home_lineup, home_team_id
)
away_analysis = self.squad_engine.analyze_squad_from_list(
away_lineup, away_team_id
)
lineup_available = True
features = {
"home_starting_11": home_analysis.starting_count or 11,
"home_goals_last_5": home_analysis.total_goals_last_5,
"home_assists_last_5": home_analysis.total_assists_last_5,
"home_key_players": home_analysis.key_players_count,
"home_forwards": home_analysis.forward_count or 2,
"away_starting_11": away_analysis.starting_count or 11,
"away_goals_last_5": away_analysis.total_goals_last_5,
"away_assists_last_5": away_analysis.total_assists_last_5,
"away_key_players": away_analysis.key_players_count,
"away_forwards": away_analysis.forward_count or 2,
}
elif match_id:
try:
features = self.squad_engine.get_features(
match_id, home_team_id, away_team_id
)
lineup_available = (
features.get("home_starting_11", 0) >= 11 and
features.get("away_starting_11", 0) >= 11
)
except Exception:
features = self.squad_engine.get_features_without_match(
home_team_id, away_team_id
)
lineup_available = False
else:
features = self.squad_engine.get_features_without_match(
home_team_id, away_team_id
)
lineup_available = False
home_goals = int(features.get("home_goals_last_5", 0))
away_goals = int(features.get("away_goals_last_5", 0))
home_key = int(features.get("home_key_players", 0))
away_key = int(features.get("away_key_players", 0))
home_starting = features.get("home_starting_11", 11)
away_starting = features.get("away_starting_11", 11)
home_fwd = features.get("home_forwards", 2)
away_fwd = features.get("away_forwards", 2)
# Squad quality — matches V25 extract_training_data.py:579
home_quality = home_starting * 0.3 + home_key * 3.0 + home_fwd * 1.5
away_quality = away_starting * 0.3 + away_key * 3.0 + away_fwd * 1.5
squad_diff = home_quality - away_quality
# Missing player impact
if sidelined_data:
home_impact, away_impact = self.sidelined_analyzer.analyze_match(sidelined_data)
home_missing = min(1.0, max(0.0, home_impact.impact_score))
away_missing = min(1.0, max(0.0, away_impact.impact_score))
sidelined_available = True
else:
expected_xi = 11
actual_home_xi = features.get("home_starting_11", 11)
actual_away_xi = features.get("away_starting_11", 11)
home_missing = (expected_xi - actual_home_xi) / expected_xi if actual_home_xi < expected_xi else 0
away_missing = (expected_xi - actual_away_xi) / expected_xi if actual_away_xi < expected_xi else 0
sidelined_available = False
# Player-level features (matches extract_training_data.py:594-650)
player_feats = self._compute_player_level_features(
home_lineup or [], away_lineup or [],
home_team_id, away_team_id,
home_analysis, away_analysis,
)
confidence = 70.0 if lineup_available else 35.0
if home_goals + away_goals > 10:
confidence += 15
if sidelined_available:
confidence += self.sidelined_analyzer.config.get("sidelined.confidence_boost", 10)
if not lineup_available:
confidence -= 5.0
return PlayerPrediction(
home_squad_quality=home_quality,
away_squad_quality=away_quality,
squad_diff=squad_diff,
home_key_players=home_key,
away_key_players=away_key,
home_missing_impact=home_missing,
away_missing_impact=away_missing,
home_goals_form=home_goals,
away_goals_form=away_goals,
home_lineup_goals_per90=player_feats['home_lineup_goals_per90'],
away_lineup_goals_per90=player_feats['away_lineup_goals_per90'],
home_lineup_assists_per90=player_feats['home_lineup_assists_per90'],
away_lineup_assists_per90=player_feats['away_lineup_assists_per90'],
home_squad_continuity=player_feats['home_squad_continuity'],
away_squad_continuity=player_feats['away_squad_continuity'],
home_top_scorer_form=player_feats['home_top_scorer_form'],
away_top_scorer_form=player_feats['away_top_scorer_form'],
home_avg_player_exp=player_feats['home_avg_player_exp'],
away_avg_player_exp=player_feats['away_avg_player_exp'],
home_goals_diversity=player_feats['home_goals_diversity'],
away_goals_diversity=player_feats['away_goals_diversity'],
lineup_available=lineup_available,
confidence=max(5.0, confidence)
)
def _compute_player_level_features(
self,
home_lineup: List[str],
away_lineup: List[str],
home_team_id: str,
away_team_id: str,
home_analysis,
away_analysis,
) -> Dict[str, float]:
defaults = {
'home_lineup_goals_per90': 0.0, 'away_lineup_goals_per90': 0.0,
'home_lineup_assists_per90': 0.0, 'away_lineup_assists_per90': 0.0,
'home_squad_continuity': 0.5, 'away_squad_continuity': 0.5,
'home_top_scorer_form': 0, 'away_top_scorer_form': 0,
'home_avg_player_exp': 0.0, 'away_avg_player_exp': 0.0,
'home_goals_diversity': 0.0, 'away_goals_diversity': 0.0,
}
conn = self.squad_engine.get_conn()
if conn is None:
return defaults
try:
from psycopg2.extras import RealDictCursor
result = {}
for prefix, lineup, team_id in [
('home', home_lineup, home_team_id),
('away', away_lineup, away_team_id),
]:
if not lineup:
for k in ('lineup_goals_per90', 'lineup_assists_per90',
'squad_continuity', 'top_scorer_form',
'avg_player_exp', 'goals_diversity'):
result[f'{prefix}_{k}'] = defaults[f'{prefix}_{k}']
continue
g90, a90, total_exp = 0.0, 0.0, 0
best_scorer_total, best_scorer_id = 0, None
scorers_in_lineup = 0
with conn.cursor(cursor_factory=RealDictCursor) as cur:
for pid in lineup:
cur.execute("""
SELECT
COUNT(*) as starts,
COALESCE(SUM(CASE WHEN e.event_type = 'goal'
AND (e.event_subtype IS NULL OR e.event_subtype NOT ILIKE '%%penaltı kaçırma%%')
THEN 1 ELSE 0 END), 0) as goals,
COALESCE((SELECT COUNT(*) FROM match_player_events
WHERE assist_player_id = %s), 0) as assists
FROM match_player_participation mpp
LEFT JOIN match_player_events e
ON e.match_id = mpp.match_id AND e.player_id = mpp.player_id
WHERE mpp.player_id = %s AND mpp.is_starting = true
""", (pid, pid))
row = cur.fetchone()
if not row or not row['starts']:
continue
starts = row['starts']
goals = row['goals'] or 0
assists = row['assists'] or 0
g90 += goals / starts
a90 += assists / starts
total_exp += starts
if goals > 0:
scorers_in_lineup += 1
if goals > best_scorer_total:
best_scorer_total = goals
best_scorer_id = pid
n_st = len(lineup) or 1
# Top scorer recent form (goals in last 5 starts)
top_scorer_form = 0
if best_scorer_id:
cur.execute("""
SELECT COUNT(*) as goals
FROM match_player_events mpe
WHERE mpe.player_id = %s AND mpe.event_type = 'goal'
AND mpe.match_id IN (
SELECT match_id FROM match_player_participation
WHERE player_id = %s AND is_starting = true
ORDER BY match_id DESC LIMIT 5
)
""", (best_scorer_id, best_scorer_id))
tsf_row = cur.fetchone()
if tsf_row:
top_scorer_form = tsf_row['goals'] or 0
# Squad continuity (overlap with previous match lineup)
squad_continuity = 0.5
cur.execute("""
SELECT mpp.player_id
FROM match_player_participation mpp
JOIN matches m ON mpp.match_id = m.id
WHERE mpp.team_id = %s AND mpp.is_starting = true
AND m.status = 'FT'
ORDER BY m.mst_utc DESC
LIMIT 11
""", (team_id,))
prev_starters = {r['player_id'] for r in cur.fetchall()}
if prev_starters:
overlap = len(set(lineup) & prev_starters)
squad_continuity = overlap / n_st
result[f'{prefix}_lineup_goals_per90'] = round(g90, 3)
result[f'{prefix}_lineup_assists_per90'] = round(a90, 3)
result[f'{prefix}_squad_continuity'] = round(squad_continuity, 3)
result[f'{prefix}_top_scorer_form'] = top_scorer_form
result[f'{prefix}_avg_player_exp'] = round(total_exp / n_st, 1)
result[f'{prefix}_goals_diversity'] = round(scorers_in_lineup / n_st, 3)
return result
except Exception as e:
print(f"[PlayerPredictor] Player-level features failed: {e}")
return defaults
def get_1x2_modifier(self, prediction: PlayerPrediction) -> Dict[str, float]:
"""
Calculate 1X2 probability modifiers based on squad analysis.
Returns modifiers to apply to base probabilities.
squad_diff is in training scale (~-33 to +33), normalize to -1..+1.
"""
diff = prediction.squad_diff / 33.0 # training-scale normalisation
diff = max(-1.0, min(1.0, diff)) # clamp
return {
"home_modifier": 1.0 + (diff * 0.3), # Up to +/-30%
"away_modifier": 1.0 - (diff * 0.3),
"draw_modifier": 1.0 - abs(diff) * 0.2 # Less draw if big diff
}
# Singleton
_engine: Optional[PlayerPredictorEngine] = None
def get_player_predictor() -> PlayerPredictorEngine:
global _engine
if _engine is None:
_engine = PlayerPredictorEngine()
return _engine
if __name__ == "__main__":
engine = get_player_predictor()
print("\n🧪 Player Predictor Engine Test")
print("=" * 50)
pred = engine.predict(
match_id="test_match",
home_team_id="test_home",
away_team_id="test_away"
)
print(f"\n📊 Prediction:")
for k, v in pred.to_dict().items():
print(f" {k}: {v}")
+302
View File
@@ -0,0 +1,302 @@
"""
Quantitative Finance Module — V2 Betting Engine
Edge calculation, Fractional Kelly Criterion staking, bet grading, and risk assessment.
"""
from __future__ import annotations
import math
from dataclasses import dataclass
from typing import Any
# ═══════════════════════════════════════════════════════════════════════════
# Constants
# ═══════════════════════════════════════════════════════════════════════════
BANKROLL_UNITS: float = 10.0 # Total bankroll in abstract units
KELLY_FRACTION: float = 0.25 # Quarter-Kelly (conservative, anti-ruin)
MIN_EDGE_PLAYABLE: float = 0.05 # 5% edge minimum to mark as playable
MIN_ODDS_PLAYABLE: float = 1.30 # Skip extreme chalk below 1.30
# ═══════════════════════════════════════════════════════════════════════════
# Edge Calculation
# ═══════════════════════════════════════════════════════════════════════════
def calculate_edge(true_prob: float, decimal_odds: float) -> float:
"""
Edge = (True_Probability × Decimal_Odds) - 1.0
Positive edge → the model says we have an advantage over the bookmaker.
"""
if decimal_odds <= 1.0 or true_prob <= 0.0:
return -1.0
return round((true_prob * decimal_odds) - 1.0, 4)
# ═══════════════════════════════════════════════════════════════════════════
# Kelly Criterion Staking
# ═══════════════════════════════════════════════════════════════════════════
def kelly_stake(true_prob: float, decimal_odds: float) -> float:
"""
Fractional Kelly Criterion for a bankroll of BANKROLL_UNITS.
Full Kelly: f* = ((b × p) - q) / b
where b = decimal_odds - 1, p = true_prob, q = 1 - true_prob
We use KELLY_FRACTION (25%) to reduce variance and avoid ruin.
Returns stake in units, rounded to 0.1.
"""
if decimal_odds <= 1.0 or true_prob <= 0.0 or true_prob >= 1.0:
return 0.0
b = decimal_odds - 1.0
p = true_prob
q = 1.0 - p
f_star = ((b * p) - q) / b
if f_star <= 0.0:
return 0.0
# Scale by fraction and bankroll
stake = f_star * KELLY_FRACTION * BANKROLL_UNITS
# Cap at a sensible maximum (3 units on a 10-unit bankroll)
stake = min(stake, 3.0)
return round(max(0.0, stake), 1)
# ═══════════════════════════════════════════════════════════════════════════
# Bet Grading
# ═══════════════════════════════════════════════════════════════════════════
def grade_bet(edge: float, playable: bool) -> str:
"""
Assign a letter grade based on edge magnitude.
A: Edge > 10% — Elite value, rare
B: Edge > 5% — Strong value, core bets
C: Edge > 2% — Marginal value, supporting picks only
PASS: Below threshold — Do not bet
"""
if not playable or edge < 0.02:
return "PASS"
if edge > 0.10:
return "A"
if edge > 0.05:
return "B"
return "C"
def is_playable(edge: float, decimal_odds: float) -> bool:
"""A pick is playable if it has sufficient edge AND reasonable odds."""
return edge >= MIN_EDGE_PLAYABLE and decimal_odds >= MIN_ODDS_PLAYABLE
# ═══════════════════════════════════════════════════════════════════════════
# Play Score (0-100 composite)
# ═══════════════════════════════════════════════════════════════════════════
def calculate_play_score(
edge: float,
true_prob: float,
data_quality: float,
) -> float:
"""
Composite score combining edge strength, probability confidence,
and data quality. Used for ranking picks and filtering.
Components:
- Edge contribution (0-50): edge * 250, capped at 50
- Prob contribution (0-30): probability * 30
- DQ contribution (0-20): data_quality * 20
"""
edge_score = min(50.0, max(0.0, edge * 250.0))
prob_score = min(30.0, max(0.0, true_prob * 30.0))
dq_score = min(20.0, max(0.0, data_quality * 20.0))
return round(edge_score + prob_score + dq_score, 1)
# ═══════════════════════════════════════════════════════════════════════════
# Risk Assessment
# ═══════════════════════════════════════════════════════════════════════════
@dataclass
class RiskResult:
level: str # LOW, MEDIUM, HIGH, EXTREME
score: float # 0.0 - 1.0
is_surprise_risk: bool
surprise_type: str | None
warnings: list[str]
def assess_risk(
missing_players_impact: float,
data_quality_score: float,
elo_diff: float,
implied_prob_fav: float,
) -> RiskResult:
"""
Multi-factor risk assessment.
Factors:
1. Missing key players (injuries/suspensions)
2. Data quality (missing stats, odds)
3. ELO closeness (tight matches are riskier)
4. Surprise potential (heavy favorite vulnerable)
"""
warnings: list[str] = []
risk_score = 0.0
# ─── Factor 1: Missing players ────────────────────────────────────
if missing_players_impact > 0.3:
risk_score += 0.35
warnings.append(
f"High missing-player impact: {missing_players_impact:.2f}"
)
elif missing_players_impact > 0.15:
risk_score += 0.15
warnings.append(
f"Moderate missing-player impact: {missing_players_impact:.2f}"
)
# ─── Factor 2: Data quality ───────────────────────────────────────
if data_quality_score < 0.5:
risk_score += 0.25
warnings.append(
f"Low data quality: {data_quality_score:.2f}"
)
elif data_quality_score < 0.75:
risk_score += 0.10
# ─── Factor 3: ELO closeness ──────────────────────────────────────
abs_elo_diff = abs(elo_diff)
if abs_elo_diff < 50:
risk_score += 0.15
warnings.append("Very tight ELO difference — coin-flip territory")
elif abs_elo_diff < 100:
risk_score += 0.05
# ─── Factor 4: Surprise detection ─────────────────────────────────
is_surprise = False
surprise_type: str | None = None
if implied_prob_fav > 0.65 and abs_elo_diff < 80:
# Heavy favorite by odds but ELO says match is closer
is_surprise = True
surprise_type = "odds_elo_divergence"
risk_score += 0.15
warnings.append(
"Upset potential: bookmaker odds suggest heavy favorite "
"but ELO says the match is closer than the market thinks"
)
# ─── Classify ─────────────────────────────────────────────────────
risk_score = min(1.0, risk_score)
if risk_score >= 0.7:
level = "EXTREME"
elif risk_score >= 0.45:
level = "HIGH"
elif risk_score >= 0.2:
level = "MEDIUM"
else:
level = "LOW"
return RiskResult(
level=level,
score=round(risk_score, 3),
is_surprise_risk=is_surprise,
surprise_type=surprise_type,
warnings=warnings,
)
# ═══════════════════════════════════════════════════════════════════════════
# Market Analysis (orchestrates edge/kelly/grade per market)
# ═══════════════════════════════════════════════════════════════════════════
@dataclass
class MarketPick:
market: str
pick: str
probability: float
odds: float
edge: float
playable: bool
bet_grade: str
stake_units: float
play_score: float
decision_reasons: list[str]
def analyze_market(
market: str,
probs: dict[str, float],
odds_map: dict[str, float],
data_quality_score: float,
) -> MarketPick:
"""
For a given market (MS, OU25, BTTS), find the best pick,
calculate edge, kelly stake, and grade it.
Parameters:
market: "MS", "OU25", "BTTS"
probs: {"1": 0.55, "X": 0.25, "2": 0.20} — calibrated model probs
odds_map: {"1": 2.10, "X": 3.40, "2": 3.50} — decimal odds
data_quality_score: 0.0-1.0
"""
best_pick: str = ""
best_edge: float = -99.0
best_prob: float = 0.0
best_odds: float = 0.0
reasons: list[str] = []
for pick_name, prob in probs.items():
odd = odds_map.get(pick_name, 0.0)
if odd <= 1.0:
continue
edge = calculate_edge(prob, odd)
if edge > best_edge:
best_edge = edge
best_pick = pick_name
best_prob = prob
best_odds = odd
if not best_pick:
return MarketPick(
market=market, pick="", probability=0.0, odds=0.0,
edge=0.0, playable=False, bet_grade="PASS",
stake_units=0.0, play_score=0.0,
decision_reasons=["no_valid_odds_found"],
)
playable = is_playable(best_edge, best_odds)
grade = grade_bet(best_edge, playable)
stake = kelly_stake(best_prob, best_odds) if playable else 0.0
play_score = calculate_play_score(best_edge, best_prob, data_quality_score)
# Build decision reasons
if playable:
reasons.append(f"edge_{best_edge:.1%}_above_threshold")
reasons.append(f"kelly_stake_{stake:.1f}_units")
else:
if best_edge < MIN_EDGE_PLAYABLE:
reasons.append(f"edge_{best_edge:.1%}_below_{MIN_EDGE_PLAYABLE:.0%}_threshold")
if best_odds < MIN_ODDS_PLAYABLE:
reasons.append(f"odds_{best_odds:.2f}_below_{MIN_ODDS_PLAYABLE:.2f}_minimum")
return MarketPick(
market=market,
pick=best_pick,
probability=round(best_prob, 4),
odds=round(best_odds, 2),
edge=round(best_edge, 4),
playable=playable,
bet_grade=grade,
stake_units=stake,
play_score=play_score,
decision_reasons=reasons,
)
+1
View File
@@ -0,0 +1 @@
# data package
+97
View File
@@ -0,0 +1,97 @@
"""
Async Database Module — V2 Betting Engine
==========================================
Provides async SQLAlchemy sessions via asyncpg for the V2 router.
Usage:
async with get_session() as session:
result = await session.execute(text("SELECT ..."))
"""
from __future__ import annotations
import os
from contextlib import asynccontextmanager
from typing import AsyncGenerator
from dotenv import load_dotenv
from sqlalchemy.ext.asyncio import (
AsyncEngine,
AsyncSession,
async_sessionmaker,
create_async_engine,
)
load_dotenv()
_engine: AsyncEngine | None = None
_session_maker: async_sessionmaker[AsyncSession] | None = None
def _get_async_dsn() -> str:
"""
Convert DATABASE_URL to asyncpg-compatible format.
Handles:
1. Prisma's ``?schema=public`` suffix → stripped
2. ``postgresql://`` driver prefix → ``postgresql+asyncpg://``
"""
dsn = os.getenv(
"DATABASE_URL",
"postgresql://suggestbet:SuGGesT2026SecuRe@localhost:15432/boilerplate_db",
)
# Strip Prisma's ?schema= parameter
if "?" in dsn:
base, query = dsn.split("?", 1)
kept_parts = [
part for part in query.split("&") if part and not part.startswith("schema=")
]
dsn = base if not kept_parts else f"{base}?{'&'.join(kept_parts)}"
# Convert driver prefix for asyncpg
if dsn.startswith("postgresql://"):
dsn = dsn.replace("postgresql://", "postgresql+asyncpg://", 1)
elif dsn.startswith("postgres://"):
dsn = dsn.replace("postgres://", "postgresql+asyncpg://", 1)
return dsn
def _ensure_engine() -> AsyncEngine:
global _engine, _session_maker
if _engine is None:
_engine = create_async_engine(
_get_async_dsn(),
pool_size=5,
max_overflow=5,
pool_timeout=10,
pool_pre_ping=True,
echo=False,
)
_session_maker = async_sessionmaker(
bind=_engine,
class_=AsyncSession,
expire_on_commit=False,
)
print("✅ Async database engine created (asyncpg)")
return _engine
@asynccontextmanager
async def get_session() -> AsyncGenerator[AsyncSession, None]:
"""Provide an async session context manager."""
_ensure_engine()
assert _session_maker is not None
async with _session_maker() as session:
yield session
async def dispose_engine() -> None:
"""Shut down the async engine cleanly."""
global _engine, _session_maker
if _engine is not None:
await _engine.dispose()
_engine = None
_session_maker = None
print("️ Async database engine disposed")
+92
View File
@@ -0,0 +1,92 @@
"""
Synchronous psycopg2 database helper for the AI Engine.
Uses a thread-safe connection pool for legacy V20+ endpoints.
"""
from __future__ import annotations
import os
from contextlib import contextmanager
from typing import Generator
import psycopg2
from psycopg2 import pool
from psycopg2.extensions import connection as PgConnection
from dotenv import load_dotenv
load_dotenv()
# Safe default with no credentials — will fail fast if not configured.
_DEFAULT_DSN = "postgresql://postgres:postgres@localhost:15432/boilerplate_db"
def get_clean_dsn() -> str:
"""
Return a psycopg2-compatible DSN from DATABASE_URL.
Handles DSN cleanup issues that break raw usage:
1. Prisma appends '?schema=public' which psycopg2 cannot parse.
"""
dsn: str = os.getenv("DATABASE_URL", _DEFAULT_DSN)
connect_timeout: str = os.getenv("PGCONNECT_TIMEOUT", "5").strip() or "5"
# Strip Prisma's ?schema= query parameter while preserving any other query args.
if "?" in dsn:
base, query = dsn.split("?", 1)
kept_parts: list[str] = [
part for part in query.split("&") if part and not part.startswith("schema=")
]
dsn = base if not kept_parts else f"{base}?{'&'.join(kept_parts)}"
# Force bounded DB connect attempts so API calls do not hang indefinitely.
if "connect_timeout=" not in dsn:
separator = "&" if "?" in dsn else "?"
dsn = f"{dsn}{separator}connect_timeout={connect_timeout}"
return dsn
class Database:
_pool: pool.ThreadedConnectionPool | None = None
@classmethod
def initialize(cls) -> None:
if cls._pool is None:
dsn: str = get_clean_dsn()
try:
cls._pool = pool.ThreadedConnectionPool(
minconn=1,
maxconn=10,
dsn=dsn,
)
print("✅ Database connection pool created")
except Exception as e:
print(f"❌ Failed to create DB pool: {e}")
raise
@classmethod
def get_conn(cls) -> PgConnection:
if cls._pool is None:
cls.initialize()
assert cls._pool is not None # guaranteed by initialize()
return cls._pool.getconn()
@classmethod
def return_conn(cls, conn: PgConnection) -> None:
if cls._pool:
cls._pool.putconn(conn)
@classmethod
@contextmanager
def connection(cls) -> Generator[PgConnection, None, None]:
"""Context manager for safe connection handling."""
conn: PgConnection = cls.get_conn()
try:
yield conn
finally:
cls.return_conn(conn)
@classmethod
def close_all(cls) -> None:
if cls._pool:
cls._pool.closeall()
print("️ Database connection pool closed")
+134
View File
@@ -0,0 +1,134 @@
{
"_meta": {
"source": "bt_10k",
"thresholds": "high:roi>10&n>=20 | low:roi<-5&n>=15 | unknown:n<10"
},
"lookup": {
"32n2r9bl6x90psj0wa7bfs6vq": {
"label": "high",
"bet_roi": 102.2,
"bet_n": 23,
"hit": 30.4,
"name": "Sudamericana"
},
"59tpnfrwnvhnhzmnvfyug68hj": {
"label": "high",
"bet_roi": 63.5,
"bet_n": 23,
"hit": 30.4,
"name": "Libertadores Kupası"
},
"b60nisd3qn427jm0hrg9kvmab": {
"label": "high",
"bet_roi": 49.7,
"bet_n": 22,
"hit": 22.7,
"name": "Allsvenskan"
},
"scf9p4y91yjvqvg5jndxzhxj": {
"label": "high",
"bet_roi": 33.8,
"bet_n": 100,
"hit": 25.0,
"name": "Serie A"
},
"4oogyu6o156iphvdvphwpck10": {
"label": "high",
"bet_roi": 32.3,
"bet_n": 23,
"hit": 21.7,
"name": "Şampiyonlar Ligi"
},
"89ovpy1rarewwzqvi30bfdr8b": {
"label": "high",
"bet_roi": 29.4,
"bet_n": 50,
"hit": 24.0,
"name": "1. Lig"
},
"82jkgccg7phfjpd0mltdl3pat": {
"label": "high",
"bet_roi": 25.8,
"bet_n": 29,
"hit": 27.6,
"name": "Süper Lig"
},
"3is4bkgf3loxv9qfg3hm8zfqb": {
"label": "high",
"bet_roi": 25.5,
"bet_n": 84,
"hit": 19.0,
"name": "LaLiga 2"
},
"enzlj1as2raqm4ids1zyb07y1": {
"label": "medium",
"bet_roi": 23.7,
"bet_n": 19,
"hit": 26.3,
"name": "USL 2. Lig"
},
"9ynnnx1qmkizq1o3qr3v0nsuk": {
"label": "high",
"bet_roi": 16.3,
"bet_n": 38,
"hit": 21.1,
"name": "Eliteserien"
},
"8ey0ww2zsosdmwr8ehsorh6t7": {
"label": "medium",
"bet_roi": 5.4,
"bet_n": 80,
"hit": 16.2,
"name": "Serie B"
},
"dm5ka0os1e3dxcp3vh05kmp33": {
"label": "low",
"bet_roi": -7.4,
"bet_n": 46,
"hit": 26.1,
"name": "Ligue 1"
},
"4zwgbb66rif2spcoeeol2motx": {
"label": "low",
"bet_roi": -12.7,
"bet_n": 39,
"hit": 23.1,
"name": "Pro Lig"
},
"a4fgj2rfbpf4ejo1qi624fefo": {
"label": "low",
"bet_roi": -14.2,
"bet_n": 73,
"hit": 17.8,
"name": "3. Lig"
},
"9chuiarcjofld1dkj9kysehmb": {
"label": "low",
"bet_roi": -14.9,
"bet_n": 22,
"hit": 13.6,
"name": "Superettan"
},
"3p81ltz6845appgkbgkzxueii": {
"label": "low",
"bet_roi": -19.8,
"bet_n": 34,
"hit": 14.7,
"name": "2. Lig"
},
"dvstmwnvw0mt5p38twn9yttyb": {
"label": "low",
"bet_roi": -37.2,
"bet_n": 19,
"hit": 26.3,
"name": "Veikkausliiga"
},
"zs18qaehvhg3w1208874zvfa": {
"label": "low",
"bet_roi": -62.0,
"bet_n": 17,
"hit": 23.5,
"name": "1. Lig"
}
}
}
+726
View File
@@ -0,0 +1,726 @@
{
"version": "v1",
"description": "Per-league odds reliability scores computed from Brier Score analysis",
"min_matches_threshold": 50,
"total_leagues": 265,
"default_reliability": 0.35,
"lookup": {
"bx57cmq1edfq53ckfk791supi": 0.9476,
"55hcphd1ccc6eai1ms77460on": 0.9445,
"d9eaigzyfnfiraqc3ius757tl": 0.9402,
"1gxlzw2ezkyeykhcaa5x8ozkk": 0.9259,
"5jd0k2txwnq69frs79eulba8j": 0.9233,
"6694fff47wqxl10lrd9tb91f8": 0.9193,
"4jg7he1n3rb5dniq6hf49xorq": 0.9061,
"59tpnfrwnvhnhzmnvfyug68hj": 0.8988,
"ac42gi3penartj88fe9l6plpk": 0.8937,
"3j81qr7yc4gdnakfwnxf95ovh": 0.8771,
"9z5643nd06afqu01ea2wt8y4g": 0.8734,
"482ofyysbdbeoxauk19yg7tdt": 0.8722,
"ahl3vljaignq9ebaos4uqkrvo": 0.8696,
"8x3sbh85gc8qir50utw39jl04": 0.865,
"agpweohvn9tugnyl6ry4rhivp": 0.8428,
"4c1nfi2j1m731hcay25fcgndq": 0.8425,
"1j4ehtrbry9depwt6oghaq3lu": 0.8299,
"40yjcbx2sq6oq736iqqqczwt1": 0.8237,
"145hkd59i6foieuwr4mwi6wlq": 0.823,
"34pl8szyvrbwcmfkuocjm3r6t": 0.8227,
"cse5oqqt2pzfcy8uz6yz3tkbj": 0.8212,
"zs18qaehvhg3w1208874zvfa": 0.8176,
"57nu0wygurzkp6fuy5hhrtaa2": 0.8099,
"1eruend45vd20g9hbrpiggs5u": 0.8083,
"595nsvo7ykvoe690b1e4u5n56": 0.7987,
"6vq8j5p3av14nr3iuyi4okhjt": 0.793,
"486rhdgz7yc0sygziht7hje65": 0.7901,
"9hh6n2f84k31zmlcxyvmc1w2y": 0.789,
"3n5046abeu3x482ds3jwda238": 0.7863,
"8yi6ejjd1zudcqtbn07haahg6": 0.7752,
"byhmntnl1b4lxw0zz21im3zkd": 0.7719,
"2bmwykmdlcc2u1c40ytoc39vy": 0.7668,
"82jkgccg7phfjpd0mltdl3pat": 0.7643,
"2nttcoriwf5co73vmz1vr8frm": 0.7641,
"dr2xk7muj8aqcjdz2b3li1c0k": 0.759,
"4yngyfinzd6bb1k7anqtqs0wt": 0.7586,
"eog6knrkfei68si736fpquyzc": 0.756,
"eg6s9f1jj7jr6stmbosn0g6c8": 0.7538,
"ae1wva3zrzcp2zd15gpvsntg6": 0.7517,
"cesdwwnxbc5fmajgroc0hqzy2": 0.7466,
"8k1xcsyvxapl4jlsluh3eomre": 0.7463,
"bdtat25m14jy85y484z3e6lf": 0.7437,
"iu1vi94p4p28oozl1h9bvplr": 0.7411,
"1r097lpxe0xn03ihb7wi98kao": 0.7391,
"2kwbbcootiqqgmrzs6o5inle5": 0.7386,
"9fuwphq8kvugrlc3ckm7k8wes": 0.7358,
"civf31q1inxohs4a03y8reetf": 0.735,
"ili150pwfuf39f7yfdch9lhw": 0.7286,
"abs7n2ae3oydilk0tgmpnsj89": 0.7277,
"9nbpdi9q3ywcm4q0j5u0ekwcq": 0.7254,
"6by3h89i2eykc341oz7lv1ddd": 0.7252,
"4qehj8hfxmy6o2ohp4fxinnzo": 0.7244,
"9u4pm8x0lfmfq3r0pypmrls71": 0.7244,
"c7b8o53flg36wbuevfzy3lb10": 0.7144,
"89ovpy1rarewwzqvi30bfdr8b": 0.7068,
"4d5d3sf6805n5u6jdoa0hdlog": 0.7052,
"eqz64pn0qsp2y7aq4m9id3fn6": 0.7031,
"8q60vlvn3krynkob6igrncdjq": 0.703,
"6ihotpaocgiovlxw18e9r9prx": 0.7019,
"c0r21rtokgnbtc0o2rldjmkxu": 0.7013,
"1mpjd0vbxbtu9zw89yj09xk3z": 0.6996,
"4zwgbb66rif2spcoeeol2motx": 0.6995,
"bu1l7ckihyr0errxw61p0m05": 0.6995,
"cv3tuitw3ho3v0opjjxpn83b9": 0.6974,
"8r98daokeuzsamu5fmjtblqx5": 0.6922,
"dvstmwnvw0mt5p38twn9yttyb": 0.688,
"8y29fg2s85ppcb8uugm5ee8s4": 0.6866,
"19q13y6ruzo0o84ipblcuouzs": 0.6858,
"f4jc2cc5nq7flaoptpi5ua4k4": 0.6852,
"4oogyu6o156iphvdvphwpck10": 0.684,
"3e40pestup9xzagsu2o6c0i8u": 0.6824,
"4rls982p5uzil6x30mhyhv9f3": 0.6812,
"e21cf135btr8t3upw0vl6n6x0": 0.6771,
"65q4uwm6ol1rkf5dp89m8omny": 0.6754,
"46b141eaqq9q7o4gz5gtdpikk": 0.6752,
"75i269i1ak43magshljadydrh": 0.6741,
"3ab1uwtoyjopdj1y1fynyy9jg": 0.6737,
"4mbfidy8zum5u0aqjqo0vuqs2": 0.673,
"7wssxdqi4xihseeam8grqa2b8": 0.666,
"61fzfjogstjuukzcehighq7mu": 0.6641,
"6g8hw3acenrw828la7gwx4mvs": 0.663,
"e1kxdivp5g4cpldgpwvnzl1vv": 0.6626,
"9ikchyu9fb8bvx0s673jofj6s": 0.6622,
"a9vrdkelbgif0gtu3wxsr75xo": 0.6618,
"6sxm2iln2w45ux498pty9miw8": 0.6615,
"ea0h6cf3bhl698hkxhpulh2zz": 0.661,
"apdwh753fupxheygs8seahh7x": 0.6604,
"er5745q30wnr8jv9nr863omzg": 0.659,
"2z7257m7hj58zuxcjrsg4erzc": 0.6551,
"2o9svokc5s7diish3ycrzk7jm": 0.655,
"8usjlmziv3p2re0r2wwzezki9": 0.6549,
"c0yqkbilbbg70ij2473xymmqv": 0.6506,
"du6jsenbjql5e8f3yk880ox4g": 0.6494,
"cbdbziaqczfuyuwqsylqi26zd": 0.6478,
"725gd73msyt08xm76v7gkxj7u": 0.6445,
"enzlj1as2raqm4ids1zyb07y1": 0.6442,
"scf9p4y91yjvqvg5jndxzhxj": 0.6414,
"5z8v4mj6cjs9ex6hdrpourjzh": 0.6389,
"4zwjlzdszduqmxzusysvzymms": 0.6387,
"7nmz249q89qg5ezcvzlheljji": 0.6381,
"2mdmx668tyhy4u4z9zszwjv5v": 0.6345,
"4a7o9rf7ytl8g3ejwpblc6p5n": 0.6306,
"2ty8ihceabty8yddmu31iuuej": 0.6283,
"dy8zaksw5e9nwrs1p5ss4o1nu": 0.628,
"1b70m6qtxrp75b4vtk8hxh8c3": 0.6261,
"ajxs0e0g6ryg5ol8qvw3evrcz": 0.6249,
"a4fgj2rfbpf4ejo1qi624fefo": 0.6184,
"akmkihra9ruad09ljapsm84b3": 0.6182,
"907l7wtxdvugdo9i2249wcmr0": 0.6171,
"6lwpjhktjhl9g7x2w7njmzva6": 0.6164,
"ax1yf4nlzqpcji4j8epdgx3zl": 0.6163,
"6ybvtzejh91761lqe7y1csrqo": 0.6158,
"3btdfgw79qiz3jmyfudovtbu2": 0.6122,
"5cwsxtx37les6m10xj71htkgf": 0.6101,
"9p3nnxhdjahfn8qswpzy8oyc3": 0.61,
"2xg0qvif1rh7du6wmk2eleku3": 0.6091,
"1wwro3z1eb3fl601dju6inlc6": 0.6084,
"gfskxsdituog2kqp9yiu7bzi": 0.6076,
"zilopfej2h0n3vpan5tcynpo": 0.6051,
"2hsidwomhjsaaytdy9u5niyi4": 0.6012,
"1klyfth8tl6lu6ra7k8zmy2n2": 0.5996,
"cegl2ivkc25blcatxp4jmk1ec": 0.5993,
"7qf0jaayyxy3ruamsexv5p1kl": 0.5988,
"erpufio3qaujd9gkszcqvb0bf": 0.5972,
"cfesxhzb83yl8b779uv3revz1": 0.597,
"3ww12jab49q8q8mk9avdwjqgk": 0.5961,
"8t2o4huu2e48ij23dxnl9w5qx": 0.5928,
"5vq1bl8h8dxdr34w0jaanokto": 0.5919,
"ac112osli9fvox1epcg4ld3t6": 0.59,
"3frp1zxrqulrlrnk503n6l4l": 0.5808,
"c76z5d6j7dpi1e79tm8fpm39z": 0.5807,
"6ifaeunfdelecgticvxanikzu": 0.5796,
"81txfenlgw75nq3u2nfdkj92o": 0.5789,
"yv73ms6v1995b5wny16jcfi3": 0.5787,
"b3ufcd24wfnnd5j98ped6irfu": 0.5752,
"29actv1ohj8r10kd9hu0jnb0n": 0.5737,
"bfqezwfhot1l3p1cpk4oonh25": 0.5705,
"5taraea6mqjjldg9zxswo825y": 0.5696,
"7qdv1xae7ikfe8dft3oj29yqc": 0.5692,
"dm5ka0os1e3dxcp3vh05kmp33": 0.5678,
"ay4u6j7lfkcg7x21mx5q121j": 0.5676,
"7af85xa75vozt2l4hzi6ryts7": 0.5663,
"5k620c7y6dlbmcm88dt3eb7t": 0.5644,
"ejunkmfhjz9weugd2bqrkgobb": 0.564,
"3428tckxcirwwh3o3jgc1m8ji": 0.5597,
"d6zovb8puwgcmsg91iya6rbtm": 0.5593,
"2wolc27r8z03itcvwp43e38c5": 0.5592,
"alpfd99yd3lfv7bhjo0biuq7b": 0.5582,
"beqqnubkv05mamuwvimeum015": 0.5577,
"4w7x0s5gfs5abasphlha5de8k": 0.5558,
"9ynnnx1qmkizq1o3qr3v0nsuk": 0.554,
"722fdbecxzcq9788l6jqclzlw": 0.5539,
"287tckirbfj9nb8ar2k9r60vn": 0.5529,
"esrunz7rjb0td98mx9e5cedoy": 0.5516,
"32n2r9bl6x90psj0wa7bfs6vq": 0.5487,
"50ap4sua1xyut3mpu7ehesp63": 0.5483,
"5c96g1zm7vo5ons9c42uy2w3r": 0.5469,
"3p81ltz6845appgkbgkzxueii": 0.5454,
"3n9mk5b2mxmq831wfmv6pu86i": 0.5437,
"5zr0b05eyx25km7z1k03ca9jx": 0.5424,
"1owhvvge4wlx7e0e431b4vhqx": 0.5423,
"3iwftmprsznl6yribr11a8l9m": 0.5393,
"7r1f93t6ddrsa5n8v1nq6qlzm": 0.5393,
"1gwajyt0pk2jm5fx5mu36v114": 0.5389,
"581t4mywybx21wcpmpykhyzr3": 0.5388,
"6wubmo7di3kdpflluf6s8c7vs": 0.5375,
"bq89wbdvedtov6auzuh6rsv7s": 0.5363,
"byu00jvt1j6csyv4y1lkt2fm2": 0.5359,
"af79lqrc0ntom74zq13ccjslo": 0.5357,
"3ri6juw2w6ma0jezszdlv1uqm": 0.5356,
"3l29w00m506ex93t5bbh9cg2a": 0.5355,
"1zp1du9n4rj36p1ss9zbxtqfb": 0.5353,
"9chuiarcjofld1dkj9kysehmb": 0.5346,
"5aw6uyw4pz2bpj24t5z8aacim": 0.5333,
"by5nibd18nkt40t0j8a0j5yzx": 0.5332,
"4yzidekywejmxxp77gqmdgopg": 0.5323,
"7ntvbsyq31jnzoqoa8850b9b8": 0.5305,
"a7247po5qs29o3zsfmt222ydu": 0.5299,
"117yqo02rs8dykkxpm274w3bd": 0.5298,
"193wqkyb0v5jnsblhvd2ocmyo": 0.5296,
"8jh0jejuxfhrpawnoztz2jlv4": 0.5295,
"5y0z0l2epprzbscvzsgldw8vu": 0.5288,
"47s2kt0e8m444ftqvsrqa3bvq": 0.5268,
"2hj3286pqov1g1g59k2t2qcgm": 0.5245,
"7swf4kpu3v38i2it4h94c5s9k": 0.5227,
"78wml3z5wrfxe5iky50tiotgu": 0.5196,
"f39uq10c8xhg5e6rwwcf6lhgc": 0.5186,
"bbajzna018c79opa1kl5kmkqo": 0.5172,
"4davonpqws4a4ejl1awu98zdg": 0.5168,
"1fedahp0rws09tj451onten8r": 0.5163,
"aho73e5udydy96iun3tkzdzsi": 0.5149,
"3aa4mumjl6zyetg6o9hwd5hhx": 0.5125,
"7cwemnr3vi40znjq451zxkus6": 0.5115,
"ajm86skyzse4ym8g6fpgzncxa": 0.5112,
"bgen5kjer2ytfp7lo9949t72g": 0.5102,
"8ey0ww2zsosdmwr8ehsorh6t7": 0.51,
"8najqkluatpaxvqws78b9s17c": 0.5082,
"8v97rcbthsxmzqk4ufxws9mug": 0.506,
"degxm4y6gmvp011ccyrev6z5p": 0.5049,
"3oa9e03e7w9nr8kqwqc3tlqz9": 0.5049,
"5dycj9wdhxh3n33qubw18ohlk": 0.5036,
"3is4bkgf3loxv9qfg3hm8zfqb": 0.5033,
"f47f3717z2vtpxfxrpdd4jl1x": 0.498,
"8ivsfwex4dfx1tvgsiq8askcx": 0.4972,
"8vbck9a4mxjms783lf72779uu": 0.4946,
"aql5z4osw5wmun0emnakfpwji": 0.4946,
"e6vzdkz6l236s9p288mharefy": 0.4925,
"4nidzmunvpvxk1ir9b6m8mpay": 0.4874,
"ein4fkggto3pdh5msp8huafiq": 0.4856,
"1q4ab2bpg5e8jl1g2udnakrju": 0.4852,
"8ztsv3pzrsyq5w1r3a0nfk1y5": 0.4842,
"1qd0wvt30rlswa4g6nu4na660": 0.4826,
"jznihqxle06xych9ygwiwnsa": 0.4796,
"2y8bntiif3a9y6gtmauv30gt": 0.4782,
"477yyajzheg2z8u7uick0e13e": 0.4706,
"bockl24qpr7ryjl8b6obukga": 0.4671,
"7mxwwunvot2pi69pj1yr1kh8i": 0.466,
"3w1hkk9k9gr8fwssyn4icvdfo": 0.4657,
"1txej2dzohnydl21zc9pgx6hy": 0.464,
"b8rae0ib0frjmwlca429bq19q": 0.4624,
"b5udgm9vakjqz8dcmy5b2g0xt": 0.4582,
"eitf7hulqfv1clb7toewkil24": 0.458,
"7hl0svs2hg225i2zud0g3xzp2": 0.4559,
"2aso72utuctat2ecs6nahjss6": 0.4521,
"3ymqchdzk8tt6lfphf26xfvh0": 0.4519,
"2yyjcbbryf1r10apyzl7c7jvp": 0.4507,
"bly7ema5au6j40i0grhl0pnub": 0.4476,
"b1rveez5u792gess9w3e7v5le": 0.4444,
"8sdpk4aerruf515yh76ezo7vi": 0.4434,
"32vph7vcjqgo1ksj1548di90n": 0.44,
"65ggsqdi6drpa4m8y3gkll25k": 0.4394,
"xaouuwuk8qyhv1libkeexwjh": 0.4347,
"6qitd9h242qkvjenaytfdnsf2": 0.4312,
"duuc1qczfnawwncru1ly6o66": 0.4213,
"b60nisd3qn427jm0hrg9kvmab": 0.4203,
"xwnjb1az11zffwty3m6vn8y6": 0.4197,
"dkarmrybx9vx10rg7cywumth0": 0.4158,
"75434tz9rc14xkkvudex742ui": 0.4137,
"c1d9p6b2e9zr5tqlzx3ktjplg": 0.4129,
"b73zounsynk9d3u1p9nvpu7i2": 0.4049,
"913mb508il6jzwtlj28fl892h": 0.4044,
"e0lck99w8meo9qoalfrxgo33o": 0.401,
"8dn0w8zh7nbn2i904603eigwf": 0.3984,
"ddyrh5latwfhesgfh4w401n92": 0.3973,
"avs3xposm3t9x1x2vzsoxzcbu": 0.3957,
"eu2g5j36zzxiazpd729osx0wm": 0.3924,
"67uya58idol2eq18ljecsru5o": 0.3912,
"23e698ls3x6vi9x8wl0mz7bsa": 0.3838,
"6321dlqv4ziuwqte4xpohijtw": 0.382,
"8o5tv5viv4hy1qg9jp94k7ayb": 0.381,
"53tknno09wqihmwxrqcuwq9sa": 0.3782,
"82wo38rqeizxlfjjhfjy4rx7u": 0.3781,
"dvtl8sf1262pd2aqgu641qa7u": 0.3767,
"663a54fmymndjeev47qm7d3nf": 0.3522,
"macko16888165594668885588": 0.3309,
"macko16698982162572521585": 0.3262,
"6lkj3o21cr4g7bql6tb3fk222": 0.3261,
"cu0rmpyff5692eo06ltddjo8a": 0.3161,
"1cnx2c8g3hhp8ssxnwwli0mjb": 0.3121,
"4vt0ldrcl6thpxpcs8zmpdq1g": 0.2926,
"etta63x1t7tnkn4jheisjwk4p": 0.2907,
"1n9l0ex47bu0762qg574hzjtd": 0.2626,
"6jgwiu2gq3dllmrwt45pfdn2z": 0.2416,
"392slbmf1kdqlr6sd1ckt71rs": 0.24,
"8z3180hhw2pj1i65uftlk54uz": 0.2096
},
"details": [
{
"league_id": "bx57cmq1edfq53ckfk791supi",
"league_name": "CAF Konfederasyon Kupası",
"match_count": 98,
"brier_score": 0.3046,
"heavy_fav_win_pct": 84.1,
"fav_win_pct": 63.3,
"odds_reliability": 0.9476
},
{
"league_id": "55hcphd1ccc6eai1ms77460on",
"league_name": "Şampiyonlar Ligi Kadınlar",
"match_count": 89,
"brier_score": 0.3258,
"heavy_fav_win_pct": 83.3,
"fav_win_pct": 74.2,
"odds_reliability": 0.9445
},
{
"league_id": "d9eaigzyfnfiraqc3ius757tl",
"league_name": "Kupa",
"match_count": 78,
"brier_score": 0.3141,
"heavy_fav_win_pct": 81.2,
"fav_win_pct": 73.1,
"odds_reliability": 0.9402
},
{
"league_id": "1gxlzw2ezkyeykhcaa5x8ozkk",
"league_name": "Concacaf Orta Amerika Kupası",
"match_count": 88,
"brier_score": 0.3338,
"heavy_fav_win_pct": 79.4,
"fav_win_pct": 61.4,
"odds_reliability": 0.9259
},
{
"league_id": "5jd0k2txwnq69frs79eulba8j",
"league_name": "Kupa",
"match_count": 69,
"brier_score": 0.3223,
"heavy_fav_win_pct": 78.4,
"fav_win_pct": 66.7,
"odds_reliability": 0.9233
},
{
"league_id": "6694fff47wqxl10lrd9tb91f8",
"league_name": "Kupa",
"match_count": 55,
"brier_score": 0.3099,
"heavy_fav_win_pct": 78.8,
"fav_win_pct": 67.3,
"odds_reliability": 0.9193
},
{
"league_id": "4jg7he1n3rb5dniq6hf49xorq",
"league_name": "Premier Lig",
"match_count": 79,
"brier_score": 0.3333,
"heavy_fav_win_pct": 77.1,
"fav_win_pct": 64.6,
"odds_reliability": 0.9061
},
{
"league_id": "59tpnfrwnvhnhzmnvfyug68hj",
"league_name": "Libertadores Kupası",
"match_count": 180,
"brier_score": 0.3408,
"heavy_fav_win_pct": 76.2,
"fav_win_pct": 61.7,
"odds_reliability": 0.8988
},
{
"league_id": "ac42gi3penartj88fe9l6plpk",
"league_name": "Premier Lig",
"match_count": 185,
"brier_score": 0.3148,
"heavy_fav_win_pct": 70.7,
"fav_win_pct": 68.1,
"odds_reliability": 0.8937
},
{
"league_id": "3j81qr7yc4gdnakfwnxf95ovh",
"league_name": "Premier Lig",
"match_count": 106,
"brier_score": 0.333,
"heavy_fav_win_pct": 72.2,
"fav_win_pct": 60.4,
"odds_reliability": 0.8771
},
{
"league_id": "9z5643nd06afqu01ea2wt8y4g",
"league_name": "Kuu Bara Ligi",
"match_count": 110,
"brier_score": 0.3294,
"heavy_fav_win_pct": 70.3,
"fav_win_pct": 53.6,
"odds_reliability": 0.8734
},
{
"league_id": "482ofyysbdbeoxauk19yg7tdt",
"league_name": "Trendyol Süper Lig",
"match_count": 342,
"brier_score": 0.3627,
"heavy_fav_win_pct": 80.7,
"fav_win_pct": 59.6,
"odds_reliability": 0.8722
},
{
"league_id": "ahl3vljaignq9ebaos4uqkrvo",
"league_name": "Kupa",
"match_count": 105,
"brier_score": 0.331,
"heavy_fav_win_pct": 70.4,
"fav_win_pct": 63.8,
"odds_reliability": 0.8696
},
{
"league_id": "8x3sbh85gc8qir50utw39jl04",
"league_name": "UEFA Kadınlar Euro 2025 Elemeleri",
"match_count": 88,
"brier_score": 0.3421,
"heavy_fav_win_pct": 75.5,
"fav_win_pct": 61.4,
"odds_reliability": 0.865
},
{
"league_id": "agpweohvn9tugnyl6ry4rhivp",
"league_name": "Eredivisie Kadınlar",
"match_count": 51,
"brier_score": 0.3356,
"heavy_fav_win_pct": 72.0,
"fav_win_pct": 56.9,
"odds_reliability": 0.8428
},
{
"league_id": "4c1nfi2j1m731hcay25fcgndq",
"league_name": "Avrupa Ligi",
"match_count": 242,
"brier_score": 0.3625,
"heavy_fav_win_pct": 77.6,
"fav_win_pct": 61.6,
"odds_reliability": 0.8425
},
{
"league_id": "1j4ehtrbry9depwt6oghaq3lu",
"league_name": "Süper Lig",
"match_count": 84,
"brier_score": 0.3201,
"heavy_fav_win_pct": 65.9,
"fav_win_pct": 60.7,
"odds_reliability": 0.8299
},
{
"league_id": "40yjcbx2sq6oq736iqqqczwt1",
"league_name": "DK Elemeler",
"match_count": 88,
"brier_score": 0.3383,
"heavy_fav_win_pct": 68.6,
"fav_win_pct": 55.7,
"odds_reliability": 0.8237
},
{
"league_id": "145hkd59i6foieuwr4mwi6wlq",
"league_name": "Pro Lig",
"match_count": 143,
"brier_score": 0.3546,
"heavy_fav_win_pct": 73.8,
"fav_win_pct": 60.1,
"odds_reliability": 0.823
},
{
"league_id": "34pl8szyvrbwcmfkuocjm3r6t",
"league_name": "LaLiga",
"match_count": 364,
"brier_score": 0.3773,
"heavy_fav_win_pct": 80.2,
"fav_win_pct": 56.6,
"odds_reliability": 0.8227
},
{
"league_id": "cse5oqqt2pzfcy8uz6yz3tkbj",
"league_name": "CAF Şampiyonlar Ligi",
"match_count": 91,
"brier_score": 0.3513,
"heavy_fav_win_pct": 73.9,
"fav_win_pct": 57.1,
"odds_reliability": 0.8212
},
{
"league_id": "zs18qaehvhg3w1208874zvfa",
"league_name": "1. Lig",
"match_count": 225,
"brier_score": 0.3744,
"heavy_fav_win_pct": 82.1,
"fav_win_pct": 59.6,
"odds_reliability": 0.8176
},
{
"league_id": "57nu0wygurzkp6fuy5hhrtaa2",
"league_name": "1. Lig",
"match_count": 286,
"brier_score": 0.3626,
"heavy_fav_win_pct": 72.9,
"fav_win_pct": 59.1,
"odds_reliability": 0.8099
},
{
"league_id": "1eruend45vd20g9hbrpiggs5u",
"league_name": "Botola Pro",
"match_count": 265,
"brier_score": 0.3625,
"heavy_fav_win_pct": 72.9,
"fav_win_pct": 50.2,
"odds_reliability": 0.8083
},
{
"league_id": "595nsvo7ykvoe690b1e4u5n56",
"league_name": "UEFA Uluslar Ligi",
"match_count": 67,
"brier_score": 0.3687,
"heavy_fav_win_pct": 83.3,
"fav_win_pct": 50.7,
"odds_reliability": 0.7987
},
{
"league_id": "6vq8j5p3av14nr3iuyi4okhjt",
"league_name": "Süper Lig Kadınlar",
"match_count": 70,
"brier_score": 0.356,
"heavy_fav_win_pct": 73.5,
"fav_win_pct": 58.6,
"odds_reliability": 0.793
},
{
"league_id": "486rhdgz7yc0sygziht7hje65",
"league_name": "Kupa",
"match_count": 62,
"brier_score": 0.3704,
"heavy_fav_win_pct": 81.1,
"fav_win_pct": 66.1,
"odds_reliability": 0.7901
},
{
"league_id": "9hh6n2f84k31zmlcxyvmc1w2y",
"league_name": "2. Lig",
"match_count": 204,
"brier_score": 0.357,
"heavy_fav_win_pct": 69.2,
"fav_win_pct": 62.3,
"odds_reliability": 0.789
},
{
"league_id": "3n5046abeu3x482ds3jwda238",
"league_name": "WE Lig Kadınlar",
"match_count": 102,
"brier_score": 0.3761,
"heavy_fav_win_pct": 85.4,
"fav_win_pct": 58.8,
"odds_reliability": 0.7863
},
{
"league_id": "8yi6ejjd1zudcqtbn07haahg6",
"league_name": "Premier Lig",
"match_count": 302,
"brier_score": 0.3712,
"heavy_fav_win_pct": 72.1,
"fav_win_pct": 56.3,
"odds_reliability": 0.7752
},
{
"league_id": "byhmntnl1b4lxw0zz21im3zkd",
"league_name": "Kupa",
"match_count": 96,
"brier_score": 0.3528,
"heavy_fav_win_pct": 68.2,
"fav_win_pct": 58.3,
"odds_reliability": 0.7719
},
{
"league_id": "2bmwykmdlcc2u1c40ytoc39vy",
"league_name": "Açık Kupası",
"match_count": 93,
"brier_score": 0.3807,
"heavy_fav_win_pct": 84.6,
"fav_win_pct": 66.7,
"odds_reliability": 0.7668
},
{
"league_id": "82jkgccg7phfjpd0mltdl3pat",
"league_name": "Süper Lig",
"match_count": 289,
"brier_score": 0.3782,
"heavy_fav_win_pct": 74.0,
"fav_win_pct": 57.4,
"odds_reliability": 0.7643
},
{
"league_id": "2nttcoriwf5co73vmz1vr8frm",
"league_name": "Nesine 2. Lig",
"match_count": 525,
"brier_score": 0.3782,
"heavy_fav_win_pct": 71.8,
"fav_win_pct": 55.2,
"odds_reliability": 0.7641
},
{
"league_id": "dr2xk7muj8aqcjdz2b3li1c0k",
"league_name": "Meistaradeildin",
"match_count": 129,
"brier_score": 0.3714,
"heavy_fav_win_pct": 73.6,
"fav_win_pct": 61.2,
"odds_reliability": 0.759
},
{
"league_id": "4yngyfinzd6bb1k7anqtqs0wt",
"league_name": "Premier Lig",
"match_count": 195,
"brier_score": 0.3772,
"heavy_fav_win_pct": 74.4,
"fav_win_pct": 57.4,
"odds_reliability": 0.7586
},
{
"league_id": "eog6knrkfei68si736fpquyzc",
"league_name": "Lig Kupası",
"match_count": 120,
"brier_score": 0.3632,
"heavy_fav_win_pct": 69.9,
"fav_win_pct": 66.7,
"odds_reliability": 0.756
},
{
"league_id": "eg6s9f1jj7jr6stmbosn0g6c8",
"league_name": "Süper Lig",
"match_count": 108,
"brier_score": 0.3657,
"heavy_fav_win_pct": 71.2,
"fav_win_pct": 55.6,
"odds_reliability": 0.7538
},
{
"league_id": "ae1wva3zrzcp2zd15gpvsntg6",
"league_name": "Ulusal Lig",
"match_count": 278,
"brier_score": 0.3783,
"heavy_fav_win_pct": 72.7,
"fav_win_pct": 55.0,
"odds_reliability": 0.7517
},
{
"league_id": "cesdwwnxbc5fmajgroc0hqzy2",
"league_name": "Hazırlık Maçları Ülkeler",
"match_count": 235,
"brier_score": 0.3669,
"heavy_fav_win_pct": 67.6,
"fav_win_pct": 56.2,
"odds_reliability": 0.7466
},
{
"league_id": "8k1xcsyvxapl4jlsluh3eomre",
"league_name": "Premier Lig",
"match_count": 328,
"brier_score": 0.385,
"heavy_fav_win_pct": 74.2,
"fav_win_pct": 45.7,
"odds_reliability": 0.7463
},
{
"league_id": "bdtat25m14jy85y484z3e6lf",
"league_name": "Kupa",
"match_count": 90,
"brier_score": 0.3772,
"heavy_fav_win_pct": 75.7,
"fav_win_pct": 55.6,
"odds_reliability": 0.7437
},
{
"league_id": "iu1vi94p4p28oozl1h9bvplr",
"league_name": "1. Lig",
"match_count": 158,
"brier_score": 0.3729,
"heavy_fav_win_pct": 71.2,
"fav_win_pct": 50.0,
"odds_reliability": 0.7411
},
{
"league_id": "1r097lpxe0xn03ihb7wi98kao",
"league_name": "Serie A",
"match_count": 359,
"brier_score": 0.3732,
"heavy_fav_win_pct": 67.8,
"fav_win_pct": 56.5,
"odds_reliability": 0.7391
},
{
"league_id": "2kwbbcootiqqgmrzs6o5inle5",
"league_name": "Premier Lig",
"match_count": 369,
"brier_score": 0.3791,
"heavy_fav_win_pct": 70.2,
"fav_win_pct": 54.2,
"odds_reliability": 0.7386
},
{
"league_id": "9fuwphq8kvugrlc3ckm7k8wes",
"league_name": "Ligler Kupası",
"match_count": 143,
"brier_score": 0.3934,
"heavy_fav_win_pct": 81.6,
"fav_win_pct": 50.3,
"odds_reliability": 0.7358
},
{
"league_id": "civf31q1inxohs4a03y8reetf",
"league_name": "Premier Lig",
"match_count": 320,
"brier_score": 0.3721,
"heavy_fav_win_pct": 67.2,
"fav_win_pct": 57.8,
"odds_reliability": 0.735
},
{
"league_id": "ili150pwfuf39f7yfdch9lhw",
"league_name": "UEFA U21 Şampiyonası Elemeler",
"match_count": 112,
"brier_score": 0.3715,
"heavy_fav_win_pct": 70.4,
"fav_win_pct": 67.9,
"odds_reliability": 0.7286
},
{
"league_id": "abs7n2ae3oydilk0tgmpnsj89",
"league_name": "Azadegan Ligi",
"match_count": 217,
"brier_score": 0.3801,
"heavy_fav_win_pct": 71.4,
"fav_win_pct": 45.2,
"odds_reliability": 0.7277
},
{
"league_id": "9nbpdi9q3ywcm4q0j5u0ekwcq",
"league_name": "Serie D",
"match_count": 232,
"brier_score": 0.3718,
"heavy_fav_win_pct": 67.2,
"fav_win_pct": 54.7,
"odds_reliability": 0.7254
}
]
}
+31
View File
@@ -0,0 +1,31 @@
{
"_meta": {
"purpose": "A-milli erkek futbol ligleri. betting_brain milli-maç gate'i bu listeyle tetiklenir.",
"strategy": "Milli maçta SADECE MS, oran 4.0-7.0, Hazırlık+Eleme oynanabilir. Turnuva/diğer market analiz-only. Backtest: +17% ROI, kararlılık-test geçti (eski/yeni yarı +22/+24%).",
"source": "2300-maç milli backtest (multi_backtest_20260602) segment+grid+stability analizi",
"competition_type_rule": "lig adı: 'hazırlık'->HAZIRLIK | 'eleme'/'play-off'->ELEME | diğer->TURNUVA"
},
"league_ids": [
"cesdwwnxbc5fmajgroc0hqzy2",
"3aa4mumjl6zyetg6o9hwd5hhx",
"40yjcbx2sq6oq736iqqqczwt1",
"39q1hq42hxjfylxb7xpe9bvf9",
"cu0rmpyff5692eo06ltddjo8a",
"ax1yf4nlzqpcji4j8epdgx3zl",
"1gxlzw2ezkyeykhcaa5x8ozkk",
"gfskxsdituog2kqp9yiu7bzi",
"595nsvo7ykvoe690b1e4u5n56",
"68zplepppndhl8bfdvgy9vgu1",
"3a0j0giz3c3ajw9h59evv7lqt",
"emy1ibc8fu2l0fukh4vlu5xl5",
"2db0aw1duj2my9l5iey5gm6nq",
"cc5tzz23tryrfqbm2pbv0jill",
"8tddm56zbasf57jkkay4kbf11",
"2r1hqz453bn9ljzt53kdr2lwb",
"93i7thp7zi0ympyt6l8aa1r2i",
"45db8orh1qttbsqq9hqapmbit",
"ude9t6yj60lebbn356qzg4k4",
"9qzn8cs96sgtqmesa9gpfti23",
"ad8y7vdjhinfqv4wo8rod6dck"
]
}
+29
View File
@@ -0,0 +1,29 @@
"""
AI Engine V9 Feature Modules
Includes V8 features + new V9 engines (Upset, Momentum, Poisson, Context, Referee, Squad)
"""
# V20 Features
from .h2h_engine import H2HFeatureEngine, get_h2h_engine
from .elo_system import ELORatingSystem, get_elo_system
from .value_calculator import ValueCalculator, get_value_calculator
from .team_stats_engine import get_team_stats_engine
from .upset_engine import UpsetEngine, get_upset_engine
from .momentum_engine import MomentumEngine, get_momentum_engine
from .poisson_engine import PoissonEngine, get_poisson_engine
from .referee_engine import RefereeEngine, get_referee_engine
from .squad_analysis_engine import SquadAnalysisEngine, get_squad_analysis_engine
__all__ = [
'H2HFeatureEngine', 'get_h2h_engine',
'ELORatingSystem', 'get_elo_system',
'ValueCalculator', 'get_value_calculator',
'get_team_stats_engine',
'UpsetEngine', 'get_upset_engine',
'MomentumEngine', 'get_momentum_engine',
'PoissonEngine', 'get_poisson_engine',
'RefereeEngine', 'get_referee_engine',
'SquadAnalysisEngine', 'get_squad_analysis_engine',
]
+655
View File
@@ -0,0 +1,655 @@
"""
ELO Rating System V2 - Venue-Adjusted & League-Weighted
V9 Model için geliştirilmiş ELO sistemi.
V1'den Farklar:
- Lig kalitesi faktörü (Premier League vs küçük lig)
- Form decay (son maçlar daha etkili)
- Venue-adjusted ELO (ev/deplasman ayrı)
- Win probability hesaplama
"""
import os
import json
from typing import Dict, Optional, Tuple
from dataclasses import dataclass, asdict, field
from datetime import datetime
try:
import psycopg2
except ImportError:
psycopg2 = None
MODELS_DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'models')
@dataclass
class TeamELO:
"""Takım ELO profili - Geliştirilmiş"""
team_id: str
team_name: str = ""
# Ana ELO'lar
overall_elo: float = 1500.0
home_elo: float = 1500.0
away_elo: float = 1500.0
# Form ELO (son 5 maça göre)
form_elo: float = 1500.0
# Meta
matches_played: int = 0
home_matches: int = 0
away_matches: int = 0
wins: int = 0
draws: int = 0
losses: int = 0
last_updated: Optional[str] = None
# Son 5 maç formu (W/D/L sequence)
recent_form: str = ""
def win_rate(self) -> float:
if self.matches_played == 0:
return 0.0
return self.wins / self.matches_played
def to_features(self) -> Dict[str, float]:
return {
'elo_overall': self.overall_elo,
'elo_home': self.home_elo,
'elo_away': self.away_elo,
'elo_form': self.form_elo,
'elo_matches': self.matches_played,
'elo_win_rate': self.win_rate(),
}
# Lig kalitesi faktörleri (1.0 = ortalama)
LEAGUE_QUALITY = {
# Top 5 Avrupa Ligleri
"premier league": 1.15,
"premier lig": 1.15,
"la liga": 1.12,
"bundesliga": 1.10,
"serie a": 1.08,
"ligue 1": 1.05,
# Güçlü ligler
"eredivisie": 1.02,
"primeira liga": 1.02,
"süper lig": 1.00,
# Avrupa kupaları
"champions league": 1.20,
"şampiyonlar ligi": 1.20,
"europa league": 1.10,
"avrupa ligi": 1.10,
"conference league": 1.00,
# Orta ligler
"championship": 0.95,
"2. bundesliga": 0.92,
"serie b": 0.90,
"la liga 2": 0.90,
# Küçük ligler
"default": 0.85,
}
class ELORatingSystem:
"""
ELO Rating System V2 - Venue-Adjusted & League-Weighted
Yenilikler:
- Ev/Deplasman ayrı ELO takibi
- Lig kalitesi faktörü
- Form ELO (son 5 maç ağırlıklı)
- Gol farkına göre K-faktör ayarı
"""
# ELO parametreleri
K_FACTOR_BASE = 32 # Temel K faktörü
K_FACTOR_NEW_TEAM = 48 # Yeni takımlar için daha yüksek (ilk 20 maç)
HOME_ADVANTAGE = 65 # Ev sahibi avantajı (ELO cinsinden)
INITIAL_ELO = 1500
FORM_WEIGHT = 0.7 # Form ELO için son maç ağırlığı
def __init__(self):
self.ratings: Dict[str, TeamELO] = {}
self.league_cache: Dict[str, str] = {} # team_id -> league_name
self.conn = None
self._load_ratings()
def _connect_db(self):
if psycopg2 is None:
return None
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
return self.conn
except Exception as e:
print(f"[ELO] DB connection failed: {e}")
return None
def get_conn(self):
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def _load_ratings(self):
"""Rating'leri yükle — önce DB, sonra JSON fallback"""
if self._load_ratings_from_db():
return
self._load_ratings_from_json()
def _load_ratings_from_db(self) -> bool:
"""team_elo_ratings tablosundan rating'leri yükle"""
conn = self.get_conn()
if conn is None:
return False
try:
cur = conn.cursor()
cur.execute("""
SELECT ter.team_id, t.name,
ter.overall_elo, ter.home_elo, ter.away_elo,
ter.form_elo, ter.matches_played, ter.recent_form
FROM team_elo_ratings ter
LEFT JOIN teams t ON ter.team_id = t.id
""")
rows = cur.fetchall()
cur.close()
if not rows:
return False
for row in rows:
tid, name, overall, home, away, form, played, recent = row
self.ratings[str(tid)] = TeamELO(
team_id=str(tid),
team_name=name or "",
overall_elo=float(overall),
home_elo=float(home),
away_elo=float(away),
form_elo=float(form),
matches_played=int(played),
recent_form=recent or [],
)
print(f"[OK] ELO V2 ratings DB'den yuklendi ({len(self.ratings)} takim)")
return True
except Exception as e:
print(f"[WARN] ELO DB yuklenemedi, JSON'a dusuyuyor: {e}")
return False
def _load_ratings_from_json(self):
"""JSON dosyasından rating'leri yükle (fallback)"""
ratings_path = os.path.join(MODELS_DIR, 'elo_ratings_v2.json')
if os.path.exists(ratings_path):
try:
with open(ratings_path, 'r', encoding='utf-8') as f:
data = json.load(f)
for team_id, rating_data in data.items():
self.ratings[team_id] = TeamELO(**rating_data)
print(f"[OK] ELO V2 ratings JSON'dan yuklendi ({len(self.ratings)} takim)")
except Exception as e:
print(f"[WARN] ELO V2 ratings yuklenemedi: {e}")
def save_ratings(self):
"""Rating'leri kaydet"""
ratings_path = os.path.join(MODELS_DIR, 'elo_ratings_v2.json')
os.makedirs(MODELS_DIR, exist_ok=True)
data = {team_id: asdict(elo) for team_id, elo in self.ratings.items()}
with open(ratings_path, 'w', encoding='utf-8') as f:
json.dump(data, f, indent=2, ensure_ascii=False)
print(f"💾 ELO V2 ratings kaydedildi ({len(self.ratings)} takım)")
def get_or_create_rating(self, team_id: str, team_name: str = "") -> TeamELO:
"""Takımın ELO'sunu getir veya oluştur"""
if team_id not in self.ratings:
self.ratings[team_id] = TeamELO(team_id=team_id, team_name=team_name)
return self.ratings[team_id]
def get_league_quality(self, league_name: str) -> float:
"""Lig kalitesi faktörünü döndür"""
if not league_name:
return LEAGUE_QUALITY["default"]
league_lower = league_name.lower()
for key, quality in LEAGUE_QUALITY.items():
if key in league_lower:
return quality
return LEAGUE_QUALITY["default"]
def expected_score(self, rating_a: float, rating_b: float) -> float:
"""
A'nın B'ye karşı beklenen skoru (0-1 arası).
1 = kesin kazanır, 0.5 = eşit, 0 = kesin kaybeder
"""
return 1 / (1 + 10 ** ((rating_b - rating_a) / 400))
def get_k_factor(self, team_elo: TeamELO, goal_diff: int,
league_quality: float = 1.0) -> float:
"""
Dinamik K-faktörü hesapla.
- Yeni takımlar için yüksek (hızlı adaptasyon)
- Gol farkı yüksekse yüksek
- Kaliteli liglerde yüksek
"""
# Temel K
if team_elo.matches_played < 20:
k = self.K_FACTOR_NEW_TEAM
else:
k = self.K_FACTOR_BASE
# Gol farkı çarpanı
if goal_diff == 1:
goal_mult = 1.0
elif goal_diff == 2:
goal_mult = 1.25
elif goal_diff == 3:
goal_mult = 1.5
else:
goal_mult = 1.75 + (goal_diff - 3) * 0.1
# Lig kalitesi çarpanı
return k * goal_mult * league_quality
def update_after_match(
self,
home_id: str,
away_id: str,
home_goals: int,
away_goals: int,
home_name: str = "",
away_name: str = "",
league_name: str = ""
):
"""Maç sonrası ELO güncelle"""
home_elo = self.get_or_create_rating(home_id, home_name)
away_elo = self.get_or_create_rating(away_id, away_name)
# Gerçek skor
if home_goals > away_goals:
actual_home, actual_away = 1.0, 0.0
home_elo.wins += 1
away_elo.losses += 1
result_home, result_away = 'W', 'L'
elif home_goals < away_goals:
actual_home, actual_away = 0.0, 1.0
home_elo.losses += 1
away_elo.wins += 1
result_home, result_away = 'L', 'W'
else:
actual_home, actual_away = 0.5, 0.5
home_elo.draws += 1
away_elo.draws += 1
result_home, result_away = 'D', 'D'
goal_diff = abs(home_goals - away_goals)
league_quality = self.get_league_quality(league_name)
# K faktörleri
k_home = self.get_k_factor(home_elo, goal_diff, league_quality)
k_away = self.get_k_factor(away_elo, goal_diff, league_quality)
# -- Overall ELO --
expected_home = self.expected_score(
home_elo.overall_elo + self.HOME_ADVANTAGE,
away_elo.overall_elo
)
home_elo.overall_elo += k_home * (actual_home - expected_home)
away_elo.overall_elo += k_away * (actual_away - (1 - expected_home))
# -- Venue-Specific ELO --
expected_home_venue = self.expected_score(home_elo.home_elo, away_elo.away_elo)
home_elo.home_elo += k_home * (actual_home - expected_home_venue)
away_elo.away_elo += k_away * (actual_away - (1 - expected_home_venue))
# -- Form ELO (son maçlar daha ağırlıklı) --
home_elo.form_elo = (
home_elo.form_elo * (1 - self.FORM_WEIGHT) +
(1500 + (actual_home - 0.5) * 100) * self.FORM_WEIGHT
)
away_elo.form_elo = (
away_elo.form_elo * (1 - self.FORM_WEIGHT) +
(1500 + (actual_away - 0.5) * 100) * self.FORM_WEIGHT
)
# Meta güncelle
home_elo.matches_played += 1
away_elo.matches_played += 1
home_elo.home_matches += 1
away_elo.away_matches += 1
# Son 5 form güncelle
home_elo.recent_form = (result_home + home_elo.recent_form)[:5]
away_elo.recent_form = (result_away + away_elo.recent_form)[:5]
home_elo.last_updated = datetime.now().isoformat()
away_elo.last_updated = datetime.now().isoformat()
def predict_match(self, home_id: str, away_id: str) -> Dict[str, float]:
"""
Maç için kazanma olasılıklarını tahmin et.
"""
home_elo = self.get_or_create_rating(home_id)
away_elo = self.get_or_create_rating(away_id)
# Overall bazlı
exp_home_overall = self.expected_score(
home_elo.overall_elo + self.HOME_ADVANTAGE,
away_elo.overall_elo
)
# Venue bazlı
exp_home_venue = self.expected_score(
home_elo.home_elo,
away_elo.away_elo
)
# Kombine (ortama)
home_prob = (exp_home_overall + exp_home_venue) / 2
# Draw tahmini (ELO farkı küçükse daha yüksek)
elo_diff = abs(home_elo.overall_elo - away_elo.overall_elo)
draw_base = 0.25 # Temel beraberlik oranı
draw_prob = draw_base * (1 - elo_diff / 800) # Fark arttıkça beraberlik azalır
draw_prob = max(0.15, min(draw_prob, 0.35))
# Normalize
remaining = 1 - draw_prob
home_win = home_prob * remaining
away_win = (1 - home_prob) * remaining
return {
"home_win": round(home_win, 3),
"draw": round(draw_prob, 3),
"away_win": round(away_win, 3),
}
def get_match_features(self, home_id: str, away_id: str) -> Dict[str, float]:
"""Model için ELO feature'larını döndür"""
home_elo = self.get_or_create_rating(home_id)
away_elo = self.get_or_create_rating(away_id)
probs = self.predict_match(home_id, away_id)
# Form encode (WWWDL -> sayısal)
def form_to_score(form: str) -> float:
if not form:
return 0.5
score = 0
for char in form:
if char == 'W':
score += 1
elif char == 'D':
score += 0.5
return score / max(len(form), 1)
return {
# Overall ELO
'elo_home_overall': home_elo.overall_elo,
'elo_away_overall': away_elo.overall_elo,
'elo_diff_overall': home_elo.overall_elo - away_elo.overall_elo,
# Venue-Specific ELO
'elo_home_venue': home_elo.home_elo,
'elo_away_venue': away_elo.away_elo,
'elo_diff_venue': home_elo.home_elo - away_elo.away_elo,
# Form ELO
'elo_home_form': home_elo.form_elo,
'elo_away_form': away_elo.form_elo,
'elo_diff_form': home_elo.form_elo - away_elo.form_elo,
# Win probabilities
'elo_prob_home': probs['home_win'],
'elo_prob_draw': probs['draw'],
'elo_prob_away': probs['away_win'],
# Experience
'elo_home_matches': min(home_elo.matches_played, 100),
'elo_away_matches': min(away_elo.matches_played, 100),
# Form score
'elo_home_form_score': form_to_score(home_elo.recent_form),
'elo_away_form_score': form_to_score(away_elo.recent_form),
# Win rates
'elo_home_win_rate': home_elo.win_rate(),
'elo_away_win_rate': away_elo.win_rate(),
}
def save_ratings_to_db(self):
"""Rating'leri team_elo_ratings tablosuna yaz (upsert)"""
conn = self.get_conn()
if conn is None:
print("❌ DB bağlantısı yok, DB'ye yazılamadı!")
return
cur = conn.cursor()
batch_size = 500
teams = list(self.ratings.values())
written = 0
for i in range(0, len(teams), batch_size):
batch = teams[i:i + batch_size]
values = []
for elo in batch:
values.append(cur.mogrify(
"(%s, %s, %s, %s, %s, %s, %s, NOW())",
(
elo.team_id,
round(elo.overall_elo, 2),
round(elo.home_elo, 2),
round(elo.away_elo, 2),
round(elo.form_elo, 2),
elo.matches_played,
elo.recent_form[:5],
)
).decode('utf-8'))
sql = """
INSERT INTO team_elo_ratings
(team_id, overall_elo, home_elo, away_elo, form_elo, matches_played, recent_form, updated_at)
VALUES {}
ON CONFLICT (team_id) DO UPDATE SET
overall_elo = EXCLUDED.overall_elo,
home_elo = EXCLUDED.home_elo,
away_elo = EXCLUDED.away_elo,
form_elo = EXCLUDED.form_elo,
matches_played = EXCLUDED.matches_played,
recent_form = EXCLUDED.recent_form,
updated_at = EXCLUDED.updated_at
""".format(", ".join(values))
cur.execute(sql)
written += len(batch)
conn.commit()
cur.close()
print(f"💾 DB'ye {written} takım ELO yazıldı (team_elo_ratings)")
def _load_top_league_ids(self) -> set:
"""top_leagues.json'dan lig ID'lerini oku"""
paths = [
os.path.join(os.path.dirname(__file__), '..', '..', 'top_leagues.json'),
os.path.join(os.path.dirname(__file__), '..', 'top_leagues.json'),
]
for p in paths:
if os.path.exists(p):
with open(p) as f:
ids = set(json.load(f))
print(f"📋 {len(ids)} top lig yüklendi ({os.path.basename(p)})")
return ids
print("⚠️ top_leagues.json bulunamadı — tüm maçlar yazılacak")
return set()
def calculate_all_from_history(self, sport: str = 'football'):
"""Tüm tarihsel maçlardan ELO hesapla, top ligleri match_ai_features'a yaz"""
print(f"\n🔄 {sport.upper()} için ELO V2 hesaplanıyor...")
conn = self.get_conn()
if conn is None:
print("❌ DB bağlantısı yok!")
return
top_league_ids = self._load_top_league_ids()
cur = conn.cursor()
# Tüm bitmiş maçları tarih sırasına göre al (m.id ve league_id dahil)
cur.execute("""
SELECT m.id, m.home_team_id, m.away_team_id,
m.score_home, m.score_away, m.league_id,
t1.name as home_name, t2.name as away_name,
l.name as league_name
FROM matches m
LEFT JOIN teams t1 ON m.home_team_id = t1.id
LEFT JOIN teams t2 ON m.away_team_id = t2.id
LEFT JOIN leagues l ON m.league_id = l.id
WHERE m.sport = %s
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
ORDER BY m.mst_utc ASC
""", (sport,))
matches = cur.fetchall()
print(f"📊 {len(matches):,} maç işlenecek...")
BATCH_SIZE = 1000
batch: list = []
processed = 0
written = 0
for match in matches:
(match_id, home_id, away_id, score_h, score_a,
league_id, home_name, away_name, league) = match
if not (home_id and away_id):
continue
# Sadece top ligler için pre-match ELO kaydet
if not top_league_ids or league_id in top_league_ids:
home_elo_obj = self.get_or_create_rating(home_id, home_name or "")
away_elo_obj = self.get_or_create_rating(away_id, away_name or "")
batch.append((
match_id,
home_elo_obj.overall_elo,
away_elo_obj.overall_elo,
home_elo_obj.home_elo,
away_elo_obj.away_elo,
home_elo_obj.form_elo,
away_elo_obj.form_elo,
))
# Tüm maçlar için ELO güncelle
self.update_after_match(
home_id, away_id, score_h, score_a,
home_name or "", away_name or "", league or ""
)
processed += 1
if len(batch) >= BATCH_SIZE:
self._flush_elo_batch(cur, batch, sport)
conn.commit()
written += len(batch)
batch.clear()
if processed % 10000 == 0:
print(f" İşlenen: {processed:,} / {len(matches):,}")
# Kalan batch'i yaz
if batch:
self._flush_elo_batch(cur, batch, sport)
conn.commit()
written += len(batch)
cur.close()
print(f"{processed:,} maç işlendi, {len(self.ratings)} takım")
print(f"📝 {written:,} maç match_ai_features'a yazıldı")
# JSON'a kaydet
self.save_ratings()
# DB'ye kaydet
self.save_ratings_to_db()
# Top 20 takımı göster
self._show_top_teams()
@staticmethod
def _flush_elo_batch(cur, batch: list, sport: str = 'football') -> None:
"""Batch upsert pre-match ELO values into sport-partitioned ai_features table."""
from psycopg2.extras import execute_values
table_name = 'football_ai_features' if sport == 'football' else 'basketball_ai_features'
sql = f"""
INSERT INTO {table_name}
(match_id, home_elo, away_elo,
home_home_elo, away_away_elo,
home_form_elo, away_form_elo,
calculator_ver, updated_at)
VALUES %s
ON CONFLICT (match_id) DO UPDATE SET
home_elo = EXCLUDED.home_elo,
away_elo = EXCLUDED.away_elo,
home_home_elo = EXCLUDED.home_home_elo,
away_away_elo = EXCLUDED.away_away_elo,
home_form_elo = EXCLUDED.home_form_elo,
away_form_elo = EXCLUDED.away_form_elo,
calculator_ver = EXCLUDED.calculator_ver,
updated_at = EXCLUDED.updated_at
"""
now = datetime.now().isoformat()
values = [
(mid, h_elo, a_elo, hh_elo, aa_elo, hf_elo, af_elo,
'elo_v2_backfill', now)
for mid, h_elo, a_elo, hh_elo, aa_elo, hf_elo, af_elo in batch
]
execute_values(cur, sql, values, page_size=500)
def _show_top_teams(self, n: int = 20):
"""En güçlü takımları göster"""
sorted_teams = sorted(
self.ratings.items(),
key=lambda x: x[1].overall_elo,
reverse=True
)[:n]
print(f"\n🏆 Top {n} Takım (ELO V2):")
for i, (team_id, elo) in enumerate(sorted_teams, 1):
name = elo.team_name[:25] if elo.team_name else team_id[:25]
print(f" {i:2}. {name:25}{elo.overall_elo:.0f} (H:{elo.home_elo:.0f} A:{elo.away_elo:.0f})")
# Singleton
_system = None
def get_elo_system() -> ELORatingSystem:
global _system
if _system is None:
_system = ELORatingSystem()
return _system
if __name__ == "__main__":
import sys
from pathlib import Path
# Ensure ai-engine root is on sys.path (for `from data.db import ...`)
_AI_ENGINE_ROOT = Path(__file__).resolve().parent.parent
if str(_AI_ENGINE_ROOT) not in sys.path:
sys.path.insert(0, str(_AI_ENGINE_ROOT))
system = get_elo_system()
if len(sys.argv) > 1 and sys.argv[1] == 'calculate':
system.calculate_all_from_history('football')
else:
print("\n🧪 ELO V2 Test")
print("Kullanım: python elo_system.py calculate")
print(f"\n📊 Yüklü takım sayısı: {len(system.ratings)}")
if len(system.ratings) > 0:
system._show_top_teams(10)
+990
View File
@@ -0,0 +1,990 @@
"""
Feature Extractor - V2 Betting Engine
Pulls historical team stats, ELO, missing-player impact and live odds from
PostgreSQL and engineers a leakage-free feature vector for the ensemble model.
CRITICAL: Only pre-match data (matches before the target match) is used.
Post-match stats of the target match are NEVER included.
"""
from __future__ import annotations
import json
import logging
from dataclasses import dataclass, field
from typing import Any
import numpy as np
from sqlalchemy import text
from sqlalchemy.ext.asyncio import AsyncSession
logger = logging.getLogger(__name__)
ROLLING_WINDOW: int = 5
H2H_WINDOW: int = 10
MAX_REST_DAYS: float = 14.0
@dataclass
class MatchFeatures:
"""Structured feature vector ready for the ensemble model."""
match_id: str = ""
home_team_id: str = ""
away_team_id: str = ""
# ELO & AI features
home_elo: float = 1500.0
away_elo: float = 1500.0
elo_diff: float = 0.0
missing_players_impact: float = 0.0
home_form_score: float = 0.0
away_form_score: float = 0.0
h2h_home_win_rate: float = 0.5
h2h_sample_size: int = 0
home_rest_days: float = 7.0
away_rest_days: float = 7.0
rest_diff: float = 0.0
home_lineup_availability: float = 1.0
away_lineup_availability: float = 1.0
# Rolling averages - Home (last 5 matches)
home_avg_possession: float = 50.0
home_avg_shots_on_target: float = 4.0
home_avg_total_shots: float = 10.0
home_avg_goals_scored: float = 1.3
home_avg_goals_conceded: float = 1.1
# Rolling averages - Away (last 5 matches)
away_avg_possession: float = 50.0
away_avg_shots_on_target: float = 4.0
away_avg_total_shots: float = 10.0
away_avg_goals_scored: float = 1.3
away_avg_goals_conceded: float = 1.1
# Implied probabilities from bookmaker odds
implied_prob_home: float = 0.33
implied_prob_draw: float = 0.33
implied_prob_away: float = 0.33
implied_prob_over25: float = 0.50
implied_prob_under25: float = 0.50
implied_prob_btts_yes: float = 0.50
implied_prob_btts_no: float = 0.50
# Raw decimal odds (for Edge/Kelly calculations downstream)
odds_home: float = 2.50
odds_draw: float = 3.20
odds_away: float = 2.80
odds_over25: float = 1.90
odds_under25: float = 1.90
odds_btts_yes: float = 1.85
odds_btts_no: float = 1.95
# Data quality
data_quality_score: float = 0.5
data_quality_flags: list[str] = field(default_factory=list)
# Metadata
match_name: str = ""
home_team_name: str = ""
away_team_name: str = ""
league_id: str = ""
league_name: str = ""
referee_name: str = ""
match_date_ms: int = 0
league_avg_goals: float = 2.6
referee_avg_goals: float = 2.6
referee_home_bias: float = 0.0
home_squad_strength: float = 0.5
away_squad_strength: float = 0.5
home_key_players: float = 0.0
away_key_players: float = 0.0
def to_model_array(self) -> np.ndarray:
"""Return the 24-feature vector the ensemble expects."""
return np.array(
[
self.home_elo,
self.away_elo,
self.elo_diff,
self.missing_players_impact,
self.home_avg_possession,
self.home_avg_shots_on_target,
self.home_avg_total_shots,
self.home_avg_goals_scored,
self.home_avg_goals_conceded,
self.away_avg_possession,
self.away_avg_shots_on_target,
self.away_avg_total_shots,
self.away_avg_goals_scored,
self.away_avg_goals_conceded,
self.implied_prob_home,
self.implied_prob_draw,
self.implied_prob_away,
self.implied_prob_over25,
self.implied_prob_under25,
self.implied_prob_btts_yes,
self.implied_prob_btts_no,
self.odds_home,
self.odds_draw,
self.odds_away,
],
dtype=np.float64,
)
@staticmethod
def feature_names() -> list[str]:
return [
"home_elo", "away_elo", "elo_diff", "missing_players_impact",
"home_avg_possession", "home_avg_shots_on_target",
"home_avg_total_shots", "home_avg_goals_scored",
"home_avg_goals_conceded",
"away_avg_possession", "away_avg_shots_on_target",
"away_avg_total_shots", "away_avg_goals_scored",
"away_avg_goals_conceded",
"implied_prob_home", "implied_prob_draw", "implied_prob_away",
"implied_prob_over25", "implied_prob_under25",
"implied_prob_btts_yes", "implied_prob_btts_no",
"odds_home", "odds_draw", "odds_away",
]
async def extract_features(session: AsyncSession, match_id: str) -> MatchFeatures | None:
"""Master extraction pipeline."""
feats = MatchFeatures(match_id=match_id)
flags: list[str] = []
match_row = await _load_match_header(session, match_id)
if match_row is None:
logger.warning("Match %s not found in live_matches or matches.", match_id)
return None
feats.home_team_id = match_row["home_team_id"] or ""
feats.away_team_id = match_row["away_team_id"] or ""
feats.match_name = match_row.get("match_name", "") or ""
feats.match_date_ms = int(match_row.get("mst_utc", 0) or 0)
feats.home_team_name = match_row.get("home_name", "") or ""
feats.away_team_name = match_row.get("away_name", "") or ""
feats.league_id = match_row.get("league_id", "") or ""
feats.league_name = match_row.get("league_name", "") or ""
feats.referee_name = match_row.get("referee_name", "") or ""
if not feats.home_team_id or not feats.away_team_id:
logger.warning("Match %s missing team IDs.", match_id)
flags.append("missing_team_ids")
feats.data_quality_flags = flags
feats.data_quality_score = 0.1
return feats
ai_row = await _load_ai_features(session, match_id)
if ai_row:
feats.home_elo = float(ai_row["home_elo"] or 1500.0)
feats.away_elo = float(ai_row["away_elo"] or 1500.0)
feats.missing_players_impact = float(ai_row["missing_players_impact"] or 0.0)
feats.home_form_score = float(ai_row["home_form_score"] or 0.0)
feats.away_form_score = float(ai_row["away_form_score"] or 0.0)
if ai_row.get("h2h_home_win_rate") is not None:
feats.h2h_home_win_rate = float(ai_row["h2h_home_win_rate"])
feats.h2h_sample_size = int(ai_row.get("h2h_total") or 0)
else:
flags.append("missing_ai_features")
feats.elo_diff = feats.home_elo - feats.away_elo
home_rolling = await _rolling_team_stats(
session, feats.home_team_id, feats.match_date_ms,
)
away_rolling = await _rolling_team_stats(
session, feats.away_team_id, feats.match_date_ms,
)
if home_rolling is not None:
feats.home_avg_possession = home_rolling["avg_possession"]
feats.home_avg_shots_on_target = home_rolling["avg_shots_on_target"]
feats.home_avg_total_shots = home_rolling["avg_total_shots"]
feats.home_avg_goals_scored = home_rolling["avg_goals_scored"]
feats.home_avg_goals_conceded = home_rolling["avg_goals_conceded"]
else:
flags.append("missing_home_stats")
if away_rolling is not None:
feats.away_avg_possession = away_rolling["avg_possession"]
feats.away_avg_shots_on_target = away_rolling["avg_shots_on_target"]
feats.away_avg_total_shots = away_rolling["avg_total_shots"]
feats.away_avg_goals_scored = away_rolling["avg_goals_scored"]
feats.away_avg_goals_conceded = away_rolling["avg_goals_conceded"]
else:
flags.append("missing_away_stats")
if abs(feats.home_form_score) < 1e-6:
feats.home_form_score = round(
feats.home_avg_goals_scored - feats.home_avg_goals_conceded,
3,
)
if abs(feats.away_form_score) < 1e-6:
feats.away_form_score = round(
feats.away_avg_goals_scored - feats.away_avg_goals_conceded,
3,
)
home_rest_days = await _load_rest_days(
session, feats.home_team_id, feats.match_date_ms,
)
away_rest_days = await _load_rest_days(
session, feats.away_team_id, feats.match_date_ms,
)
if home_rest_days is not None:
feats.home_rest_days = home_rest_days
else:
flags.append("missing_home_rest")
if away_rest_days is not None:
feats.away_rest_days = away_rest_days
else:
flags.append("missing_away_rest")
feats.rest_diff = round(feats.home_rest_days - feats.away_rest_days, 3)
if feats.h2h_sample_size == 0:
h2h = await _load_h2h_stats(
session,
feats.home_team_id,
feats.away_team_id,
feats.match_date_ms,
)
if h2h is not None:
feats.h2h_home_win_rate = h2h["home_win_rate"]
feats.h2h_sample_size = h2h["sample_size"]
else:
flags.append("missing_h2h")
league_profile = await _load_league_profile(
session,
feats.league_id,
feats.match_date_ms,
)
if league_profile is not None:
feats.league_avg_goals = league_profile["avg_goals"]
else:
flags.append("missing_league_profile")
referee_profile = await _load_referee_profile(
session,
feats.referee_name,
feats.match_date_ms,
)
if referee_profile is not None:
feats.referee_avg_goals = referee_profile["avg_goals"]
feats.referee_home_bias = referee_profile["home_bias"]
else:
flags.append("missing_referee_profile")
home_squad = await _load_team_squad_profile(
session,
feats.home_team_id,
feats.match_date_ms,
)
away_squad = await _load_team_squad_profile(
session,
feats.away_team_id,
feats.match_date_ms,
)
if home_squad is not None:
feats.home_squad_strength = home_squad["squad_strength"]
feats.home_key_players = home_squad["key_players"]
else:
flags.append("missing_home_squad_profile")
if away_squad is not None:
feats.away_squad_strength = away_squad["squad_strength"]
feats.away_key_players = away_squad["key_players"]
else:
flags.append("missing_away_squad_profile")
lineup_info = _extract_lineup_context(match_row)
feats.home_lineup_availability = lineup_info["home_availability"]
feats.away_lineup_availability = lineup_info["away_availability"]
if lineup_info["has_real_lineup_data"]:
feats.missing_players_impact = max(
feats.missing_players_impact,
round(
(
(1.0 - feats.home_lineup_availability)
+ (1.0 - feats.away_lineup_availability)
) / 2.0,
4,
),
)
else:
flags.append("missing_lineup_context")
odds_ok = await _extract_odds(session, match_id, feats)
if not odds_ok:
flags.append("missing_odds")
quality = 1.0
penalty_map = {
"missing_team_ids": 0.5,
"missing_ai_features": 0.05,
"missing_home_stats": 0.15,
"missing_away_stats": 0.15,
"missing_home_rest": 0.05,
"missing_away_rest": 0.05,
"missing_h2h": 0.05,
"missing_league_profile": 0.04,
"missing_referee_profile": 0.04,
"missing_home_squad_profile": 0.06,
"missing_away_squad_profile": 0.06,
"missing_lineup_context": 0.05,
"missing_odds": 0.2,
}
for flag in flags:
quality -= penalty_map.get(flag, 0.05)
feats.data_quality_score = max(0.0, round(quality, 2))
feats.data_quality_flags = flags
return feats
async def _load_match_header(
session: AsyncSession, match_id: str,
) -> dict[str, Any] | None:
"""Try live_matches first, then matches table."""
table_queries = {
"live_matches": """
SELECT
m.id,
m.home_team_id,
m.away_team_id,
m.match_name,
m.mst_utc,
m.sport,
m.league_id,
m.referee_name,
m.lineups,
m.sidelined,
ht.name AS home_name,
at.name AS away_name,
l.name AS league_name
FROM live_matches m
LEFT JOIN teams ht ON ht.id = m.home_team_id
LEFT JOIN teams at ON at.id = m.away_team_id
LEFT JOIN leagues l ON l.id = m.league_id
WHERE m.id = :match_id
LIMIT 1
""",
"matches": """
SELECT
m.id,
m.home_team_id,
m.away_team_id,
m.match_name,
m.mst_utc,
m.sport,
m.league_id,
ref.name AS referee_name,
NULL AS lineups,
NULL AS sidelined,
ht.name AS home_name,
at.name AS away_name,
l.name AS league_name
FROM matches m
LEFT JOIN teams ht ON ht.id = m.home_team_id
LEFT JOIN teams at ON at.id = m.away_team_id
LEFT JOIN leagues l ON l.id = m.league_id
LEFT JOIN match_officials ref ON ref.match_id = m.id AND ref.role_id = 1
WHERE m.id = :match_id
LIMIT 1
""",
}
for table in ("live_matches", "matches"):
query = text(table_queries[table])
result = await session.execute(query, {"match_id": match_id})
row = result.mappings().first()
if row:
return dict(row)
return None
async def _load_ai_features(
session: AsyncSession, match_id: str,
) -> dict[str, Any] | None:
query = text("""
SELECT
home_elo,
away_elo,
missing_players_impact,
home_form_score,
away_form_score,
h2h_home_win_rate,
h2h_total
FROM football_ai_features
WHERE match_id = :match_id
LIMIT 1
""")
result = await session.execute(query, {"match_id": match_id})
row = result.mappings().first()
return dict(row) if row else None
async def _rolling_team_stats(
session: AsyncSession,
team_id: str,
before_mst_utc: int,
) -> dict[str, float] | None:
"""Calculate rolling averages from the team's last N finished matches."""
query = text("""
WITH recent AS (
SELECT
m.id AS match_id,
m.home_team_id,
m.away_team_id,
m.score_home,
m.score_away,
ts.possession_percentage,
ts.shots_on_target,
ts.total_shots
FROM matches m
JOIN football_team_stats ts ON ts.match_id = m.id AND ts.team_id = :team_id
WHERE (m.home_team_id = :team_id OR m.away_team_id = :team_id)
AND m.mst_utc < :before_ts
AND m.sport = 'football'
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
ORDER BY m.mst_utc DESC
LIMIT :window
)
SELECT
COALESCE(AVG(possession_percentage), 50.0) AS avg_possession,
COALESCE(AVG(shots_on_target), 4.0) AS avg_shots_on_target,
COALESCE(AVG(total_shots), 10.0) AS avg_total_shots,
COALESCE(AVG(
CASE
WHEN home_team_id = :team_id THEN score_home
ELSE score_away
END
), 1.3) AS avg_goals_scored,
COALESCE(AVG(
CASE
WHEN home_team_id = :team_id THEN score_away
ELSE score_home
END
), 1.1) AS avg_goals_conceded,
COUNT(*) AS match_count
FROM recent
""")
result = await session.execute(
query,
{"team_id": team_id, "before_ts": before_mst_utc, "window": ROLLING_WINDOW},
)
row = result.mappings().first()
if row is None or int(row["match_count"]) == 0:
return None
return {
"avg_possession": round(float(row["avg_possession"]), 2),
"avg_shots_on_target": round(float(row["avg_shots_on_target"]), 2),
"avg_total_shots": round(float(row["avg_total_shots"]), 2),
"avg_goals_scored": round(float(row["avg_goals_scored"]), 2),
"avg_goals_conceded": round(float(row["avg_goals_conceded"]), 2),
}
async def _load_rest_days(
session: AsyncSession,
team_id: str,
before_mst_utc: int,
) -> float | None:
query = text("""
SELECT m.mst_utc
FROM matches m
WHERE (m.home_team_id = :team_id OR m.away_team_id = :team_id)
AND m.mst_utc < :before_ts
AND m.sport = 'football'
ORDER BY m.mst_utc DESC
LIMIT 1
""")
result = await session.execute(
query,
{"team_id": team_id, "before_ts": before_mst_utc},
)
last_match_ts = result.scalar_one_or_none()
if last_match_ts is None:
return None
rest_days = max(0.0, (float(before_mst_utc) - float(last_match_ts)) / 86400000.0)
return round(min(rest_days, MAX_REST_DAYS), 3)
async def _load_h2h_stats(
session: AsyncSession,
home_team_id: str,
away_team_id: str,
before_mst_utc: int,
) -> dict[str, float | int] | None:
query = text("""
SELECT
m.home_team_id,
m.away_team_id,
m.score_home,
m.score_away
FROM matches m
WHERE m.sport = 'football'
AND m.mst_utc < :before_ts
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
AND (
(m.home_team_id = :home_team_id AND m.away_team_id = :away_team_id)
OR
(m.home_team_id = :away_team_id AND m.away_team_id = :home_team_id)
)
ORDER BY m.mst_utc DESC
LIMIT :window
""")
result = await session.execute(
query,
{
"home_team_id": home_team_id,
"away_team_id": away_team_id,
"before_ts": before_mst_utc,
"window": H2H_WINDOW,
},
)
rows = result.mappings().all()
if not rows:
return None
home_wins = 0.0
draws = 0.0
sample_size = 0
for row in rows:
score_home = row["score_home"]
score_away = row["score_away"]
if score_home is None or score_away is None:
continue
sample_size += 1
row_home_team_id = row["home_team_id"]
row_away_team_id = row["away_team_id"]
current_home_score = float(score_home) if row_home_team_id == home_team_id else float(score_away)
current_away_score = float(score_away) if row_home_team_id == home_team_id else float(score_home)
if current_home_score > current_away_score:
home_wins += 1.0
elif current_home_score == current_away_score:
draws += 1.0
if sample_size == 0:
return None
# Count draws as a half-win signal instead of throwing them away.
home_win_rate = round((home_wins + draws * 0.5) / sample_size, 4)
return {
"home_win_rate": home_win_rate,
"sample_size": sample_size,
}
async def _load_league_profile(
session: AsyncSession,
league_id: str,
before_mst_utc: int,
) -> dict[str, float] | None:
if not league_id:
return None
query = text("""
SELECT
COALESCE(AVG(m.score_home + m.score_away), 2.6) AS avg_goals,
COUNT(*) AS match_count
FROM (
SELECT score_home, score_away
FROM matches
WHERE league_id = :league_id
AND sport = 'football'
AND status = 'FT'
AND score_home IS NOT NULL
AND score_away IS NOT NULL
AND mst_utc < :before_ts
ORDER BY mst_utc DESC
LIMIT 100
) m
""")
result = await session.execute(
query,
{"league_id": league_id, "before_ts": before_mst_utc},
)
row = result.mappings().first()
if row is None or int(row["match_count"] or 0) == 0:
return None
return {"avg_goals": round(float(row["avg_goals"]), 3)}
async def _load_referee_profile(
session: AsyncSession,
referee_name: str,
before_mst_utc: int,
) -> dict[str, float] | None:
if not referee_name:
return None
query = text("""
SELECT
COALESCE(AVG(CASE WHEN score_home > score_away THEN 1.0 ELSE 0.0 END), 0.46) - 0.46 AS home_bias,
COALESCE(AVG(score_home + score_away), 2.6) AS avg_goals,
COUNT(*) AS match_count
FROM (
SELECT m.score_home, m.score_away
FROM match_officials mo
JOIN matches m ON m.id = mo.match_id
WHERE mo.name = :referee_name
AND mo.role_id = 1
AND m.sport = 'football'
AND m.status = 'FT'
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
AND m.mst_utc < :before_ts
ORDER BY m.mst_utc DESC
LIMIT 30
) ref_matches
""")
result = await session.execute(
query,
{"referee_name": referee_name, "before_ts": before_mst_utc},
)
row = result.mappings().first()
if row is None or int(row["match_count"] or 0) == 0:
return None
return {
"home_bias": round(float(row["home_bias"]), 4),
"avg_goals": round(float(row["avg_goals"]), 3),
}
async def _load_team_squad_profile(
session: AsyncSession,
team_id: str,
before_mst_utc: int,
) -> dict[str, float] | None:
if not team_id:
return None
query = text("""
WITH recent_matches AS (
SELECT m.id, m.mst_utc
FROM matches m
WHERE (m.home_team_id = :team_id OR m.away_team_id = :team_id)
AND m.sport = 'football'
AND m.status = 'FT'
AND m.mst_utc < :before_ts
ORDER BY m.mst_utc DESC
LIMIT 8
),
player_base AS (
SELECT
mpp.player_id,
COUNT(*)::float AS appearances,
COUNT(*) FILTER (WHERE mpp.is_starting = true)::float AS starts
FROM match_player_participation mpp
JOIN recent_matches rm ON rm.id = mpp.match_id
WHERE mpp.team_id = :team_id
GROUP BY mpp.player_id
),
player_goals AS (
SELECT
mpe.player_id,
COUNT(*) FILTER (
WHERE mpe.event_type = 'goal'
AND COALESCE(mpe.event_subtype, '') NOT ILIKE '%penaltı kaçırma%'
)::float AS goals,
0.0::float AS assists
FROM match_player_events mpe
JOIN recent_matches rm ON rm.id = mpe.match_id
WHERE mpe.team_id = :team_id
GROUP BY mpe.player_id
UNION ALL
SELECT
mpe.assist_player_id AS player_id,
0.0::float AS goals,
COUNT(*) FILTER (
WHERE mpe.event_type = 'goal'
AND mpe.assist_player_id IS NOT NULL
)::float AS assists
FROM match_player_events mpe
JOIN recent_matches rm ON rm.id = mpe.match_id
WHERE mpe.team_id = :team_id
AND mpe.assist_player_id IS NOT NULL
GROUP BY mpe.assist_player_id
),
player_events AS (
SELECT
player_id,
SUM(goals) AS goals,
SUM(assists) AS assists
FROM player_goals
GROUP BY player_id
),
player_scores AS (
SELECT
pb.player_id,
(pb.starts * 1.5)
+ ((pb.appearances - pb.starts) * 0.5)
+ (COALESCE(pe.goals, 0.0) * 2.5)
+ (COALESCE(pe.assists, 0.0) * 1.5) AS score
FROM player_base pb
LEFT JOIN player_events pe ON pe.player_id = pb.player_id
)
SELECT
COALESCE(AVG(top_players.score), 0.0) AS avg_top_score,
COALESCE(COUNT(*) FILTER (WHERE top_players.score >= 6.0), 0) AS key_players,
COALESCE((SELECT COUNT(*) FROM recent_matches), 0) AS match_count
FROM (
SELECT score
FROM player_scores
ORDER BY score DESC
LIMIT 11
) top_players
""")
result = await session.execute(
query,
{"team_id": team_id, "before_ts": before_mst_utc},
)
row = result.mappings().first()
if row is None or int(row["match_count"] or 0) == 0:
return None
avg_top_score = float(row["avg_top_score"] or 0.0)
return {
"squad_strength": round(min(max(avg_top_score / 10.0, 0.0), 1.0), 4),
"key_players": float(row["key_players"] or 0),
}
def _safe_json(value: Any) -> dict[str, Any] | None:
if value is None:
return None
if isinstance(value, dict):
return value
if isinstance(value, str):
try:
parsed = json.loads(value)
except (TypeError, json.JSONDecodeError):
return None
return parsed if isinstance(parsed, dict) else None
return None
def _safe_list(value: Any) -> list[Any]:
if isinstance(value, list):
return value
return []
def _extract_lineup_context(match_row: dict[str, Any]) -> dict[str, float | bool]:
lineups = _safe_json(match_row.get("lineups"))
sidelined = _safe_json(match_row.get("sidelined"))
home_xi_count = 0
away_xi_count = 0
home_sidelined_count = 0
away_sidelined_count = 0
if lineups:
home_xi_count = len(_safe_list(lineups.get("home", {}).get("xi")))
away_xi_count = len(_safe_list(lineups.get("away", {}).get("xi")))
if sidelined:
home_team = sidelined.get("homeTeam", {})
away_team = sidelined.get("awayTeam", {})
home_sidelined_count = max(
int(home_team.get("totalSidelined") or 0),
len(_safe_list(home_team.get("players"))),
)
away_sidelined_count = max(
int(away_team.get("totalSidelined") or 0),
len(_safe_list(away_team.get("players"))),
)
has_real_lineup_data = any(
value > 0
for value in (
home_xi_count,
away_xi_count,
home_sidelined_count,
away_sidelined_count,
)
)
home_availability = _compute_availability(home_xi_count, home_sidelined_count)
away_availability = _compute_availability(away_xi_count, away_sidelined_count)
return {
"home_availability": home_availability,
"away_availability": away_availability,
"has_real_lineup_data": has_real_lineup_data,
}
def _compute_availability(xi_count: int, sidelined_count: int) -> float:
xi_ratio = min(max(xi_count / 11.0, 0.0), 1.0) if xi_count > 0 else 1.0
sidelined_penalty = min(max(sidelined_count / 11.0, 0.0), 1.0) * 0.35
return round(min(max(xi_ratio - sidelined_penalty, 0.0), 1.0), 4)
def _safe_odd(val: Any) -> float:
"""Parse an odds value that might be str, float, int, or None."""
if val is None:
return 0.0
try:
parsed = float(val)
return parsed if parsed > 1.0 else 0.0
except (ValueError, TypeError):
return 0.0
def _implied_prob(decimal_odd: float) -> float:
"""Convert decimal odds to implied probability, clamped [0, 1]."""
if decimal_odd <= 1.0:
return 0.0
return min(1.0, 1.0 / decimal_odd)
async def _extract_odds(
session: AsyncSession,
match_id: str,
feats: MatchFeatures,
) -> bool:
"""Extract odds from live JSON first, then relational tables."""
found = False
odds_json = await _load_live_odds_json(session, match_id)
if odds_json:
found = _parse_odds_json(odds_json, feats)
if not found:
found = await _load_relational_odds(session, match_id, feats)
if found:
feats.implied_prob_home = round(_implied_prob(feats.odds_home), 4)
feats.implied_prob_draw = round(_implied_prob(feats.odds_draw), 4)
feats.implied_prob_away = round(_implied_prob(feats.odds_away), 4)
feats.implied_prob_over25 = round(_implied_prob(feats.odds_over25), 4)
feats.implied_prob_under25 = round(_implied_prob(feats.odds_under25), 4)
feats.implied_prob_btts_yes = round(_implied_prob(feats.odds_btts_yes), 4)
feats.implied_prob_btts_no = round(_implied_prob(feats.odds_btts_no), 4)
return found
async def _load_live_odds_json(
session: AsyncSession, match_id: str,
) -> dict[str, Any] | None:
query = text("SELECT odds FROM live_matches WHERE id = :mid AND odds IS NOT NULL")
result = await session.execute(query, {"mid": match_id})
row = result.scalar_one_or_none()
if row is None:
return None
if isinstance(row, str):
try:
parsed = json.loads(row)
except (json.JSONDecodeError, TypeError):
return None
return parsed if isinstance(parsed, (dict, list)) else None
if isinstance(row, (dict, list)):
return row
return None
def _parse_odds_json(odds_blob: dict[str, Any] | list[Any], feats: MatchFeatures) -> bool:
"""Parse the Mackolik-style odds JSON structure."""
found_any = False
categories: list[dict[str, Any]] = []
if isinstance(odds_blob, list):
categories = [item for item in odds_blob if isinstance(item, dict)]
elif isinstance(odds_blob, dict):
raw_categories = odds_blob.get("categories", odds_blob.get("odds", []))
if isinstance(raw_categories, dict):
categories = [item for item in raw_categories.values() if isinstance(item, dict)]
elif isinstance(raw_categories, list):
categories = [item for item in raw_categories if isinstance(item, dict)]
for cat in categories:
cat_name = (cat.get("name") or cat.get("cn") or "").strip().lower()
selections = cat.get("selections") or cat.get("s") or []
if cat_name in ("mac sonucu", "match result", "1x2", "maç sonucu"):
sels = _selections_to_map(selections)
feats.odds_home = _safe_odd(sels.get("1")) or feats.odds_home
feats.odds_draw = _safe_odd(sels.get("x")) or feats.odds_draw
feats.odds_away = _safe_odd(sels.get("2")) or feats.odds_away
found_any = True
elif cat_name in ("2,5 alt/ust", "over/under 2.5", "2.5 alt/ust", "2,5 alt/üst", "2.5 alt/üst"):
sels = _selections_to_map(selections)
feats.odds_over25 = _safe_odd(sels.get("ust") or sels.get("over") or sels.get("üst")) or feats.odds_over25
feats.odds_under25 = _safe_odd(sels.get("alt") or sels.get("under")) or feats.odds_under25
found_any = True
elif cat_name in ("karsilikli gol", "both teams to score", "btts", "karşılıklı gol"):
sels = _selections_to_map(selections)
feats.odds_btts_yes = _safe_odd(sels.get("var") or sels.get("yes")) or feats.odds_btts_yes
feats.odds_btts_no = _safe_odd(sels.get("yok") or sels.get("no")) or feats.odds_btts_no
found_any = True
return found_any
def _selections_to_map(selections: list[Any] | dict[str, Any]) -> dict[str, Any]:
"""Normalize varied selection structures into {name_lower: odd_value}."""
result: dict[str, Any] = {}
if isinstance(selections, dict):
for key, value in selections.items():
result[str(key).strip().lower()] = value
elif isinstance(selections, list):
for sel in selections:
if isinstance(sel, dict):
name = (sel.get("name") or sel.get("n") or "").strip().lower()
value = sel.get("odd_value") or sel.get("ov") or sel.get("v")
if name:
result[name] = value
return result
async def _load_relational_odds(
session: AsyncSession, match_id: str, feats: MatchFeatures,
) -> bool:
"""Fallback: load odds from odd_categories + odd_selections."""
query = text("""
SELECT oc.name AS cat_name, os.name AS sel_name, os.odd_value
FROM odd_categories oc
JOIN odd_selections os ON os.odd_category_db_id = oc.db_id
WHERE oc.match_id = :match_id
AND oc.name IN ('Maç Sonucu', '2,5 Alt/Üst', 'Karşılıklı Gol')
""")
result = await session.execute(query, {"match_id": match_id})
rows = result.mappings().all()
if not rows:
return False
for row in rows:
cat = (row["cat_name"] or "").strip()
sel = (row["sel_name"] or "").strip().lower()
value = _safe_odd(row["odd_value"])
if value <= 1.0:
continue
if cat == "Maç Sonucu":
if sel == "1":
feats.odds_home = value
elif sel == "x":
feats.odds_draw = value
elif sel == "2":
feats.odds_away = value
elif cat == "2,5 Alt/Üst":
if sel in ("üst", "ust", "over"):
feats.odds_over25 = value
elif sel in ("alt", "under"):
feats.odds_under25 = value
elif cat == "Karşılıklı Gol":
if sel in ("var", "yes"):
feats.odds_btts_yes = value
elif sel in ("yok", "no"):
feats.odds_btts_no = value
return True
+256
View File
@@ -0,0 +1,256 @@
"""
Feature Adapter for XGBoost Inference
=====================================
Bridges the gap between V20 Engine outputs (CalculationContext) and XGBoost Models.
Constructs the exact 44-feature vector used in training.
"""
from __future__ import annotations
import os
from typing import Any
import psycopg2
from psycopg2.extensions import connection as PgConnection
import pandas as pd
import numpy as np
from data.db import get_clean_dsn
# Feature definitions (Must match train_xgboost_markets.py)
# NOTE: 68 features - matching the trained XGBoost models
FEATURES = [
# ELO
"home_overall_elo", "away_overall_elo", "elo_diff",
"home_home_elo", "away_away_elo", "form_elo_diff",
# Form
"home_goals_avg", "home_conceded_avg",
"away_goals_avg", "away_conceded_avg",
"home_clean_sheet_rate", "away_clean_sheet_rate",
"home_scoring_rate", "away_scoring_rate",
"home_winning_streak", "away_winning_streak",
# H2H
"h2h_home_win_rate", "h2h_draw_rate",
"h2h_avg_goals", "h2h_btts_rate", "h2h_over25_rate",
# Stats
"home_avg_possession", "away_avg_possession",
"home_avg_shots_on_target", "away_avg_shots_on_target",
"home_shot_conversion", "away_shot_conversion",
# Odds (Implicit market wisdom)
"odds_ms_h", "odds_ms_d", "odds_ms_a",
"implied_home", "implied_draw", "implied_away",
"odds_ht_ms_h", "odds_ht_ms_d", "odds_ht_ms_a",
"odds_ou05_o", "odds_ou05_u",
"odds_ou15_o", "odds_ou15_u",
"odds_ou25_o", "odds_ou25_u",
"odds_ou35_o", "odds_ou35_u",
"odds_ht_ou05_o", "odds_ht_ou05_u",
"odds_ht_ou15_o", "odds_ht_ou15_u",
"odds_btts_y", "odds_btts_n",
# League/Context
"league_avg_goals", "league_zero_goal_rate",
"home_xga", "away_xga",
# Upset features
"upset_atmosphere", "upset_motivation", "upset_fatigue", "upset_potential",
# Referee features
"referee_home_bias", "referee_avg_goals", "referee_cards_total",
"referee_avg_yellow", "referee_experience",
# Momentum features
"home_momentum_score", "away_momentum_score", "momentum_diff",
]
class FeatureAdapter:
"""
Adapter to convert V20 context into XGBoost-compatible features.
"""
def __init__(self) -> None:
self.conn: PgConnection | None = None
self._connect_db()
self.league_stats_cache: dict[str, dict[str, float]] = {}
def _connect_db(self) -> None:
try:
# FeatureAdapter uses DB only for optional league stats enrichment.
# Keep startup non-blocking when DB/tunnel is unavailable.
if not os.getenv("DATABASE_URL", "").strip():
return
self.conn = psycopg2.connect(get_clean_dsn())
except Exception as e:
print(f"⚠️ FeatureAdapter DB connection failed: {e}")
def get_features(self, ctx: Any) -> pd.DataFrame:
"""
Construct feature vector from CalculationContext.
Returns a DataFrame with 1 row and correct columns.
"""
raw = ctx.team_pred.raw_features
odds = ctx.odds_data or {}
upset_features = getattr(ctx, "upset_features", {}) or {}
momentum_features = getattr(ctx, "momentum_features", {}) or {}
referee_features = getattr(ctx, "referee_features", {}) or {}
# 1. Odds Features
ms_h = float(odds.get("ms_h") or 0)
ms_d = float(odds.get("ms_d") or 0)
ms_a = float(odds.get("ms_a") or 0)
implied_home, implied_draw, implied_away = 0.33, 0.33, 0.33
if ms_h > 0 and ms_d > 0 and ms_a > 0:
raw_sum = 1/ms_h + 1/ms_d + 1/ms_a
implied_home = (1/ms_h) / raw_sum
implied_draw = (1/ms_d) / raw_sum
implied_away = (1/ms_a) / raw_sum
# 2. League Features
# Using ctx.league_id if available, or just defaults
league_stats = self._get_league_stats(ctx.league_id)
# 3. Assemble Dictionary
row = {
# ELO (Explicit float casting)
"home_overall_elo": float(raw.get("home_overall_elo") or 1500),
"away_overall_elo": float(raw.get("away_overall_elo") or 1500),
"elo_diff": float(raw.get("elo_diff") or 0),
"home_home_elo": float(raw.get("home_home_elo") or 1500),
"away_away_elo": float(raw.get("away_away_elo") or 1500),
"form_elo_diff": float(raw.get("form_elo_diff") or 0),
# Form (Explicit float casting)
"home_goals_avg": float(raw.get("home_goals_avg") or 1.3),
"home_conceded_avg": float(raw.get("home_conceded_avg") or 1.2),
"away_goals_avg": float(raw.get("away_goals_avg") or 1.2),
"away_conceded_avg": float(raw.get("away_conceded_avg") or 1.4),
"home_clean_sheet_rate": float(raw.get("home_clean_sheet_rate") or 0.2),
"away_clean_sheet_rate": float(raw.get("away_clean_sheet_rate") or 0.2),
"home_scoring_rate": float(raw.get("home_scoring_rate") or 0.8),
"away_scoring_rate": float(raw.get("away_scoring_rate") or 0.8),
"home_winning_streak": float(raw.get("home_winning_streak") or 0),
"away_winning_streak": float(raw.get("away_winning_streak") or 0),
# H2H (Explicit float casting)
"h2h_home_win_rate": float(raw.get("h2h_home_win_rate") or 0.33),
"h2h_draw_rate": float(raw.get("h2h_draw_rate") or 0.33),
"h2h_avg_goals": float(raw.get("h2h_avg_goals") or 2.5),
"h2h_btts_rate": float(raw.get("h2h_btts_rate") or 0.5),
"h2h_over25_rate": float(raw.get("h2h_over25_rate") or 0.5),
# Stats (Explicit float casting to avoid XGBoost 'object' error)
"home_avg_possession": float(raw.get("home_avg_possession") or 0.5),
"away_avg_possession": float(raw.get("away_avg_possession") or 0.5),
"home_avg_shots_on_target": float(raw.get("home_avg_shots_on_target") or 4.0),
"away_avg_shots_on_target": float(raw.get("away_avg_shots_on_target") or 3.5),
"home_shot_conversion": float(raw.get("home_shot_conversion") or 0.1),
"away_shot_conversion": float(raw.get("away_shot_conversion") or 0.1),
# Odds
"odds_ms_h": ms_h,
"odds_ms_d": ms_d,
"odds_ms_a": ms_a,
"implied_home": implied_home,
"implied_draw": implied_draw,
"implied_away": implied_away,
"odds_ht_ms_h": float(odds.get("ht_ms_h") or 0.0),
"odds_ht_ms_d": float(odds.get("ht_ms_d") or 0.0),
"odds_ht_ms_a": float(odds.get("ht_ms_a") or 0.0),
"odds_ou05_o": float(odds.get("ou05_o") or 0.0),
"odds_ou05_u": float(odds.get("ou05_u") or 0.0),
"odds_ou15_o": float(odds.get("ou15_o") or 0.0),
"odds_ou15_u": float(odds.get("ou15_u") or 0.0),
"odds_ou25_o": float(odds.get("ou25_o") or 0.0),
"odds_ou25_u": float(odds.get("ou25_u") or 0.0),
"odds_ou35_o": float(odds.get("ou35_o") or 0.0),
"odds_ou35_u": float(odds.get("ou35_u") or 0.0),
"odds_ht_ou05_o": float(odds.get("ht_ou05_o") or 0.0),
"odds_ht_ou05_u": float(odds.get("ht_ou05_u") or 0.0),
"odds_ht_ou15_o": float(odds.get("ht_ou15_o") or 0.0),
"odds_ht_ou15_u": float(odds.get("ht_ou15_u") or 0.0),
"odds_btts_y": float(odds.get("btts_y") or 0.0),
"odds_btts_n": float(odds.get("btts_n") or 0.0),
# League/Def
"league_avg_goals": float(league_stats.get("avg_goals") or 2.7),
"league_zero_goal_rate": float(league_stats.get("zero_rate") or 0.07),
"home_xga": float(raw.get("home_xga") or 1.2),
"away_xga": float(raw.get("away_xga") or 1.4),
# Upset features (default values - computed separately in upset_engine_v2)
"upset_atmosphere": float(raw.get("upset_atmosphere") or 0.0),
"upset_motivation": float(raw.get("upset_motivation") or 0.0),
"upset_fatigue": float(raw.get("upset_fatigue") or 0.0),
"upset_potential": float(raw.get("upset_potential") or 0.0),
# Referee features (default values)
"referee_home_bias": float(raw.get("referee_home_bias") or 0.0),
"referee_avg_goals": float(raw.get("referee_avg_goals") or 2.5),
"referee_cards_total": float(raw.get("referee_cards_total") or 4.0),
"referee_avg_yellow": float(raw.get("referee_avg_yellow") or 3.0),
"referee_experience": float(raw.get("referee_experience") or 0),
# Momentum features (default values)
"home_momentum_score": float(raw.get("home_momentum_score") or 0.0),
"away_momentum_score": float(raw.get("away_momentum_score") or 0.0),
"momentum_diff": float(raw.get("momentum_diff") or 0.0),
}
# Return as DataFrame (cols sorted by FEATURES list to ensure alignment)
df = pd.DataFrame([row], columns=FEATURES)
return df
def _get_league_stats(self, league_id: str | None) -> dict[str, float]:
"""Get cached league stats or default."""
if not league_id:
return {"avg_goals": 2.7, "zero_rate": 0.07}
if league_id in self.league_stats_cache:
return self.league_stats_cache[league_id]
if self.conn:
try:
with self.conn.cursor() as cur:
cur.execute("""
SELECT AVG(score_home + score_away),
AVG(CASE WHEN score_home=0 AND score_away=0 THEN 1.0 ELSE 0.0 END)
FROM matches
WHERE league_id = %s AND status = 'FT'
AND mst_utc > EXTRACT(EPOCH FROM NOW() - INTERVAL '1 year')
""", (league_id,))
res = cur.fetchone()
if res and res[0]:
stats = {
"avg_goals": float(res[0]),
"zero_rate": float(res[1])
}
self.league_stats_cache[league_id] = stats
return stats
except Exception:
pass
# Default fallback
return {"avg_goals": 2.7, "zero_rate": 0.07}
# Singleton
_adapter: FeatureAdapter | None = None
def get_feature_adapter() -> FeatureAdapter:
global _adapter
if _adapter is None:
_adapter = FeatureAdapter()
return _adapter
+316
View File
@@ -0,0 +1,316 @@
"""
Head-to-Head (H2H) Feature Engine
Takımların birbirine karşı geçmiş performansını analiz eder.
"""
import os
import psycopg2
from typing import Dict, Optional, Tuple
from dataclasses import dataclass
from functools import lru_cache
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from data.db import get_clean_dsn
@dataclass
class H2HProfile:
"""Head-to-Head analiz sonucu"""
total_matches: int
home_wins: int
draws: int
away_wins: int
home_goals_total: int
away_goals_total: int
btts_count: int # Both teams to score
over25_count: int
@property
def home_win_rate(self) -> float:
return self.home_wins / self.total_matches if self.total_matches > 0 else 0.33
@property
def draw_rate(self) -> float:
return self.draws / self.total_matches if self.total_matches > 0 else 0.33
@property
def away_win_rate(self) -> float:
return self.away_wins / self.total_matches if self.total_matches > 0 else 0.33
@property
def avg_total_goals(self) -> float:
return (self.home_goals_total + self.away_goals_total) / self.total_matches if self.total_matches > 0 else 2.5
@property
def btts_rate(self) -> float:
return self.btts_count / self.total_matches if self.total_matches > 0 else 0.5
@property
def over25_rate(self) -> float:
return self.over25_count / self.total_matches if self.total_matches > 0 else 0.5
@property
def home_dominance(self) -> float:
"""Ev sahibinin üstünlük skoru (-1 ile 1 arası)"""
if self.total_matches == 0:
return 0
return (self.home_wins - self.away_wins) / self.total_matches
def to_features(self) -> Dict[str, float]:
"""Feature dictionary döndür"""
return {
'h2h_total_matches': self.total_matches,
'h2h_home_win_rate': self.home_win_rate,
'h2h_draw_rate': self.draw_rate,
'h2h_away_win_rate': self.away_win_rate,
'h2h_avg_goals': self.avg_total_goals,
'h2h_btts_rate': self.btts_rate,
'h2h_over25_rate': self.over25_rate,
'h2h_home_dominance': self.home_dominance,
}
class H2HFeatureEngine:
"""
Head-to-Head Feature Engine
İki takım arasındaki geçmiş karşılaşmaları analiz eder.
"""
def __init__(self):
self.conn = None
self._cache: Dict[Tuple[str, str], H2HProfile] = {}
def get_conn(self):
if self.conn is None or self.conn.closed:
self.conn = psycopg2.connect(get_clean_dsn())
return self.conn
def get_h2h_profile(self, home_team_id: str, away_team_id: str,
before_date: Optional[int] = None,
limit: int = 20) -> H2HProfile:
"""
İki takım arasındaki geçmiş karşılaşmaları analiz et.
Args:
home_team_id: Ev sahibi takım ID
away_team_id: Deplasman takım ID
before_date: Bu tarihten önceki maçlar (mst_utc, milliseconds)
limit: Kaç maç geriye bakılacak
Returns:
H2HProfile: Head-to-head analiz sonucu
"""
cache_key = (home_team_id, away_team_id)
# Cache kontrolü (before_date yoksa)
if before_date is None and cache_key in self._cache:
return self._cache[cache_key]
conn = self.get_conn()
cur = conn.cursor()
# Her iki yöndeki karşılaşmaları al
# (A evde B deplasman + B evde A deplasman)
query = """
SELECT
home_team_id, away_team_id,
score_home, score_away
FROM matches
WHERE (
(home_team_id = %s AND away_team_id = %s)
OR
(home_team_id = %s AND away_team_id = %s)
)
AND score_home IS NOT NULL
AND score_away IS NOT NULL
"""
params = [home_team_id, away_team_id, away_team_id, home_team_id]
if before_date:
query += " AND mst_utc < %s"
params.append(before_date)
query += " ORDER BY mst_utc DESC LIMIT %s"
params.append(limit)
cur.execute(query, params)
matches = cur.fetchall()
if not matches:
return H2HProfile(
total_matches=0, home_wins=0, draws=0, away_wins=0,
home_goals_total=0, away_goals_total=0,
btts_count=0, over25_count=0
)
# İstatistikleri hesapla
home_wins = 0
draws = 0
away_wins = 0
home_goals = 0
away_goals = 0
btts = 0
over25 = 0
for match in matches:
m_home_id, m_away_id, score_h, score_a = match
# Perspektifi normalize et (istenen takım açısından)
if m_home_id == home_team_id:
# Normal sıralama
h_score, a_score = score_h, score_a
else:
# Ters sıralama (rakip evde oynamış)
h_score, a_score = score_a, score_h
# Sonuç
if h_score > a_score:
home_wins += 1
elif h_score < a_score:
away_wins += 1
else:
draws += 1
# Goller
home_goals += h_score
away_goals += a_score
# BTTS
if h_score > 0 and a_score > 0:
btts += 1
# Over 2.5
if h_score + a_score > 2.5:
over25 += 1
profile = H2HProfile(
total_matches=len(matches),
home_wins=home_wins,
draws=draws,
away_wins=away_wins,
home_goals_total=home_goals,
away_goals_total=away_goals,
btts_count=btts,
over25_count=over25
)
# Cache'e kaydet
if before_date is None:
self._cache[cache_key] = profile
return profile
def get_features(self, home_team_id: str, away_team_id: str,
before_date: Optional[int] = None) -> Dict[str, float]:
"""Feature dictionary döndür"""
profile = self.get_h2h_profile(home_team_id, away_team_id, before_date)
return profile.to_features()
def get_momentum(self, home_team_id: str, away_team_id: str,
before_date: Optional[int] = None) -> Dict[str, float]:
"""
Son karşılaşmalardaki momentum/trend analizi.
Son 5 maçtaki trend'e bakar.
"""
profile = self.get_h2h_profile(home_team_id, away_team_id, before_date, limit=5)
# Streak hesapla (ardışık sonuçlar)
conn = self.get_conn()
cur = conn.cursor()
query = """
SELECT home_team_id, score_home, score_away
FROM matches
WHERE (
(home_team_id = %s AND away_team_id = %s)
OR
(home_team_id = %s AND away_team_id = %s)
)
AND score_home IS NOT NULL
"""
params = [home_team_id, away_team_id, away_team_id, home_team_id]
if before_date:
query += " AND mst_utc < %s"
params.append(before_date)
query += " ORDER BY mst_utc DESC LIMIT 5"
cur.execute(query, params)
recent = cur.fetchall()
streak = 0
streak_type = None # 'home', 'away', 'draw'
for match in recent:
m_home_id, score_h, score_a = match
# Perspektifi normalize et
if m_home_id == home_team_id:
result = 'home' if score_h > score_a else ('away' if score_h < score_a else 'draw')
else:
result = 'away' if score_h > score_a else ('home' if score_h < score_a else 'draw')
if streak_type is None:
streak_type = result
streak = 1
elif result == streak_type:
streak += 1
else:
break
return {
'h2h_recent_home_dominance': profile.home_dominance,
'h2h_streak_length': streak,
'h2h_streak_home': 1 if streak_type == 'home' else 0,
'h2h_streak_away': 1 if streak_type == 'away' else 0,
'h2h_streak_draw': 1 if streak_type == 'draw' else 0,
}
# Singleton
_engine = None
def get_h2h_engine() -> H2HFeatureEngine:
global _engine
if _engine is None:
_engine = H2HFeatureEngine()
return _engine
if __name__ == "__main__":
# Test
engine = get_h2h_engine()
# Örnek: Fenerbahçe vs Galatasaray (ID'leri bulunmalı)
# Test için veritabanından bir karşılaşma çekelim
conn = engine.get_conn()
cur = conn.cursor()
cur.execute("""
SELECT home_team_id, away_team_id, match_name
FROM matches
WHERE score_home IS NOT NULL
LIMIT 1
""")
result = cur.fetchone()
if result:
home_id, away_id, name = result
print(f"\n🧪 Test: {name}")
print(f" Home ID: {home_id}")
print(f" Away ID: {away_id}")
profile = engine.get_h2h_profile(home_id, away_id)
print(f"\n📊 H2H Profil:")
print(f" Toplam Maç: {profile.total_matches}")
print(f" Ev Sahibi Kazanma: {profile.home_win_rate:.1%}")
print(f" Beraberlik: {profile.draw_rate:.1%}")
print(f" Deplasman Kazanma: {profile.away_win_rate:.1%}")
print(f" Ortalama Gol: {profile.avg_total_goals:.2f}")
print(f" BTTS Oranı: {profile.btts_rate:.1%}")
print(f" Üst 2.5 Oranı: {profile.over25_rate:.1%}")
print(f" Ev Dominance: {profile.home_dominance:+.2f}")
features = engine.get_features(home_id, away_id)
print(f"\n🔧 Features: {features}")
+343
View File
@@ -0,0 +1,343 @@
"""
HT/FT Tendency Feature Engine
================================
Produces team-level HT/FT tendency features for match prediction.
Computes ~15 features per match based on historical data:
- 1st half scoring/conceding rates
- Comeback rates
- Half-specific goal distribution
- League-level HT/FT profiles
All features are computed from the `matches` table using only data
BEFORE the match date (no future leakage).
"""
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from typing import Dict, Optional, Tuple
from dataclasses import dataclass, field
from data.db import get_clean_dsn
import psycopg2
@dataclass
class TeamHtftProfile:
"""HT/FT tendency profile for a single team."""
matches: int = 0
ht_scored: int = 0 # Matches where team scored in 1st half
ht_conceded: int = 0 # Matches where team conceded in 1st half
ht_leading: int = 0 # Matches where team led at HT
ht_trailing: int = 0 # Matches where team trailed at HT
comeback_wins: int = 0 # Trailing at HT -> Won
goals_1h: int = 0
goals_2h: int = 0
conceded_1h: int = 0
conceded_2h: int = 0
@property
def ht_scoring_rate(self):
return self.ht_scored / self.matches if self.matches > 0 else 0.5
@property
def ht_concede_rate(self):
return self.ht_conceded / self.matches if self.matches > 0 else 0.5
@property
def ht_win_rate(self):
return self.ht_leading / self.matches if self.matches > 0 else 0.33
@property
def comeback_rate(self):
return self.comeback_wins / self.ht_trailing if self.ht_trailing > 0 else 0.0
@property
def first_half_goal_pct(self):
total = self.goals_1h + self.goals_2h
return self.goals_1h / total if total > 0 else 0.5
@property
def second_half_surge(self):
"""Ratio of 2H goals vs 1H goals. >1 means more dangerous in 2nd half."""
return self.goals_2h / self.goals_1h if self.goals_1h > 0 else 1.0
@dataclass
class LeagueHtftProfile:
"""League-level HT/FT statistics."""
matches: int = 0
ht_goals_total: int = 0
ft_goals_total: int = 0
reversals: int = 0
htft_counts: Dict[str, int] = field(default_factory=dict)
@property
def avg_ht_goals(self):
return self.ht_goals_total / self.matches if self.matches > 0 else 1.0
@property
def avg_2h_goals(self):
ft = self.ft_goals_total / self.matches if self.matches > 0 else 2.5
return ft - self.avg_ht_goals
@property
def reversal_rate(self):
return self.reversals / self.matches if self.matches > 0 else 0.05
@property
def first_half_pct(self):
return self.ht_goals_total / self.ft_goals_total if self.ft_goals_total > 0 else 0.44
class HtftTendencyEngine:
"""
Computes HT/FT tendency features for a given match.
Uses historical data from `matches` table, filtering by date to
avoid future leakage.
Features are based on team-level and league-level tendencies, which
are DIFFERENT from the existing model features (ELO, form, H2H score).
"""
def __init__(self):
self.conn = None
self._team_cache: Dict[Tuple[str, bool], TeamHtftProfile] = {}
self._league_cache: Dict[str, LeagueHtftProfile] = {}
def get_conn(self):
if self.conn is None or self.conn.closed:
dsn = get_clean_dsn()
self.conn = psycopg2.connect(dsn)
return self.conn
def _get_team_htft_profile(
self,
team_id: str,
is_home: bool,
before_date: Optional[int] = None,
limit: int = 30,
) -> TeamHtftProfile:
"""
Compute HT/FT profile for a team from their recent matches.
Args:
team_id: Team ID
is_home: True = only home matches, False = only away matches
before_date: Only use matches before this timestamp (ms UTC)
limit: Number of recent matches to consider
"""
cache_key = (team_id, is_home, before_date)
if cache_key in self._team_cache:
return self._team_cache[cache_key]
conn = self.get_conn()
cur = conn.cursor()
if is_home:
query = """
SELECT ht_score_home, ht_score_away, score_home, score_away
FROM matches
WHERE home_team_id = %s
AND sport = 'football'
AND status = 'FT'
AND ht_score_home IS NOT NULL
AND ht_score_away IS NOT NULL
"""
else:
query = """
SELECT ht_score_away, ht_score_home, score_away, score_home
FROM matches
WHERE away_team_id = %s
AND sport = 'football'
AND status = 'FT'
AND ht_score_home IS NOT NULL
AND ht_score_away IS NOT NULL
"""
params = [team_id]
if before_date:
query += " AND mst_utc < %s"
params.append(before_date)
query += " ORDER BY mst_utc DESC LIMIT %s"
params.append(limit)
cur.execute(query, params)
rows = cur.fetchall()
cur.close()
profile = TeamHtftProfile()
profile.matches = len(rows)
for ht_mine, ht_opp, ft_mine, ft_opp in rows:
# 1st half scoring
if ht_mine > 0:
profile.ht_scored += 1
if ht_opp > 0:
profile.ht_conceded += 1
# HT situation
if ht_mine > ht_opp:
profile.ht_leading += 1
elif ht_mine < ht_opp:
profile.ht_trailing += 1
# Comeback
if ft_mine > ft_opp:
profile.comeback_wins += 1
# Goal distribution
profile.goals_1h += ht_mine
profile.goals_2h += (ft_mine - ht_mine)
profile.conceded_1h += ht_opp
profile.conceded_2h += (ft_opp - ht_opp)
self._team_cache[cache_key] = profile
return profile
def _get_league_htft_profile(
self,
league_id: str,
before_date: Optional[int] = None,
) -> LeagueHtftProfile:
"""Compute HT/FT profile for a league."""
cache_key = (league_id, before_date)
if cache_key in self._league_cache:
return self._league_cache[cache_key]
conn = self.get_conn()
cur = conn.cursor()
query = """
SELECT ht_score_home, ht_score_away, score_home, score_away
FROM matches
WHERE league_id = %s
AND sport = 'football'
AND status = 'FT'
AND ht_score_home IS NOT NULL
AND ht_score_away IS NOT NULL
"""
params = [league_id]
if before_date:
query += " AND mst_utc < %s"
params.append(before_date)
query += " ORDER BY mst_utc DESC LIMIT 500"
params_final = params
cur.execute(query, params_final)
rows = cur.fetchall()
cur.close()
profile = LeagueHtftProfile()
profile.matches = len(rows)
for hth, hta, sh, sa in rows:
profile.ht_goals_total += hth + hta
profile.ft_goals_total += sh + sa
# Classify HT/FT
ht = "1" if hth > hta else ("2" if hth < hta else "X")
ft = "1" if sh > sa else ("2" if sh < sa else "X")
htft = f"{ht}/{ft}"
profile.htft_counts[htft] = profile.htft_counts.get(htft, 0) + 1
if htft in ("1/2", "2/1"):
profile.reversals += 1
self._league_cache[cache_key] = profile
return profile
def get_features(
self,
home_team_id: str,
away_team_id: str,
league_id: Optional[str] = None,
before_date: Optional[int] = None,
) -> Dict[str, float]:
"""
Get HT/FT tendency features for a match.
Returns dict with ~15 features.
"""
# Team profiles (home side for home team, away side for away team)
home_prof = self._get_team_htft_profile(home_team_id, is_home=True, before_date=before_date)
away_prof = self._get_team_htft_profile(away_team_id, is_home=False, before_date=before_date)
# League profile
league_prof = LeagueHtftProfile()
if league_id:
league_prof = self._get_league_htft_profile(league_id, before_date=before_date)
features = {
# Home team HT/FT tendencies
"htft_home_ht_scoring_rate": home_prof.ht_scoring_rate,
"htft_home_ht_concede_rate": home_prof.ht_concede_rate,
"htft_home_ht_win_rate": home_prof.ht_win_rate,
"htft_home_comeback_rate": home_prof.comeback_rate,
"htft_home_first_half_goal_pct": home_prof.first_half_goal_pct,
"htft_home_second_half_surge": min(home_prof.second_half_surge, 3.0),
# Away team HT/FT tendencies
"htft_away_ht_scoring_rate": away_prof.ht_scoring_rate,
"htft_away_ht_concede_rate": away_prof.ht_concede_rate,
"htft_away_ht_win_rate": away_prof.ht_win_rate,
"htft_away_comeback_rate": away_prof.comeback_rate,
"htft_away_first_half_goal_pct": away_prof.first_half_goal_pct,
"htft_away_second_half_surge": min(away_prof.second_half_surge, 3.0),
# League-level
"htft_league_avg_ht_goals": league_prof.avg_ht_goals,
"htft_league_reversal_rate": league_prof.reversal_rate,
"htft_league_first_half_pct": league_prof.first_half_pct,
# Data quality (how many matches we have for these features)
"htft_home_sample_size": min(home_prof.matches / 30.0, 1.0),
"htft_away_sample_size": min(away_prof.matches / 30.0, 1.0),
}
return features
def clear_cache(self):
"""Clear internal caches (useful between batches)."""
self._team_cache.clear()
self._league_cache.clear()
# Singleton
_engine = None
def get_htft_tendency_engine() -> HtftTendencyEngine:
global _engine
if _engine is None:
_engine = HtftTendencyEngine()
return _engine
# ── Test ─────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
engine = get_htft_tendency_engine()
conn = engine.get_conn()
cur = conn.cursor()
cur.execute("""
SELECT home_team_id, away_team_id, league_id, mst_utc, match_name
FROM matches
WHERE sport = 'football' AND status = 'FT'
AND home_team_id IS NOT NULL AND away_team_id IS NOT NULL
ORDER BY mst_utc DESC LIMIT 3
""")
matches = cur.fetchall()
cur.close()
for hid, aid, lid, mst, name in matches:
print(f"\n🏟️ {name}")
features = engine.get_features(hid, aid, lid, mst)
for k, v in sorted(features.items()):
print(f" {k}: {v:.4f}")
+434
View File
@@ -0,0 +1,434 @@
"""
Momentum Engine - Son Maç Trendleri
V9 Model için takımların anlık form trendini analiz eder.
Faktörler:
1. Gol atma trendi (artan/azalan/stabil)
2. Yenilmezlik/yenilgi serisi
3. Son maç psikolojisi (büyük galibiyet/mağlubiyet etkisi)
4. Ev/Deplasman momentum farkı
"""
import os
import sys
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass, field
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
@dataclass
class MomentumData:
"""Takım momentum verileri"""
goals_trend: float = 0.0 # -1 (azalan) to +1 (artan)
conceded_trend: float = 0.0 # -1 (azalan) to +1 (artan) [negatif iyi]
unbeaten_streak: int = 0 # Yenilmezlik serisi
losing_streak: int = 0 # Yenilgi serisi
winning_streak: int = 0 # Galibiyet serisi
last_match_impact: float = 0.0 # Son maç psikolojik etkisi (-1 to +1)
momentum_score: float = 0.0 # Toplam momentum (-1 to +1)
form_direction: str = "stable" # "improving", "declining", "stable"
xg_underperformance: float = 0.0 # (xG_For - Real_Goals) in last matches (>0 means underperforming)
xg_conceded_diff: float = 0.0 # (Real_Conceded - xG_Against) in last matches
class MomentumEngine:
"""
Son maçlardaki trendi analiz eder.
Form yükselişi/düşüşü, seriler ve psikolojik etki.
"""
def __init__(self):
self.conn = None
self._connect_db()
def _connect_db(self):
"""Veritabanına bağlan"""
if psycopg2 is None:
return
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
except Exception as e:
print(f"[MomentumEngine] DB connection failed: {e}")
self.conn = None
def _get_conn(self):
"""Bağlantıyı kontrol et ve döndür"""
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def get_recent_matches(
self,
team_id: str,
before_date_ms: int,
limit: int = 5,
home_only: bool = False,
away_only: bool = False
) -> List[Dict]:
"""
Takımın son maçlarını getir.
Returns:
List of matches with scores and home/away info
"""
conn = self._get_conn()
if conn is None:
return []
try:
cursor = conn.cursor(cursor_factory=RealDictCursor)
conditions = ["mst_utc < %s", "score_home IS NOT NULL"]
params = [before_date_ms]
if home_only:
conditions.append("home_team_id = %s")
params.append(team_id)
elif away_only:
conditions.append("away_team_id = %s")
params.append(team_id)
else:
conditions.append("(home_team_id = %s OR away_team_id = %s)")
params.extend([team_id, team_id])
query = f"""
SELECT
id, home_team_id, away_team_id,
score_home, score_away, mst_utc
FROM matches
WHERE {' AND '.join(conditions)}
ORDER BY mst_utc DESC
LIMIT %s
"""
params.append(limit)
cursor.execute(query, params)
return cursor.fetchall()
except Exception as e:
print(f"[MomentumEngine] Query error: {e}")
return []
def calculate_goals_trend(self, matches: List[Dict], team_id: str) -> Tuple[float, float]:
"""
Gol atma ve yeme trendini hesapla.
Son 3 maç vs önceki 2 maç karşılaştırması.
Returns:
(goals_trend, conceded_trend) - -1 to +1
"""
if len(matches) < 3:
return 0.0, 0.0
# Her maç için gol ve yenilen gol hesapla
goals = []
conceded = []
for match in matches:
if match['home_team_id'] == team_id:
goals.append(match['score_home'])
conceded.append(match['score_away'])
else:
goals.append(match['score_away'])
conceded.append(match['score_home'])
# Son 3 vs önceki maçlar
recent_goals = sum(goals[:3]) / 3 if len(goals) >= 3 else 0
older_goals = sum(goals[3:]) / len(goals[3:]) if len(goals) > 3 else recent_goals
recent_conceded = sum(conceded[:3]) / 3 if len(conceded) >= 3 else 0
older_conceded = sum(conceded[3:]) / len(conceded[3:]) if len(conceded) > 3 else recent_conceded
# Trend hesapla (-1 to +1)
goals_trend = min(max((recent_goals - older_goals) / 2, -1), 1)
conceded_trend = min(max((recent_conceded - older_conceded) / 2, -1), 1)
return goals_trend, conceded_trend
def calculate_streaks(self, matches: List[Dict], team_id: str) -> Tuple[int, int, int]:
"""
Galibiyet, yenilmezlik ve yenilgi serilerini hesapla.
Returns:
(winning_streak, unbeaten_streak, losing_streak)
"""
winning = 0
unbeaten = 0
losing = 0
for match in matches:
# Sonucu belirle
if match['home_team_id'] == team_id:
goals_for = match['score_home']
goals_against = match['score_away']
else:
goals_for = match['score_away']
goals_against = match['score_home']
if goals_for > goals_against: # Galibiyet
if losing == 0: # Henüz yenilgi serisi başlamamış
winning += 1
unbeaten += 1
else:
break
elif goals_for == goals_against: # Beraberlik
if losing == 0:
winning = 0 # Galibiyet serisi bitti
unbeaten += 1
else:
break
else: # Yenilgi
if winning > 0 or unbeaten > 0:
winning = 0
unbeaten = 0
losing += 1
return winning, unbeaten, losing
def calculate_last_match_impact(self, matches: List[Dict], team_id: str) -> float:
"""
Son maçın psikolojik etkisini hesapla.
Büyük galibiyet = +1, büyük mağlubiyet = -1
Returns:
impact score: -1 to +1
"""
if not matches:
return 0.0
last_match = matches[0]
if last_match['home_team_id'] == team_id:
goals_for = last_match['score_home']
goals_against = last_match['score_away']
else:
goals_for = last_match['score_away']
goals_against = last_match['score_home']
goal_diff = goals_for - goals_against
# Gol farkına göre etki
if goal_diff >= 4:
return 1.0 # Çok büyük galibiyet
elif goal_diff >= 2:
return 0.6
elif goal_diff == 1:
return 0.3
elif goal_diff == 0:
return 0.0
elif goal_diff == -1:
return -0.3
elif goal_diff >= -3:
return -0.6
else:
return -1.0 # Çok büyük mağlubiyet
def calculate_xg_underperformance(self, matches: List[Dict], team_id: str) -> Tuple[float, float]:
"""
Calculate if a team chronically underperforms its xG (Expected Goals).
Returns:
(xg_strike_diff, xg_defend_diff)
xg_strike_diff: > 0 means they score LESS than expected (Bad Finishers)
xg_defend_diff: > 0 means they concede MORE than expected (Bad Goalkeeper/Luck)
"""
if not matches:
return 0.0, 0.0
real_scored = 0
xg_created = 0.0
real_conceded = 0
xg_conceded = 0.0
for m in matches:
is_home = (m['home_team_id'] == team_id)
if is_home:
real_scored += m['score_home']
real_conceded += m['score_away']
# Create synthetic xG data (mock based on score for demo since stats table absent)
xg_created += max(0.5, m['score_home'] * 1.5 - 0.5)
xg_conceded += max(0.5, m['score_away'] * 1.5 - 0.5)
else:
real_scored += m['score_away']
real_conceded += m['score_home']
xg_created += max(0.5, m['score_away'] * 1.5 - 0.5)
xg_conceded += max(0.5, m['score_home'] * 1.5 - 0.5)
# Calculate per match diffs
match_count = len(matches)
xg_strike_diff = (xg_created - real_scored) / match_count if match_count else 0
xg_defend_diff = (real_conceded - xg_conceded) / match_count if match_count else 0
return xg_strike_diff, xg_defend_diff
def calculate_momentum(
self,
team_id: str,
before_date_ms: int,
match_limit: int = 5
) -> MomentumData:
"""
Takımın tam momentum analizini yap.
Returns:
MomentumData with all metrics
"""
data = MomentumData()
matches = self.get_recent_matches(team_id, before_date_ms, match_limit)
if not matches:
return data
# 1. Gol trendi
data.goals_trend, data.conceded_trend = self.calculate_goals_trend(matches, team_id)
# 2. Seriler
data.winning_streak, data.unbeaten_streak, data.losing_streak = \
self.calculate_streaks(matches, team_id)
# 3. Son maç etkisi
data.last_match_impact = self.calculate_last_match_impact(matches, team_id)
# 4. Form yönü belirleme
if data.goals_trend > 0.3 and data.conceded_trend < 0:
data.form_direction = "improving"
elif data.goals_trend < -0.3 or data.conceded_trend > 0.3:
data.form_direction = "declining"
else:
data.form_direction = "stable"
# 5. xG Underperformance (Chronik beceriksizlik)
data.xg_underperformance, data.xg_conceded_diff = self.calculate_xg_underperformance(matches, team_id)
# 6. Toplam momentum skoru
momentum = 0.0
# Gol trendi + savunma trendi (ters çevrilmiş)
momentum += data.goals_trend * 0.25
momentum += (-data.conceded_trend) * 0.20
# Seri bonusları
if data.winning_streak >= 3:
momentum += 0.25
elif data.winning_streak >= 2:
momentum += 0.15
elif data.unbeaten_streak >= 5:
momentum += 0.15
if data.losing_streak >= 3:
momentum -= 0.30
elif data.losing_streak >= 2:
momentum -= 0.15
# Son maç etkisi
momentum += data.last_match_impact * 0.20
# Ceza: xG Underperformance Penalty (Beceriksizlik Cezası)
# Eğer takım attığından çok xG üretiyorsa (- puan)
if data.xg_underperformance > 0.5: # Maç başı 0.5 gol eksik atıyor!
momentum -= min(0.3, data.xg_underperformance * 0.2)
# Ceza: xG Defend Underperformance (Kötü kaleci Cezası)
# Eğer beklenenden çok gol yiyorsa
if data.xg_conceded_diff > 0.5:
momentum -= min(0.3, data.xg_conceded_diff * 0.2)
data.momentum_score = min(max(momentum, -1), 1)
return data
def get_features(
self,
home_team_id: str,
away_team_id: str,
match_date_ms: int
) -> Dict[str, float]:
"""
Model için feature dict döndür.
"""
home_momentum = self.calculate_momentum(home_team_id, match_date_ms)
away_momentum = self.calculate_momentum(away_team_id, match_date_ms)
# Form direction encoding
direction_map = {"improving": 1, "stable": 0, "declining": -1}
return {
# Ev sahibi momentum
"home_momentum_score": home_momentum.momentum_score,
"home_goals_trend": home_momentum.goals_trend,
"home_conceded_trend": home_momentum.conceded_trend,
"home_winning_streak": min(home_momentum.winning_streak, 5),
"home_unbeaten_streak": min(home_momentum.unbeaten_streak, 10),
"home_losing_streak": min(home_momentum.losing_streak, 5),
"home_last_impact": home_momentum.last_match_impact,
"home_form_direction": direction_map.get(home_momentum.form_direction, 0),
"home_xg_underperf": home_momentum.xg_underperformance,
"home_xg_conceded_diff": home_momentum.xg_conceded_diff,
# Deplasman momentum
"away_momentum_score": away_momentum.momentum_score,
"away_goals_trend": away_momentum.goals_trend,
"away_conceded_trend": away_momentum.conceded_trend,
"away_winning_streak": min(away_momentum.winning_streak, 5),
"away_unbeaten_streak": min(away_momentum.unbeaten_streak, 10),
"away_losing_streak": min(away_momentum.losing_streak, 5),
"away_last_impact": away_momentum.last_match_impact,
"away_form_direction": direction_map.get(away_momentum.form_direction, 0),
"away_xg_underperf": away_momentum.xg_underperformance,
"away_xg_conceded_diff": away_momentum.xg_conceded_diff,
# Farklar
"momentum_diff": home_momentum.momentum_score - away_momentum.momentum_score,
"trend_diff": (home_momentum.goals_trend - home_momentum.conceded_trend) -
(away_momentum.goals_trend - away_momentum.conceded_trend),
"xg_underperf_diff": home_momentum.xg_underperformance - away_momentum.xg_underperformance,
}
# Singleton instance
_engine_instance = None
def get_momentum_engine() -> MomentumEngine:
"""Singleton pattern ile engine döndür"""
global _engine_instance
if _engine_instance is None:
_engine_instance = MomentumEngine()
return _engine_instance
# Test
if __name__ == "__main__":
engine = get_momentum_engine()
# Test data
print("=" * 60)
print("MOMENTUM ENGINE TEST")
print("=" * 60)
# Örnek hesaplama (DB olmadan)
data = MomentumData(
goals_trend=0.5,
conceded_trend=-0.3,
winning_streak=3,
unbeaten_streak=5,
losing_streak=0,
last_match_impact=0.6,
form_direction="improving"
)
print(f"Goals Trend: {data.goals_trend}")
print(f"Conceded Trend: {data.conceded_trend}")
print(f"Winning Streak: {data.winning_streak}")
print(f"Unbeaten Streak: {data.unbeaten_streak}")
print(f"Form Direction: {data.form_direction}")
print(f"Last Match Impact: {data.last_match_impact}")
File diff suppressed because it is too large Load Diff
+371
View File
@@ -0,0 +1,371 @@
"""
Poisson Engine - Matematiksel Gol Modeli
V9 Model için Poisson dağılımı ile gol olasılıkları hesaplar.
Özellikler:
1. Exact score olasılıkları (0-0, 1-0, 1-1, 2-1, vb.)
2. Over/Under olasılıkları (matematiksel)
3. BTTS (Karşılıklı Gol) olasılıkları
4. Expected Goals (xG) tahmini
"""
import math
from typing import Dict, Tuple, Optional
from dataclasses import dataclass, field
def poisson_prob(lam: float, k: int) -> float:
"""
Poisson olasılık formülü.
P(X = k) = (λ^k * e^(-λ)) / k!
"""
if lam <= 0:
return 1.0 if k == 0 else 0.0
return (math.pow(lam, k) * math.exp(-lam)) / math.factorial(k)
@dataclass
class PoissonPrediction:
"""Poisson tahmin sonuçları"""
home_xg: float = 0.0 # Ev sahibi beklenen gol
away_xg: float = 0.0 # Deplasman beklenen gol
total_xg: float = 0.0 # Toplam beklenen gol
# Maç sonucu olasılıkları
home_win_prob: float = 0.0
draw_prob: float = 0.0
away_win_prob: float = 0.0
# Alt/Üst olasılıkları
over_15_prob: float = 0.0
over_25_prob: float = 0.0
over_35_prob: float = 0.0
under_15_prob: float = 0.0
under_25_prob: float = 0.0
under_35_prob: float = 0.0
# BTTS
btts_yes_prob: float = 0.0
btts_no_prob: float = 0.0
# En olası skorlar
most_likely_scores: list = field(default_factory=list)
class PoissonEngine:
"""
Poisson dağılımı ile gol olasılıkları hesaplar.
İstatistiksel bir yaklaşım - machine learning'den bağımsız.
"""
# Lig bazlı ortalama gol verileri (varsayılan değerler)
DEFAULT_HOME_XG = 1.45
DEFAULT_AWAY_XG = 1.15
DEFAULT_LEAGUE_AVG = 2.60
def __init__(self):
self.max_goals = 7 # Hesaplama için maksimum gol sayısı
def calculate_xg(
self,
home_goals_avg: float,
home_conceded_avg: float,
away_goals_avg: float,
away_conceded_avg: float,
league_home_avg: float = None,
league_away_avg: float = None,
league_total_avg: float = None
) -> Tuple[float, float]:
"""
Beklenen gol (xG) hesapla.
Attack strength * Defense weakness * League average
"""
# Varsayılan lig ortalamaları
if league_home_avg is None:
league_home_avg = self.DEFAULT_HOME_XG
if league_away_avg is None:
league_away_avg = self.DEFAULT_AWAY_XG
if league_total_avg is None:
league_total_avg = self.DEFAULT_LEAGUE_AVG
# Güç hesaplamaları
# Ev sahibi saldırı gücü = Ev gol ortalaması / Lig ev gol ortalaması
home_attack = home_goals_avg / league_home_avg if league_home_avg > 0 else 1.0
# Deplasman savunma zayıflığı = Deplasman yenilen gol / Lig deplasman yenilen
away_defense = away_conceded_avg / league_away_avg if league_away_avg > 0 else 1.0
# Deplasman saldırı gücü
away_attack = away_goals_avg / league_away_avg if league_away_avg > 0 else 1.0
# Ev sahibi savunma zayıflığı
home_defense = home_conceded_avg / league_home_avg if league_home_avg > 0 else 1.0
# Expected Goals
home_xg = home_attack * away_defense * league_home_avg
away_xg = away_attack * home_defense * league_away_avg
# Aşırı değerleri sınırla
home_xg = max(0.3, min(home_xg, 4.0))
away_xg = max(0.2, min(away_xg, 3.5))
return home_xg, away_xg
def calculate_score_matrix(
self,
home_xg: float,
away_xg: float
) -> Dict[Tuple[int, int], float]:
"""
Tüm skor kombinasyonlarının olasılıklarını hesapla.
Returns:
Dict[(home_goals, away_goals)] = probability
"""
matrix = {}
for home_goals in range(self.max_goals + 1):
for away_goals in range(self.max_goals + 1):
prob = poisson_prob(home_xg, home_goals) * poisson_prob(away_xg, away_goals)
matrix[(home_goals, away_goals)] = prob
return matrix
def calculate_match_odds(
self,
home_xg: float,
away_xg: float
) -> Tuple[float, float, float]:
"""
1X2 olasılıklarını hesapla.
Returns:
(home_win, draw, away_win) probabilities
"""
matrix = self.calculate_score_matrix(home_xg, away_xg)
home_win = 0.0
draw = 0.0
away_win = 0.0
for (h, a), prob in matrix.items():
if h > a:
home_win += prob
elif h == a:
draw += prob
else:
away_win += prob
# Normalize (toplam 1 olmalı)
total = home_win + draw + away_win
if total > 0:
home_win /= total
draw /= total
away_win /= total
return home_win, draw, away_win
def calculate_over_under(
self,
home_xg: float,
away_xg: float
) -> Dict[str, float]:
"""
Alt/Üst olasılıklarını hesapla.
"""
matrix = self.calculate_score_matrix(home_xg, away_xg)
over_15 = 0.0
over_25 = 0.0
over_35 = 0.0
for (h, a), prob in matrix.items():
total = h + a
if total > 1.5:
over_15 += prob
if total > 2.5:
over_25 += prob
if total > 3.5:
over_35 += prob
return {
"over_15": over_15,
"over_25": over_25,
"over_35": over_35,
"under_15": 1 - over_15,
"under_25": 1 - over_25,
"under_35": 1 - over_35,
}
def calculate_btts(
self,
home_xg: float,
away_xg: float
) -> Tuple[float, float]:
"""
Karşılıklı Gol (Both Teams To Score) olasılığı.
"""
# P(Home scores at least 1) = 1 - P(Home scores 0)
home_scores = 1 - poisson_prob(home_xg, 0)
# P(Away scores at least 1) = 1 - P(Away scores 0)
away_scores = 1 - poisson_prob(away_xg, 0)
# P(BTTS) = P(Home scores) * P(Away scores)
btts_yes = home_scores * away_scores
btts_no = 1 - btts_yes
return btts_yes, btts_no
def get_most_likely_scores(
self,
home_xg: float,
away_xg: float,
top_n: int = 5
) -> list:
"""
En olası skorları getir.
"""
matrix = self.calculate_score_matrix(home_xg, away_xg)
# Olasılığa göre sırala
sorted_scores = sorted(matrix.items(), key=lambda x: x[1], reverse=True)
return [
{"score": f"{h}-{a}", "probability": round(prob * 100, 1)}
for (h, a), prob in sorted_scores[:top_n]
]
def predict(
self,
home_goals_avg: float,
home_conceded_avg: float,
away_goals_avg: float,
away_conceded_avg: float,
league_home_avg: float = None,
league_away_avg: float = None,
league_total_avg: float = None
) -> PoissonPrediction:
"""
Tam Poisson tahmini.
"""
prediction = PoissonPrediction()
# 1. xG hesapla
home_xg, away_xg = self.calculate_xg(
home_goals_avg, home_conceded_avg,
away_goals_avg, away_conceded_avg,
league_home_avg, league_away_avg, league_total_avg
)
prediction.home_xg = round(home_xg, 2)
prediction.away_xg = round(away_xg, 2)
prediction.total_xg = round(home_xg + away_xg, 2)
# 2. Maç sonucu
hw, d, aw = self.calculate_match_odds(home_xg, away_xg)
prediction.home_win_prob = round(hw, 3)
prediction.draw_prob = round(d, 3)
prediction.away_win_prob = round(aw, 3)
# 3. Alt/Üst
ou = self.calculate_over_under(home_xg, away_xg)
prediction.over_15_prob = round(ou["over_15"], 3)
prediction.over_25_prob = round(ou["over_25"], 3)
prediction.over_35_prob = round(ou["over_35"], 3)
prediction.under_15_prob = round(ou["under_15"], 3)
prediction.under_25_prob = round(ou["under_25"], 3)
prediction.under_35_prob = round(ou["under_35"], 3)
# 4. BTTS
btts_yes, btts_no = self.calculate_btts(home_xg, away_xg)
prediction.btts_yes_prob = round(btts_yes, 3)
prediction.btts_no_prob = round(btts_no, 3)
# 5. En olası skorlar
prediction.most_likely_scores = self.get_most_likely_scores(home_xg, away_xg)
return prediction
def get_features(
self,
home_goals_avg: float,
home_conceded_avg: float,
away_goals_avg: float,
away_conceded_avg: float,
league_home_avg: float = None,
league_away_avg: float = None,
league_total_avg: float = None
) -> Dict[str, float]:
"""
Model için feature dict.
"""
pred = self.predict(
home_goals_avg, home_conceded_avg,
away_goals_avg, away_conceded_avg,
league_home_avg, league_away_avg, league_total_avg
)
return {
"poisson_home_xg": pred.home_xg,
"poisson_away_xg": pred.away_xg,
"poisson_total_xg": pred.total_xg,
"poisson_home_win": pred.home_win_prob,
"poisson_draw": pred.draw_prob,
"poisson_away_win": pred.away_win_prob,
"poisson_over_15": pred.over_15_prob,
"poisson_over_25": pred.over_25_prob,
"poisson_over_35": pred.over_35_prob,
"poisson_btts_yes": pred.btts_yes_prob,
}
# Singleton
_engine_instance = None
def get_poisson_engine() -> PoissonEngine:
"""Singleton pattern"""
global _engine_instance
if _engine_instance is None:
_engine_instance = PoissonEngine()
return _engine_instance
# Test
if __name__ == "__main__":
engine = get_poisson_engine()
# Örnek: Güçlü ev sahibi vs zayıf deplasman
print("=" * 60)
print("POISSON ENGINE TEST")
print("Galatasaray (ev) vs Antalyaspor (deplasman)")
print("=" * 60)
pred = engine.predict(
home_goals_avg=2.1, # GS ev ortalaması
home_conceded_avg=0.8, # GS ev yenilen
away_goals_avg=0.9, # Antalya deplasman gol
away_conceded_avg=1.8, # Antalya deplasman yenilen
league_home_avg=1.5,
league_away_avg=1.1
)
print(f"\n📊 Expected Goals:")
print(f" Ev Sahibi xG: {pred.home_xg}")
print(f" Deplasman xG: {pred.away_xg}")
print(f" Toplam xG: {pred.total_xg}")
print(f"\n🎯 Maç Sonucu:")
print(f" 1 (Ev): {pred.home_win_prob*100:.1f}%")
print(f" X (Beraberlik): {pred.draw_prob*100:.1f}%")
print(f" 2 (Deplasman): {pred.away_win_prob*100:.1f}%")
print(f"\n⚽ Alt/Üst:")
print(f" 2.5 Üst: {pred.over_25_prob*100:.1f}%")
print(f" 2.5 Alt: {pred.under_25_prob*100:.1f}%")
print(f"\n🤝 Karşılıklı Gol:")
print(f" KG Var: {pred.btts_yes_prob*100:.1f}%")
print(f" KG Yok: {pred.btts_no_prob*100:.1f}%")
print(f"\n📈 En Olası Skorlar:")
for score_data in pred.most_likely_scores:
print(f" {score_data['score']}: {score_data['probability']}%")
+368
View File
@@ -0,0 +1,368 @@
"""
Referee Engine - V9 Feature
Hakem profilleri ve maç etki analizi.
Analiz Edilen Metrikler:
- Ortalama kart sayısı (sarı/kırmızı)
- Penaltı verme eğilimi
- Ev sahibi lehine karar oranı
- Maç başına toplam gol ortalaması
"""
import os
from typing import Dict, Optional, List
from dataclasses import dataclass, field
from datetime import datetime
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
@dataclass
class RefereeProfile:
"""Hakem profili"""
referee_name: str
matches_count: int = 0
# Kart istatistikleri
avg_yellow_cards: float = 0.0
avg_red_cards: float = 0.0
total_cards_per_match: float = 0.0
# Penaltı istatistikleri
penalty_rate: float = 0.0 # Penaltı verdiği maç oranı
# Ev sahibi eğilimi
home_win_rate: float = 0.0
home_bias: float = 0.0 # -1 (away bias) to +1 (home bias)
# Gol istatistikleri
avg_goals_per_match: float = 0.0
over_25_rate: float = 0.0
@dataclass
class RefereeFeatures:
"""Model için hakem feature'ları"""
referee_name: str = ""
referee_matches: int = 0
referee_avg_yellow: float = 0.0
referee_avg_red: float = 0.0
referee_cards_total: float = 0.0
referee_penalty_rate: float = 0.0
referee_home_bias: float = 0.0
referee_avg_goals: float = 0.0
referee_over25_rate: float = 0.0
referee_experience: float = 0.0 # 0-1 normalized
def to_dict(self) -> Dict[str, float]:
return {
'referee_matches': float(self.referee_matches),
'referee_avg_yellow': self.referee_avg_yellow,
'referee_avg_red': self.referee_avg_red,
'referee_cards_total': self.referee_cards_total,
'referee_penalty_rate': self.referee_penalty_rate,
'referee_home_bias': self.referee_home_bias,
'referee_avg_goals': self.referee_avg_goals,
'referee_over25_rate': self.referee_over25_rate,
'referee_experience': self.referee_experience,
}
class RefereeEngine:
"""
Hakem analiz motoru.
Hakemlerin geçmiş maçlarını analiz ederek:
- Kart eğilimlerini
- Ev sahibi bias'ını
- Gol ortalamasını
hesaplar.
"""
# Ana hakem rolü ID'si (genellikle 1 veya "Hakem")
MAIN_REFEREE_ROLE_ID = 1
def __init__(self):
self.conn = None
self._referee_cache: Dict[str, RefereeProfile] = {}
self._cache_loaded = False
def _connect_db(self):
if psycopg2 is None:
return None
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
return self.conn
except Exception as e:
print(f"[RefereeEngine] DB connection failed: {e}")
return None
def get_conn(self):
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def _get_main_referee_role_id(self) -> int:
"""Ana hakem rolü ID'sini bul"""
conn = self.get_conn()
if conn is None:
return self.MAIN_REFEREE_ROLE_ID
try:
with conn.cursor() as cur:
cur.execute("""
SELECT id FROM official_roles
WHERE LOWER(name) LIKE '%%hakem%%'
AND LOWER(name) NOT LIKE '%%yardımcı%%'
AND LOWER(name) NOT LIKE '%%dördüncü%%'
LIMIT 1
""")
result = cur.fetchone()
if result:
return result[0]
except Exception:
pass
return self.MAIN_REFEREE_ROLE_ID
def get_referee_for_match(self, match_id: str) -> Optional[str]:
"""Maçın ana hakemini bul"""
conn = self.get_conn()
if conn is None:
return None
try:
main_role_id = self._get_main_referee_role_id()
with conn.cursor() as cur:
cur.execute("""
SELECT name FROM match_officials
WHERE match_id = %s AND role_id = %s
LIMIT 1
""", (match_id, main_role_id))
result = cur.fetchone()
return result[0] if result else None
except Exception as e:
print(f"[RefereeEngine] Error getting referee: {e}")
return None
def calculate_referee_profile(self, referee_name: str, league_id: str = None) -> RefereeProfile:
"""Hakemin maçlarını analiz et. league_id verilirse sadece o ligteki maçları kullanır."""
# Composite cache key — aynı isim farklı liglerde farklı profil
cache_key = (referee_name, league_id)
if cache_key in self._referee_cache:
return self._referee_cache[cache_key]
profile = RefereeProfile(referee_name=referee_name)
conn = self.get_conn()
if conn is None:
return profile
try:
main_role_id = self._get_main_referee_role_id()
with conn.cursor(cursor_factory=RealDictCursor) as cur:
# Bu hakemin yönettiği maçları al (league_id varsa sadece o lig)
if league_id:
cur.execute("""
SELECT m.id, m.score_home, m.score_away, m.home_team_id, m.away_team_id
FROM matches m
JOIN match_officials mo ON m.id = mo.match_id
WHERE mo.name = %s
AND mo.role_id = %s
AND m.league_id = %s
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
ORDER BY m.mst_utc DESC
LIMIT 100
""", (referee_name, main_role_id, league_id))
else:
cur.execute("""
SELECT m.id, m.score_home, m.score_away, m.home_team_id, m.away_team_id
FROM matches m
JOIN match_officials mo ON m.id = mo.match_id
WHERE mo.name = %s
AND mo.role_id = %s
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
ORDER BY m.mst_utc DESC
LIMIT 100
""", (referee_name, main_role_id))
matches = cur.fetchall()
profile.matches_count = len(matches)
if profile.matches_count == 0:
return profile
match_ids = [m['id'] for m in matches]
# Kart istatistikleri
cur.execute("""
SELECT
COUNT(*) FILTER (WHERE event_subtype ILIKE '%%yellow%%') as yellow_count,
COUNT(*) FILTER (WHERE event_subtype ILIKE '%%red%%' OR event_subtype ILIKE '%%second%%') as red_count
FROM match_player_events
WHERE match_id = ANY(%s) AND event_type = 'card'
""", (match_ids,))
card_stats = cur.fetchone()
if card_stats:
profile.avg_yellow_cards = (card_stats['yellow_count'] or 0) / profile.matches_count
profile.avg_red_cards = (card_stats['red_count'] or 0) / profile.matches_count
profile.total_cards_per_match = profile.avg_yellow_cards + profile.avg_red_cards
# Penaltı istatistikleri
cur.execute("""
SELECT COUNT(DISTINCT match_id) as penalty_matches
FROM match_player_events
WHERE match_id = ANY(%s)
AND event_type = 'goal'
AND event_subtype ILIKE '%%penaltı%%'
""", (match_ids,))
penalty_stats = cur.fetchone()
if penalty_stats:
profile.penalty_rate = (penalty_stats['penalty_matches'] or 0) / profile.matches_count
# Ev sahibi eğilimi ve gol ortalaması
home_wins = 0
away_wins = 0
draws = 0
total_goals = 0
over_25_count = 0
for m in matches:
goals = (m['score_home'] or 0) + (m['score_away'] or 0)
total_goals += goals
if goals > 2.5:
over_25_count += 1
if m['score_home'] > m['score_away']:
home_wins += 1
elif m['score_home'] < m['score_away']:
away_wins += 1
else:
draws += 1
profile.avg_goals_per_match = total_goals / profile.matches_count
profile.over_25_rate = over_25_count / profile.matches_count
profile.home_win_rate = home_wins / profile.matches_count
# Home bias: -1 (away favors) to +1 (home favors)
# Normal lig ortalaması ~%46 ev sahibi, buna göre normalize
expected_home_rate = 0.46
profile.home_bias = (profile.home_win_rate - expected_home_rate) * 2
profile.home_bias = max(-1, min(1, profile.home_bias))
# Cache'e ekle
self._referee_cache[cache_key] = profile
return profile
except Exception as e:
print(f"[RefereeEngine] Error calculating profile: {e}")
return profile
def get_features(self, match_id: str, league_id: str = None) -> Dict[str, float]:
"""
Maç için hakem feature'larını hesapla.
Args:
match_id: Maç ID'si
league_id: Lig ID'si (opsiyonel — isim çakışmalarını önlemek için)
Returns:
Hakem feature'ları dict olarak
"""
features = RefereeFeatures()
# Hakemi bul
referee_name = self.get_referee_for_match(match_id)
if referee_name is None:
return features.to_dict()
features.referee_name = referee_name
# Profili hesapla (league_id ile scope'lanmış)
profile = self.calculate_referee_profile(referee_name, league_id=league_id)
features.referee_matches = profile.matches_count
features.referee_avg_yellow = profile.avg_yellow_cards
features.referee_avg_red = profile.avg_red_cards
features.referee_cards_total = profile.total_cards_per_match
features.referee_penalty_rate = profile.penalty_rate
features.referee_home_bias = profile.home_bias
features.referee_avg_goals = profile.avg_goals_per_match
features.referee_over25_rate = profile.over_25_rate
# Deneyim: 50+ maç = 1.0, 0 maç = 0.0
features.referee_experience = min(profile.matches_count / 50, 1.0)
return features.to_dict()
def get_features_by_name(self, referee_name: str, league_id: str = None) -> Dict[str, float]:
"""
Hakem ismiyle feature'ları hesapla.
Args:
referee_name: Hakem ismi
league_id: Lig ID'si (opsiyonel — isim çakışmalarını önlemek için)
Returns:
Hakem feature'ları dict olarak
"""
features = RefereeFeatures()
if not referee_name:
return features.to_dict()
features.referee_name = referee_name
profile = self.calculate_referee_profile(referee_name, league_id=league_id)
features.referee_matches = profile.matches_count
features.referee_avg_yellow = profile.avg_yellow_cards
features.referee_avg_red = profile.avg_red_cards
features.referee_cards_total = profile.total_cards_per_match
features.referee_penalty_rate = profile.penalty_rate
features.referee_home_bias = profile.home_bias
features.referee_avg_goals = profile.avg_goals_per_match
features.referee_over25_rate = profile.over_25_rate
features.referee_experience = min(profile.matches_count / 50, 1.0)
return features.to_dict()
# Singleton instance
_engine: Optional[RefereeEngine] = None
def get_referee_engine() -> RefereeEngine:
"""Singleton referee engine instance döndür"""
global _engine
if _engine is None:
_engine = RefereeEngine()
return _engine
if __name__ == "__main__":
# Test
engine = get_referee_engine()
print("\n🧪 Referee Engine Test")
print("=" * 50)
# Test with a known referee name
test_referee = "Cüneyt Çakır"
features = engine.get_features_by_name(test_referee)
print(f"\n📊 Hakem: {test_referee}")
for key, value in features.items():
print(f" {key}: {value:.3f}")
+243
View File
@@ -0,0 +1,243 @@
"""
V27 Rolling Window Feature Calculator
======================================
Computes rolling averages over 5/10/20 match windows,
with home/away splits and trend detection.
"""
from __future__ import annotations
from typing import Dict, List, Tuple
import math
def calc_rolling_features(
team_matches: List[Tuple], # [(mst, is_home, team_goals, opp_goals, opp_id), ...]
before_date: int,
team_is_home: bool,
) -> Dict[str, float]:
"""Calculate rolling window features for a team before a given date."""
valid = [m for m in team_matches if m[0] < before_date]
defaults = {
"rolling5_goals_avg": 1.3, "rolling5_conceded_avg": 1.2,
"rolling10_goals_avg": 1.3, "rolling10_conceded_avg": 1.2,
"rolling20_goals_avg": 1.3, "rolling20_conceded_avg": 1.2,
"rolling5_clean_sheets": 0.25,
"venue_goals_avg": 1.3, "venue_conceded_avg": 1.2,
"goal_trend": 0.0,
}
if len(valid) < 3:
return defaults
result = {}
for window in [5, 10, 20]:
recent = valid[-window:] if len(valid) >= window else valid
n = len(recent)
g_sum = sum(m[2] for m in recent)
c_sum = sum(m[3] for m in recent)
result[f"rolling{window}_goals_avg"] = g_sum / n
result[f"rolling{window}_conceded_avg"] = c_sum / n
# Clean sheet rate (last 5)
r5 = valid[-5:] if len(valid) >= 5 else valid
result["rolling5_clean_sheets"] = sum(1 for m in r5 if m[3] == 0) / len(r5)
# Venue-specific (home-only or away-only)
venue_matches = [m for m in valid if m[1] == team_is_home]
if venue_matches:
vm = venue_matches[-10:] if len(venue_matches) >= 10 else venue_matches
result["venue_goals_avg"] = sum(m[2] for m in vm) / len(vm)
result["venue_conceded_avg"] = sum(m[3] for m in vm) / len(vm)
else:
result["venue_goals_avg"] = defaults["venue_goals_avg"]
result["venue_conceded_avg"] = defaults["venue_conceded_avg"]
# Goal trend: compare last 3 vs previous 3
if len(valid) >= 6:
last3 = sum(m[2] for m in valid[-3:]) / 3
prev3 = sum(m[2] for m in valid[-6:-3]) / 3
result["goal_trend"] = last3 - prev3
else:
result["goal_trend"] = 0.0
return result
def calc_league_quality(
all_matches: List[Tuple], # all FT matches in this league
) -> Dict[str, float]:
"""Calculate league-level quality features."""
defaults = {
"league_home_win_rate": 0.45,
"league_draw_rate": 0.25,
"league_btts_rate": 0.50,
"league_ou25_rate": 0.50,
"league_reliability_score": 0.50,
}
if len(all_matches) < 20:
return defaults
n = len(all_matches)
home_wins = sum(1 for m in all_matches if m[2] > m[3])
draws = sum(1 for m in all_matches if m[2] == m[3])
btts = sum(1 for m in all_matches if m[2] > 0 and m[3] > 0)
ou25 = sum(1 for m in all_matches if (m[2] + m[3]) > 2.5)
hw_rate = home_wins / n
dr_rate = draws / n
btts_rate = btts / n
ou25_rate = ou25 / n
# Reliability: leagues closer to averages are more predictable
predictability = 1.0 - abs(hw_rate - 0.45) - abs(dr_rate - 0.27) * 0.5
reliability = max(0.2, min(0.95, predictability))
return {
"league_home_win_rate": round(hw_rate, 4),
"league_draw_rate": round(dr_rate, 4),
"league_btts_rate": round(btts_rate, 4),
"league_ou25_rate": round(ou25_rate, 4),
"league_reliability_score": round(reliability, 4),
}
def calc_time_features(
team_matches: List[Tuple],
match_mst: int,
) -> Dict[str, float]:
"""Calculate time-based features."""
from datetime import datetime
# Days since last match
valid = [m for m in team_matches if m[0] < match_mst]
if valid:
last_mst = valid[-1][0]
days_rest = (match_mst - last_mst) / 86_400_000 # ms to days
days_rest = min(days_rest, 60.0) # cap at 60 days
else:
days_rest = 14.0
# Month and season flags
try:
dt = datetime.utcfromtimestamp(match_mst / 1000)
month = dt.month
is_season_start = 1.0 if month in (7, 8) else 0.0
is_season_end = 1.0 if month in (5, 6) else 0.0
except Exception:
month = 6
is_season_start = 0.0
is_season_end = 0.0
return {
"days_rest": round(days_rest, 2),
"match_month": month,
"is_season_start": is_season_start,
"is_season_end": is_season_end,
}
def calc_advanced_h2h(
team_matches: List[Tuple],
home_id: int,
away_id: int,
before_date: int,
) -> Dict[str, float]:
"""Calculate advanced H2H features."""
defaults = {
"h2h_home_goals_avg": 1.3,
"h2h_away_goals_avg": 1.1,
"h2h_recent_trend": 0.0,
"h2h_venue_advantage": 0.0,
}
h2h = [m for m in team_matches if m[4] == away_id and m[0] < before_date]
if not h2h:
return defaults
recent = h2h[-10:]
home_goals_total = 0
away_goals_total = 0
venue_home_wins = 0
venue_total = 0
for mst, is_home, team_goals, opp_goals, _ in recent:
if is_home:
home_goals_total += team_goals
away_goals_total += opp_goals
venue_total += 1
if team_goals > opp_goals:
venue_home_wins += 1
else:
home_goals_total += opp_goals
away_goals_total += team_goals
n = len(recent)
result = {
"h2h_home_goals_avg": home_goals_total / n,
"h2h_away_goals_avg": away_goals_total / n,
"h2h_venue_advantage": venue_home_wins / venue_total if venue_total > 0 else 0.5,
}
# Recent trend: last 3 vs overall
if len(h2h) >= 4:
last3_pts = sum(
1.0 if m[2] > m[3] else (0.5 if m[2] == m[3] else 0.0)
for m in h2h[-3:]
) / 3
overall_pts = sum(
1.0 if m[2] > m[3] else (0.5 if m[2] == m[3] else 0.0)
for m in h2h
) / len(h2h)
result["h2h_recent_trend"] = round(last3_pts - overall_pts, 4)
else:
result["h2h_recent_trend"] = 0.0
return result
def calc_strength_diff(
home_form: Dict[str, float],
away_form: Dict[str, float],
home_elo: Dict[str, float],
away_elo: Dict[str, float],
home_momentum: float,
away_momentum: float,
upset_potential: float,
) -> Dict[str, float]:
"""Calculate strength differential features."""
# Attack vs Defense mismatches
h_attack = home_form.get("goals_avg", 1.3)
a_defense = away_form.get("conceded_avg", 1.2)
a_attack = away_form.get("goals_avg", 1.3)
h_defense = home_form.get("conceded_avg", 1.2)
atk_def_home = h_attack - a_defense # positive = home attack > away defense
atk_def_away = a_attack - h_defense
# XG diff approximation
xg_diff = (h_attack + a_defense) / 2 - (a_attack + h_defense) / 2
# Form × Momentum interaction
form_mom = (home_momentum - away_momentum) * (
home_form.get("scoring_rate", 0.75) - away_form.get("scoring_rate", 0.75)
)
# ELO-Form consistency
elo_diff = home_elo.get("overall", 1500) - away_elo.get("overall", 1500)
form_diff = h_attack - a_attack
elo_form_consistency = 1.0 if (elo_diff > 0 and form_diff > 0) or (elo_diff < 0 and form_diff < 0) else 0.0
# Upset × ELO gap
elo_gap = abs(elo_diff)
upset_x_elo = upset_potential * (elo_gap / 400.0)
return {
"attack_vs_defense_home": round(atk_def_home, 4),
"attack_vs_defense_away": round(atk_def_away, 4),
"xg_diff": round(xg_diff, 4),
"form_momentum_interaction": round(form_mom, 4),
"elo_form_consistency": elo_form_consistency,
"upset_x_elo_gap": round(upset_x_elo, 4),
}
+408
View File
@@ -0,0 +1,408 @@
"""
Sidelined Analyzer — Injury & Suspension Impact Calculator
==========================================================
Parses sidelined JSON from live_matches and calculates
position-weighted missing player impact using ACTUAL player
statistics from the database (goals, assists, starting frequency).
Senior ML Engineer Principle: No magic numbers — all weights from config.
Data Quality: Cross-reference sidelined IDs with DB for real impact.
"""
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Any, Tuple
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
from config.config_loader import get_config
@dataclass
class PlayerImpactDetail:
"""Impact detail for a single sidelined player."""
player_id: str
player_name: str
position: str
impact_score: float
db_goals: int = 0
db_assists: int = 0
db_starts: int = 0
db_rating: float = 0.0 # Calculated from DB stats
is_key_player: bool = False
adaptation_applied: bool = False
@dataclass
class SidelinedImpact:
"""Impact analysis of sidelined players for one team."""
total_sidelined: int = 0
impact_score: float = 0.0 # 0.0 - 1.0 (normalized)
key_position_missing: bool = False # GK or 2+ same position missing
key_players_missing: int = 0 # How many key players are missing
position_breakdown: Dict[str, int] = field(default_factory=dict)
player_details: List[PlayerImpactDetail] = field(default_factory=list)
details: List[str] = field(default_factory=list)
class SidelinedAnalyzer:
"""
Analyzes sidelined player data with DB-backed statistics.
Impact formula per player:
player_impact = position_weight × db_rating_factor × adaptation_factor
Where:
- position_weight: from config (GK most critical)
- db_rating_factor: calculated from actual goals + assists + starts (not mackolik average!)
- adaptation_factor: 1.0 if recent injury, discounted if team adapted (many matches missed)
DB Query: Cross-references sidelined player IDs with match_player_events
to get real goals/assists from recent matches.
"""
def __init__(self):
self.config = get_config()
self.conn = None
self._load_config()
self._connect_db()
def _load_config(self):
"""Load all config values once at init."""
cfg = self.config
self.position_weights = cfg.get("sidelined.position_weights", {
"K": 0.35, "D": 0.20, "O": 0.25, "F": 0.30
})
self.max_rating = cfg.get("sidelined.max_rating", 10)
self.adaptation_threshold = cfg.get("sidelined.adaptation_threshold", 10)
self.adaptation_discount = cfg.get("sidelined.adaptation_discount", 0.5)
self.goalkeeper_penalty = cfg.get("sidelined.goalkeeper_penalty", 0.15)
self.confidence_boost = cfg.get("sidelined.confidence_boost", 10)
self.max_impact = cfg.get("sidelined.max_impact", 0.85)
self.key_player_threshold = cfg.get("sidelined.key_player_threshold", 3)
self.recent_matches_lookback = cfg.get("sidelined.recent_matches_lookback", 15)
@staticmethod
def _safe_int(value: Any, default: int = 0) -> int:
try:
if value is None or value == "":
return default
return int(float(value))
except (TypeError, ValueError):
return default
@staticmethod
def _safe_float(value: Any, default: float = 0.0) -> float:
try:
if value is None or value == "":
return default
return float(value)
except (TypeError, ValueError):
return default
def _connect_db(self):
"""Lazy DB connection following existing engine patterns."""
if psycopg2 is None:
return
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
except Exception as e:
print(f"[SidelinedAnalyzer] DB connection failed: {e}")
self.conn = None
def _get_conn(self):
"""Get or reconnect DB connection."""
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def _fetch_player_stats(self, player_ids: List[str]) -> Dict[str, Dict]:
"""
Fetch real player statistics from DB for given player IDs.
Returns dict keyed by player_id with:
goals: int, assists: int, starts: int, matches: int
"""
conn = self._get_conn()
if not conn or not player_ids:
return {}
stats = {}
try:
cur = conn.cursor(cursor_factory=RealDictCursor)
# 1. Goals from match_player_events + Assists via assist_player_id
cur.execute("""
SELECT
sub.player_id,
SUM(sub.goals) AS goals,
SUM(sub.assists) AS assists
FROM (
-- Goals: player scored
SELECT mpe.player_id,
COUNT(*) AS goals,
0 AS assists
FROM match_player_events mpe
JOIN matches m ON mpe.match_id = m.id
WHERE mpe.player_id = ANY(%s)
AND mpe.event_type = 'goal'
AND m.status = 'FT'
GROUP BY mpe.player_id
UNION ALL
-- Assists: player assisted
SELECT mpe.assist_player_id AS player_id,
0 AS goals,
COUNT(*) AS assists
FROM match_player_events mpe
JOIN matches m ON mpe.match_id = m.id
WHERE mpe.assist_player_id = ANY(%s)
AND mpe.event_type = 'goal'
AND m.status = 'FT'
GROUP BY mpe.assist_player_id
) sub
GROUP BY sub.player_id
""", (player_ids, player_ids))
for row in cur.fetchall():
pid = row["player_id"]
stats[pid] = {
"goals": row["goals"] or 0,
"assists": row["assists"] or 0,
"starts": 0,
"matches": 0
}
# 2. Starting frequency from match_player_participation
cur.execute("""
SELECT
mpp.player_id,
COUNT(*) AS total_matches,
COUNT(*) FILTER (WHERE mpp.is_starting = true) AS starts
FROM match_player_participation mpp
JOIN matches m ON mpp.match_id = m.id
WHERE mpp.player_id = ANY(%s)
AND m.status = 'FT'
GROUP BY mpp.player_id
""", (player_ids,))
for row in cur.fetchall():
pid = row["player_id"]
if pid not in stats:
stats[pid] = {"goals": 0, "assists": 0, "starts": 0, "matches": 0}
stats[pid]["starts"] = row["starts"] or 0
stats[pid]["matches"] = row["total_matches"] or 0
cur.close()
except Exception as e:
print(f"[SidelinedAnalyzer] DB query error: {e}")
try:
conn.rollback()
except Exception:
pass
return stats
def _calculate_db_rating(self, db_stats: Dict, position: str) -> float:
"""
Calculate player rating from DB statistics.
Rating is 0.0 - 1.0, where 1.0 = absolute key player.
Factors:
- Goals (weighted by position: Forwards value more, Defenders less)
- Assists
- Starting frequency (regulars > squad players)
"""
def _to_float(value: Any, default: float = 0.0) -> float:
try:
return float(value)
except (TypeError, ValueError):
return default
goals = _to_float(db_stats.get("goals", 0))
assists = _to_float(db_stats.get("assists", 0))
starts = _to_float(db_stats.get("starts", 0))
matches = _to_float(db_stats.get("matches", 0))
# Goal contribution weight by position
# Forwards: goals matter most
# Midfielders: balanced
# Defenders: starts matter more than goals
# Goalkeeper: starts are everything
goal_weight = {"F": 0.5, "O": 0.35, "D": 0.15, "K": 0.05}.get(position, 0.25)
assist_weight = {"F": 0.2, "O": 0.3, "D": 0.15, "K": 0.0}.get(position, 0.15)
start_weight = {"F": 0.3, "O": 0.35, "D": 0.7, "K": 0.95}.get(position, 0.5)
# Normalize each component to 0-1
# Goals: 5+ goals in recent matches = max
goal_factor = min(goals / 5.0, 1.0) if goals > 0 else 0.0
# Assists: 4+ assists = max
assist_factor = min(assists / 4.0, 1.0) if assists > 0 else 0.0
# Starts: 80%+ start rate = max regular
start_rate = starts / max(matches, 1)
start_factor = min(start_rate / 0.8, 1.0)
rating = (goal_factor * goal_weight +
assist_factor * assist_weight +
start_factor * start_weight)
return round(min(rating, 1.0), 4)
def analyze(self, team_data: Optional[Dict[str, Any]]) -> SidelinedImpact:
"""
Analyze sidelined data for a single team using DB-backed stats.
Args:
team_data: dict with 'players' list and 'totalSidelined' count.
Returns:
SidelinedImpact with calculated impact score and breakdown.
"""
if not team_data or not isinstance(team_data, dict):
return SidelinedImpact()
players = team_data.get("players", [])
if not players:
return SidelinedImpact(
total_sidelined=team_data.get("totalSidelined", 0)
)
# Collect player IDs for batch DB query
player_ids = [p.get("playerId", "") for p in players if p.get("playerId")]
# Batch fetch DB stats (single query, not N+1)
db_stats = self._fetch_player_stats(player_ids) if player_ids else {}
total_impact = 0.0
position_counts: Dict[str, int] = {}
player_details: List[PlayerImpactDetail] = []
details: List[str] = []
has_gk_missing = False
key_players_count = 0
for player in players:
if not isinstance(player, dict):
continue
pos = player.get("positionShort", "O")
name = player.get("playerName", "Unknown")
pid = player.get("playerId", "")
matches_missed = self._safe_int(player.get("matchesMissed", 0), 0)
player_type = player.get("type", "other")
mackolik_avg = self._safe_float(player.get("average", 0), 0.0)
position_counts[pos] = position_counts.get(pos, 0) + 1
if pos == "K":
has_gk_missing = True
# === Rating: DB first, mackolik fallback ===
p_db_stats = db_stats.get(pid, {})
if p_db_stats:
# Use real DB stats
db_rating = self._calculate_db_rating(p_db_stats, pos)
else:
# Fallback to mackolik average (normalized)
db_rating = min(mackolik_avg / self.max_rating, 1.0) if self.max_rating > 0 else 0.3
db_rating = max(db_rating, 0.15) # Minimum floor
# Key player check
is_key = db_rating >= 0.5 or (
self._safe_int(p_db_stats.get("goals", 0), 0) >= self.key_player_threshold
)
if is_key:
key_players_count += 1
# === Impact Calculation ===
pos_weight = self.position_weights.get(pos, 0.20)
# Rating factor: higher rated = bigger loss
rating_factor = max(db_rating, 0.15) # Even unknown players have minimum impact
# Adaptation: team has coped if player missed many matches
adapted = matches_missed >= self.adaptation_threshold
adapt_factor = self.adaptation_discount if adapted else 1.0
# Type factor
type_factor = 1.0 if player_type == "injury" else 0.8
player_impact = pos_weight * rating_factor * adapt_factor * type_factor
total_impact += player_impact
detail = PlayerImpactDetail(
player_id=pid,
player_name=name,
position=pos,
impact_score=round(player_impact, 4),
db_goals=p_db_stats.get("goals", 0),
db_assists=p_db_stats.get("assists", 0),
db_starts=p_db_stats.get("starts", 0),
db_rating=db_rating,
is_key_player=is_key,
adaptation_applied=adapted
)
player_details.append(detail)
db_info = f"G:{detail.db_goals} A:{detail.db_assists} S:{detail.db_starts}" if p_db_stats else "no DB data"
details.append(
f"{name} ({pos}, db_rating:{db_rating:.2f}, {db_info}) → impact:{player_impact:.3f}"
+ (" ⭐ KEY" if is_key else "")
+ (f" [adapted, {matches_missed} missed]" if adapted else "")
)
# GK penalty bonus
if has_gk_missing:
total_impact += self.goalkeeper_penalty
key_position_missing = has_gk_missing or any(v >= 2 for v in position_counts.values())
# Normalize to 0-1 range
normalization_cap = 1.5
normalized_impact = min(total_impact / normalization_cap, self.max_impact)
return SidelinedImpact(
total_sidelined=len(players),
impact_score=round(normalized_impact, 4),
key_position_missing=key_position_missing,
key_players_missing=key_players_count,
position_breakdown=position_counts,
player_details=player_details,
details=details
)
def analyze_match(self, sidelined_json: Optional[Dict[str, Any]]) -> Tuple[SidelinedImpact, SidelinedImpact]:
"""
Analyze sidelined data for both teams.
Returns:
(home_impact, away_impact)
"""
if not sidelined_json or not isinstance(sidelined_json, dict):
return SidelinedImpact(), SidelinedImpact()
home_impact = self.analyze(sidelined_json.get("homeTeam"))
away_impact = self.analyze(sidelined_json.get("awayTeam"))
return home_impact, away_impact
# Singleton
_analyzer: Optional[SidelinedAnalyzer] = None
def get_sidelined_analyzer() -> SidelinedAnalyzer:
global _analyzer
if _analyzer is None:
_analyzer = SidelinedAnalyzer()
return _analyzer
+357
View File
@@ -0,0 +1,357 @@
"""
Smart Bet Recommender
=====================
Skor tahminine göre akıllı bahis önerileri yapan sistem.
Örnek: Beşiktaş-Galatasaray için model 3-1 tahmin ediyor
→ DÜŞÜK RİSK: 1.5 Üst (yüksek ihtimal tutar)
→ ORTA RİSK: MS 1 + 2.5 Üst (orta ihtimal)
→ YÜKSEK RİSK: 3.5 Üst veya skor 3-1 (düşük ihtimal, yüksek kazanç)
Ayrıca kombinasyonlar:
- MS 1 + 1.5 Üst
- MS 1 + KG Var
- Her iki takım skor > 0.5 (her takım en az 1 gol atar)
"""
from dataclasses import dataclass
from typing import Dict, List, Optional, Tuple
from enum import Enum
class RiskLevel(Enum):
LOW = "LOW" # Yüksek olasılık, düşük oran (güvenli)
MEDIUM = "MEDIUM" # Orta olasılık, orta oran
HIGH = "HIGH" # Düşük olasılık, yüksek kazanç
EXTREME = "EXTREME" # Çok düşük olasılık, çok yüksek kazanç
@dataclass
class BetRecommendation:
"""Tek bir bahis önerisi"""
market: str # Piyasa adı (örn: "MS 1", "2.5 Üst")
pick: str # Seçim (örn: "1", "OVER", "YES")
odds: float # Oran
probability: float # Model olasılığı (0-1)
confidence: float # Güven seviyesi (0-100)
risk_level: RiskLevel
def to_dict(self) -> dict:
return {
"market": self.market,
"pick": self.pick,
"odds": self.odds,
"probability": round(self.probability * 100, 1),
"confidence": round(self.confidence, 1),
"risk_level": self.risk_level.value
}
@dataclass
class MatchPredictionSet:
"""Bir maç için tüm tahmin seti"""
match_name: str
predicted_score: Tuple[int, int] # (home, away)
home_win_prob: float
draw_prob: float
away_win_prob: float
over_15_prob: float
over_25_prob: float
over_35_prob: float
btts_yes_prob: float
# Öneriler
low_risk_bets: List[BetRecommendation]
medium_risk_bets: List[BetRecommendation]
high_risk_bets: List[BetRecommendation]
extreme_risk_bets: List[BetRecommendation]
def to_dict(self) -> dict:
return {
"match_name": self.match_name,
"predicted_score": f"{self.predicted_score[0]}-{self.predicted_score[1]}",
"probs": {
"home_win": round(self.home_win_prob * 100, 1),
"draw": round(self.draw_prob * 100, 1),
"away_win": round(self.away_win_prob * 100, 1),
"over_15": round(self.over_15_prob * 100, 1),
"over_25": round(self.over_25_prob * 100, 1),
"over_35": round(self.over_35_prob * 100, 1),
"btts": round(self.btts_yes_prob * 100, 1)
},
"low_risk": [b.to_dict() for b in self.low_risk_bets],
"medium_risk": [b.to_dict() for b in self.medium_risk_bets],
"high_risk": [b.to_dict() for b in self.high_risk_bets],
"extreme_risk": [b.to_dict() for b in self.extreme_risk_bets]
}
class SmartBetRecommender:
"""
Akıllı Bahis Öneri Sistemi
Skor tahminine göre farklı risk seviyelerinde bahisler önerir.
Mantık:
1. DÜŞÜK RİSK: Yüksek olasılıklı (>70%), düşük oranlı bahisler
- 1.5 Üst
- Double Chance
- Favori takım gol atar
2. ORTA RİSK: Orta olasılıklı (50-70%), orta oranlı bahisler
- MS favori
- 2.5 Üst
- KG Var/Var
3. YÜKSEK RİSK: Düşük olasılıklı (30-50%), yüksek oranlı bahisler
- 3.5 Üst
- Skor tahmini
- Handikap
4. EXTREME RİSK: Çok düşük olasılıklı (<30%), çok yüksek oranlı
- Tam skor
- Uzunluklu kombinasyonlar
"""
# Olasılık eşikleri
PROB_LOW_RISK = 0.70 # > %70 olasılık
PROB_MEDIUM_RISK = 0.50 # %50-70 olasılık
PROB_HIGH_RISK = 0.30 # %30-50 olasılık
# < %30 = EXTREME
def __init__(self):
pass
def _determine_risk(self, probability: float) -> RiskLevel:
"""Olasılığa göre risk seviyesi belirle"""
if probability >= self.PROB_LOW_RISK:
return RiskLevel.LOW
elif probability >= self.PROB_MEDIUM_RISK:
return RiskLevel.MEDIUM
elif probability >= self.PROB_HIGH_RISK:
return RiskLevel.HIGH
else:
return RiskLevel.EXTREME
def _get_favorite(self, home_prob: float, draw_prob: float, away_prob: float) -> Tuple[str, float]:
"""Favori sonucu ve olasılığını döndür"""
if home_prob >= draw_prob and home_prob >= away_prob:
return "1", home_prob
elif away_prob >= home_prob and away_prob >= draw_prob:
return "2", away_prob
else:
return "X", draw_prob
def _calculate_expected_goals(self, predicted_score: Tuple[int, int]) -> float:
"""Tahmin edilen skora göre beklenen gol sayısı"""
return predicted_score[0] + predicted_score[1]
def recommend(
self,
match_name: str,
predicted_score: Tuple[int, int],
probs: Dict[str, float],
odds: Dict[str, float]
) -> MatchPredictionSet:
"""
Maç için tüm bahis önerilerini oluştur.
Args:
match_name: Maç adı
predicted_score: (home_goals, away_goals)
probs: {"home_win": 0.55, "draw": 0.25, "away_win": 0.20,
"over_15": 0.85, "over_25": 0.65, "over_35": 0.35,
"btts_yes": 0.55}
odds: {"1": 1.80, "X": 3.50, "2": 4.20,
"ou15_o": 1.25, "ou15_u": 3.80,
"ou25_o": 1.90, "ou25_u": 1.85,
"ou35_o": 3.20, "ou35_u": 1.30,
"btts_y": 1.75, "btts_n": 2.00}
Returns:
MatchPredictionSet with all recommendations
"""
home_prob = probs.get("home_win", 0.33)
draw_prob = probs.get("draw", 0.33)
away_prob = probs.get("away_win", 0.33)
over_15_prob = probs.get("over_15", 0.70)
over_25_prob = probs.get("over_25", 0.50)
over_35_prob = probs.get("over_35", 0.30)
btts_prob = probs.get("btts_yes", 0.50)
# Beklenen goller
expected_goals = self._calculate_expected_goals(predicted_score)
# Favori
favorite, favorite_prob = self._get_favorite(home_prob, draw_prob, away_prob)
# Önerileri oluştur
low_risk = []
medium_risk = []
high_risk = []
extreme_risk = []
# ========== DÜŞÜK RİSK ÖNERİLERİ ==========
# 1.5 Üst (en güvenli)
if over_15_prob >= self.PROB_LOW_RISK:
low_risk.append(BetRecommendation(
market="1.5 Üst/Alt",
pick="OVER",
odds=odds.get("ou15_o", 1.25),
probability=over_15_prob,
confidence=over_15_prob * 100,
risk_level=RiskLevel.LOW
))
# Double Chance
if home_prob > away_prob:
dc_prob = home_prob + draw_prob
if dc_prob >= self.PROB_LOW_RISK:
low_risk.append(BetRecommendation(
market="Double Chance",
pick="1X",
odds=odds.get("dc_1x", 1.30),
probability=dc_prob,
confidence=dc_prob * 100,
risk_level=RiskLevel.LOW
))
elif away_prob > home_prob:
dc_prob = away_prob + draw_prob
if dc_prob >= self.PROB_LOW_RISK:
low_risk.append(BetRecommendation(
market="Double Chance",
pick="X2",
odds=odds.get("dc_x2", 1.30),
probability=dc_prob,
confidence=dc_prob * 100,
risk_level=RiskLevel.LOW
))
# ========== ORTA RİSK ÖNERİLERİ ==========
# MS Favori
if self.PROB_MEDIUM_RISK <= favorite_prob < self.PROB_LOW_RISK:
medium_risk.append(BetRecommendation(
market="Maç Sonucu",
pick=favorite,
odds=odds.get(favorite, 2.00),
probability=favorite_prob,
confidence=favorite_prob * 100,
risk_level=RiskLevel.MEDIUM
))
# 2.5 Üst
if self.PROB_MEDIUM_RISK <= over_25_prob < self.PROB_LOW_RISK:
medium_risk.append(BetRecommendation(
market="2.5 Üst/Alt",
pick="OVER",
odds=odds.get("ou25_o", 1.90),
probability=over_25_prob,
confidence=over_25_prob * 100,
risk_level=RiskLevel.MEDIUM
))
# KG Var
if self.PROB_MEDIUM_RISK <= btts_prob < self.PROB_LOW_RISK:
medium_risk.append(BetRecommendation(
market="Karşılıklı Gol",
pick="YES",
odds=odds.get("btts_y", 1.75),
probability=btts_prob,
confidence=btts_prob * 100,
risk_level=RiskLevel.MEDIUM
))
# MS + 2.5 Üst kombinasyonu
if favorite_prob >= 0.45 and over_25_prob >= 0.50:
combo_prob = favorite_prob * over_25_prob # Basit çarpım
combo_odds = odds.get(favorite, 2.00) * odds.get("ou25_o", 1.90)
if combo_prob >= 0.30: # En az %30 olasılık
medium_risk.append(BetRecommendation(
market=f"MS {favorite} + 2.5 Üst",
pick=f"{favorite} & OVER",
odds=combo_odds,
probability=combo_prob,
confidence=combo_prob * 100,
risk_level=RiskLevel.MEDIUM
))
# ========== YÜKSEK RİSK ÖNERİLERİ ==========
# 3.5 Üst
if self.PROB_HIGH_RISK <= over_35_prob < self.PROB_MEDIUM_RISK:
high_risk.append(BetRecommendation(
market="3.5 Üst/Alt",
pick="OVER",
odds=odds.get("ou35_o", 3.20),
probability=over_35_prob,
confidence=over_35_prob * 100,
risk_level=RiskLevel.HIGH
))
# Skor tahmini (yüksek skorlu maçlar için)
if expected_goals >= 3.5:
score_str = f"{predicted_score[0]}-{predicted_score[1]}"
# Skor olasılığı tahmini (basit model)
score_prob = 0.15 if expected_goals <= 4 else 0.10
high_risk.append(BetRecommendation(
market="Tam Skor",
pick=score_str,
odds=8.0, # Tahmini oran
probability=score_prob,
confidence=score_prob * 100,
risk_level=RiskLevel.HIGH
))
# MS + 3.5 Üst
if favorite_prob >= 0.40 and over_35_prob >= 0.30:
combo_prob = favorite_prob * over_35_prob
combo_odds = odds.get(favorite, 2.00) * odds.get("ou35_o", 3.20)
high_risk.append(BetRecommendation(
market=f"MS {favorite} + 3.5 Üst",
pick=f"{favorite} & OVER",
odds=combo_odds,
probability=combo_prob,
confidence=combo_prob * 100,
risk_level=RiskLevel.HIGH
))
# ========== EXTREME RİSK ÖNERİLERİ ==========
# Uzun kombinasyonlar
if favorite_prob >= 0.50 and btts_prob >= 0.50 and over_25_prob >= 0.60:
combo_prob = favorite_prob * btts_prob * over_25_prob
combo_odds = odds.get(favorite, 2.00) * odds.get("btts_y", 1.75) * odds.get("ou25_o", 1.90)
if combo_prob >= 0.15: # En az %15 olasılık
extreme_risk.append(BetRecommendation(
market=f"MS {favorite} + KG Var + 2.5 Üst",
pick=f"{favorite} & BTTS & OVER",
odds=combo_odds,
probability=combo_prob,
confidence=combo_prob * 100,
risk_level=RiskLevel.EXTREME
))
return MatchPredictionSet(
match_name=match_name,
predicted_score=predicted_score,
home_win_prob=home_prob,
draw_prob=draw_prob,
away_win_prob=away_prob,
over_15_prob=over_15_prob,
over_25_prob=over_25_prob,
over_35_prob=over_35_prob,
btts_yes_prob=btts_prob,
low_risk_bets=low_risk,
medium_risk_bets=medium_risk,
high_risk_bets=high_risk,
extreme_risk_bets=extreme_risk
)
# Singleton
_recommender = None
def get_smart_bet_recommender() -> SmartBetRecommender:
global _recommender
if _recommender is None:
_recommender = SmartBetRecommender()
return _recommender
+582
View File
@@ -0,0 +1,582 @@
"""
Squad Analysis Engine - V9 Feature
Kadro ve oyuncu bazlı analiz.
Analiz Edilen Metrikler:
- İlk 11 kalitesi (golcü formu, key player)
- Yedek gücü
- Eksik oyuncu etkisi
- Pozisyon bazlı güç
- Takım içi golcü dağılımı
"""
import os
from typing import Dict, Optional, List, Tuple
from dataclasses import dataclass, field
from datetime import datetime
from collections import defaultdict
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
@dataclass
class PlayerForm:
"""Oyuncu form bilgisi"""
player_id: str
player_name: str
goals_last_5: int = 0
assists_last_5: int = 0
minutes_last_5: int = 0
cards_last_5: int = 0
is_key_player: bool = False # Golcü veya sık oynayan
@dataclass
class SquadAnalysis:
"""Takım kadro analizi"""
team_id: str
team_name: str = ""
# İlk 11 bilgisi
starting_count: int = 0
sub_count: int = 0
total_squad: int = 0
# Pozisyon dağılımı
goalkeeper_count: int = 0
defender_count: int = 0
midfielder_count: int = 0
forward_count: int = 0
# Form metrikleri
total_goals_last_5: int = 0 # Kadrodaki oyuncuların son 5 maçtaki golleri
total_assists_last_5: int = 0
key_players_count: int = 0 # Golcü sayısı
key_player_missing: int = 0 # Eksik golcü
# Kalite metrikleri
avg_minutes_per_player: float = 0.0 # Ortalama oynama süresi
squad_experience: float = 0.0 # 0-1, takımla oynama deneyimi
rotation_rate: float = 0.0 # Kadro rotasyonu oranı
@dataclass
class SquadFeatures:
"""Model için kadro feature'ları"""
# Home team features
home_starting_11: int = 11
home_sub_count: int = 7
home_total_squad: int = 18
home_goalkeepers: int = 1
home_defenders: int = 4
home_midfielders: int = 4
home_forwards: int = 2
home_goals_last_5: int = 0
home_assists_last_5: int = 0
home_key_players: int = 0
home_squad_experience: float = 0.5
# Away team features
away_starting_11: int = 11
away_sub_count: int = 7
away_total_squad: int = 18
away_goalkeepers: int = 1
away_defenders: int = 4
away_midfielders: int = 4
away_forwards: int = 2
away_goals_last_5: int = 0
away_assists_last_5: int = 0
away_key_players: int = 0
away_squad_experience: float = 0.5
# Comparison features
squad_strength_diff: float = 0.0 # + = home stronger
goals_form_diff: float = 0.0
key_players_diff: int = 0
def to_dict(self) -> Dict[str, float]:
return {
# Home
'home_starting_11': float(self.home_starting_11),
'home_sub_count': float(self.home_sub_count),
'home_total_squad': float(self.home_total_squad),
'home_goalkeepers': float(self.home_goalkeepers),
'home_defenders': float(self.home_defenders),
'home_midfielders': float(self.home_midfielders),
'home_forwards': float(self.home_forwards),
'home_goals_last_5': float(self.home_goals_last_5),
'home_assists_last_5': float(self.home_assists_last_5),
'home_key_players': float(self.home_key_players),
'home_squad_experience': self.home_squad_experience,
# Away
'away_starting_11': float(self.away_starting_11),
'away_sub_count': float(self.away_sub_count),
'away_total_squad': float(self.away_total_squad),
'away_goalkeepers': float(self.away_goalkeepers),
'away_defenders': float(self.away_defenders),
'away_midfielders': float(self.away_midfielders),
'away_forwards': float(self.away_forwards),
'away_goals_last_5': float(self.away_goals_last_5),
'away_assists_last_5': float(self.away_assists_last_5),
'away_key_players': float(self.away_key_players),
'away_squad_experience': self.away_squad_experience,
# Diffs
'squad_strength_diff': self.squad_strength_diff,
'goals_form_diff': self.goals_form_diff,
'key_players_diff': float(self.key_players_diff),
}
class SquadAnalysisEngine:
"""
Kadro ve oyuncu analiz motoru.
Beşiktaş-Galatasaray maçı için:
- İlk 11'deki oyuncuların son 5 maçtaki gol/asist
- Key player tespiti (çok gol atan oyuncular)
- Pozisyon dağılımı (4-3-3, 4-4-2 vb.)
- Yedek kalitesi
hesaplar.
"""
# Pozisyon mapping
POSITION_MAP = {
'goalkeeper': 'GK',
'gk': 'GK',
'kaleci': 'GK',
'defender': 'DEF',
'def': 'DEF',
'defans': 'DEF',
'savunma': 'DEF',
'midfielder': 'MID',
'mid': 'MID',
'orta saha': 'MID',
'forward': 'FWD',
'fwd': 'FWD',
'forvet': 'FWD',
'striker': 'FWD',
}
def __init__(self):
self.conn = None
self._player_form_cache: Dict[str, PlayerForm] = {}
def _connect_db(self):
if psycopg2 is None:
return None
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
return self.conn
except Exception as e:
print(f"[SquadEngine] DB connection failed: {e}")
return None
def get_conn(self):
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def _normalize_position(self, position: Optional[str]) -> str:
"""Pozisyonu normalize et"""
if not position:
return 'UNK'
pos_lower = position.lower().strip()
for key, val in self.POSITION_MAP.items():
if key in pos_lower:
return val
return 'UNK'
def get_player_form(self, player_id: str, before_date_ms: int = None) -> PlayerForm:
"""Oyuncunun son 5 maçtaki formunu hesapla"""
if player_id in self._player_form_cache:
return self._player_form_cache[player_id]
form = PlayerForm(player_id=player_id, player_name="")
conn = self.get_conn()
if conn is None:
return form
try:
with conn.cursor(cursor_factory=RealDictCursor) as cur:
# Oyuncu adını al
cur.execute("SELECT name FROM players WHERE id = %s", (player_id,))
player_row = cur.fetchone()
if player_row:
form.player_name = player_row['name']
# Son 5 maçtaki gol ve asist
cur.execute("""
SELECT
COUNT(*) FILTER (WHERE event_type = 'goal' AND event_subtype NOT ILIKE '%%penaltı kaçırma%%') as goals,
COUNT(*) FILTER (WHERE event_type = 'goal' AND assist_player_id IS NOT NULL) as assists_given
FROM match_player_events
WHERE player_id = %s
AND match_id IN (
SELECT match_id FROM match_player_participation
WHERE player_id = %s
ORDER BY match_id DESC LIMIT 5
)
""", (player_id, player_id))
stats = cur.fetchone()
if stats:
form.goals_last_5 = stats['goals'] or 0
# Asist hesapla (assist_player_id olarak geçen)
cur.execute("""
SELECT COUNT(*) as assists
FROM match_player_events
WHERE assist_player_id = %s
AND match_id IN (
SELECT match_id FROM match_player_participation
WHERE player_id = %s
ORDER BY match_id DESC LIMIT 5
)
""", (player_id, player_id))
assist_row = cur.fetchone()
if assist_row:
form.assists_last_5 = assist_row['assists'] or 0
# Kart sayısı
cur.execute("""
SELECT COUNT(*) as cards
FROM match_player_events
WHERE player_id = %s AND event_type = 'card'
AND match_id IN (
SELECT match_id FROM match_player_participation
WHERE player_id = %s
ORDER BY match_id DESC LIMIT 5
)
""", (player_id, player_id))
card_row = cur.fetchone()
if card_row:
form.cards_last_5 = card_row['cards'] or 0
# Key player mi? (Son 10 maçta 3+ gol)
cur.execute("""
SELECT COUNT(*) as total_goals
FROM match_player_events
WHERE player_id = %s
AND event_type = 'goal'
AND event_subtype NOT ILIKE '%%penaltı kaçırma%%'
""", (player_id,))
total_row = cur.fetchone()
form.is_key_player = (total_row['total_goals'] or 0) >= 3
self._player_form_cache[player_id] = form
return form
except Exception as e:
import traceback
traceback.print_exc()
print(f"[SquadEngine] Error getting player form: {e}")
return form
def analyze_squad(self, match_id: str, team_id: str) -> SquadAnalysis:
"""Takımın maç kadrosunu analiz et"""
analysis = SquadAnalysis(team_id=team_id)
conn = self.get_conn()
if conn is None:
return analysis
try:
with conn.cursor(cursor_factory=RealDictCursor) as cur:
# Takım adını al
cur.execute("SELECT name FROM teams WHERE id = %s", (team_id,))
team_row = cur.fetchone()
if team_row:
analysis.team_name = team_row['name']
# Maç kadrosunu al
cur.execute("""
SELECT player_id, position, is_starting
FROM match_player_participation
WHERE match_id = %s AND team_id = %s
""", (match_id, team_id))
players = cur.fetchall()
for p in players:
if p['is_starting']:
analysis.starting_count += 1
else:
analysis.sub_count += 1
pos = self._normalize_position(p['position'])
if pos == 'GK':
analysis.goalkeeper_count += 1
elif pos == 'DEF':
analysis.defender_count += 1
elif pos == 'MID':
analysis.midfielder_count += 1
elif pos == 'FWD':
analysis.forward_count += 1
# İlk 11'in formunu topluca hesapla
if p['is_starting']:
form = self.get_player_form(p['player_id'])
analysis.total_goals_last_5 += form.goals_last_5
analysis.total_assists_last_5 += form.assists_last_5
if form.is_key_player:
analysis.key_players_count += 1
analysis.total_squad = analysis.starting_count + analysis.sub_count
# Takım deneyimi (bu takımla kaç maç oynamışlar)
if analysis.starting_count > 0:
cur.execute("""
SELECT AVG(match_count) as avg_exp
FROM (
SELECT player_id, COUNT(*) as match_count
FROM match_player_participation
WHERE team_id = %s AND is_starting = true
GROUP BY player_id
) sub
""", (team_id,))
exp_row = cur.fetchone()
if exp_row and exp_row['avg_exp']:
# Normalize: 50+ maç = 1.0
analysis.squad_experience = min(exp_row['avg_exp'] / 50, 1.0)
return analysis
except Exception as e:
print(f"[SquadEngine] Error analyzing squad: {e}")
return analysis
def analyze_squad_from_list(self, player_ids: List[str], team_id: str) -> SquadAnalysis:
"""
Memory'deki oyuncu listesinden kadro analizi yap.
DB'de olmayan canlı maçlar için kullanılır.
"""
analysis = SquadAnalysis(team_id=team_id)
# Varsayılan: İlk 11 oyuncu (listede genellikle ilk 11 verilir)
# Eğer liste boşsa
if not player_ids:
return analysis
# Varsayımlar: Mackolik API'den gelen liste sıralıdır.
# İlk 11 genellikle as kadrodur. Ancak burada sadece 'starting' oyuncuları alıyoruz varsayalım.
# User calling uses explicit starting 11 list.
analysis.starting_count = len(player_ids)
analysis.total_squad = len(player_ids) # Subs unknown usually unless separate list
# Position tahmini zor, default dağıt? Veya oyuncu detayına git?
# Hız için: Oyuncu ID'sinden DB'ye bakıp pozisyon öğrenmeye çalışabiliriz.
conn = self.get_conn()
if conn is None:
return analysis
try:
with conn.cursor(cursor_factory=RealDictCursor) as cur:
# Calculate stats for these specific players
for pid in player_ids:
# Get Form
form = self.get_player_form(pid)
analysis.total_goals_last_5 += form.goals_last_5
analysis.total_assists_last_5 += form.assists_last_5
if form.is_key_player:
analysis.key_players_count += 1
# Get Position/Exp history attempt
cur.execute("""
SELECT position, COUNT(*) as match_count
FROM match_player_participation
WHERE player_id = %s AND team_id = %s
GROUP BY position
ORDER BY match_count DESC LIMIT 1
""", (pid, team_id))
row = cur.fetchone()
if row:
pos = self._normalize_position(row.get('position', 'UNK'))
if pos == 'GK': analysis.goalkeeper_count += 1
elif pos == 'DEF': analysis.defender_count += 1
elif pos == 'MID': analysis.midfielder_count += 1
elif pos == 'FWD': analysis.forward_count += 1
# Experience contribution
exp = min(row['match_count'] / 50.0, 1.0)
analysis.squad_experience += exp
# Average experience
if analysis.starting_count > 0:
analysis.squad_experience /= analysis.starting_count
except Exception as e:
print(f"[SquadEngine] Live analyze error: {e}")
return analysis
def get_features(
self,
match_id: str,
home_team_id: str,
away_team_id: str
) -> Dict[str, float]:
"""
Maç için kadro feature'larını hesapla.
Args:
match_id: Maç ID'si
home_team_id: Ev sahibi takım ID
away_team_id: Deplasman takım ID
Returns:
Kadro feature'ları dict olarak
"""
features = SquadFeatures()
# Ev sahibi analizi
home = self.analyze_squad(match_id, home_team_id)
features.home_starting_11 = home.starting_count
features.home_sub_count = home.sub_count
features.home_total_squad = home.total_squad
features.home_goalkeepers = home.goalkeeper_count
features.home_defenders = home.defender_count
features.home_midfielders = home.midfielder_count
features.home_forwards = home.forward_count
features.home_goals_last_5 = home.total_goals_last_5
features.home_assists_last_5 = home.total_assists_last_5
features.home_key_players = home.key_players_count
features.home_squad_experience = home.squad_experience
# Deplasman analizi
away = self.analyze_squad(match_id, away_team_id)
features.away_starting_11 = away.starting_count
features.away_sub_count = away.sub_count
features.away_total_squad = away.total_squad
features.away_goalkeepers = away.goalkeeper_count
features.away_defenders = away.defender_count
features.away_midfielders = away.midfielder_count
features.away_forwards = away.forward_count
features.away_goals_last_5 = away.total_goals_last_5
features.away_assists_last_5 = away.total_assists_last_5
features.away_key_players = away.key_players_count
features.away_squad_experience = away.squad_experience
# Karşılaştırma feature'ları
home_strength = (
home.total_goals_last_5 * 2 +
home.total_assists_last_5 +
home.key_players_count * 3 +
home.squad_experience * 10
)
away_strength = (
away.total_goals_last_5 * 2 +
away.total_assists_last_5 +
away.key_players_count * 3 +
away.squad_experience * 10
)
features.squad_strength_diff = home_strength - away_strength
features.goals_form_diff = home.total_goals_last_5 - away.total_goals_last_5
features.key_players_diff = home.key_players_count - away.key_players_count
return features.to_dict()
def get_features_without_match(
self,
home_team_id: str,
away_team_id: str
) -> Dict[str, float]:
"""
Maç ID olmadan takım bazlı feature'ları hesapla.
Son maçtaki kadroyu referans alır.
"""
features = SquadFeatures()
conn = self.get_conn()
if conn is None:
return features.to_dict()
try:
with conn.cursor(cursor_factory=RealDictCursor) as cur:
for team_id, prefix in [(home_team_id, 'home'), (away_team_id, 'away')]:
# Son maçı bul
cur.execute("""
SELECT mpp.match_id
FROM match_player_participation mpp
JOIN matches m ON mpp.match_id = m.id
WHERE mpp.team_id = %s
ORDER BY m.mst_utc DESC
LIMIT 1
""", (team_id,))
row = cur.fetchone()
if row:
analysis = self.analyze_squad(row['match_id'], team_id)
if prefix == 'home':
features.home_starting_11 = analysis.starting_count
features.home_sub_count = analysis.sub_count
features.home_total_squad = analysis.total_squad
features.home_goals_last_5 = analysis.total_goals_last_5
features.home_assists_last_5 = analysis.total_assists_last_5
features.home_key_players = analysis.key_players_count
features.home_squad_experience = analysis.squad_experience
else:
features.away_starting_11 = analysis.starting_count
features.away_sub_count = analysis.sub_count
features.away_total_squad = analysis.total_squad
features.away_goals_last_5 = analysis.total_goals_last_5
features.away_assists_last_5 = analysis.total_assists_last_5
features.away_key_players = analysis.key_players_count
features.away_squad_experience = analysis.squad_experience
# Karşılaştırma
features.goals_form_diff = features.home_goals_last_5 - features.away_goals_last_5
features.key_players_diff = features.home_key_players - features.away_key_players
return features.to_dict()
except Exception as e:
print(f"[SquadEngine] Error: {e}")
return features.to_dict()
# Singleton instance
_engine: Optional[SquadAnalysisEngine] = None
def get_squad_analysis_engine() -> SquadAnalysisEngine:
"""Singleton squad analysis engine instance döndür"""
global _engine
if _engine is None:
_engine = SquadAnalysisEngine()
return _engine
if __name__ == "__main__":
# Test
engine = get_squad_analysis_engine()
print("\n🧪 Squad Analysis Engine Test")
print("=" * 50)
# Test with known team IDs (Galatasaray, Fenerbahce)
features = engine.get_features_without_match(
home_team_id="test_gs",
away_team_id="test_fb"
)
print("\n📊 Features:")
for key, value in features.items():
print(f" {key}: {value:.2f}")
+194
View File
@@ -0,0 +1,194 @@
"""
Team Stats Engine
Takımların oyun tarzı istatistiklerini analiz eder.
football_team_stats tablosundaki kayıtlardan possession, şut, korner verilerini kullanır.
"""
import os
import sys
import psycopg2
from typing import Dict
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from data.db import get_clean_dsn
class TeamStatsEngine:
"""
Takım istatistikleri için feature engine.
Analiz edilen metrikler:
- Ortalama top hakimiyeti (possession)
- Ortalama isabetli şut
- Ortalama korner
- Şut/Gol dönüşüm oranı (xG benzeri)
- Savunma gücü
"""
def __init__(self):
self.conn = None
def get_conn(self):
if self.conn is None or self.conn.closed:
self.conn = psycopg2.connect(get_clean_dsn())
return self.conn
def get_features(self, team_id: str, before_date: int,
limit: int = 10, max_days: int = 180) -> Dict[str, float]:
"""
Takımın oyun tarzı feature'larını hesapla.
Args:
team_id: Takım ID
before_date: Bu tarihten önceki maçlara bak (ms timestamp)
limit: Kaç maç analiz edilecek
max_days: Maksimum kaç gün geriye gidilecek
Returns:
Dict: Team stats feature'ları
"""
if not team_id or len(team_id) < 5:
return self._default_features()
try:
conn = self.get_conn()
cur = conn.cursor()
min_date = before_date - (max_days * 24 * 60 * 60 * 1000)
# Bu takımın son N maçındaki istatistikleri çek
cur.execute("""
SELECT
mts.possession_percentage,
mts.shots_on_target,
mts.shots_off_target,
mts.total_shots,
mts.corners,
mts.fouls,
m.score_home,
m.score_away,
m.home_team_id
FROM football_team_stats mts
JOIN matches m ON mts.match_id = m.id
WHERE mts.team_id = %s
AND m.mst_utc < %s
AND m.mst_utc > %s
AND m.score_home IS NOT NULL
AND m.sport = 'football'
ORDER BY m.mst_utc DESC
LIMIT %s
""", (team_id, before_date, min_date, limit))
stats = cur.fetchall()
if not stats:
return self._default_features()
# İstatistikleri hesapla
total_matches = len(stats)
possession_sum = 0
shots_on_target_sum = 0
shots_total_sum = 0
corners_sum = 0
fouls_sum = 0
goals_scored = 0
valid_possession_count = 0
for stat in stats:
poss, sot, soff, total_shots, corners, fouls, sh, sa, home_id = stat
if poss and poss > 0:
possession_sum += poss
valid_possession_count += 1
if sot:
shots_on_target_sum += sot
if total_shots:
shots_total_sum += total_shots
if corners:
corners_sum += corners
if fouls:
fouls_sum += fouls
# Gol hesaplama
is_home = (home_id == team_id)
goals_scored += sh if is_home else sa
avg_possession = possession_sum / valid_possession_count if valid_possession_count > 0 else 50.0
avg_shots_on_target = shots_on_target_sum / total_matches if total_matches > 0 else 3.0
avg_shots_total = shots_total_sum / total_matches if total_matches > 0 else 10.0
avg_corners = corners_sum / total_matches if total_matches > 0 else 4.0
avg_fouls = fouls_sum / total_matches if total_matches > 0 else 12.0
# Shot conversion rate (xG benzeri)
shot_conversion = goals_scored / shots_total_sum if shots_total_sum > 0 else 0.1
# Shot accuracy
shot_accuracy = shots_on_target_sum / shots_total_sum if shots_total_sum > 0 else 0.35
return {
'avg_possession': avg_possession / 100, # Normalize to 0-1
'avg_shots_on_target': avg_shots_on_target,
'avg_shots_total': avg_shots_total,
'avg_corners': avg_corners,
'avg_fouls': avg_fouls,
'shot_conversion_rate': shot_conversion,
'shot_accuracy': shot_accuracy,
'attacking_intensity': (avg_shots_total + avg_corners) / 2
}
except Exception as e:
print(f"[TeamStatsEngine] Error: {e}")
return self._default_features()
def _default_features(self) -> Dict[str, float]:
return {
'avg_possession': 0.50,
'avg_shots_on_target': 3.5,
'avg_shots_total': 11.0,
'avg_corners': 4.5,
'avg_fouls': 12.0,
'shot_conversion_rate': 0.10,
'shot_accuracy': 0.35,
'attacking_intensity': 7.5
}
# Singleton
_engine = None
def get_team_stats_engine() -> TeamStatsEngine:
global _engine
if _engine is None:
_engine = TeamStatsEngine()
return _engine
if __name__ == "__main__":
engine = get_team_stats_engine()
print("\n🧪 Team Stats Engine Test")
print("=" * 50)
# Test için örnek takım ID'si al
conn = engine.get_conn()
cur = conn.cursor()
cur.execute("""
SELECT DISTINCT mts.team_id, t.name
FROM match_team_stats mts
JOIN teams t ON mts.team_id = t.id
LIMIT 1
""")
result = cur.fetchone()
if result:
team_id, team_name = result
print(f"Test Takımı: {team_name}")
import time
features = engine.get_features(team_id, int(time.time() * 1000))
print(f"\n📊 Feature'lar:")
for k, v in features.items():
print(f" {k}: {v:.3f}")
+419
View File
@@ -0,0 +1,419 @@
"""
Upset Engine - Dev Avcısı Tespit Sistemi
V9 Model için Galatasaray-Liverpool tarzı sürpriz maçları tespit eder.
Faktörler:
1. Atmosfer (Avrupa gecesi, taraftar baskısı)
2. Motivasyon asimetrisi (küme düşme vs şampiyon)
3. Yorgunluk (maç yoğunluğu, seyahat)
4. Tarihsel upset pattern
"""
import os
import sys
from typing import Dict, Any, Optional, Tuple
from dataclasses import dataclass, field
# Add parent directory to path for imports
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
@dataclass
class UpsetFactors:
"""Upset potansiyelini etkileyen faktörler"""
atmosphere_score: float = 0.0 # Atmosfer etkisi (0-1)
motivation_score: float = 0.0 # Motivasyon asimetrisi (0-1)
fatigue_score: float = 0.0 # Yorgunluk farkı (0-1)
historical_upset_rate: float = 0.0 # Tarihsel upset oranı (0-1)
total_upset_potential: float = 0.0 # Toplam upset potansiyeli (0-1)
reasoning: list = field(default_factory=list)
class UpsetEngine:
"""
Favori takımın kaybedeceği maçları tespit eder.
Galatasaray-Liverpool tarzı sürprizleri yakalar.
"""
# Yüksek atmosferli stadyumlar (manuel tanımlı + hesaplanabilir)
HIGH_ATMOSPHERE_TEAMS = {
# Türkiye
"galatasaray", "fenerbahce", "besiktas", "trabzonspor",
# İngiltere
"liverpool", "newcastle", "leeds",
# Almanya
"dortmund", "union berlin",
# Yunanistan
"olympiacos", "panathinaikos", "aek athens",
# Arjantin
"boca juniors", "river plate",
# Diğer
"celtic", "rangers", "red star belgrade"
}
# Avrupa kupaları (yüksek motivasyon)
EUROPEAN_COMPETITIONS = {
"şampiyonlar ligi", "champions league", "uefa champions league",
"avrupa ligi", "europa league", "uefa europa league",
"konferans ligi", "conference league", "uefa conference league"
}
def __init__(self):
self.conn = None
self._connect_db()
def _connect_db(self):
"""Veritabanına bağlan"""
if psycopg2 is None:
return
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
except Exception as e:
print(f"[UpsetEngine] DB connection failed: {e}")
self.conn = None
def _get_conn(self):
"""Bağlantıyı kontrol et ve döndür"""
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
def calculate_atmosphere_score(
self,
home_team_name: str,
league_name: str,
is_cup_match: bool = False
) -> Tuple[float, list]:
"""
Atmosfer skorunu hesapla.
Yüksek atmosferli stadyumlar upset potansiyelini artırır.
"""
score = 0.0
reasons = []
# Yüksek atmosferli takım mı?
home_lower = home_team_name.lower()
for team in self.HIGH_ATMOSPHERE_TEAMS:
if team in home_lower:
score += 0.25
reasons.append(f"🔥 {home_team_name} yüksek atmosferli stadyum")
break
# Avrupa kupası mı?
league_lower = league_name.lower()
for comp in self.EUROPEAN_COMPETITIONS:
if comp in league_lower:
score += 0.20
reasons.append("🌟 Avrupa gecesi - ekstra motivasyon")
break
# Kupa maçı mı? (tek maç eliminasyon)
if is_cup_match:
score += 0.10
reasons.append("🏆 Kupa maçı - her şey olabilir")
return min(score, 1.0), reasons
def calculate_motivation_score(
self,
home_position: int,
away_position: int,
home_points_to_safety: Optional[int] = None,
away_already_champion: bool = False,
total_teams: int = 20
) -> Tuple[float, list]:
"""
Motivasyon asimetrisini hesapla.
Alt sıradaki takımın üst sıradakine karşı ekstra motivasyonu.
"""
score = 0.0
reasons = []
# Pozisyon farkı
position_diff = 0
if away_position is not None and home_position is not None:
position_diff = away_position - home_position # Negatif = deplasman daha iyi sırada
# Küme düşme hattı vs üst sıra (en güçlü upset faktörü)
relegation_zone = total_teams - 3 # Son 3 takım
if home_position is not None and away_position is not None:
if home_position >= relegation_zone and away_position <= 3:
score += 0.30
reasons.append("⚔️ Hayatta kalma savaşı vs şampiyonluk adayı")
elif home_position >= relegation_zone:
score += 0.15
reasons.append("🔥 Ev sahibi küme düşme hattında - ekstra motivasyon")
elif home_position is not None and home_position >= relegation_zone:
score += 0.15
reasons.append("🔥 Ev sahibi küme düşme hattında - ekstra motivasyon")
# Deplasman takımı zaten şampiyon mu?
if away_already_champion:
score += 0.20
reasons.append("😴 Deplasman takımı zaten şampiyon - motivasyon düşük")
# Büyük pozisyon farkı (underdog evinde)
if position_diff < -10:
score += 0.15
reasons.append(f"📊 {abs(position_diff)} sıra fark - büyük maç heyecanı")
elif position_diff < -5:
score += 0.08
return min(score, 1.0), reasons
def calculate_fatigue_score(
self,
home_matches_last_14d: int = 0,
away_matches_last_14d: int = 0,
home_days_rest: int = 7,
away_days_rest: int = 7,
away_travel_km: float = 0
) -> Tuple[float, list]:
"""
Yorgunluk farkını hesapla.
Yorgun deplasman takımı = yüksek upset potansiyeli.
"""
score = 0.0
reasons = []
# Maç yoğunluğu farkı
match_diff = away_matches_last_14d - home_matches_last_14d
if match_diff >= 3:
score += 0.20
reasons.append(f"🏃 Deplasman {match_diff} maç daha fazla oynamış")
elif match_diff >= 2:
score += 0.10
# Dinlenme süresi farkı
rest_diff = home_days_rest - away_days_rest
if rest_diff >= 4:
score += 0.15
reasons.append(f"💤 Ev sahibi {rest_diff} gün daha fazla dinlenmiş")
elif rest_diff >= 2:
score += 0.08
# Uzun deplasman
if away_travel_km > 3000:
score += 0.15
reasons.append(f"✈️ Uzun deplasman ({int(away_travel_km)} km)")
elif away_travel_km > 1500:
score += 0.08
return min(score, 1.0), reasons
def get_historical_upset_rate(
self,
home_team_id: str,
before_date_ms: int,
lookback_matches: int = 20
) -> Tuple[float, list]:
"""
Ev sahibi takımın tarihsel upset oranını hesapla.
Üst sıradaki takımlara karşı galibiyetler.
"""
reasons = []
conn = self._get_conn()
if conn is None:
return 0.0, reasons
try:
cursor = conn.cursor(cursor_factory=RealDictCursor)
# Ev sahibi olarak oynadığı ve sıralamada geride olduğu maçlar
query = """
WITH home_matches AS (
SELECT
m.id,
m.score_home,
m.score_away,
m.home_team_id,
m.away_team_id
FROM matches m
WHERE m.home_team_id = %s
AND m.mst_utc < %s
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
ORDER BY m.mst_utc DESC
LIMIT %s
)
SELECT
COUNT(*) as total,
SUM(CASE WHEN score_home > score_away THEN 1 ELSE 0 END) as wins
FROM home_matches
"""
cursor.execute(query, (home_team_id, before_date_ms, lookback_matches))
result = cursor.fetchone()
if result and result['total'] > 0:
win_rate = result['wins'] / result['total']
# Ev sahibi kazanma oranı yüksekse, upset potansiyeli de yüksek
if win_rate > 0.5:
rate = min((win_rate - 0.4) * 0.5, 0.3)
reasons.append(f"📈 Güçlü ev sahibi performansı (%{int(win_rate*100)} kazanma)")
return rate, reasons
return 0.0, reasons
except Exception as e:
print(f"[UpsetEngine] Historical query error: {e}")
return 0.0, reasons
def calculate_upset_potential(
self,
home_team_name: str,
home_team_id: str,
away_team_name: str,
league_name: str,
home_position: int,
away_position: int,
match_date_ms: int,
is_cup_match: bool = False,
home_matches_last_14d: int = 2,
away_matches_last_14d: int = 2,
home_days_rest: int = 7,
away_days_rest: int = 7,
away_travel_km: float = 0,
total_teams: int = 20
) -> UpsetFactors:
"""
Tüm faktörleri birleştirerek upset potansiyelini hesapla.
Returns:
UpsetFactors: Tüm faktörler ve toplam skor
"""
factors = UpsetFactors()
all_reasons = []
# 1. Atmosfer
atm_score, atm_reasons = self.calculate_atmosphere_score(
home_team_name, league_name, is_cup_match
)
factors.atmosphere_score = atm_score
all_reasons.extend(atm_reasons)
# 2. Motivasyon
mot_score, mot_reasons = self.calculate_motivation_score(
home_position, away_position,
total_teams=total_teams
)
factors.motivation_score = mot_score
all_reasons.extend(mot_reasons)
# 3. Yorgunluk
fat_score, fat_reasons = self.calculate_fatigue_score(
home_matches_last_14d, away_matches_last_14d,
home_days_rest, away_days_rest,
away_travel_km
)
factors.fatigue_score = fat_score
all_reasons.extend(fat_reasons)
# 4. Tarihsel (sadece DB varsa)
hist_score, hist_reasons = self.get_historical_upset_rate(
home_team_id, match_date_ms
)
factors.historical_upset_rate = hist_score
all_reasons.extend(hist_reasons)
# Toplam skor (weighted average)
factors.total_upset_potential = min(
factors.atmosphere_score * 0.25 +
factors.motivation_score * 0.35 +
factors.fatigue_score * 0.25 +
factors.historical_upset_rate * 0.15,
1.0
)
factors.reasoning = all_reasons
return factors
def get_features(
self,
home_team_name: str,
home_team_id: str,
away_team_name: str,
league_name: str,
home_position: int,
away_position: int,
match_date_ms: int,
**kwargs
) -> Dict[str, float]:
"""
Model için feature dict döndür.
Training ve inference'da kullanılır.
"""
factors = self.calculate_upset_potential(
home_team_name=home_team_name,
home_team_id=home_team_id,
away_team_name=away_team_name,
league_name=league_name,
home_position=home_position,
away_position=away_position,
match_date_ms=match_date_ms,
**kwargs
)
return {
"upset_atmosphere": factors.atmosphere_score,
"upset_motivation": factors.motivation_score,
"upset_fatigue": factors.fatigue_score,
"upset_historical": factors.historical_upset_rate,
"upset_potential": factors.total_upset_potential,
}
# Singleton instance
_engine_instance = None
def get_upset_engine() -> UpsetEngine:
"""Singleton pattern ile engine döndür"""
global _engine_instance
if _engine_instance is None:
_engine_instance = UpsetEngine()
return _engine_instance
# Test
if __name__ == "__main__":
engine = get_upset_engine()
# Galatasaray vs Liverpool örneği
factors = engine.calculate_upset_potential(
home_team_name="Galatasaray",
home_team_id="test-gs-id",
away_team_name="Liverpool",
league_name="UEFA Champions League",
home_position=12,
away_position=1,
match_date_ms=1700000000000,
is_cup_match=False,
away_matches_last_14d=5,
home_matches_last_14d=2,
away_days_rest=3,
home_days_rest=7,
away_travel_km=2800,
total_teams=20
)
print("=" * 60)
print("GALATASARAY vs LIVERPOOL - UPSET ANALİZİ")
print("=" * 60)
print(f"🏟️ Atmosfer Skoru: {factors.atmosphere_score:.2f}")
print(f"💪 Motivasyon Skoru: {factors.motivation_score:.2f}")
print(f"😓 Yorgunluk Skoru: {factors.fatigue_score:.2f}")
print(f"📊 Tarihsel Skor: {factors.historical_upset_rate:.2f}")
print(f"\n🎯 TOPLAM UPSET POTANSİYELİ: {factors.total_upset_potential:.2f}")
print("\n📝 Sebepler:")
for reason in factors.reasoning:
print(f" {reason}")
+507
View File
@@ -0,0 +1,507 @@
"""
Upset Engine v2 - GLM-5 Tespitleri ile Geliştirilmiş Sürpriz Tespiti
====================================================================
Yeni Eklenen Faktörler (GLM-5 Analizinden):
1. MARGIN_ANALIZI - Bookmaker margin > %18 = sürpriz riski
2. FAVORI_ORAN_TUZAGI - 1.40-1.60 arası en yüksek sürpriz oranı
3. HAKEM_SURPRIZ_ORANI - Hakemin geçmiş maçlarında ev kayıp oranı
4. FORM_FARKI_TUZAGI - Form farkı > 40 = "çok iyi görünen" favori tuzak
Orijinal Faktörler:
- Atmosfer (Avrupa gecesi, taraftar baskısı)
- Motivasyon asimetrisi (küme düşme vs şampiyon)
- Yorgunluk (maç yoğunluğu, seyahat)
- Tarihsel upset pattern
"""
from typing import Dict, Any, Optional, Tuple, List
from dataclasses import dataclass, field
try:
import psycopg2
from psycopg2.extras import RealDictCursor
except ImportError:
psycopg2 = None
@dataclass
class UpsetFactorsV2:
"""Upset potansiyelini etkileyen faktörler - v2"""
# Orijinal faktörler
atmosphere_score: float = 0.0
motivation_score: float = 0.0
fatigue_score: float = 0.0
historical_upset_rate: float = 0.0
# YENİ FAKTÖRLER (GLM-5)
margin_score: float = 0.0 # Bookmaker margin analizi
favorite_odds_trap: float = 0.0 # Favori oran tuzağı
referee_upset_score: float = 0.0 # Hakem sürpriz oranı
form_trap_score: float = 0.0 # Form farkı tuzağı
# Toplam
total_upset_potential: float = 0.0
reasoning: List[str] = field(default_factory=list)
# YENİ: Sürpriz skoru (0-100)
upset_score: int = 0
upset_level: str = "LOW" # LOW, MEDIUM, HIGH, EXTREME
class UpsetEngineV2:
"""
Favori takımın kaybedeceği maçları tespit eder.
v2: GLM-5 analizlerinden elde edilen yeni faktörler eklendi.
"""
# Yüksek atmosferli stadyumlar
HIGH_ATMOSPHERE_TEAMS = {
"galatasaray", "fenerbahce", "besiktas", "trabzonspor",
"liverpool", "newcastle", "leeds",
"dortmund", "union berlin",
"olympiacos", "panathinaikos", "aek athens",
"boca juniors", "river plate",
"celtic", "rangers", "red star belgrade"
}
EUROPEAN_COMPETITIONS = {
"şampiyonlar ligi", "champions league", "uefa champions league",
"avrupa ligi", "europa league", "uefa europa league",
"konferans ligi", "conference league", "uefa conference league"
}
# YENİ: Sürpriz oranları (veritabanı analizinden)
# Favori oran aralığına göre sürpriz oranları
FAVORITE_ODDS_UPSET_RATES = {
(1.10, 1.20): 0.111, # %11.1 sürpriz
(1.20, 1.30): 0.150, # %15.0 sürpriz
(1.30, 1.40): 0.235, # %23.5 sürpriz
(1.40, 1.50): 0.333, # %33.3 sürpriz ← DİKKAT!
(1.50, 1.60): 0.350, # %35.0 sürpriz ← EN YÜKSEK!
}
def __init__(self):
self.conn = None
self._connect_db()
def _connect_db(self):
if psycopg2 is None:
return
try:
from data.db import get_clean_dsn
self.conn = psycopg2.connect(get_clean_dsn())
except Exception as e:
print(f"[UpsetEngineV2] DB connection failed: {e}")
self.conn = None
def _get_conn(self):
if self.conn is None or self.conn.closed:
self._connect_db()
return self.conn
# ═════════════════════════════════════════════════════════════════
# YENİ FAKTÖRLER (GLM-5 Analizinden)
# ═════════════════════════════════════════════════════════════════
def calculate_margin_score(
self,
odds_data: Dict[str, float]
) -> Tuple[float, List[str]]:
"""
GLM-5 Tespiti: Bookmaker margin analizi
Margin > %18 → Bookmaker kendini koruyor, favori riskli
Margin > %20 → Yüksek risk, sürpriz bekleniyor
"""
score = 0.0
reasons = []
ms_h = odds_data.get("ms_h", 0)
ms_d = odds_data.get("ms_d", 0)
ms_a = odds_data.get("ms_a", 0)
if ms_h > 0 and ms_d > 0 and ms_a > 0:
margin = (1/ms_h + 1/ms_d + 1/ms_a) - 1
if margin > 0.20:
score = 0.25
reasons.append(f"⚠️ Margin çok yüksek (%{margin*100:.1f}) - Bookmaker risk görüyor!")
elif margin > 0.18:
score = 0.15
reasons.append(f"⚠️ Margin yüksek (%{margin*100:.1f}) - Dikkat!")
return score, reasons
def calculate_favorite_odds_trap(
self,
favorite_odds: float,
favorite_side: str # 'home' or 'away'
) -> Tuple[float, List[str]]:
"""
GLM-5 Tespiti: Favori oran tuzağı
Veritabanı analizine göre:
- 1.40-1.50 arası: %33.3 sürpriz
- 1.50-1.60 arası: %35.0 sürpriz (EN YÜKSEK!)
- < 1.20: Tuzak oranı şüphesi
"""
score = 0.0
reasons = []
if favorite_odds <= 0:
return score, reasons
for (low, high), upset_rate in self.FAVORITE_ODDS_UPSET_RATES.items():
if low <= favorite_odds < high:
score = upset_rate # Doğrudan sürpriz olasılığı
if upset_rate >= 0.30:
reasons.append(f"🔴 Favori oran {favorite_odds:.2f} - %{upset_rate*100:.0f} sürpriz oranı!")
elif upset_rate >= 0.20:
reasons.append(f"⚠️ Favori oran {favorite_odds:.2f} - %{upset_rate*100:.0f} sürpriz riski")
break
# Çok düşük oran tuzağı
if favorite_odds < 1.20:
score = max(score, 0.20)
reasons.append(f"⚠️ Favori oran çok düşük ({favorite_odds:.2f}) - Tuzak oranı şüphesi")
return score, reasons
def calculate_referee_upset_score(
self,
referee_name: str
) -> Tuple[float, List[str]]:
"""
GLM-5 Tespiti: Hakem sürpriz oranı
Hakemin yönettiği maçlarda ev sahibi kayıp oranı
> %25 → Yüksek sürpriz riski
"""
score = 0.0
reasons = []
if not referee_name or not self._get_conn():
return score, reasons
try:
cur = self._get_conn().cursor()
# Hakemin yönettiği maçlarda sonuçlar
cur.execute("""
SELECT
COUNT(*) as total,
SUM(CASE WHEN m.score_home < m.score_away THEN 1 ELSE 0 END) as away_wins,
SUM(CASE WHEN m.score_home = m.score_away THEN 1 ELSE 0 END) as draws
FROM match_officials mo
JOIN matches m ON m.id = mo.match_id
WHERE mo.name = %s AND mo.role_id = 1
AND m.score_home IS NOT NULL
""", (referee_name,))
row = cur.fetchone()
cur.close()
if row and row[0] and row[0] >= 3:
total = row[0]
away_wins = row[1] or 0
draws = row[2] or 0
upset_rate = (away_wins + draws * 0.5) / total
if upset_rate > 0.40:
score = 0.25
reasons.append(f"👨‍⚖️ {referee_name}: %{upset_rate*100:.0f} sürpriz oranı (YÜKSEK!)")
elif upset_rate > 0.30:
score = 0.15
reasons.append(f"👨‍⚖️ {referee_name}: %{upset_rate*100:.0f} sürpriz oranı")
except Exception as e:
pass
return score, reasons
def calculate_form_trap_score(
self,
home_form_score: float,
away_form_score: float,
favorite_side: str
) -> Tuple[float, List[str]]:
"""
GLM-5 Tespiti: Form farkı tuzağı
Form farkı > 40 → "Çok iyi görünen" favori tuzak
Favori formu kötü ama oran düşük → Sürpriz bekleniyor
"""
score = 0.0
reasons = []
form_diff = home_form_score - away_form_score
# Form farkı çok büyük
if abs(form_diff) > 40:
score = 0.20
if form_diff > 0 and favorite_side == 'away':
reasons.append(f"🔴 Form tuzağı! Ev sahibi formda ({home_form_score:.0f}) ama deplasman favori")
elif form_diff < 0 and favorite_side == 'home':
reasons.append(f"🔴 Form tuzağı! Deplasman formda ({away_form_score:.0f}) ama ev sahibi favori")
# Favori formu kötü
if favorite_side == 'home' and home_form_score < 50:
score = max(score, 0.15)
reasons.append(f"⚠️ Favori ev sahibi formu düşük ({home_form_score:.0f})")
elif favorite_side == 'away' and away_form_score < 50:
score = max(score, 0.15)
reasons.append(f"⚠️ Favori deplasman formu düşük ({away_form_score:.0f})")
return score, reasons
# ═════════════════════════════════════════════════════════════════
# ORİJİNAL FAKTÖRLER
# ═════════════════════════════════════════════════════════════════
def calculate_atmosphere_score(
self,
home_team_name: str,
league_name: str,
is_cup_match: bool = False
) -> Tuple[float, List[str]]:
"""Orijinal: Atmosfer skoru"""
score = 0.0
reasons = []
home_lower = home_team_name.lower()
for team in self.HIGH_ATMOSPHERE_TEAMS:
if team in home_lower:
score += 0.25
reasons.append(f"🔥 {home_team_name} yüksek atmosferli stadyum")
break
league_lower = league_name.lower()
for comp in self.EUROPEAN_COMPETITIONS:
if comp in league_lower:
score += 0.20
reasons.append("🌟 Avrupa gecesi - ekstra motivasyon")
break
if is_cup_match:
score += 0.10
reasons.append("🏆 Kupa maçı - her şey olabilir")
return min(score, 1.0), reasons
def calculate_motivation_score(
self,
home_position: int,
away_position: int,
total_teams: int = 20
) -> Tuple[float, List[str]]:
"""Orijinal: Motivasyon asimetrisi"""
score = 0.0
reasons = []
if home_position is not None and away_position is not None:
position_diff = away_position - home_position
relegation_zone = total_teams - 3
if home_position >= relegation_zone and away_position <= 3:
score += 0.30
reasons.append("⚔️ Hayatta kalma savaşı vs şampiyonluk adayı")
elif home_position >= relegation_zone:
score += 0.15
reasons.append("🔥 Ev sahibi küme düşme hattında")
if position_diff < -10:
score += 0.15
reasons.append(f"📊 {abs(position_diff)} sıra fark")
return min(score, 1.0), reasons
# ═════════════════════════════════════════════════════════════════
# ANA FONKSİYON
# ═════════════════════════════════════════════════════════════════
def calculate_upset_potential(
self,
home_team_name: str,
home_team_id: str,
away_team_name: str,
league_name: str,
home_position: int = None,
away_position: int = None,
match_date_ms: int = None,
odds_data: Dict[str, float] = None,
referee_name: str = None,
home_form_score: float = 50.0,
away_form_score: float = 50.0,
favorite_side: str = None, # 'home', 'away', or 'draw'
favorite_odds: float = None
) -> UpsetFactorsV2:
"""
Tam upset analizi - v2 (GLM-5 geliştirmeleri ile)
"""
factors = UpsetFactorsV2()
all_reasons = []
# 1. Margin analizi (YENİ)
if odds_data:
factors.margin_score, reasons = self.calculate_margin_score(odds_data)
all_reasons.extend(reasons)
# 2. Favori oran tuzağı (YENİ)
if favorite_odds and favorite_side:
factors.favorite_odds_trap, reasons = self.calculate_favorite_odds_trap(
favorite_odds, favorite_side
)
all_reasons.extend(reasons)
# 3. Hakem sürpriz oranı (YENİ)
if referee_name:
factors.referee_upset_score, reasons = self.calculate_referee_upset_score(
referee_name
)
all_reasons.extend(reasons)
# 4. Form tuzağı (YENİ)
factors.form_trap_score, reasons = self.calculate_form_trap_score(
home_form_score, away_form_score, favorite_side or 'home'
)
all_reasons.extend(reasons)
# 5. Atmosfer (orijinal)
factors.atmosphere_score, reasons = self.calculate_atmosphere_score(
home_team_name, league_name
)
all_reasons.extend(reasons)
# 6. Motivasyon (orijinal)
if home_position is not None and away_position is not None:
factors.motivation_score, reasons = self.calculate_motivation_score(
home_position, away_position
)
all_reasons.extend(reasons)
# ═══════════════════════════════════════════════════════════
# SÜRPRİZ SKORU HESAPLAMA (0-100) - GÜÇLENDİRİLMİŞ v2.1
# ═══════════════════════════════════════════════════════════
upset_score = 0
# Margin (> %18 = +20, > %20 = +30) - GÜÇLENDİRİLDİ
if factors.margin_score >= 0.25:
upset_score += 30 # Artırıldı: 20 -> 30
all_reasons.append("🔴 Margin > %20: Bookmaker büyük risk görüyor!")
elif factors.margin_score >= 0.15:
upset_score += 20 # Artırıldı: 15 -> 20
all_reasons.append("⚠️ Margin > %18: Dikkatli ol!")
# Favori oran tuzağı - GÜÇLENDİRİLDİ
if factors.favorite_odds_trap >= 0.30:
upset_score += 30 # Artırıldı: 25 -> 30
elif factors.favorite_odds_trap >= 0.20:
upset_score += 25 # Artırıldı: 20 -> 25
elif factors.favorite_odds_trap >= 0.15:
upset_score += 20 # Artırıldı: 15 -> 20
# Hakem
if factors.referee_upset_score >= 0.25:
upset_score += 20
elif factors.referee_upset_score >= 0.15:
upset_score += 10
# Form tuzağı - GÜÇLENDİRİLDİ
if factors.form_trap_score >= 0.20:
upset_score += 20 # Artırıldı: 15 -> 20
elif factors.form_trap_score >= 0.15:
upset_score += 15 # Artırıldı: 10 -> 15
# Atmosfer - GÜÇLENDİRİLDİ
if factors.atmosphere_score >= 0.40:
upset_score += 20 # Artırıldı: 15 -> 20
elif factors.atmosphere_score >= 0.25:
upset_score += 15 # Artırıldı: 10 -> 15
# Motivasyon
if factors.motivation_score >= 0.30:
upset_score += 15
elif factors.motivation_score >= 0.15:
upset_score += 10
# ═══════════════════════════════════════════════════════════
# YENİ: EKSTRA RİSK FAKTÖRLERİ
# ═══════════════════════════════════════════════════════════
# Deplasman favorisi ekstra risk (+10)
if favorite_side == 'away':
upset_score += 10
all_reasons.append("📍 Deplasman favorisi - ekstra risk!")
# Favori formu çok düşük (< 40) = +15
if favorite_side == 'home' and home_form_score < 40:
upset_score += 15
all_reasons.append(f"🔴 Favori ev sahibi formu ÇOK DÜŞÜK ({home_form_score:.0f})")
elif favorite_side == 'away' and away_form_score < 40:
upset_score += 15
all_reasons.append(f"🔴 Favori deplasman formu ÇOK DÜŞÜK ({away_form_score:.0f})")
# Çok düşük favori oranı (< 1.30) ama margin yüksek = tuzak şüphesi
if favorite_odds and favorite_odds < 1.30 and factors.margin_score >= 0.15:
upset_score += 10
all_reasons.append(f"⚠️ Düşük oran ({favorite_odds:.2f}) + yüksek margin = TUZAK ŞÜPHESİ!")
factors.upset_score = min(upset_score, 100)
# Seviye belirle
if factors.upset_score >= 60:
factors.upset_level = "EXTREME"
elif factors.upset_score >= 45:
factors.upset_level = "HIGH"
elif factors.upset_score >= 30:
factors.upset_level = "MEDIUM"
else:
factors.upset_level = "LOW"
# Toplam upset potansiyeli
factors.total_upset_potential = min(
(factors.margin_score + factors.favorite_odds_trap +
factors.referee_upset_score + factors.form_trap_score +
factors.atmosphere_score * 0.5 + factors.motivation_score * 0.5) / 1.5,
1.0
)
factors.reasoning = all_reasons
return factors
def get_upset_engine_v2():
"""Singleton pattern"""
return UpsetEngineV2()
if __name__ == "__main__":
# Test
engine = get_upset_engine_v2()
# Real Madrid vs Getafe test
result = engine.calculate_upset_potential(
home_team_name="Real Madrid",
home_team_id="test",
away_team_name="Getafe",
league_name="LaLiga",
odds_data={"ms_h": 1.25, "ms_d": 3.92, "ms_a": 6.86},
referee_name="A. Muniz Ruiz",
home_form_score=80.0,
away_form_score=56.7,
favorite_side="home",
favorite_odds=1.25
)
print(f"\n{'='*60}")
print(f"Real Madrid vs Getafe - Sürpriz Analizi")
print(f"{'='*60}")
print(f"Sürpriz Skoru: {result.upset_score}/100")
print(f"Seviye: {result.upset_level}")
print(f"\nNedenler:")
for reason in result.reasoning:
print(f" {reason}")
+249
View File
@@ -0,0 +1,249 @@
"""
Value Betting Calculator
Expected Value (EV) ve stake önerileri hesaplar.
"""
from typing import Dict, Optional
from dataclasses import dataclass
@dataclass
class ValueBet:
"""Value bet analiz sonucu"""
bet_type: str # MS_1, AU25_Üst, KG_Var
my_probability: float # Bizim tahminimiz
market_odds: float # Bahis oranı
implied_probability: float # Oranın ima ettiği olasılık
edge: float # Fark (benim tahmin - implied)
expected_value: float # EV = (prob × odds) - 1
is_value: bool # EV > threshold mı?
kelly_fraction: float # Kelly stake oranı
confidence_tier: str # "banker", "strong", "value", "skip"
def to_dict(self) -> Dict:
return {
'bet_type': self.bet_type,
'my_probability': round(self.my_probability, 4),
'market_odds': self.market_odds,
'implied_probability': round(self.implied_probability, 4),
'edge': round(self.edge, 4),
'expected_value': round(self.expected_value, 4),
'is_value': self.is_value,
'kelly_fraction': round(self.kelly_fraction, 4),
'confidence_tier': self.confidence_tier,
}
class ValueCalculator:
"""
Value Betting Calculator
Tahminleri oranlarla karşılaştırarak EV hesaplar.
"""
# Eşikler
MIN_EDGE_FOR_VALUE = 0.05 # Minimum %5 edge
MIN_EDGE_FOR_STRONG = 0.10 # %10+ edge = strong value
MIN_EDGE_FOR_BANKER = 0.15 # %15+ edge = banker
KELLY_FRACTION = 0.25 # 1/4 Kelly (güvenli)
MAX_STAKE_PERCENT = 0.10 # Maksimum bank'ın %10'u
def __init__(self):
pass
def calculate_implied_probability(self, odds: float) -> float:
"""Bahis oranından implied probability hesapla"""
if odds <= 1:
return 1.0
return 1 / odds
def calculate_ev(self, probability: float, odds: float) -> float:
"""
Expected Value hesapla.
EV = (Probability × Odds) - 1
Pozitif EV = uzun vadede kar
Negatif EV = uzun vadede zarar
"""
return (probability * odds) - 1
def calculate_kelly_stake(self, probability: float, odds: float) -> float:
"""
Kelly Criterion stake hesapla.
Kelly = (p × b - q) / b
Burada:
- p = kazanma olasılığı
- q = kaybetme olasılığı (1 - p)
- b = odds - 1 (net kar)
"""
if odds <= 1:
return 0
b = odds - 1
p = probability
q = 1 - p
kelly = (p * b - q) / b
# Negatif veya çok yüksek değerleri sınırla
kelly = max(0, min(kelly, self.MAX_STAKE_PERCENT))
# Fractional Kelly (daha güvenli)
return kelly * self.KELLY_FRACTION
def analyze_bet(self, bet_type: str, my_probability: float,
market_odds: float) -> ValueBet:
"""
Tek bir bahis için value analizi yap.
Args:
bet_type: Bahis türü (MS_1, AU25_Üst, KG_Var vb.)
my_probability: Bizim tahminimiz (0-1 arası)
market_odds: Bahis oranı
Returns:
ValueBet: Analiz sonucu
"""
if market_odds <= 1:
return ValueBet(
bet_type=bet_type,
my_probability=my_probability,
market_odds=market_odds,
implied_probability=1.0,
edge=0,
expected_value=-1,
is_value=False,
kelly_fraction=0,
confidence_tier="skip"
)
implied = self.calculate_implied_probability(market_odds)
edge = my_probability - implied
ev = self.calculate_ev(my_probability, market_odds)
kelly = self.calculate_kelly_stake(my_probability, market_odds)
# Tier belirleme
if edge >= self.MIN_EDGE_FOR_BANKER and my_probability >= 0.70:
tier = "banker"
elif edge >= self.MIN_EDGE_FOR_STRONG:
tier = "strong"
elif edge >= self.MIN_EDGE_FOR_VALUE:
tier = "value"
else:
tier = "skip"
return ValueBet(
bet_type=bet_type,
my_probability=my_probability,
market_odds=market_odds,
implied_probability=implied,
edge=edge,
expected_value=ev,
is_value=edge >= self.MIN_EDGE_FOR_VALUE,
kelly_fraction=kelly,
confidence_tier=tier
)
def analyze_match_predictions(self, predictions: Dict[str, float],
odds: Dict[str, float]) -> Dict[str, ValueBet]:
"""
Maç için tüm tahminleri analiz et.
Args:
predictions: Tahminler {'MS_1': 0.55, 'MS_X': 0.25, ...}
odds: Oranlar {'MS_1': 1.80, 'MS_X': 3.50, ...}
Returns:
Dict[str, ValueBet]: Her bahis için value analizi
"""
results = {}
for bet_type, probability in predictions.items():
if bet_type in odds and odds[bet_type] > 1:
results[bet_type] = self.analyze_bet(
bet_type=bet_type,
my_probability=probability,
market_odds=odds[bet_type]
)
return results
def get_best_value_bets(self, value_bets: Dict[str, ValueBet],
top_n: int = 3) -> list:
"""En iyi value bet'leri döndür"""
valid_bets = [vb for vb in value_bets.values() if vb.is_value]
sorted_bets = sorted(valid_bets, key=lambda x: x.expected_value, reverse=True)
return sorted_bets[:top_n]
def calculate_stake(self, value_bet: ValueBet, bankroll: float,
use_kelly: bool = True) -> float:
"""
Önerilen stake miktarını hesapla.
Args:
value_bet: Value bet analizi
bankroll: Toplam bütçe
use_kelly: Kelly criterion kullan mı?
Returns:
float: Önerilen stake miktarı
"""
if not value_bet.is_value:
return 0
if use_kelly:
return bankroll * value_bet.kelly_fraction
else:
# Tier bazlı sabit stake
tier_stakes = {
"banker": 0.05,
"strong": 0.03,
"value": 0.02,
"skip": 0
}
return bankroll * tier_stakes.get(value_bet.confidence_tier, 0)
# Singleton
_calculator = None
def get_value_calculator() -> ValueCalculator:
global _calculator
if _calculator is None:
_calculator = ValueCalculator()
return _calculator
if __name__ == "__main__":
calc = get_value_calculator()
print("\n🧪 Value Calculator Test")
print("=" * 50)
# Test senaryoları
test_cases = [
{"bet": "MS_1", "prob": 0.70, "odds": 1.60}, # High prob, low odds
{"bet": "MS_1", "prob": 0.55, "odds": 1.90}, # Medium prob, good odds
{"bet": "MS_1", "prob": 0.60, "odds": 2.10}, # VALUE!
{"bet": "AU25_Üst", "prob": 0.65, "odds": 1.85}, # VALUE!
{"bet": "KG_Var", "prob": 0.50, "odds": 1.70}, # No value
]
for tc in test_cases:
result = calc.analyze_bet(tc["bet"], tc["prob"], tc["odds"])
status_emoji = "" if result.is_value else ""
tier_emoji = {"banker": "🎯", "strong": "💪", "value": "", "skip": "⏭️"}
print(f"\n{status_emoji} {tc['bet']}")
print(f" Tahmin: {tc['prob']:.0%} | Oran: {tc['odds']:.2f} | Implied: {result.implied_probability:.0%}")
print(f" Edge: {result.edge:+.1%} | EV: {result.expected_value:+.1%}")
print(f" Tier: {tier_emoji.get(result.confidence_tier, '')} {result.confidence_tier.upper()}")
print(f" Kelly Stake: {result.kelly_fraction:.2%} of bankroll")
if result.is_value:
stake = calc.calculate_stake(result, 1000)
print(f" 💰 Önerilen Stake (1000 TL bank): {stake:.2f} TL")
@@ -0,0 +1,415 @@
"""
Value Detection Engine
======================
The Smart Way to Beat the Bookmakers
This engine doesn't just predict winners - it finds VALUE.
The key insight: We don't need to predict the winner, we need to find
where the bookmaker made a mistake in their odds.
Core Philosophy:
- High Margin = High Uncertainty = Potential Value
- Model Probability > Implied Probability = Value Bet
- The goal is NOT to predict correctly, but to find +EV bets
Author: AI Engine V21
"""
import math
from dataclasses import dataclass
from typing import Dict, List, Optional, Tuple
from collections import defaultdict
@dataclass
class ValueBet:
"""Represents a value bet opportunity"""
outcome: str # "1", "X", "2"
model_probability: float # Our model's probability (0-1)
implied_probability: float # Bookmaker's implied probability (0-1)
odds: float # Bookmaker's odds
edge: float # model_prob - implied_prob (as percentage)
expected_value: float # EV = (prob * odds) - 1
kelly_fraction: float # Optimal bet size
confidence: str # "HIGH", "MEDIUM", "LOW"
reasons: List[str] # Why this is value
def to_dict(self) -> dict:
return {
"outcome": self.outcome,
"model_prob": round(self.model_probability * 100, 1),
"implied_prob": round(self.implied_probability * 100, 1),
"odds": self.odds,
"edge": round(self.edge * 100, 1),
"ev": round(self.expected_value * 100, 1),
"kelly": round(self.kelly_fraction * 100, 1),
"confidence": self.confidence,
"reasons": self.reasons
}
@dataclass
class MarginAnalysis:
"""Analysis of bookmaker margin"""
raw_margin: float # Sum of raw implied probabilities - 1
true_margin: float # Adjusted for favorite-longshot bias
favorite_outcome: str
favorite_odds: float
uncertainty_level: str # "LOW", "MEDIUM", "HIGH", "EXTREME"
def to_dict(self) -> dict:
return {
"raw_margin": round(self.raw_margin * 100, 1),
"true_margin": round(self.true_margin * 100, 1),
"favorite": self.favorite_outcome,
"favorite_odds": self.favorite_odds,
"uncertainty": self.uncertainty_level
}
class ValueDetectionEngine:
"""
The Smart Betting Engine
This engine finds value bets by comparing model probabilities
with bookmaker implied probabilities.
Key Insights:
1. Margin > 18% → Bookmaker is unsure, potential value on underdog
2. Margin > 20% → Bookmaker sees high risk, BIG potential value
3. Favorite odds 1.40-1.60 → Highest upset rate historically
4. Away favorites have higher upset rate than home favorites
"""
# Historical upset rates by favorite odds range
UPSET_RATES = {
(1.00, 1.25): 0.08, # 8% upset rate
(1.25, 1.40): 0.18, # 18% upset rate
(1.40, 1.60): 0.33, # 33% upset rate - DANGER ZONE
(1.60, 1.80): 0.28, # 28% upset rate
(1.80, 2.00): 0.35, # 35% upset rate
(2.00, 2.50): 0.42, # 42% upset rate
(2.50, 3.00): 0.45, # 45% upset rate
(3.00, 5.00): 0.55, # 55% upset rate
}
# Margin thresholds
MARGIN_LOW = 0.06 # 6% - bookmaker very confident
MARGIN_MEDIUM = 0.12 # 12% - normal margin
MARGIN_HIGH = 0.18 # 18% - bookmaker unsure
MARGIN_EXTREME = 0.22 # 22% - bookmaker very unsure
def __init__(self):
self.historical_data = [] # For learning
self.value_threshold = 0.03 # Minimum 3% edge to consider value
def calculate_margin(self, odds_1: float, odds_x: float, odds_2: float) -> MarginAnalysis:
"""
Calculate bookmaker margin and analyze uncertainty.
Higher margin = More uncertainty = More potential value
"""
if not all([odds_1 > 1, odds_x > 1, odds_2 > 1]):
return MarginAnalysis(0, 0, "X", 0, "UNKNOWN")
# Raw implied probabilities
imp_1 = 1 / odds_1
imp_x = 1 / odds_x
imp_2 = 1 / odds_2
raw_margin = imp_1 + imp_x + imp_2 - 1
# Determine favorite
if odds_1 <= odds_x and odds_1 <= odds_2:
favorite_outcome = "1"
favorite_odds = odds_1
elif odds_2 <= odds_1 and odds_2 <= odds_x:
favorite_outcome = "2"
favorite_odds = odds_2
else:
favorite_outcome = "X"
favorite_odds = odds_x
# Adjust for favorite-longshot bias
# Bookmakers typically overprice longshots
true_margin = raw_margin * 0.85 # Simplified adjustment
# Determine uncertainty level
if raw_margin < self.MARGIN_LOW:
uncertainty = "LOW"
elif raw_margin < self.MARGIN_MEDIUM:
uncertainty = "MEDIUM"
elif raw_margin < self.MARGIN_HIGH:
uncertainty = "HIGH"
else:
uncertainty = "EXTREME"
return MarginAnalysis(
raw_margin=raw_margin,
true_margin=true_margin,
favorite_outcome=favorite_outcome,
favorite_odds=favorite_odds,
uncertainty_level=uncertainty
)
def get_historical_upset_rate(self, favorite_odds: float) -> float:
"""Get historical upset rate for given favorite odds"""
for (low, high), rate in self.UPSET_RATES.items():
if low <= favorite_odds < high:
return rate
return 0.40 # Default for very high odds
def calculate_edge(
self,
model_prob: float,
odds: float,
margin: float
) -> Tuple[float, float]:
"""
Calculate the edge (advantage) we have over the bookmaker.
Returns: (edge, expected_value)
Edge = Model Probability - True Implied Probability
EV = (Probability * Odds) - 1
"""
if odds <= 1:
return 0, -1
# Raw implied probability
implied = 1 / odds
# Adjust for margin (proportional adjustment)
# This gives us the "true" implied probability
# Assuming bookmaker spreads margin proportionally
true_implied = implied # Simplified - could be more sophisticated
edge = model_prob - true_implied
ev = (model_prob * odds) - 1
return edge, ev
def calculate_kelly_fraction(
self,
probability: float,
odds: float,
half_kelly: bool = True
) -> float:
"""
Calculate optimal bet size using Kelly Criterion.
Kelly = (p * b - 1) / (b - 1)
where b = odds - 1
We use half Kelly for safety.
"""
if odds <= 1:
return 0
b = odds - 1
kelly = (probability * b - 1) / b
# Don't bet if negative
if kelly < 0:
return 0
# Use half Kelly for safety
if half_kelly:
kelly = kelly / 2
# Cap at 10% of bankroll
return min(kelly, 0.10)
def find_value_bets(
self,
model_probs: Dict[str, float],
odds: Dict[str, float],
match_context: Optional[Dict] = None
) -> List[ValueBet]:
"""
Find all value bets in a match.
This is the MAIN method - it finds where we have an edge.
Args:
model_probs: {"1": 0.55, "X": 0.25, "2": 0.20}
odds: {"1": 1.25, "X": 4.50, "2": 8.00}
match_context: Additional context (form, h2h, etc.)
Returns:
List of ValueBet objects, sorted by edge
"""
value_bets = []
# Calculate margin
margin_analysis = self.calculate_margin(
odds.get("1", 0),
odds.get("X", 0),
odds.get("2", 0)
)
# Analyze each outcome
for outcome in ["1", "X", "2"]:
prob = model_probs.get(outcome, 0)
odd = odds.get(outcome, 0)
if prob <= 0 or odd <= 1:
continue
edge, ev = self.calculate_edge(prob, odd, margin_analysis.raw_margin)
kelly = self.calculate_kelly_fraction(prob, odd)
# Determine if this is a value bet
reasons = []
# 1. Basic edge
if edge > self.value_threshold:
reasons.append(f"Edge: +{round(edge*100, 1)}% over bookmaker")
# 2. High margin bonus
if margin_analysis.raw_margin > self.MARGIN_HIGH:
reasons.append(f"High margin ({round(margin_analysis.raw_margin*100, 1)}%) = uncertainty")
# Boost edge for underdogs in high margin matches
if outcome != margin_analysis.favorite_outcome:
edge += 0.02 # 2% bonus
reasons.append("Underdog in high-margin match = bonus value")
# 3. Favorite odds trap
fav_odds = margin_analysis.favorite_odds
if margin_analysis.favorite_outcome != outcome:
upset_rate = self.get_historical_upset_rate(fav_odds)
if upset_rate > 0.25:
reasons.append(f"Favorite odds {fav_odds} has {round(upset_rate*100)}% upset rate")
# Extra bonus for 1.40-1.60 range
if 1.40 <= fav_odds <= 1.60:
edge += 0.03
reasons.append("DANGER ZONE: 1.40-1.60 odds = highest upset risk")
# 4. Away favorite risk
if margin_analysis.favorite_outcome == "2" and outcome == "1":
edge += 0.015
reasons.append("Away favorite = extra home value")
# 5. EV positive
if ev > 0:
reasons.append(f"Positive EV: +{round(ev*100, 1)}%")
# Only add if we have reasons (value detected)
if reasons and edge > 0:
# Determine confidence
if edge > 0.08 or (edge > 0.05 and kelly > 0.03):
confidence = "HIGH"
elif edge > 0.05:
confidence = "MEDIUM"
else:
confidence = "LOW"
value_bets.append(ValueBet(
outcome=outcome,
model_probability=prob,
implied_probability=1/odd,
odds=odd,
edge=edge,
expected_value=ev,
kelly_fraction=kelly,
confidence=confidence,
reasons=reasons
))
# Sort by edge (highest first)
value_bets.sort(key=lambda x: x.edge, reverse=True)
return value_bets
def predict_with_value(
self,
model_probs: Dict[str, float],
odds: Dict[str, float],
match_context: Optional[Dict] = None
) -> Dict:
"""
Make a prediction based on VALUE, not just probability.
This is the smart way to bet:
- If there's clear value on one outcome → Bet it
- If there's no value → NO BET (don't force it)
- If margin is extreme → Look for underdog value
Returns:
{
"best_value": ValueBet or None,
"alternative_value": ValueBet or None,
"margin_analysis": MarginAnalysis,
"recommendation": str,
"confidence": str
}
"""
margin_analysis = self.calculate_margin(
odds.get("1", 0),
odds.get("X", 0),
odds.get("2", 0)
)
value_bets = self.find_value_bets(model_probs, odds, match_context)
result = {
"margin_analysis": margin_analysis.to_dict(),
"value_bets": [vb.to_dict() for vb in value_bets],
"best_value": None,
"alternative_value": None,
"recommendation": "NO_BET",
"confidence": "LOW",
"reasoning": []
}
if not value_bets:
result["reasoning"].append("No value detected in any outcome")
result["reasoning"].append("Bookmaker odds are efficient for this match")
return result
# Get best value bet
best = value_bets[0]
result["best_value"] = best.to_dict()
if len(value_bets) > 1:
result["alternative_value"] = value_bets[1].to_dict()
# Determine recommendation
if best.confidence == "HIGH" and best.edge > 0.05:
result["recommendation"] = f"BET_{best.outcome}"
result["confidence"] = "HIGH"
result["reasoning"] = best.reasons
result["reasoning"].append(f"Strong value on {best.outcome} with {round(best.edge*100, 1)}% edge")
elif best.confidence == "MEDIUM" or best.edge > 0.03:
result["recommendation"] = f"CONSIDER_{best.outcome}"
result["confidence"] = "MEDIUM"
result["reasoning"] = best.reasons
result["reasoning"].append(f"Moderate value on {best.outcome}")
else:
result["recommendation"] = "NO_BET"
result["confidence"] = "LOW"
result["reasoning"].append("Edge too small to justify bet")
result["reasoning"].append(f"Best edge: {round(best.edge*100, 1)}% (need >3%)")
# Add margin context
if margin_analysis.uncertainty_level == "EXTREME":
result["reasoning"].append("⚠️ EXTREME margin - high volatility match")
elif margin_analysis.uncertainty_level == "HIGH":
result["reasoning"].append("⚠️ High margin - bookmaker sees risk")
return result
# Singleton instance
_engine_instance = None
def get_value_detection_engine() -> ValueDetectionEngine:
"""Get the singleton instance"""
global _engine_instance
if _engine_instance is None:
_engine_instance = ValueDetectionEngine()
return _engine_instance
+167
View File
@@ -0,0 +1,167 @@
"""
Shared VQWEN feature contract
=============================
One place defines how VQWEN features are produced.
Both training and runtime inference must use this module so the model sees
the same feature semantics in historical data and live analysis.
"""
from __future__ import annotations
from dataclasses import dataclass
import numpy as np
FEATURE_COLUMNS = [
"elo_diff",
"h_xg",
"a_xg",
"total_xg",
"pow_diff",
"rest_diff",
"h_fat",
"a_fat",
"imp_h",
"imp_d",
"imp_a",
"h_xi",
"a_xi",
"h2h_h_wr",
"form_diff",
]
@dataclass(slots=True)
class VqwenFeatureInput:
home_elo: float
away_elo: float
home_avg_goals_scored: float
away_avg_goals_scored: float
home_avg_goals_conceded: float
away_avg_goals_conceded: float
home_avg_shots_on_target: float
away_avg_shots_on_target: float
home_avg_possession: float
away_avg_possession: float
home_rest_days: float
away_rest_days: float
implied_prob_home: float
implied_prob_draw: float
implied_prob_away: float
home_lineup_availability: float = 1.0
away_lineup_availability: float = 1.0
h2h_home_win_rate: float = 0.5
home_form_score: float = 0.0
away_form_score: float = 0.0
league_avg_goals: float = 2.6
referee_avg_goals: float = 2.6
referee_home_bias: float = 0.0
home_squad_strength: float = 0.5
away_squad_strength: float = 0.5
home_key_players: float = 0.0
away_key_players: float = 0.0
missing_players_impact: float = 0.0
def fatigue_multiplier(rest_days: float) -> float:
if rest_days < 3.0:
return 0.85
if rest_days < 5.0:
return 0.95
return 1.0
def clamp(value: float, lower: float, upper: float) -> float:
return min(max(float(value), lower), upper)
def build_vqwen_feature_row(values: VqwenFeatureInput) -> dict[str, float]:
home_fatigue = fatigue_multiplier(values.home_rest_days)
away_fatigue = fatigue_multiplier(values.away_rest_days)
goal_environment = (
float(values.league_avg_goals) + float(values.referee_avg_goals)
) / 2.0
goal_environment_multiplier = clamp(goal_environment / 2.6, 0.85, 1.2)
squad_diff = float(values.home_squad_strength) - float(values.away_squad_strength)
key_player_diff = float(values.home_key_players) - float(values.away_key_players)
missing_penalty = clamp(float(values.missing_players_impact), 0.0, 1.0)
referee_bias = clamp(float(values.referee_home_bias), -0.25, 0.25)
home_squad_multiplier = clamp(
1.0 + squad_diff * 0.08 + key_player_diff * 0.025 - missing_penalty * 0.08 + referee_bias * 0.03,
0.82,
1.18,
)
away_squad_multiplier = clamp(
1.0 - squad_diff * 0.08 - key_player_diff * 0.025 - missing_penalty * 0.08 - referee_bias * 0.03,
0.82,
1.18,
)
home_xg = max(
0.05,
(
float(values.home_avg_goals_scored)
+ float(values.away_avg_goals_conceded)
)
/ 2.0,
) * home_fatigue * goal_environment_multiplier * home_squad_multiplier
away_xg = max(
0.05,
(
float(values.away_avg_goals_scored)
+ float(values.home_avg_goals_conceded)
)
/ 2.0,
) * away_fatigue * goal_environment_multiplier * away_squad_multiplier
home_power = (
float(values.home_avg_goals_scored) * 5.0
- float(values.home_avg_goals_conceded) * 5.0
+ float(values.home_avg_shots_on_target) * 2.0
+ float(values.home_avg_possession) * 0.1
+ float(values.home_squad_strength) * 3.0
+ float(values.home_key_players) * 0.8
+ referee_bias * 6.0
)
away_power = (
float(values.away_avg_goals_scored) * 5.0
- float(values.away_avg_goals_conceded) * 5.0
+ float(values.away_avg_shots_on_target) * 2.0
+ float(values.away_avg_possession) * 0.1
+ float(values.away_squad_strength) * 3.0
+ float(values.away_key_players) * 0.8
- referee_bias * 6.0
)
return {
"elo_diff": float(values.home_elo) - float(values.away_elo),
"h_xg": home_xg,
"a_xg": away_xg,
"total_xg": home_xg + away_xg,
"pow_diff": home_power - away_power,
"rest_diff": float(values.home_rest_days) - float(values.away_rest_days),
"h_fat": home_fatigue,
"a_fat": away_fatigue,
"imp_h": clamp(values.implied_prob_home, 0.01, 0.98),
"imp_d": clamp(values.implied_prob_draw, 0.01, 0.98),
"imp_a": clamp(values.implied_prob_away, 0.01, 0.98),
# Column names are preserved for artifact compatibility.
# Semantics are now "pre-match lineup availability" instead of leaked
# post-match starting-XI counts.
"h_xi": clamp(values.home_lineup_availability, 0.0, 1.0),
"a_xi": clamp(values.away_lineup_availability, 0.0, 1.0),
"h2h_h_wr": clamp(values.h2h_home_win_rate, 0.0, 1.0),
"form_diff": (
float(values.home_form_score)
- float(values.away_form_score)
+ squad_diff * 1.5
+ key_player_diff * 0.35
+ referee_bias * 2.0
- missing_penalty * 1.75
),
}
def row_to_array(row: dict[str, float]) -> np.ndarray:
return np.array([[float(row[column]) for column in FEATURE_COLUMNS]], dtype=np.float64)
+430
View File
@@ -0,0 +1,430 @@
from __future__ import annotations
import os
import sys
import asyncio
import time
from contextlib import asynccontextmanager
from typing import Any
from datetime import datetime
import uvicorn
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
import subprocess
from pydantic import BaseModel
try:
from models.basketball_v25 import get_basketball_v25_predictor
HAS_BASKETBALL = True
except ImportError:
HAS_BASKETBALL = False
from services.single_match_orchestrator import get_single_match_orchestrator
from services.v26_shadow_engine import get_v26_shadow_engine
from models.league_model import get_league_model_loader
load_dotenv()
if sys.stdout and hasattr(sys.stdout, "reconfigure"):
sys.stdout.reconfigure(encoding="utf-8")
if sys.stderr and hasattr(sys.stderr, "reconfigure"):
sys.stderr.reconfigure(encoding="utf-8")
class CouponRequest(BaseModel):
match_ids: list[str]
strategy: str | None = "BALANCED"
max_matches: int | None = None
min_confidence: float | None = None
class RetrainRequest(BaseModel):
reason: str | None = "manual"
markets: str | None = None # comma-separated, e.g. "MS,OU25,BTTS"
trials: int | None = 50
# ─── Retrain state tracking ──────────────────────────────────
_retrain_state: dict[str, Any] = {
"running": False,
"last_started": None,
"last_completed": None,
"last_status": None,
"last_error": None,
"pid": None,
}
@asynccontextmanager
async def lifespan(_: FastAPI):
try:
print("🚀 Initializing V28 orchestrator...", flush=True)
get_single_match_orchestrator()
get_v26_shadow_engine()
print("✅ V28 orchestrator ready", flush=True)
except Exception as error:
print(f"❌ Failed to initialize orchestrator: {error}", flush=True)
import traceback
traceback.print_exc()
yield
app = FastAPI(
title="Suggest-Bet AI Engine",
version="28.0.0",
description="V28 Single Match Prediction Package API",
lifespan=lifespan,
)
def _parse_cors_origins() -> list[str]:
raw = os.getenv("CORS_ALLOW_ORIGINS", "").strip()
if raw:
return [item.strip() for item in raw.split(",") if item.strip()]
# Dev-safe defaults + production domains.
return [
"http://localhost:3000",
"http://127.0.0.1:3000",
"http://localhost:3001",
"http://127.0.0.1:3001",
"http://localhost:3005",
"http://127.0.0.1:3005",
"https://ui-suggestbet.bilgich.com",
"https://suggestbet.bilgich.com",
"https://iddaai.com",
"https://www.iddaai.com",
]
app.add_middleware(
CORSMiddleware,
allow_origins=_parse_cors_origins(),
allow_origin_regex=r"^https?://(localhost|127\.0\.0\.1)(:\d+)?$",
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.exception_handler(Exception)
async def global_exception_handler(_: Request, exc: Exception):
import traceback
print(f"💥 ERROR: {exc}", flush=True)
traceback.print_exc()
return JSONResponse(
status_code=500,
content={"message": f"Internal Server Error: {str(exc)}"},
)
@app.get("/")
def read_root() -> dict[str, Any]:
return {
"status": "Suggest-Bet AI Engine v28",
"engine": "V28 Single Match Orchestrator",
"mode": os.getenv("AI_ENGINE_MODE", "v28"),
"routes": [
"POST /v20plus/analyze/{match_id}",
"GET /v20plus/analyze-htms/{match_id}",
"GET /v20plus/analyze-htft/{match_id}",
"GET /v20plus/reversal-watchlist",
"POST /v20plus/coupon",
"GET /v20plus/daily-banker",
"POST /v1/admin/retrain",
"GET /v1/admin/retrain/status",
],
}
@app.get("/health")
def health_check() -> dict[str, Any]:
try:
orchestrator = get_single_match_orchestrator()
shadow_engine = get_v26_shadow_engine()
# Per-market V25 model status
v25_readiness: dict[str, Any] = {"fully_loaded": False}
try:
v25_predictor = orchestrator._get_v25_predictor()
v25_readiness = v25_predictor.readiness_summary()
except Exception as v25_err:
v25_readiness = {"fully_loaded": False, "error": str(v25_err)}
if HAS_BASKETBALL:
basketball_predictor = get_basketball_v25_predictor()
basketball_readiness = basketball_predictor.readiness_summary()
ready = bool(basketball_readiness.get("fully_loaded", True))
else:
basketball_readiness = {"fully_loaded": False, "error": "Basketball module not found"}
ready = True
league_readiness = get_league_model_loader().readiness_summary()
overall_ready = ready and v25_readiness.get("fully_loaded", False)
return {
"status": "healthy" if overall_ready else "degraded",
"engine": "v28.main",
"mode": os.getenv("AI_ENGINE_MODE", "v28"),
"ready": overall_ready,
"v25_football": v25_readiness,
"league_specific": league_readiness,
"basketball_v25": basketball_readiness,
"v26_shadow": shadow_engine.readiness_summary(),
"prediction_service_ready": True,
"model_loaded": overall_ready,
"orchestrator_mode": getattr(orchestrator, "engine_mode", "v28"),
}
except Exception as error:
return {"status": "unhealthy", "ready": False, "error": str(error)}
_REQUIRED_RESPONSE_FIELDS = ("match_info", "market_board", "main_pick", "bet_summary", "data_quality")
@app.post("/v20plus/analyze/{match_id}")
async def analyze_match_v20plus(match_id: str) -> dict[str, Any]:
started_at = time.time()
orchestrator = get_single_match_orchestrator()
result = await asyncio.to_thread(orchestrator.analyze_match, match_id)
elapsed_ms = int((time.time() - started_at) * 1000)
if not result:
raise HTTPException(status_code=404, detail=f"Match not found: {match_id}")
# Response validation: log missing required fields (non-fatal)
missing_fields = [f for f in _REQUIRED_RESPONSE_FIELDS if f not in result]
if missing_fields:
print(f"⚠️ [API] analyze/{match_id} response missing fields: {missing_fields} ({elapsed_ms}ms)")
result["timing_ms"] = elapsed_ms
return result
@app.get("/v20plus/analyze-htms/{match_id}")
async def analyze_match_htms_v20plus(match_id: str) -> dict[str, Any]:
orchestrator = get_single_match_orchestrator()
result = await asyncio.to_thread(orchestrator.analyze_match_htms, match_id)
if not result:
raise HTTPException(status_code=404, detail=f"Match not found: {match_id}")
return result
@app.get("/v20plus/analyze-htft/{match_id}")
async def analyze_match_htft_v20plus(match_id: str, timeout_sec: int = 30) -> dict[str, Any]:
# Small, explicit endpoint for HT/FT inspection and debugging in FE/Postman.
if timeout_sec < 3 or timeout_sec > 120:
raise HTTPException(status_code=400, detail="timeout_sec must be between 3 and 120")
orchestrator = get_single_match_orchestrator()
started_at = time.time()
try:
result = await asyncio.wait_for(
asyncio.to_thread(orchestrator.analyze_match, match_id),
timeout=float(timeout_sec),
)
except asyncio.TimeoutError as error:
raise HTTPException(
status_code=504,
detail=f"Analyze timeout after {timeout_sec}s for match_id={match_id}",
) from error
if not result:
raise HTTPException(status_code=404, detail=f"Match not found: {match_id}")
risk = result.get("risk", {})
market_board = result.get("market_board", {})
htft_probs = market_board.get("HTFT", {}).get("probs", {}) or risk.get("ht_ft_probs", {})
top_reversal_pick = None
top_reversal_prob = 0.0
if htft_probs:
prob_12 = float(htft_probs.get("1/2", 0.0))
prob_21 = float(htft_probs.get("2/1", 0.0))
if prob_21 >= prob_12:
top_reversal_pick = "2/1"
top_reversal_prob = prob_21
else:
top_reversal_pick = "1/2"
top_reversal_prob = prob_12
overall_htft_pick = None
overall_htft_prob = 0.0
if htft_probs:
overall_htft_pick, overall_htft_prob = max(
htft_probs.items(),
key=lambda item: float(item[1]),
)
return {
"engine": "v28.main",
"match_info": result.get("match_info", {}),
"timing_ms": int((time.time() - started_at) * 1000),
"ht_ft_probs": htft_probs,
"top_reversal_pick": top_reversal_pick,
"top_reversal_prob": round(float(top_reversal_prob), 4),
"overall_htft_pick": overall_htft_pick,
"overall_htft_pick_prob": round(float(overall_htft_prob), 4),
"surprise_hunter": result.get("surprise_hunter", {}),
"ht_ft_reversal_radar": result.get("ht_ft_reversal_radar", {}),
"first_half_result": result.get("market_board", {}).get("first_half_result", {}),
"main_pick": result.get("main_pick", {}),
"bet_summary": result.get("bet_summary", {}),
}
@app.post("/v20plus/coupon")
async def generate_coupon_v20plus(request: CouponRequest) -> dict[str, Any]:
orchestrator = get_single_match_orchestrator()
return await asyncio.to_thread(
orchestrator.build_coupon,
request.match_ids,
request.strategy or "BALANCED",
request.max_matches,
request.min_confidence,
)
@app.get("/v20plus/daily-banker")
async def get_daily_banker_v20plus(count: int = 3) -> dict[str, Any]:
if count < 1:
raise HTTPException(status_code=400, detail="count must be >= 1")
orchestrator = get_single_match_orchestrator()
bankers = await asyncio.to_thread(orchestrator.get_daily_bankers, count)
return {"count": len(bankers), "bankers": bankers}
@app.get("/v20plus/reversal-watchlist")
async def get_reversal_watchlist_v20plus(
count: int = 20,
horizon_hours: int = 72,
min_score: float = 45.0,
top_leagues_only: bool = False,
) -> dict[str, Any]:
if count < 1 or count > 100:
raise HTTPException(status_code=400, detail="count must be between 1 and 100")
if horizon_hours < 6 or horizon_hours > 168:
raise HTTPException(status_code=400, detail="horizon_hours must be between 6 and 168")
if min_score < 0 or min_score > 100:
raise HTTPException(status_code=400, detail="min_score must be between 0 and 100")
orchestrator = get_single_match_orchestrator()
return await asyncio.to_thread(
orchestrator.get_reversal_watchlist,
count,
horizon_hours,
min_score,
top_leagues_only,
)
# ─── ADMIN: Retrain Pipeline ─────────────────────────────────
def _run_retrain_pipeline(markets: str | None, trials: int):
"""Background function: extract data → train model → reload."""
global _retrain_state
ai_dir = os.path.dirname(os.path.abspath(__file__))
scripts_dir = os.path.join(ai_dir, "scripts")
python = os.path.join(ai_dir, "venv", "bin", "python3")
if not os.path.exists(python):
python = sys.executable # fallback
try:
# Step 1: Extract training data
print("🔄 [RETRAIN] Step 1/3: Extracting training data...", flush=True)
result = subprocess.run(
[python, os.path.join(scripts_dir, "extract_training_data.py")],
capture_output=True, text=True, timeout=600, cwd=ai_dir,
)
if result.returncode != 0:
raise RuntimeError(f"Extract failed:\n{result.stderr[-500:]}")
print(f"✅ [RETRAIN] Extract done", flush=True)
# Step 2: Train V25 Pro
print("🔄 [RETRAIN] Step 2/3: Training V25 Pro model...", flush=True)
train_cmd = [python, os.path.join(scripts_dir, "train_v25_pro.py")]
if markets:
train_cmd += ["--markets", markets]
train_cmd += ["--trials", str(trials)]
result = subprocess.run(
train_cmd, capture_output=True, text=True, timeout=3600, cwd=ai_dir,
)
if result.returncode != 0:
raise RuntimeError(f"Training failed:\n{result.stderr[-500:]}")
print(f"✅ [RETRAIN] Training done", flush=True)
# Step 3: Reload models in memory
print("🔄 [RETRAIN] Step 3/3: Reloading models...", flush=True)
try:
orchestrator = get_single_match_orchestrator()
v25 = orchestrator._get_v25_predictor()
v25._loaded = False
v25.load_models()
print("✅ [RETRAIN] Models reloaded in memory", flush=True)
except Exception as reload_err:
print(f"⚠️ [RETRAIN] Hot reload failed (restart needed): {reload_err}", flush=True)
_retrain_state.update({
"running": False,
"last_completed": datetime.now().isoformat(),
"last_status": "success",
"last_error": None,
})
print("🎉 [RETRAIN] Pipeline complete!", flush=True)
except Exception as err:
_retrain_state.update({
"running": False,
"last_completed": datetime.now().isoformat(),
"last_status": "failed",
"last_error": str(err),
})
print(f"❌ [RETRAIN] Pipeline failed: {err}", flush=True)
@app.post("/v1/admin/retrain")
async def admin_retrain(request: RetrainRequest) -> dict[str, Any]:
"""Trigger full retrain pipeline: extract → train → reload."""
if _retrain_state["running"]:
return {
"status": "already_running",
"message": f"Retrain in progress since {_retrain_state['last_started']}",
}
_retrain_state.update({
"running": True,
"last_started": datetime.now().isoformat(),
"last_status": "running",
"last_error": None,
})
# Run in background thread
import threading
thread = threading.Thread(
target=_run_retrain_pipeline,
args=(request.markets, request.trials or 50),
daemon=True,
)
thread.start()
return {
"status": "triggered",
"message": "Retrain pipeline started in background",
"reason": request.reason,
"markets": request.markets or "all",
"trials": request.trials or 50,
}
@app.get("/v1/admin/retrain/status")
async def admin_retrain_status() -> dict[str, Any]:
"""Check retrain pipeline status."""
return {**_retrain_state}
if __name__ == "__main__":
port = int(os.getenv("PORT", "8000"))
uvicorn.run("main:app", host="0.0.0.0", port=port, reload=True)
+356
View File
@@ -0,0 +1,356 @@
"""
Basketball V25 Predictor Package
=================================
Loads trained XGBoost + LightGBM models for basketball market predictions:
- ML (Money Line — home / away win)
- Total (Over/Under total points)
- Spread (ATS home cover / away cover)
Model files live in this directory:
xgb_basketball_v25_{market}.json — XGBoost (primary)
lgb_basketball_v25_{market}.txt — LightGBM (ensemble)
feature_cols.json — ordered feature list
"""
from __future__ import annotations
import json
import os
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
# ── Constants ─────────────────────────────────────────────────────────────────
_DIR = os.path.dirname(os.path.abspath(__file__))
_MARKETS = ("ml", "total", "spread")
# ── Output dataclass ──────────────────────────────────────────────────────────
@dataclass
class BasketballMatchPrediction:
"""Complete basketball match prediction output."""
match_id: str
home_team_name: str
away_team_name: str
league_name: str = ""
# Money Line
ml_home_prob: float = 0.50
ml_away_prob: float = 0.50
ml_pick: str = ""
ml_confidence: float = 0.0
# Total (Over/Under)
total_line: float = 0.0
total_over_prob: float = 0.50
total_under_prob: float = 0.50
total_pick: str = ""
total_confidence: float = 0.0
# Spread (ATS)
spread_home_line: float = 0.0
spread_home_prob: float = 0.50
spread_away_prob: float = 0.50
spread_pick: str = ""
spread_confidence: float = 0.0
# Meta
model_version: str = "basketball_v25"
risk_level: str = "MEDIUM"
analysis_details: Dict[str, Any] = field(default_factory=dict)
market_board: Dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> Dict[str, Any]:
return {
"match_id": self.match_id,
"home_team": self.home_team_name,
"away_team": self.away_team_name,
"league": self.league_name,
"model": self.model_version,
"risk_level": self.risk_level,
"ml": {
"home_prob": round(self.ml_home_prob * 100, 1),
"away_prob": round(self.ml_away_prob * 100, 1),
"pick": self.ml_pick,
"confidence": round(self.ml_confidence, 1),
},
"total": {
"line": self.total_line,
"over_prob": round(self.total_over_prob * 100, 1),
"under_prob": round(self.total_under_prob * 100, 1),
"pick": self.total_pick,
"confidence": round(self.total_confidence, 1),
},
"spread": {
"home_line": self.spread_home_line,
"home_prob": round(self.spread_home_prob * 100, 1),
"away_prob": round(self.spread_away_prob * 100, 1),
"pick": self.spread_pick,
"confidence": round(self.spread_confidence, 1),
},
"market_board": self.market_board,
"analysis_details": self.analysis_details,
}
# ── Predictor ─────────────────────────────────────────────────────────────────
class BasketballV25Predictor:
"""
Ensemble basketball predictor using XGBoost + LightGBM models.
Markets:
- ml → home/away win probability
- total → over/under total points
- spread → home/away ATS cover
"""
def __init__(self) -> None:
self.feature_cols: List[str] = self._load_feature_cols()
self.models: Dict[str, Any] = {}
self._load_models()
print(f"✅ BasketballV25Predictor ready ({len(self.models)} models loaded)")
# ── Setup ──────────────────────────────────────────────────────────────
def _load_feature_cols(self) -> List[str]:
path = os.path.join(_DIR, "feature_cols.json")
try:
with open(path, "r") as f:
return json.load(f)
except Exception as e:
print(f"⚠️ [Basketball] Could not load feature_cols.json: {e}")
return []
def _load_models(self) -> None:
for market in _MARKETS:
xgb_path = os.path.join(_DIR, f"xgb_basketball_v25_{market}.json")
lgb_path = os.path.join(_DIR, f"lgb_basketball_v25_{market}.txt")
xgb_model = self._try_load_xgb(xgb_path, market)
lgb_model = self._try_load_lgb(lgb_path, market)
if xgb_model is not None or lgb_model is not None:
self.models[market] = {"xgb": xgb_model, "lgb": lgb_model}
def _try_load_xgb(self, path: str, market: str) -> Optional[Any]:
if not os.path.exists(path):
return None
try:
import xgboost as xgb # type: ignore[import-not-found]
m = xgb.XGBClassifier()
m.load_model(path)
return m
except Exception as e:
print(f"⚠️ [Basketball] XGB {market} load failed: {e}")
return None
def _try_load_lgb(self, path: str, market: str) -> Optional[Any]:
if not os.path.exists(path):
return None
try:
import lightgbm as lgb # type: ignore[import-not-found]
with open(path, "r", encoding="utf-8") as f:
model_str = f.read()
return lgb.Booster(model_str=model_str)
except Exception as e:
print(f"⚠️ [Basketball] LGB {market} load failed: {e}")
return None
# ── Inference ──────────────────────────────────────────────────────────
def _build_feature_row(self, odds_data: Dict[str, Any], **kwargs: Any) -> "Any":
"""Build a single-row DataFrame aligned to training feature columns."""
import pandas as pd # type: ignore[import-not-found]
row: Dict[str, float] = {}
for col in self.feature_cols:
row[col] = float(kwargs.get(col) or odds_data.get(col) or 0.0)
# Map common odds keys
row["ml_home_odds"] = float(odds_data.get("ml_h") or 0.0)
row["ml_away_odds"] = float(odds_data.get("ml_a") or 0.0)
row["total_line"] = float(odds_data.get("tot_line") or 0.0)
row["total_over_odds"] = float(odds_data.get("tot_o") or 0.0)
row["total_under_odds"] = float(odds_data.get("tot_u") or 0.0)
row["spread_home_line"] = float(odds_data.get("spread_home_line") or 0.0)
row["spread_home_odds"] = float(odds_data.get("spread_h") or 0.0)
row["spread_away_odds"] = float(odds_data.get("spread_a") or 0.0)
# Implied probabilities
def _imp(odd: float) -> float:
return (1.0 / odd) if odd > 1.01 else 0.5
ml_h = row["ml_home_odds"]
ml_a = row["ml_away_odds"]
if ml_h > 1.01 and ml_a > 1.01:
raw = _imp(ml_h) + _imp(ml_a)
row["implied_home"] = _imp(ml_h) / raw
row["implied_away"] = _imp(ml_a) / raw
row["odds_overround"] = raw - 1.0
tot_o = row["total_over_odds"]
tot_u = row["total_under_odds"]
if tot_o > 1.01 and tot_u > 1.01:
raw = _imp(tot_o) + _imp(tot_u)
row["implied_total_over"] = _imp(tot_o) / raw
row["implied_total_under"] = _imp(tot_u) / raw
sp_h = row["spread_home_odds"]
sp_a = row["spread_away_odds"]
if sp_h > 1.01 and sp_a > 1.01:
raw = _imp(sp_h) + _imp(sp_a)
row["implied_spread_home"] = _imp(sp_h) / raw
row["implied_spread_away"] = _imp(sp_a) / raw
return pd.DataFrame([row])
def _ensemble_predict(self, market: str, df: "Any") -> List[float]:
"""Return [p_class0, p_class1] from XGB+LGB ensemble."""
models = self.models.get(market, {})
xgb_model = models.get("xgb")
lgb_model = models.get("lgb")
probs_list: List[List[float]] = []
if xgb_model is not None:
try:
p = xgb_model.predict_proba(df)
probs_list.append([float(p[0][0]), float(p[0][1])])
except Exception as e:
print(f"⚠️ [Basketball] XGB {market} inference failed: {e}")
if lgb_model is not None:
try:
p_raw = lgb_model.predict(df)
p1 = float(p_raw[0]) if len(p_raw.shape) == 1 else float(p_raw[0][1])
probs_list.append([1.0 - p1, p1])
except Exception as e:
print(f"⚠️ [Basketball] LGB {market} inference failed: {e}")
if not probs_list:
return [0.5, 0.5]
p0 = sum(p[0] for p in probs_list) / len(probs_list)
p1 = sum(p[1] for p in probs_list) / len(probs_list)
total = p0 + p1 or 1.0
return [p0 / total, p1 / total]
# ── Public API ─────────────────────────────────────────────────────────
def predict(
self,
match_id: str,
home_team_id: str,
away_team_id: str,
home_team_name: str = "",
away_team_name: str = "",
league_id: str = "",
league_name: str = "",
odds_data: Optional[Dict[str, Any]] = None,
sidelined_data: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> BasketballMatchPrediction:
odds = odds_data or {}
prediction = BasketballMatchPrediction(
match_id=match_id,
home_team_name=home_team_name,
away_team_name=away_team_name,
league_name=league_name,
)
# Sidelined impact
home_sl = int((sidelined_data or {}).get("homeTeam", {}).get("totalSidelined", 0) or 0)
away_sl = int((sidelined_data or {}).get("awayTeam", {}).get("totalSidelined", 0) or 0)
kwargs.setdefault("home_sidelined_count", float(home_sl))
kwargs.setdefault("away_sidelined_count", float(away_sl))
kwargs.setdefault("sidelined_diff", float(home_sl - away_sl))
kwargs.setdefault("missing_players_impact", float(home_sl + away_sl) / 10.0)
if not self.models:
print("⚠️ [Basketball] No models loaded — returning neutral defaults")
return prediction
try:
df = self._build_feature_row(odds, **kwargs)
# ── ML ──
ml_probs = self._ensemble_predict("ml", df)
prediction.ml_home_prob = ml_probs[0]
prediction.ml_away_prob = ml_probs[1]
prediction.ml_pick = "1" if ml_probs[0] >= ml_probs[1] else "2"
prediction.ml_confidence = max(ml_probs) * 100.0
# ── Total ──
prediction.total_line = float(odds.get("tot_line") or 0.0)
tot_probs = self._ensemble_predict("total", df)
prediction.total_over_prob = tot_probs[1]
prediction.total_under_prob = tot_probs[0]
total_line = prediction.total_line
prediction.total_pick = (
f"Over {total_line}" if tot_probs[1] >= tot_probs[0] else f"Under {total_line}"
)
prediction.total_confidence = max(tot_probs) * 100.0
# ── Spread ──
prediction.spread_home_line = float(odds.get("spread_home_line") or 0.0)
sp_probs = self._ensemble_predict("spread", df)
prediction.spread_home_prob = sp_probs[0]
prediction.spread_away_prob = sp_probs[1]
home_line = prediction.spread_home_line
away_line = -home_line
prediction.spread_pick = (
f"Home {home_line:+.1f}" if sp_probs[0] >= sp_probs[1] else f"Away {away_line:+.1f}"
)
prediction.spread_confidence = max(sp_probs) * 100.0
# Market board summary
prediction.market_board = {
"ML": {
"1": f"{prediction.ml_home_prob * 100:.0f}%",
"2": f"{prediction.ml_away_prob * 100:.0f}%",
},
"Totals": {
f"Over {total_line}": f"{prediction.total_over_prob * 100:.0f}%",
f"Under {total_line}": f"{prediction.total_under_prob * 100:.0f}%",
},
"Spread": {
f"Home {home_line:+.1f}": f"{prediction.spread_home_prob * 100:.0f}%",
f"Away {away_line:+.1f}": f"{prediction.spread_away_prob * 100:.0f}%",
},
}
# Risk
top_conf = max(prediction.ml_confidence, prediction.total_confidence, prediction.spread_confidence)
prediction.risk_level = "LOW" if top_conf >= 65 else "MEDIUM" if top_conf >= 55 else "HIGH"
prediction.analysis_details = {
"model_version": "basketball_v25",
"markets_predicted": list(self.models.keys()),
"ensemble_size": {m: sum(1 for k in v.values() if v[k] is not None) for m, v in self.models.items()},
}
except Exception as e:
print(f"⚠️ [Basketball] Prediction failed for {match_id}: {e}")
return prediction
# ── Singleton factory ──────────────────────────────────────────────────────────
_predictor: Optional[BasketballV25Predictor] = None
def get_basketball_v25_predictor() -> BasketballV25Predictor:
"""Return the singleton BasketballV25Predictor (lazy-loaded)."""
global _predictor
if _predictor is None:
_predictor = BasketballV25Predictor()
return _predictor
__all__ = [
"BasketballMatchPrediction",
"BasketballV25Predictor",
"get_basketball_v25_predictor",
]
@@ -0,0 +1,101 @@
[
"home_overall_elo",
"away_overall_elo",
"elo_diff",
"home_home_elo",
"away_away_elo",
"home_form_elo",
"away_form_elo",
"home_form_score",
"away_form_score",
"form_score_diff",
"home_points_avg",
"away_points_avg",
"points_avg_diff",
"home_conceded_avg",
"away_conceded_avg",
"conceded_avg_diff",
"home_net_rating",
"away_net_rating",
"net_rating_diff",
"home_win_rate",
"away_win_rate",
"win_rate_diff",
"home_winning_streak",
"away_winning_streak",
"streak_diff",
"home_rest_days",
"away_rest_days",
"rest_diff",
"home_rebounds_avg",
"away_rebounds_avg",
"rebounds_diff",
"home_assists_avg",
"away_assists_avg",
"assists_diff",
"home_steals_avg",
"away_steals_avg",
"steals_diff",
"home_blocks_avg",
"away_blocks_avg",
"blocks_diff",
"home_turnovers_avg",
"away_turnovers_avg",
"turnovers_diff",
"home_fg_pct",
"away_fg_pct",
"fg_pct_diff",
"home_three_pt_pct",
"away_three_pt_pct",
"three_pt_pct_diff",
"home_ft_pct",
"away_ft_pct",
"ft_pct_diff",
"home_q1_avg",
"away_q1_avg",
"home_q4_avg",
"away_q4_avg",
"home_conc_rebounds_avg",
"away_conc_rebounds_avg",
"home_conc_assists_avg",
"away_conc_assists_avg",
"home_conc_turnovers_avg",
"away_conc_turnovers_avg",
"home_conc_fg_pct",
"away_conc_fg_pct",
"home_conc_three_pt_pct",
"away_conc_three_pt_pct",
"h2h_total_matches",
"h2h_home_win_rate",
"h2h_avg_points",
"h2h_avg_margin",
"h2h_over_total_rate",
"h2h_home_cover_rate",
"league_avg_points",
"league_home_win_rate",
"league_over_total_rate",
"league_home_cover_rate",
"ml_home_odds",
"ml_away_odds",
"implied_home",
"implied_away",
"total_line",
"total_over_odds",
"total_under_odds",
"implied_total_over",
"implied_total_under",
"spread_home_line",
"spread_home_odds",
"spread_away_odds",
"implied_spread_home",
"implied_spread_away",
"odds_overround",
"home_sidelined_count",
"away_sidelined_count",
"sidelined_diff",
"missing_players_impact",
"total_points_form",
"total_points_allowed_form",
"projected_total_delta_vs_line",
"projected_margin_vs_spread"
]
@@ -0,0 +1,655 @@
tree
version=v4
num_class=1
num_tree_per_iteration=1
label_index=0
max_feature_idx=98
objective=binary sigmoid:1
feature_names=Column_0 Column_1 Column_2 Column_3 Column_4 Column_5 Column_6 Column_7 Column_8 Column_9 Column_10 Column_11 Column_12 Column_13 Column_14 Column_15 Column_16 Column_17 Column_18 Column_19 Column_20 Column_21 Column_22 Column_23 Column_24 Column_25 Column_26 Column_27 Column_28 Column_29 Column_30 Column_31 Column_32 Column_33 Column_34 Column_35 Column_36 Column_37 Column_38 Column_39 Column_40 Column_41 Column_42 Column_43 Column_44 Column_45 Column_46 Column_47 Column_48 Column_49 Column_50 Column_51 Column_52 Column_53 Column_54 Column_55 Column_56 Column_57 Column_58 Column_59 Column_60 Column_61 Column_62 Column_63 Column_64 Column_65 Column_66 Column_67 Column_68 Column_69 Column_70 Column_71 Column_72 Column_73 Column_74 Column_75 Column_76 Column_77 Column_78 Column_79 Column_80 Column_81 Column_82 Column_83 Column_84 Column_85 Column_86 Column_87 Column_88 Column_89 Column_90 Column_91 Column_92 Column_93 Column_94 Column_95 Column_96 Column_97 Column_98
feature_infos=none none none none none none none [0:100] [0:100] [-100:100] [54:128.875] [61:138] [-41.333333333333329:24] [61:130.125] [59:133] [-50:34.916666666666671] [-28.666666666666657:32.5] [-24.5:27] [-35.5:31.833333333333329] [0:1] [0:1] [-1:1] [0:12] [0:12] [-12:10] [0.91666666666666663:186.1875] [0.85416666666666663:185.08333333333337] [-179.02083333333334:184.16666666666663] [20:54] [22:56] [-21:15.833333333333332] [11:34.5] [9:36] [-15:13.333333333333332] [2:13] [3:13] [-8:5.5] [0.875:9] [0:12] [-9:4.875] [7:22] [8:23] [-10:10] [0.41025641025641019:0.67647058823529416] [0.39730639730639727:0.76744186046511631] [-0.3174418604651163:0.22647058823529409] [0.20419254658385089:0.4825000000000001] [0.1875:0.54166666666666663] [-0.22171945701357459:0.17923280423280419] [0.47826086956521741:0.87898457080200498] [0.42857142857142849:0.88636363636363635] [-0.29647435897435898:0.38962801311591622] [11:39.666666666666664] [15.5:40] [10:31.875] [11.5:34] [15:56] [17:57] [12:32.125] [10:38] [8:20] [7:22] [0.40000000000000002:0.69047619047619047] [0.3902439024390244:0.71739130434782605] [0.23753976670201479:0.51851851851851849] [0.1707317073170731:0.54545454545454541] [0:8] [0:1] [120:295] [-38:36] [0:1] [0:1] [140.39130434782609:233.09999999999999] [0:1] [0:1] [0:1] [1.05:4.1100000000000003] [1.05:4.1299999999999999] [0.2034883720930232:0.79729729729729726] [0.20270270270270269:0.79651162790697683] [145.5:257.5] [1.3:7.7599999999999998] [1.4099999999999999:7.5599999999999996] [0.1789976133651551:0.75977653631284914] [0.24022346368715081:0.82100238663484482] [-15.5:12.5] [1.3:2.7200000000000002] [1.1799999999999999:2.7799999999999998] [0.36298076923076922:0.64953271028037374] [0.35046728971962621:0.63701923076923073] [0.1918833727344364:0.21555133935749429] none none none none [132:250.6875] [126.25:256.625] [-63.5:88.6875] [-35.5:27.5]
tree_sizes=2312 2651 2758 2331 2348 2534 2554 2662 2359 2770 1932 1926 2552 2145 2460 2555 1931 2453 1831 2020 2352 2348 2035
Tree=0
num_leaves=21
num_cat=0
split_feature=77 77 95 50 40 97 37 25 73 56 81 57 35 17 50 97 25 23 23 9
split_gain=87.208 23.2965 12.6569 10.5307 10.1901 9.83622 9.59439 7.56522 7.2843 6.81795 6.60435 5.65211 7.74871 6.22559 5.61137 5.14301 3.28504 2.92219 2.6911 0.0931822
threshold=1.3750000000000002 2.6550000000000007 166.34375000000003 0.73851279305095374 11.062500000000002 5.9687500000000009 3.7750000000000004 1.8802083333333337 0.5344873150105709 30.803571428571434 1.5750000000000004 41.062500000000007 6.6937500000000005 -4.7749999999999977 0.71663103786665172 -3.4241071428571384 7.7864583333333348 1.0000000180025095e-35 1.0000000180025095e-35 -39.015151515151508
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 2 14 -1 -4 7 -7 -6 -3 -10 -5 12 -11 -13 -2 -16 18 -17 -12 -20
right_child=1 8 4 10 5 6 -8 -9 9 11 16 13 -14 -15 15 17 -18 -19 19 -21
leaf_value=-0.14765628269715012 -0.15600540791535678 -0.23353299922727577 -0.12151989070972055 -0.13344020462290634 -0.11780070546597361 -0.21612555668738542 -0.16327023115717296 -0.15881183656013212 -0.15004174704520915 -0.17986005139594724 -0.10382337530156519 -0.2025678370168939 -0.21889492254600434 -0.15072330828751174 -0.17027559642606713 -0.19596193574534582 -0.1164283857771044 -0.23353299922727577 -0.079223274212206221 -0.072514155733290175
leaf_weight=11.17881618440151 6.70728971064091 6.210453435778617 11.427234321832655 9.1914710849523527 7.94938039779663 9.1914710849523562 13.662997558712958 76.015950053930283 6.7072897106409064 14.159833833575247 8.943052947521215 6.4588715732097617 19.128196582198143 8.6946348100900632 6.9557078480720582 7.4525441229343361 8.1977985352277738 5.9620352983474731 5.9620352983474714 7.4525441229343414
leaf_count=45 27 25 46 37 32 37 55 306 27 57 36 26 77 35 28 30 33 24 24 30
internal_value=-0.160244 -0.171796 -0.16307 -0.113358 -0.157421 -0.161262 -0.184527 -0.154929 -0.192463 -0.187838 -0.103712 -0.193072 -0.202291 -0.172821 -0.187739 -0.198187 -0.0947688 -0.21266 -0.0868269 -0.075496
internal_weight=257.61 206.684 145.325 50.9257 118.247 106.82 22.8545 83.9653 61.3593 55.1488 39.7469 48.4415 33.288 15.1535 27.0776 20.3703 30.5554 13.4146 22.3576 13.4146
internal_count=1037 832 585 205 476 430 92 338 247 222 160 195 134 61 109 82 123 54 90 54
is_linear=0
shrinkage=1
Tree=1
num_leaves=24
num_cat=0
split_feature=78 78 95 46 40 97 37 41 13 50 75 83 22 17 50 97 15 81 65 43 23 43 25
split_gain=80.2042 21.5716 11.6989 11.0897 9.36546 9.08525 8.91556 8.1723 7.75226 7.09679 6.96929 6.56155 6.17385 5.70382 5.20301 4.788 4.74542 3.71513 3.26193 3.12932 2.73996 1.35961 0.11562
threshold=0.39285714285714285 0.68693349935065817 166.34375000000003 0.34153389001132711 11.062500000000002 5.9687500000000009 3.7750000000000004 14.612500000000002 109.93750000000001 0.74692579813599125 0.36503968253968255 0.47970222696600762 1.5000000000000002 6.5625000000000009 0.71663103786665172 -3.4241071428571384 1.5166666666666659 1.635 0.35759117986701622 0.52557970719916269 1.0000000180025095e-35 0.51924929840975398 3.0312500000000004
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 2 14 8 -4 7 -7 -6 9 -1 16 -12 17 21 -2 -16 18 -13 -3 -20 -17 22 -5
right_child=1 10 4 13 5 6 -8 -9 -10 -11 11 12 -14 -15 15 20 -18 -19 19 -21 -22 -23 -24
leaf_value=0.0055530680640734674 0.0032845293651423612 -0.072766645625454485 0.036292826760507588 0.077436849336701308 -0.0022894176915147862 -0.054725292524358454 -0.003689958048423233 0.027881485430822688 -0.013760354637319356 0.059619560028970849 -0.050594773409676115 -0.0085801885492360844 -0.033840627756109222 0.036366913393967221 -0.010423432121785391 -0.035188847843671808 -0.021864921616209863 0.033823144597002808 -0.060759072260146665 -0.021284474578964662 -0.071669256311519158 0.058239915950701049 0.085149025961743521
leaf_weight=6.4684473276138368 6.7090961784124401 9.4034554809331912 11.457648575305937 5.9847112745046669 65.60038697719574 9.142817825078966 13.658771887421606 18.391589343547821 6.4761682301759711 9.7236045598983747 7.1794854700565365 7.4327530264854449 8.1723358631134015 9.9696762263774854 6.9495052695274415 7.4284560978412575 10.15483947098255 5.9542565494775772 6.6812664419412595 6.190190851688385 5.9189314842224121 5.978186532855033 6.4756923913955688
leaf_count=26 27 38 46 24 264 37 55 74 26 39 29 30 33 40 28 30 41 24 27 25 24 24 26
internal_value=-0.000786037 -0.0118905 -0.00349945 0.0440926 0.00192543 -0.00176177 -0.0241537 0.00431706 0.0232273 0.0380209 -0.0318166 -0.0174741 -0.00644452 0.0607419 -0.0272534 -0.0373477 -0.0445269 0.0102799 -0.0548581 -0.0417748 -0.0513662 0.0739213 0.0814449
internal_weight=257.502 206.426 145.257 51.0765 118.251 106.794 22.8016 83.992 22.6682 16.1921 61.1686 28.7388 21.5593 28.4083 27.006 20.2969 32.4298 13.387 22.2749 12.8715 13.3474 18.4386 12.4604
internal_count=1037 832 585 205 476 430 92 338 91 65 247 116 87 114 109 82 131 54 90 52 54 74 50
is_linear=0
shrinkage=0.04
Tree=2
num_leaves=25
num_cat=0
split_feature=77 76 10 46 51 75 75 42 12 51 74 44 9 61 86 41 49 17 85 44 27 49 74 97
split_gain=73.8475 19.9927 10.9898 10.1665 9.79528 8.50828 11.1493 9.23472 9.00307 8.63936 8.14649 7.11143 7.43675 10.0884 8.56935 6.6516 6.48284 5.22574 5.21615 4.45089 3.01631 2.88722 1.39723 0.105613
threshold=1.3750000000000002 1.2250000000000003 80.937500000000014 0.34153389001132711 0.011242161620033652 0.44508064516129037 0.39517441860465125 -0.61249999999999971 2.5499999999999976 0.012462680598456451 0.40642170329670335 0.55897334407178179 8.7121212121212164 11.866071428571431 1.635 14.225000000000001 0.78414366883116904 6.5625000000000009 6.0000000000000009 0.51258507460016867 -0.10937499999999993 0.73610958414604988 0.62250000000000016 1.1937499999999945
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 11 19 4 18 6 8 10 -4 -9 -7 12 13 -2 -15 20 -8 22 -1 -3 -14 -21 23 -5
right_child=1 2 5 17 -6 7 16 9 -10 -11 -12 -13 15 14 -16 -17 -18 -19 -20 21 -22 -23 -24 -25
leaf_value=0.017705783755189462 0.028974497745101052 -0.0060737041061852126 0.028384924190391733 0.081917709440956332 -0.011007695119626166 0.051971315475459197 -0.05418693058345065 -0.015490981812325239 -0.030486429608756988 0.017134873390643509 0.01007092726093781 -0.071582917658417497 -0.024872499607636107 0.0039725729814389819 -0.046514665309970234 -0.0035756501646253529 -0.0037995538175229883 0.034855017993211231 0.06661466771004268 -0.026672382631287159 -0.057894168683677763 -0.065766238281201289 0.054894232678188361 0.074556160849369266
leaf_weight=6.7373062372207659 7.4330984354019192 7.1768550276756313 11.678280815482141 6.4950603991746956 8.7123459726572019 11.683709740638731 11.923531427979471 30.992172449827194 6.4527826756238928 22.351678267121315 20.364999443292618 5.9138691127300254 5.9248394817113939 9.3952521085739154 12.583362787961958 6.4179672747850409 6.2149500101804724 9.9856856316328031 7.2363171875476837 6.1660670638084394 17.493629351258278 5.9290464669466019 5.9967442750930777 5.9974743723869324
leaf_count=27 30 29 47 26 35 47 48 125 26 90 82 24 24 38 51 26 25 40 29 25 71 24 24 24
internal_value=-0.000757575 -0.0114353 -0.00296399 0.0422564 0.0222795 0.00148162 -0.0147493 0.00837551 0.00743281 -0.00182038 0.0253462 -0.0297572 -0.0255824 -0.011309 -0.0249328 -0.0396526 -0.0369223 0.058172 0.0430335 -0.0310287 -0.0495397 -0.0458363 0.0707651 0.0783835
internal_weight=257.257 206.096 140.934 51.1609 22.686 121.662 36.2695 85.3926 18.1311 53.3439 32.0487 65.162 59.2481 29.4117 21.9786 29.8364 18.1385 28.475 13.9736 19.272 23.4185 12.0951 18.4893 12.4925
internal_count=1037 832 568 205 91 490 146 344 73 215 129 264 240 119 89 121 73 114 56 78 95 49 74 50
is_linear=0
shrinkage=0.04
Tree=3
num_leaves=21
num_cat=0
split_feature=77 77 95 46 51 86 49 42 73 69 54 7 31 85 50 81 61 44 53 77
split_gain=68.0698 18.5538 10.4158 9.36539 9.01462 7.14355 8.22377 7.01134 6.54248 6.34821 10.3019 8.78745 4.815 4.79475 4.79 3.98009 3.96113 2.92643 3.60798 2.21945
threshold=1.3750000000000002 2.6550000000000007 166.34375000000003 0.34153389001132711 0.011242161620033652 2.2450000000000006 0.80827483213138251 -0.61249999999999971 0.5344873150105709 4.6500000000000012 23.062500000000004 40.833333333333343 23.550000000000004 6.0000000000000009 0.71663103786665172 1.7750000000000001 12.062500000000002 0.53723539037745083 24.803571428571434 1.925
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 2 14 4 13 6 7 -4 -3 11 -11 -10 -8 -1 -2 19 -13 18 -5 -16
right_child=1 8 5 17 -6 -7 12 -9 9 10 -12 16 -14 -15 15 -17 -18 -19 -20 -21
leaf_value=0.016987942399224069 0.0035814571364199662 -0.068907535589063609 0.01429912015371851 0.079098125792394905 -0.010575188509827679 -0.033022257030977734 0.059900457392998388 -0.008250813675967944 0.00010937594628987166 -0.040766144429216389 0.025542986379799158 -0.019382488279346841 0.013986965287345072 0.063856791784991249 -0.032605400222488659 -0.0094169329033003141 -0.05275351833756161 0.039347381597312048 0.041332957240490006 -0.064822733951046208
leaf_weight=6.7410361915826815 6.703504994511607 6.0858818590641013 35.323686808347702 11.494772866368299 8.7060475498437864 8.6918699741363508 5.9581677168607703 58.758967056870461 9.3814801126718503 6.6312184035778037 8.6243968755006772 7.6371410340070751 9.4523572921752912 7.2466183453798294 7.3598874658346194 6.4097239673137656 22.335135906934738 10.743333801627157 6.2475921660661689 6.3931281566619873
leaf_count=27 27 25 142 46 35 35 24 237 38 27 35 31 38 29 30 26 91 43 25 26
internal_value=-0.000730438 -0.0109991 -0.00322902 0.0405507 0.0213801 0.00188154 0.0046523 0.000215644 -0.0295683 -0.0251842 -0.00327986 -0.0336755 0.0317385 0.0412694 -0.0257105 -0.0354492 -0.0442504 0.0558234 0.0657999 -0.0475817
internal_weight=256.926 205.747 145.051 51.1794 22.6937 118.185 109.493 94.0827 60.6953 54.6094 15.2556 39.3538 15.4105 13.9877 26.8662 20.1627 29.9723 28.4857 17.7424 13.753
internal_count=1037 832 585 205 91 476 441 379 247 222 62 160 62 56 109 82 122 114 71 56
is_linear=0
shrinkage=0.04
Tree=4
num_leaves=21
num_cat=0
split_feature=77 76 10 46 38 52 90 13 63 35 27 50 50 12 36 17 44 49 74 45
split_gain=62.8093 17.2545 9.84388 8.65527 8.50818 8.94286 7.35721 7.1083 6.79729 7.71767 6.73811 6.55324 5.90783 5.04228 9.19595 4.75428 4.11592 2.77884 1.29074 0.102719
threshold=1.3750000000000002 1.2250000000000003 80.937500000000014 0.34153389001132711 2.6904761904761911 26.562500000000004 0.19724305626860048 109.93750000000001 0.55212996425566885 6.2750000000000012 -0.020833333333333648 0.76335870360422431 0.74692579813599125 7.3125000000000009 -0.26190476190476181 6.5625000000000009 0.51258507460016867 0.73610958414604988 0.62250000000000016 -0.0013187490579748998
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 8 16 7 11 6 -6 12 9 -2 -7 -4 -1 14 -11 18 -3 -18 19 -5
right_child=1 2 4 15 5 10 -8 -9 -10 13 -12 -13 -14 -15 -16 -17 17 -19 -20 -21
leaf_value=0.0050560182108375287 0.01454555314741844 -0.0053774531433530308 0.0088078452630964664 0.076512274772428487 0.012372105311854016 0.04280070757761658 -0.015898578376521092 -0.014892135795344198 -0.057603106440945573 -0.013716056812372755 -2.743210612548636e-05 0.06049972094546327 0.054367671817810027 -0.0053464224261316716 -0.054858503781032956 0.031440123411904081 -0.024849777283710722 -0.063361316106082435 0.050431436557637821 0.069244124813442359
leaf_weight=6.4694478660821977 7.8782515972852698 7.1609281748533276 10.659276410937311 6.4840846359729793 19.36934831738472 10.460701778531076 61.473121464252472 6.4756214618682852 10.231325536966322 16.168815225362774 13.415295347571371 6.2101935893297187 9.7394945174455625 11.545304596424101 18.798230275511745 9.9960098117589933 6.1246920526027662 5.871805265545845 5.9897889196872702 5.9808590859174755
leaf_count=26 32 29 43 26 78 42 248 26 42 66 54 25 39 47 77 40 25 24 24 24
internal_value=-0.0007041 -0.010581 -0.00272469 0.0389631 0.00147434 -0.00277256 -0.0091251 0.0205332 -0.0276919 -0.0220654 0.0187367 0.0278373 0.034686 -0.0282665 -0.0358342 0.0536577 -0.029375 -0.0436996 0.0656918 0.0730249
internal_weight=256.503 205.367 140.745 51.1353 121.588 104.718 80.8425 22.6846 64.6219 54.3906 23.876 16.8695 16.2089 46.5124 34.967 28.4507 19.1574 11.9965 18.4547 12.4649
internal_count=1037 832 568 205 490 422 326 91 264 222 96 68 65 190 143 114 78 49 74 50
is_linear=0
shrinkage=0.04
Tree=5
num_leaves=23
num_cat=0
split_feature=77 77 49 8 59 55 59 9 40 35 50 74 36 58 97 86 49 8 95 56 52 48
split_gain=59.2137 16.5141 9.9144 11.7486 12.1383 8.99339 10.5611 8.66857 10.2694 8.14362 7.72674 7.72484 12.0149 7.4839 7.07919 6.69422 4.48655 4.69909 4.36454 2.6211 3.68984 2.53359
threshold=1.3550000000000002 2.2550000000000003 0.75657876131221735 31.666666666666668 17.267857142857146 19.535714285714288 18.937500000000004 -2.6785714285714266 12.937500000000002 8.3875000000000011 0.73851279305095374 0.54575358851674649 0.066964285714285574 24.354166666666668 2.9821428571428616 1.7450000000000003 0.77175549922512843 65.151515151515156 173.36607142857147 40.38750000000001 20.937500000000004 0.0091659973707001029
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=10 2 3 -2 -5 -3 18 8 -8 11 -1 14 -13 -6 -4 -9 19 -18 21 20 -12 -7
right_child=1 5 9 4 13 6 7 15 -10 -11 16 12 -14 -15 -16 -17 17 -19 -20 -21 -22 -23
leaf_value=0.011681667207113649 0.033908823443212138 0.017536154054844273 0.021215365730383975 0.03251466321515277 -0.035698618619601191 -0.069217953298740698 0.040352083533208016 -0.037782244586161744 -0.02378355379185353 -0.016829989475423059 0.067034861961705272 0.039469073342078345 -0.028155083253979179 0.009683122942141572 0.064205028012414941 -0.003245990990143062 0.008774257662963655 0.054818660291170745 -0.016792700372560584 0.076628087701740399 0.026483773237715093 -0.038854738339966714
leaf_weight=10.698686346411703 8.1725911200046522 9.060679689049719 26.58554495871067 6.4189013689756385 30.358332172036171 10.46654355525971 8.4112317115068453 24.458471849560738 7.6072484105825415 11.665961533784865 8.4696357846260089 6.9492753297090557 10.640220895409582 7.1914227455854407 7.9649787694215766 14.189488068222998 6.9667989164590818 7.2231719940900803 5.8842895776033393 9.7066867947578412 6.2319085150957099 7.58248247206211
leaf_count=43 33 37 107 26 123 43 34 100 31 47 34 28 43 29 32 58 28 29 24 39 25 31
internal_value=0.0024258 -0.0070979 0.00280729 -0.0101318 -0.0183177 -0.0201993 -0.0245493 -0.0148477 0.00989378 0.0133809 0.0417609 0.0201404 -0.00143807 -0.0270072 0.0311258 -0.0251024 0.0500983 0.0322124 -0.046709 0.0604964 0.0498455 -0.0564622
internal_weight=252.905 203.608 115.947 52.1412 43.9687 87.6604 78.5998 54.6664 16.0185 63.806 49.2969 52.14 17.5895 37.5498 34.5505 38.648 38.5982 14.19 23.9333 24.4082 14.7015 18.049
internal_count=1024 826 468 211 178 358 321 223 65 257 198 210 71 152 139 158 155 57 98 98 59 74
is_linear=0
shrinkage=0.04
Tree=6
num_leaves=23
num_cat=0
split_feature=76 78 27 46 86 11 55 29 65 47 50 46 18 61 81 36 44 9 13 43 81 44
split_gain=54.7148 15.7289 10.7788 14.3867 9.58583 8.78499 8.46201 10.8596 9.81775 8.03411 7.677 7.45074 5.6669 7.59424 5.03116 4.82371 4.8014 4.00902 3.45566 2.28622 3.41567 0.00497756
threshold=2.1950000000000007 0.62900904125363766 -2.3020833333333326 0.34016393531964906 2.1450000000000005 122.43750000000001 19.854166666666668 39.812500000000007 0.3770676732818225 0.36593380560885896 0.73563272044162131 0.35736791305409527 -3.1874999999999996 12.690476190476192 1.5150000000000003 0.25833333333333336 0.49644821746513607 -8.7121212121212164 117.43750000000001 0.50247426971013531 1.7450000000000003 0.53567202664757096
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 2 11 9 5 -5 -3 8 17 16 -2 15 -9 -14 -12 -1 -4 -8 19 21 -21 -16
right_child=10 6 3 4 -6 -7 7 12 -10 -11 14 -13 13 -15 18 -17 -18 -19 -20 20 -22 -23
leaf_value=0.070324761157363386 0.0084031851759769002 0.012486475233554927 -0.014274497772434848 0.02062654133633882 -0.032240672579085888 -0.029937743516828159 -0.020241102484534694 0.022406478756216045 0.0013966323082333749 0.0049975598343196606 0.016976514726108942 -0.0032581054169875008 0.011631640379196602 -0.033623046124736253 0.073425640497252462 0.021730508558563181 -0.050815423135628625 -0.053986984893246243 0.028244335395794146 0.069298036861864615 0.028035945730723436 0.075014904249462944
leaf_weight=6.9250844269990939 9.2112289071083051 11.516124725341795 9.6139747798442858 48.390351086854935 7.907471075654029 6.2022370845079413 7.0930576324462846 7.9190109968185416 8.3109913170337659 11.12387755513191 6.4693672657012931 7.1976774334907523 9.5812086164951342 15.581949874758719 6.6890226900577598 6.1896365433931351 14.328032165765761 27.358511343598369 5.9492755085229865 6.9541786164045316 5.9617312103509903 5.9650087505578995
leaf_count=28 37 47 39 195 32 25 29 32 34 45 26 29 39 64 27 25 58 113 24 28 24 24
internal_value=0.00233149 -0.00659898 0.00293381 -0.00258517 0.00892003 0.014882 -0.0194618 -0.0243128 -0.0376256 -0.0230916 0.0411639 0.029443 -0.00710444 -0.0163917 0.0491075 0.0473902 -0.0361423 -0.0470392 0.0557024 0.062091 0.0502523 0.0741748
internal_weight=252.439 205.239 117.878 97.5659 62.5001 54.5926 87.3609 75.8447 42.7626 35.0659 47.1998 20.3124 33.0822 25.1632 37.9886 13.1147 23.942 34.4516 31.5192 25.5699 12.9159 12.654
internal_count=1024 834 476 394 252 220 358 311 176 142 190 82 135 103 153 53 97 142 127 103 52 51
is_linear=0
shrinkage=0.04
Tree=7
num_leaves=24
num_cat=0
split_feature=76 78 27 31 32 42 58 74 15 40 46 26 76 97 98 73 62 60 46 61 90 97 42
split_gain=50.8947 11.0049 9.32122 10.5702 10.4712 9.27532 8.02666 7.19583 7.79066 6.93078 7.78095 6.76995 6.46148 6.98083 6.77168 6.35566 6.68601 5.47664 5.22172 4.56413 4.30903 3.83111 0.826657
threshold=1.8750000000000002 0.70062845714719713 -9.1770833333333304 18.401785714285719 20.645833333333339 -2.774999999999999 26.401785714285719 0.63245614035087727 -0.43749999999999994 11.866071428571431 0.37542254586435625 6.7812500000000009 2.4650000000000003 -3.4241071428571384 -6.3184523809523787 0.55383792139841859 0.51117533329139231 13.35416666666667 0.35034659555724629 12.062500000000002 0.19829454556786633 0.28125000000000006 0.19761904761904778
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 2 -1 7 11 -6 12 8 -4 19 15 -5 14 -14 -2 -11 -17 18 -16 -3 -15 -19 -22
right_child=6 9 3 4 5 -7 -8 -9 -10 10 -12 -13 13 20 17 16 -18 21 -20 -21 22 -23 -24
leaf_value=0.036083213444415528 0.053277627847354556 -0.021782130288108433 -0.0091325820237097618 0.049355787080620245 0.039957936624339677 -0.010162507947377566 0.066494523434590488 0.007861663962773149 -0.056001702923737531 -0.05035447873268746 0.025153534242404262 -0.0002279551315397668 0.0074196720389386114 0.025979484890861368 -0.0040180493742294409 -0.035722871972219031 0.012557904583466907 -0.037148801270085199 0.040822377329808285 -0.065566429091563946 0.054687788523123476 0.0082770984108304652 0.074704189299082305
leaf_weight=9.12299896776676 5.9481788575649253 6.5940756052732459 9.5768276453018242 10.393245324492456 6.4848928600549689 66.375781863927841 8.9054767787456495 7.6045115441083899 13.925430610775946 9.6051787436008471 6.321256920695304 7.647834599018096 7.4389976263046291 6.437974855303767 7.4326431751251283 8.6875483840703982 9.7281266301870328 5.9339717775583249 9.4233583807945234 9.0198750644922239 5.9406516849994642 5.9482765197753906 7.4301211833953857
leaf_count=37 24 27 39 42 26 268 36 31 57 40 26 31 30 26 30 36 40 24 38 38 24 24 30
internal_value=0.00224138 -0.00900344 -0.00291719 -0.00583339 0.00105391 -0.00570159 0.0309866 -0.0259598 -0.0369032 -0.0249795 -0.0149333 0.0283366 0.0258809 0.0404581 0.0144298 -0.0239765 -0.0102184 0.00638917 0.02105 -0.0470755 0.0528654 -0.0144085 0.0658109
internal_weight=251.927 181.088 131.132 122.009 90.9018 72.8607 70.8397 31.1068 23.5023 49.9561 34.3421 18.0411 61.9342 27.2477 34.6864 28.0209 18.4157 28.7382 16.856 15.614 19.8087 11.8822 13.3708
internal_count=1024 738 531 494 367 294 286 127 96 207 142 73 250 110 140 116 76 116 68 65 80 48 54
is_linear=0
shrinkage=0.04
Tree=8
num_leaves=21
num_cat=0
split_feature=76 78 27 10 18 88 65 58 48 76 75 50 55 57 45 61 52 16 37 64
split_gain=47.0285 10.2297 8.57167 9.90629 8.49863 11.7596 8.93582 7.47628 6.70549 7.3988 6.88535 6.62498 6.53575 6.19036 7.25158 6.45563 5.31391 3.33594 2.78119 2.27511
threshold=1.8750000000000002 0.70062845714719713 -9.1770833333333304 82.267857142857153 -4.3541666666666705 0.49939467312348679 0.34321545037197632 26.401785714285719 0.021174040484134952 2.2250000000000005 0.64837092731829593 0.73563272044162131 19.854166666666668 41.062500000000007 0.0031867569273384007 12.062500000000002 20.937500000000004 -6.3124999999999991 2.6125000000000003 0.36408812715877942
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 2 -1 17 5 -5 -6 8 9 10 -2 -11 -3 15 -15 -14 -13 -4 -19 -17
right_child=7 12 3 4 6 -7 -8 -9 -10 11 -12 16 13 14 -16 19 -18 18 -20 -21
leaf_value=0.034533388049788116 0.023208498381564548 0.010691617096435792 -0.0099350499244745989 -0.013374654384074001 -0.030791758397498405 0.036260912427308507 0.00098553848763632086 0.064191286864412847 0.055107303489570873 -0.0011209989712481605 -0.023658767436963126 0.066673274412926392 -0.0085936349487782514 0.018687811460451158 -0.034036012019189175 -0.064822499725983529 0.027488397354637369 -0.062556001891513391 -0.025921626974491768 -0.035460654646497967
leaf_weight=9.1505375951528531 10.182640746235849 7.3519092053174964 8.031925320625307 12.611105665564539 19.997042715549469 19.363006785511971 48.494191810488701 8.8569523096084577 9.8766392171382886 7.7215904891490963 9.8834861069917661 8.600008040666582 6.5588535666465786 7.2764098048210171 9.7889975756406766 12.264306232333185 15.548548832535742 6.0707129538059217 7.3061331808567047 6.4392972588539115
leaf_count=37 41 30 33 51 81 78 196 36 40 31 40 35 27 30 41 52 63 25 30 27
internal_value=0.00215494 -0.00866469 -0.0028044 -0.00560779 -0.000343403 0.0166838 -0.00829232 0.029821 0.0248962 0.019151 0.000124224 0.0311307 -0.0241206 -0.0301671 -0.0115554 -0.0427397 0.0414433 -0.0303121 -0.0425471 -0.0547138
internal_weight=251.374 180.704 131.025 121.874 100.465 31.9741 68.4912 70.6699 61.8129 51.9363 20.0661 31.8701 49.6798 42.3279 17.0654 25.2625 24.1486 21.4088 13.3768 18.7036
internal_count=1024 738 531 494 406 129 277 286 250 210 81 129 207 177 71 106 98 88 55 79
is_linear=0
shrinkage=0.04
Tree=9
num_leaves=25
num_cat=0
split_feature=78 78 49 49 88 63 41 34 8 52 58 40 46 48 78 75 50 73 62 52 52 89 61 30
split_gain=43.4968 9.51409 8.12139 9.24718 9.16177 10.1476 7.62239 8.2415 7.26498 6.99079 6.98605 6.58508 7.12245 6.22401 6.86036 6.34938 6.1593 5.87831 6.08977 5.0965 5.00399 4.68377 4.18568 2.8965
threshold=0.44444541990764314 0.70062845714719713 0.75617247776303531 0.7645406242769327 0.53179766719418564 0.50671990539912803 11.35416666666667 6.322916666666667 57.738095238095248 21.387500000000003 26.401785714285719 11.866071428571431 0.37542254586435625 0.021174040484134952 0.37587731721902251 0.64837092731829593 0.73563272044162131 0.55383792139841859 0.51117533329139231 20.145833333333339 20.937500000000004 0.48209613271233459 12.062500000000002 0.14583333333333218
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=10 2 4 21 5 9 -5 -8 -7 -2 13 22 17 14 16 -16 -1 -13 -19 -6 -18 -4 -3 -21
right_child=1 11 3 6 19 8 7 -9 -10 -11 -12 12 -14 -15 15 -17 20 18 -20 23 -22 -23 -24 -25
leaf_value=-0.0010761821563351997 -0.0504521132857738 -0.020888576146344352 0.063381196536312773 0.031300149180180752 -0.0091861880185250058 0.032629739057098435 0.026755708854807866 -0.014960107933453847 -0.014669550109854696 0.0014514898694815139 0.062084955523258885 -0.047888703413600249 0.024824846695046907 0.053148590255660884 0.022272644170989585 -0.022746839774489575 0.064616726934294752 -0.033579092631573867 0.012687158247053376 -0.033188047394488003 0.026459678285886239 0.014128239830164315 -0.063091908329842183 -0.070867855155932621
leaf_weight=7.7214295715093639 8.7155217230319995 6.5707600712776175 5.9724875986576063 10.170844271779059 7.820334509015086 14.327638953924181 9.4181414246559125 38.771787792444229 8.1518821418285352 7.9293809384107581 8.7924163937568647 9.4247248619794863 6.3216789960861197 9.8318816572427732 10.185996487736704 9.8688251972198469 8.5203529894351977 8.5675740987062472 9.7116009145975095 7.3497301340103132 15.507762670516966 6.3991550505161285 8.7900275439023954 5.8721687346696854
leaf_count=31 36 27 24 41 32 58 38 157 33 32 36 40 26 40 41 40 35 36 40 30 63 26 38 24
internal_value=0.00207304 -0.00834038 -0.00269623 0.00649292 -0.0134991 -0.00205203 -0.000166073 -0.00680725 0.0154773 -0.025726 0.0287296 -0.0233003 -0.0134864 0.0239715 0.018434 0.000118899 0.0300029 -0.0222285 -0.00899815 -0.0347829 0.0399901 0.0379054 -0.0450389 -0.0499226
internal_weight=250.714 180.285 130.899 70.7324 60.1667 39.1244 58.3608 48.1899 22.4795 16.6449 70.4287 49.3864 34.0256 61.6362 51.8044 20.0548 31.7495 27.7039 18.2792 21.0422 24.0281 12.3716 15.3608 13.2219
internal_count=1024 738 531 286 245 159 236 195 91 68 286 207 142 250 210 81 129 116 76 86 98 50 65 54
is_linear=0
shrinkage=0.04
Tree=10
num_leaves=17
num_cat=0
split_feature=77 77 46 27 54 57 10 43 97 43 40 54 36 76 74 37
split_gain=38.196 8.66905 8.17628 11.664 9.48906 7.12349 6.31154 5.47718 7.96438 4.78449 4.32492 4.31373 3.38709 2.74196 2.35164 0.967464
threshold=1.3550000000000002 3.2850000000000006 0.30594944598976864 -7.9791666666666661 21.145833333333339 39.267857142857146 82.267857142857153 0.517251622710314 -4.4062499999999991 0.53429184335543722 13.062500000000002 21.387500000000003 0.25833333333333336 1.6050000000000002 0.49258241758241766 4.354166666666667
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=7 2 13 11 6 -6 -5 8 -1 14 -9 -4 15 -2 -3 -10
right_child=1 9 3 4 5 -7 -8 10 12 -11 -12 -13 -14 -15 -16 -17
leaf_value=0.0010938350878875731 -0.056271458252691184 -0.059825392191432909 0.059788669423910978 -0.039005664596234592 0.013116780911492858 -0.0075331438797635181 -0.0060155405307781556 -0.0070478699384603017 0.069320208232492922 -0.0094715712429085195 0.037723823315383334 0.012087841826597247 0.032449698237748015 -0.017612173431570215 -0.028207255364314641 0.049213209433963845
leaf_weight=6.4049855172634116 5.8483278751373273 7.5306213200092333 5.9356311261653882 18.997604206204414 43.460418909788132 69.426650807261467 18.137169808149338 8.3646507412195188 10.436919420957571 11.170329853892325 5.8781122714281073 6.2036552876234055 10.014484807848929 5.8936225026845932 7.5243068188428879 6.0471841692924491
leaf_count=26 24 32 24 79 177 282 74 34 43 47 24 25 41 24 32 25
internal_value=-0.000237742 -0.00786817 -0.00463523 -0.00230133 -0.00535295 0.000416875 -0.0228928 0.0321521 0.0411221 -0.0293062 0.0114298 0.0354117 0.0507973 -0.0368673 -0.044023 0.061944
internal_weight=247.275 200.128 173.903 162.161 150.022 112.887 37.1348 47.1463 32.9036 26.2253 14.2428 12.1393 26.4986 11.742 15.0549 16.4841
internal_count=1013 820 709 661 612 459 153 193 135 111 58 49 109 48 64 68
is_linear=0
shrinkage=0.04
Tree=11
num_leaves=17
num_cat=0
split_feature=77 77 46 86 10 15 15 82 23 59 38 40 31 28 26 76
split_gain=35.3726 8.07222 7.58645 6.89209 6.39755 8.67145 7.68194 5.36328 4.76149 6.47719 8.94184 5.29839 3.97447 5.87629 4.03464 2.57596
threshold=1.3550000000000002 3.2850000000000006 0.30594944598976864 2.1550000000000007 81.291666666666671 -3.3523809523809471 -7.5624999999999991 1.7550000000000001 1.5000000000000002 22.062500000000004 3.0625000000000004 13.550000000000002 21.550000000000004 41.937500000000007 7.2708333333333348 1.6050000000000002
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=8 2 15 4 5 -4 -6 -5 9 10 -1 -10 14 -14 -3 -2
right_child=1 12 3 7 6 -7 -8 -9 11 -11 -12 -13 13 -15 -16 -17
leaf_value=0.028732650584569908 -0.054519811278789007 -0.064491542130753823 0.014624725366248799 -0.0085976653870859041 -0.017414530959728048 -0.043244668184488785 0.0081240637328352395 -0.056808196775163779 0.027059814047174627 0.044509876383840984 -0.03008579237684399 0.070067154331982448 0.011727805368536705 -0.043007463941734977 -0.020379238714872804 -0.016944262076966699
leaf_weight=10.437584459781652 5.7958323061466199 7.0121503621339816 6.5479190945625332 9.3588718920946103 22.927828788757324 11.280139639973639 105.82874263823032 6.0974729955196372 9.9562236517667788 11.169452026486395 6.8490484803915015 8.4930274337530118 7.1303854584693891 5.6052840352058411 6.2962067276239395 5.8811570256948471
leaf_count=43 24 30 27 38 94 47 430 25 41 46 28 35 30 24 27 24
internal_value=-0.000229245 -0.00756918 -0.0044558 -0.00221187 0.000466882 -0.0219903 0.00357638 -0.0276165 0.0310303 0.0207685 0.0054285 0.046858 -0.0283359 -0.0123625 -0.0436219 -0.0355948
internal_weight=246.667 199.762 173.718 162.041 146.585 17.8281 128.757 15.4563 46.9053 28.4561 17.2866 18.4493 26.044 12.7357 13.3084 11.677
internal_count=1013 820 709 661 598 74 524 63 193 117 71 76 111 54 57 48
is_linear=0
shrinkage=0.04
Tree=12
num_leaves=23
num_cat=0
split_feature=77 77 52 75 51 26 74 17 75 82 54 32 27 33 81 83 85 32 72 61 77 52
split_gain=32.7624 7.68874 8.64181 8.25451 7.32214 6.8233 6.75593 6.44176 8.25947 8.69321 6.36363 7.17186 5.52164 5.23358 7.14313 7.28422 6.71979 4.33821 3.0057 2.88005 2.0103 1.97945
threshold=1.3550000000000002 2.7750000000000004 26.437500000000004 0.36503968253968255 0.0031256509629944006 8.0104166666666696 0.53482673267326752 5.7142857142857162 0.37048387096774199 1.5250000000000001 21.387500000000003 23.401785714285719 0.041666666666666609 0.48750000000000077 1.5450000000000002 0.49675474226372424 7.0000000000000009 23.937500000000004 168.45500000000001 12.866071428571431 3.5750000000000006 20.937500000000004
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=13 2 7 17 -4 6 -5 8 -2 -10 -9 -12 -6 14 -1 16 -16 18 -3 -15 -20 -17
right_child=1 3 4 5 12 -7 -8 10 9 -11 11 -13 -14 19 15 21 -18 -19 20 -21 -22 -23
leaf_value=-0.013704229151213245 0.027450875173411476 -0.023064966047954667 -0.0059624168736026485 -0.038589566739523094 0.056326441355657071 0.030446921359459631 0.016883396721917278 -0.016350382639718393 0.013627075998421501 -0.021633517290006031 0.05028258642359236 0.0009173349354000002 0.0072316487440658916 0.068524301803199614 -0.018133517520923233 0.068692298622940004 0.040971194599499342 -0.0050561417900614513 -0.063950555049639521 0.032780066279364729 -0.033589087879592569 0.036867444471177346
leaf_weight=7.3369961380958548 7.6220861226320258 8.4552565813064628 11.61829833686352 9.1469356268644351 8.95874175429344 7.0741493850946418 5.7027423083782187 10.210099041461946 13.296437010169027 70.522654443979263 8.4501094073057192 10.635228663682936 6.2034237682819358 6.9417874813079816 6.5973821580410021 6.2351381480693799 5.7690534591674805 6.4230835288763037 9.5591890364885312 7.5070647299289703 5.4950702935457221 6.273080438375473
leaf_count=30 31 36 47 39 36 29 24 42 54 288 34 43 25 29 27 26 24 27 41 31 24 26
internal_value=-0.000221457 -0.00728287 -0.00262557 -0.0205316 0.0179311 -0.00188419 -0.0172862 -0.00718521 -0.0124149 -0.01604 0.0091383 0.022774 0.0362399 0.0299509 0.0209788 0.0312089 0.00943936 -0.0341897 -0.0421493 0.049953 -0.0528681 0.0527316
internal_weight=246.034 199.374 147.517 51.8564 26.7805 21.9238 14.8497 120.737 91.4412 83.8191 29.2954 19.0853 15.1622 46.6605 32.2117 24.8747 12.3664 29.9326 23.5095 14.4489 15.0543 12.5082
internal_count=1013 820 600 220 108 92 63 492 373 342 119 77 61 193 133 103 51 128 101 60 65 52
is_linear=0
shrinkage=0.04
Tree=13
num_leaves=19
num_cat=0
split_feature=77 27 31 81 14 38 21 55 43 97 42 50 49 41 11 29 36 46
split_gain=30.365 7.27107 9.52487 12.2996 9.0589 7.41691 9.90203 7.25389 5.08675 7.54296 5.07106 4.74587 4.44817 3.59305 3.39298 3.01366 2.9941 0.448837
threshold=1.3550000000000002 -9.1770833333333304 19.267857142857146 1.7250000000000003 122.31250000000001 2.6904761904761911 0.083333333333333273 24.291666666666668 0.517251622710314 -3.4687499999999996 -1.0000000180025095e-35 0.74140914232826016 0.76659506707061065 13.937500000000002 82.225000000000009 31.645833333333339 0.25833333333333336 0.36329415664585835
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=8 10 3 13 5 6 -4 -7 9 -1 -2 -5 -10 15 -13 -3 17 -11
right_child=1 2 4 11 -6 7 -8 -9 12 16 -12 14 -14 -15 -16 -17 -18 -19
leaf_value=0.0041476899979937393 -0.0034999926541034228 -0.023850556255963173 0.062374335339732266 0.019436960650830851 -0.040563681465367818 -0.014906923971429929 -0.0074965637413479513 0.0059412375908438819 0.030706710820604975 0.065449418006224877 0.045123507823127261 -0.039381405838009649 -0.014252467415403541 -0.014092509822306006 -0.0010383997769504584 -0.056317419451033038 0.031129941687341473 0.051133321323921815
leaf_weight=8.055756524205206 6.7893940508365631 6.2344330102205339 6.6629877835512143 9.740582004189493 9.6089727431535703 49.483073160052299 6.3268143832683563 58.00315548479557 7.2527140378952009 9.1974637508392352 6.9395729303359968 6.112829089164733 6.8431497812271118 6.5173709690570822 9.326295539736746 17.179670557379723 9.3787097781896573 5.6604025810956946
leaf_count=33 28 26 27 40 40 203 26 237 30 39 28 26 28 27 39 73 39 24
internal_value=-0.000213824 -0.00700969 -0.00909187 -0.0230288 -0.00318742 -0.000206347 0.028343 -0.00365656 0.0289287 0.03768 0.0210777 -0.00242612 0.00888028 -0.0403607 -0.0162196 -0.0476725 0.0488255 0.0599954
internal_weight=245.313 198.925 185.196 55.1112 130.085 120.476 12.9898 107.486 46.3882 32.2923 13.729 25.1797 14.0959 29.9315 15.4391 23.4141 24.2366 14.8579
internal_count=1013 820 764 231 533 493 53 440 193 135 56 105 58 126 65 99 102 63
is_linear=0
shrinkage=0.04
Tree=14
num_leaves=22
num_cat=0
split_feature=77 77 46 75 82 55 32 17 38 46 57 30 61 33 98 65 51 27 98 47 61
split_gain=28.1527 7.12874 8.477 7.73926 6.50746 5.98775 7.72785 8.6859 7.74875 5.93269 5.06747 5.06776 4.95308 4.91574 6.80815 7.34665 4.31144 3.2933 2.95919 4.65978 2.76263
threshold=1.3550000000000002 2.7750000000000004 0.31724837366825204 0.36503968253968255 1.6250000000000002 25.387500000000003 23.162500000000005 5.4375000000000009 4.5357142857142865 0.38268646215659091 43.267857142857146 1.616071428571429 12.937500000000002 0.48750000000000077 -7.4374999999999991 0.33749469440953833 -0.013613105968960247 -1.0000000180025095e-35 -3.8124999999999996 0.35584581233114326 12.866071428571431
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=13 2 17 9 -5 6 7 -4 -8 18 11 -7 -6 14 -1 -16 -17 -2 19 -3 -15
right_child=1 3 5 4 12 10 8 -9 -10 -11 -12 -13 -14 20 15 16 -18 -19 -20 -21 -22
leaf_value=0.049192658968065413 -0.0067623296175654806 -0.0059097661371454915 -0.0050944451331990034 -0.03421558742429949 0.036155385236969206 0.014169839641633519 -0.043477105235728328 0.028316035394997729 0.0010581675575233177 0.003886886786396461 -0.0086753300433371153 0.054580295048802033 -0.0096808854134481682 0.065934772628735538 -0.023746444878321241 0.051631222052607971 0.0076433857618228972 -0.040918404046466461 -0.058216504004969173 -0.052238605686860437 0.030637453650431135
leaf_weight=8.7887725979089755 7.5061362385749844 6.7677073627710378 50.746787026524544 6.7268268465995815 7.4960924834012967 19.041952610015873 13.521969988942148 16.497406467795372 11.624812230467795 5.6693915277719489 10.139670595526693 6.716789945960044 7.5927126258611679 6.7980895787477476 7.8056097179651287 5.6656755805015573 9.6161311566829646 11.340498879551886 10.0033008903265 7.136577308177948 7.4203208535909653
leaf_count=37 31 29 208 29 31 77 55 67 47 24 41 27 32 29 32 24 40 47 44 31 31
internal_value=-0.000206417 -0.00674501 -0.00226535 -0.0195702 -0.00149628 0.00141462 -0.00397204 0.00310234 -0.0228894 -0.0329013 0.0152781 0.0247072 0.0130905 0.0279552 0.0192311 0.00782548 0.0239517 -0.0273149 -0.0416252 -0.0296887 0.0475138
internal_weight=244.623 198.529 147.136 51.3926 21.8156 128.289 92.391 67.2442 25.1468 29.577 35.8984 25.7587 15.0888 46.0946 31.8762 23.0874 15.2818 18.8466 23.9076 13.9043 14.2184
internal_count=1013 820 600 220 92 522 377 275 102 128 145 104 63 193 133 96 64 78 104 60 60
is_linear=0
shrinkage=0.04
Tree=15
num_leaves=23
num_cat=0
split_feature=77 78 26 31 57 17 43 18 88 36 86 43 43 83 26 73 29 74 16 55 49 68
split_gain=30.6549 12.7043 7.98721 10.8099 12.9808 8.83958 9.66696 7.6233 8.30341 7.00237 7.84435 6.65312 6.67534 5.83636 5.52421 4.95185 4.66908 4.3802 2.29615 1.92109 0.661162 0.1601
threshold=1.9950000000000003 0.33644919786096261 21.770833333333339 20.133928571428573 35.187500000000007 -5.0624999999999991 0.53857630126707601 1.8541666666666716 0.53866359200690284 0.82291666666666663 1.8550000000000002 0.45900318339543572 0.4818732325807798 0.47597575932472408 12.94791666666667 0.53851851851851873 32.38750000000001 0.45972972972972981 3.4375000000000004 22.133928571428573 0.73531540336276224 213.70833333333334
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 14 3 13 -5 17 -7 8 11 10 -9 -3 -13 -2 15 -1 -15 -6 -12 -17 -18 -21
right_child=2 7 -4 4 5 6 -8 9 -10 -11 18 12 -14 16 -16 19 20 -19 -20 21 -22 -23
leaf_value=0.010871073320075295 -0.01172908397497561 -0.046441338207332344 0.019672691938767486 0.046075447983197577 -0.061569404959666424 0.0074003860750678123 -0.03202888807961754 0.0051474644513716042 0.036884377615520249 -0.003400242373786107 0.074863846195407729 0.029393863715199694 -0.0074553467007233889 -0.02523852343490229 -0.00025978536454887422 0.034473210798287857 -0.04795709241166466 -0.021389238996026894 0.039593407313705566 0.06652884725665352 -0.0649082592606154 0.057420779027093437
leaf_weight=5.6847298443317476 11.006925001740454 5.903722643852233 9.1704158484935743 6.0783775299787512 7.7922967374324825 29.886689186096191 14.913227543234823 7.632679730653769 8.1395289301872236 10.812797814607618 5.8884793967008573 9.226744741201399 53.322611838579178 10.414219856262209 5.7322616279125205 5.7362669706344631 5.4735058695077923 9.8010887205600721 5.9245722442865372 6.8773611485958082 11.245072126388548 5.6040449738502502
leaf_count=24 47 25 38 25 33 123 62 31 33 44 24 38 218 44 24 24 24 42 24 29 48 24
internal_value=-0.00100969 0.011833 -0.0161489 -0.0192302 -0.00972405 -0.0151601 -0.00572503 0.00540609 -0.00130932 0.0224046 0.0367534 -0.00585081 -0.00201967 -0.0362963 0.0350059 0.0434633 -0.0462625 -0.0391855 0.0571747 0.0536336 -0.0593586 0.0624394
internal_weight=252.268 136.486 115.782 106.611 68.4717 62.3933 44.7999 106.851 76.5926 30.2585 19.4457 68.4531 62.5494 38.1397 29.6347 23.9024 27.1328 17.5934 11.8131 18.2177 16.7186 12.4814
internal_count=1048 562 486 448 285 260 185 437 314 123 79 281 256 163 125 101 116 75 48 77 72 53
is_linear=0
shrinkage=0.04
Tree=16
num_leaves=17
num_cat=0
split_feature=76 76 33 62 60 32 55 73 17 64 98 25 97 51 64 46
split_gain=28.3621 12.2502 8.00023 7.56697 8.78584 8.59503 10.2753 7.68395 7.41413 7.20524 7.00591 5.23347 2.9989 2.73623 2.05561 2.04323
threshold=1.4350000000000003 2.4650000000000003 4.2946428571428585 0.47839506172839513 11.437500000000002 16.162500000000005 19.854166666666668 0.53494186046511638 -7.4374999999999991 0.32494646077353467 5.2142857142857162 7.713541666666667 3.7187500000000004 -0.004282105062106599 0.35502948224071251 0.3406679635524858
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 2 8 15 12 -6 -7 -8 -2 -10 -11 13 14 -3 -5 -1
right_child=1 11 -4 4 5 6 7 -9 9 10 -12 -13 -14 -15 -16 -17
leaf_value=-0.029356541192986724 0.037955112610862249 0.059695040893734667 0.043145015879701695 -0.032469843140688813 -0.046002450801952505 0.045093595797352712 -0.042281712918251578 -0.0013799114657325063 0.028081928029134246 -0.010286395434884021 0.025869758711122577 0.011074792109595291 -0.016203413481459306 0.029391271532808353 -0.066050254966225394 -0.059556082391998284
leaf_weight=8.2328822761774045 8.3526985198259336 10.85255688428879 8.1492396891117078 6.0218881517648715 7.7615482956171027 6.8014763444662085 8.401523217558859 58.652556464076042 11.53392608463764 68.72949393093586 9.7969980090856534 10.286443859338759 7.4116774648427954 8.5023315101861971 5.6562657654285431 6.3488157242536536
leaf_count=36 34 47 33 26 33 28 36 245 47 282 40 43 31 36 24 27
internal_value=-0.000973738 0.0113848 0.00505797 -0.0155743 -0.0116748 -0.00596092 -0.00175292 -0.00650469 0.00190411 -0.00143945 -0.00577554 0.03413 -0.0361041 0.046383 -0.0487344 -0.0425053
internal_weight=251.492 136.204 106.562 115.289 100.707 81.6171 73.8556 67.0541 98.4131 90.0604 78.5265 29.6413 19.0898 19.3549 11.6782 14.5817
internal_count=1048 562 436 486 423 342 309 281 403 369 322 126 81 83 50 63
is_linear=0
shrinkage=0.04
Tree=17
num_leaves=22
num_cat=0
split_feature=76 76 33 26 31 57 43 97 60 18 96 88 18 83 26 39 73 7 55 45 68
split_gain=26.2331 11.4465 7.36965 7.34122 10.0074 11.7565 7.00539 6.9993 6.88815 10.1637 8.77606 8.04063 5.85928 5.44514 5.13063 4.51191 4.39613 4.1986 1.8532 1.64881 0.156111
threshold=1.4350000000000003 2.4650000000000003 4.2946428571428585 21.770833333333339 20.133928571428573 35.187500000000007 0.53857630126707601 4.7812500000000009 12.937500000000002 7.8125000000000009 227.02083333333337 0.50795806360821849 -4.2250000000000076 0.47597575932472408 12.94791666666667 -0.14642857142857144 0.53851851851851873 38.750000000000007 22.133928571428573 -0.022501766737042645 213.70833333333334
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 2 8 4 13 -6 7 -7 10 12 11 -2 -10 -1 16 -8 -3 -15 -18 -19 -20
right_child=1 14 -4 -5 5 6 15 -9 9 -11 -12 -13 -14 17 -16 -17 18 19 20 -21 -22
leaf_value=-0.010786926974099898 -0.010875891059966199 0.011190352480577235 0.041404682122757074 0.019321738062697564 0.044141656903284171 0.0050650165981334172 -0.0061370748531028195 -0.031081442136894771 0.0018664483503017856 0.037131975018793693 0.045073208415421694 0.031828378125081415 -0.026128862170482447 -0.019454829109928577 -0.00098318851507109848 -0.04740967911986986 0.032230447411375152 -0.036142993762053356 0.064020839936514432 -0.062063497067667478 0.054909341919742859
leaf_weight=10.931022465229033 21.17684800922871 5.8557388037443223 8.1520167738199216 9.167889401316641 6.0991687178611746 30.766112342476845 6.2713132798671749 11.881261780858038 21.166444480419155 7.1458795964717856 10.812558114528654 10.578455105423926 27.506209731101993 7.7223432809114483 5.709469035267829 13.0707770884037 5.6604651659727123 5.5588853806257275 6.7105195671319944 13.371022954583166 5.4539281576871872
leaf_count=47 87 25 33 38 25 128 26 50 87 29 44 43 113 33 24 56 24 24 29 59 24
internal_value=-0.00093843 0.0109531 0.00485653 -0.0150137 -0.0179926 -0.00884725 -0.0140609 -0.00500514 0.00182825 -0.00741437 0.013948 0.00334993 -0.0139544 -0.034561 0.0330532 -0.0340278 0.0412595 -0.0443116 0.0511376 -0.0544518 0.0599357
internal_weight=250.768 135.929 106.538 114.84 105.672 68.0886 61.9895 42.6474 98.3864 55.8185 42.5679 31.7553 48.6727 37.5833 29.3901 19.3421 23.6807 26.6523 17.8249 18.9299 12.1644
internal_count=1048 562 436 486 448 285 260 178 403 229 174 130 200 163 126 82 102 116 77 83 53
is_linear=0
shrinkage=0.04
Tree=18
num_leaves=16
num_cat=0
split_feature=78 79 27 72 42 35 26 33 9 30 47 48 55 36 10
split_gain=24.4334 9.31464 8.11491 7.61266 8.41054 8.43587 6.62119 7.15164 9.45035 5.28003 4.33177 4.03099 3.63878 2.72794 1.40327
threshold=0.39285714285714285 0.25425685425685435 -9.1770833333333304 165.36681435533896 -3.4833333333333329 9.133928571428573 2.0104166666666674 0.48750000000000077 -37.987012987012982 -0.56249999999999989 0.37693449058527589 0.029982232931730853 20.062500000000004 0.19374999999999967 112.93750000000001
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=6 11 -3 10 -5 -6 14 8 -8 -10 12 13 -4 -2 -1
right_child=1 2 3 4 5 -7 7 -9 9 -11 -12 -13 -14 -15 -16
leaf_value=0.063369704073235666 -0.060192057816396222 0.028277022360530538 -0.016384609809915053 0.044115984475752157 -0.002478065592223366 -0.051567336834095104 0.039068426318397236 0.043825566156782063 0.011510346359810335 -0.030717352367799409 0.0045260873425609898 -0.0079952149779249845 -0.050175542136173798 -0.02646344151553905 0.03565855117598455
leaf_weight=6.1944728344678861 7.4922285825014132 11.610384106636046 9.429924696683889 5.9105293601751319 126.33686858415604 5.8609887659549704 9.0260636359453184 10.688595429062842 9.03087270259857 9.9656709432601982 5.7869970500469199 8.0647390484809858 11.101787701249121 7.8635760098695755 5.5375995486974716
leaf_count=27 34 47 40 24 521 24 39 46 38 41 24 36 48 35 24
internal_value=-0.000905355 -0.00719526 -0.00404233 -0.00632444 -0.00256724 -0.00465444 0.0239656 0.0159876 0.00536936 -0.0106425 -0.0260405 -0.0308938 -0.0346558 -0.0429199 0.0502899
internal_weight=249.901 199.458 176.037 164.427 138.108 132.198 50.4433 38.7112 28.0226 18.9965 26.3187 23.4205 20.5317 15.3558 11.7321
internal_count=1048 833 728 681 569 545 215 164 118 79 112 105 88 69 51
is_linear=0
shrinkage=0.04
Tree=19
num_leaves=18
num_cat=0
split_feature=76 76 62 60 60 32 55 33 17 64 86 25 97 63 12 64 81
split_gain=23.1068 9.98848 7.01309 7.92354 7.95102 8.25678 10.3579 6.71651 6.62639 6.59888 6.66891 4.82438 2.76159 2.56144 2.27179 1.91893 0.744688
threshold=1.4350000000000003 2.4650000000000003 0.47839506172839513 11.437500000000002 15.225000000000001 16.162500000000005 19.854166666666668 4.2946428571428585 -7.4374999999999991 0.32494646077353467 2.1850000000000005 7.713541666666667 3.7187500000000004 0.4972961247664654 -0.52499999999999847 0.35502948224071251 1.675
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=2 7 14 12 5 -5 -7 8 -2 -10 -11 13 15 -3 -1 -4 -15
right_child=1 11 3 4 -6 6 -8 -9 9 10 -12 -13 -14 16 -16 -17 -18
leaf_value=-0.020188921856989679 0.035859126989343022 0.020130209700355124 -0.030330032249617687 -0.042500863549311367 -0.05090993952028721 0.051411801287769635 -0.0022541904826031737 0.03956776202840788 0.026837614548232368 -0.0020673320148814839 -0.04439110542553347 0.008982843264200761 -0.014705883428948159 0.04205717024579949 -0.053006940553933922 -0.063196273867381317 0.061135640318500738
leaf_weight=5.4853323251008979 8.3379970937967283 5.4507957398891511 5.8967574089765566 7.1303204596042624 5.6088133603334418 6.3442998528480521 61.878040820360184 8.1366265416145307 11.526250436902044 71.76362943649292 6.4958909302949897 10.161600232124327 7.33974365890026 5.8689189553260785 8.7720085084438306 5.4874216467142105 7.4020354598760605
leaf_count=24 34 24 26 31 24 26 261 33 47 295 27 43 31 26 39 24 33
internal_value=-0.000872267 0.0103144 -0.0141404 -0.0103874 -0.00496412 -0.00154418 0.00273645 0.00464481 0.00174893 -0.00141873 -0.00558039 0.0311724 -0.0338375 0.0432163 -0.0403806 -0.0461723 0.0526984
internal_weight=249.086 135.144 113.943 99.6854 80.9615 75.3527 68.2223 106.26 98.1238 89.7858 78.2595 28.8834 18.7239 18.7218 14.2573 11.3842 13.271
internal_count=1048 562 486 423 342 318 287 436 403 369 322 126 81 83 63 50 59
is_linear=0
shrinkage=0.04
Tree=20
num_leaves=21
num_cat=0
split_feature=77 85 42 26 33 48 29 51 51 88 47 49 9 43 18 45 13 97 90 77
split_gain=21.5323 9.97521 10.2057 13.3659 8.32088 7.657 7.45302 7.27519 7.64625 7.10753 6.89749 6.08818 4.68215 4.36106 4.27259 4.17883 3.76513 3.4333 0.930107 0.105107
threshold=1.3050000000000004 -5.9999999999999991 1.0000000180025095e-35 9.0104166666666696 4.4375000000000009 0.059692514253813854 39.812500000000007 0.045203152003977003 0.023949864400422651 0.52433570113128136 0.36314281963351264 0.78248216953695626 -13.09523809523809 0.51328944240770225 -16.645833333333325 -0.0072528547698737485 85.062500000000014 1.0937500000000002 0.19841107122821819 3.1550000000000007
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=12 2 6 9 5 7 16 8 -3 15 -9 17 14 -14 -1 -4 -2 -8 -16 -18
right_child=1 4 3 -5 -6 -7 11 10 -10 -11 -12 -13 13 -15 18 -17 19 -19 -20 -21
leaf_value=0.013957264641046819 -0.027472132914840498 -0.0020934000360734932 0.022613724189803789 0.050832305755293579 0.037916740044630914 0.040588735510979898 0.027960766295875104 -0.003405364332683979 -0.031143520168862283 -0.05384418391600191 0.040275812485981034 -0.044330514984326927 0.030259337595429918 -0.015671661984799638 0.062645279427693656 -0.020045453140862098 -0.066399946636320761 -0.01585144878027565 0.043445571662247148 -0.058570658366565144
leaf_weight=7.1345093548297873 8.8107999712228793 87.334490463137627 6.9631915688514745 6.7626315653324118 9.8303922265768033 6.9817947149276725 5.6945501416921633 11.879387885332109 17.382111921906471 5.3163712173700324 11.272227853536604 5.8104494661092749 7.708228871226309 5.7933596074581146 6.5536312311887803 7.778242379426958 5.6191599071025831 5.7529874593019485 10.513165161013601 5.361001193523407
leaf_count=31 39 362 29 28 40 29 24 49 72 24 46 25 33 25 29 33 24 25 47 25
internal_value=0.00160475 -0.00342443 -0.0165911 0.00220167 0.002388 -0.000201998 -0.0301955 -0.00242923 -0.00691549 -0.0141945 0.0178624 -0.0109833 0.0294231 0.0105509 0.0399517 0.000104827 -0.0469487 0.00594283 0.0508182 -0.0625773
internal_weight=246.253 208.55 63.8694 26.8204 144.68 134.85 37.0489 127.868 104.717 20.0578 23.1516 17.258 37.7029 13.5016 24.2013 14.7414 19.791 11.4475 17.0668 10.9802
internal_count=1039 874 276 114 598 558 162 529 434 86 95 74 165 58 107 62 88 49 76 49
is_linear=0
shrinkage=0.04
Tree=21
num_leaves=21
num_cat=0
split_feature=77 85 42 26 25 51 29 88 90 36 49 14 9 18 61 43 13 59 90 77
split_gain=20.02 9.2622 9.52085 12.2814 7.88967 7.82256 7.04841 6.6998 6.53054 7.87452 5.68149 5.30824 4.42523 4.07084 4.07033 4.04241 3.62375 3.25279 0.898801 0.0879165
threshold=1.3050000000000004 -5.9999999999999991 1.0000000180025095e-35 9.0104166666666696 1.8541666666666667 0.080538861184771923 39.812500000000007 0.52433570113128136 0.19969875129044842 -1.0875000000000001 0.78248216953695626 118.93750000000001 -13.09523809523809 -16.645833333333325 13.816666666666668 0.51328944240770225 85.062500000000014 23.535714285714288 0.19841107122821819 3.1550000000000007
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=12 2 6 7 -3 8 16 14 11 -10 17 -6 13 -1 -4 -14 -2 -8 -15 -18
right_child=1 4 3 -5 5 -7 10 -9 9 -11 -12 -13 15 18 -16 -17 19 -19 -20 -21
leaf_value=0.013447805314569216 -0.026577780088672762 0.034080468035460186 0.017293051569916593 0.048557724592127684 0.007825469574374657 0.040280280315047955 -0.016199932862450934 -0.052623204127446775 -0.047796064224015833 -0.0064826859346349495 -0.043040720837339654 -0.024990244369414837 0.029249467101270982 0.061303641269563863 -0.025610761209309305 -0.014994478156697251 -0.064751343135519002 0.026458734247185672 0.042272167151271647 -0.057522795432648433
leaf_weight=7.10912762582302 8.7443605959415454 11.499579980969427 8.8282703608274513 6.7955194860696784 70.869819477200508 7.13891777396202 5.5701393187046069 5.2240076065063468 9.2205274105072004 37.017053097486496 5.7464897632598868 8.8745380789041501 7.6564510315656644 6.4316389113664689 5.9041270762681952 5.8124003708362579 5.5334616154432279 5.8780088722705841 10.375598087906836 5.2427052408456802
leaf_count=31 39 47 37 28 292 29 24 24 39 154 25 37 33 29 25 25 24 25 47 25
internal_value=0.00154432 -0.00329759 -0.0160366 0.0021131 0.00229294 -0.000453006 -0.029261 -0.0137021 -0.00276121 -0.0147212 -0.0105873 0.0041735 0.0284946 0.0388221 9.90026e-05 0.0101563 -0.0457098 0.005703 0.049555 -0.0612346
internal_weight=245.473 208.088 63.4671 26.7519 144.62 133.121 36.7152 19.9564 125.982 46.2376 17.1946 79.7444 37.3852 23.9164 14.7324 13.4689 19.5205 11.4481 16.8072 10.7762
internal_count=1039 874 276 114 598 551 162 86 522 193 74 329 165 107 62 58 88 49 76 49
is_linear=0
shrinkage=0.04
Tree=22
num_leaves=18
num_cat=0
split_feature=77 78 25 48 51 25 9 35 9 48 27 30 9 18 43 50 72
split_gain=18.621 8.70394 7.91958 7.02031 6.21305 7.48529 6.53645 5.17821 7.11278 5.11576 6.50119 6.92842 4.18572 3.88527 3.74781 0.796232 0.173612
threshold=1.3050000000000004 0.69515163347650677 1.7395833333333337 0.059692514253813854 0.043626967279394004 7.8802083333333348 8.333333333333341 6.6937500000000005 8.7121212121212164 -0.018202373884811395 0.98958333333333315 0.22500000000000142 -13.09523809523809 -16.645833333333325 0.51328944240770225 0.74854124348714812 225.04250000000002
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=12 2 -2 4 6 -6 -4 8 -3 -9 11 -11 13 -1 -14 -15 -17
right_child=1 7 3 -5 5 -7 -8 9 -10 10 -12 -13 14 15 -16 16 -18
leaf_value=0.012956313378187341 0.036677286509223014 0.027307971594871701 -0.0022506819015889979 0.032532340700343249 0.001223625287223337 0.048438323769350811 -0.026039625377917281 -0.053747110254001744 -0.024750955618137494 -0.0073465850093818175 0.015119548741830309 -0.055928592960106001 0.028278098346949592 0.036190897477542942 -0.014350124084243559 0.059344625765782907 0.049204905261110286
leaf_weight=7.0841730237007132 9.6115598380565626 8.2046131193637866 86.82922887802124 9.458725377917288 19.787809953093532 7.3747483491897574 23.477235645055771 8.3150889277458173 8.6016630232334119 10.108060508966451 7.039140820503234 8.7735196799039823 7.6037576496601087 5.6597618609666887 5.8300840854644775 5.8845966756343824 4.9951938986778259
leaf_count=31 39 35 359 39 81 30 97 39 39 45 31 40 33 25 25 27 24
internal_value=0.00148631 -0.00317645 0.00150065 -0.0008005 -0.00309401 0.0140426 -0.00731384 -0.0175205 0.000663559 -0.026447 -0.0176894 -0.0299207 0.0276052 0.0377427 0.00977809 0.0483592 0.0546892
internal_weight=244.639 207.581 156.539 146.928 137.469 27.1626 110.306 51.0421 16.8063 34.2358 25.9207 18.8816 37.0576 23.6237 13.4338 16.5396 10.8798
internal_count=1039 874 645 606 567 111 456 229 74 155 116 85 165 107 58 76 51
is_linear=0
shrinkage=0.04
end of trees
feature_importances:
Column_77=25
Column_46=15
Column_76=15
Column_97=14
Column_43=13
Column_50=13
Column_49=12
Column_27=11
Column_78=11
Column_9=10
Column_26=10
Column_51=10
Column_55=10
Column_61=10
Column_17=9
Column_18=8
Column_25=8
Column_36=8
Column_42=8
Column_52=8
Column_74=8
Column_75=8
Column_81=8
Column_33=7
Column_40=7
Column_48=7
Column_73=7
Column_86=7
Column_10=6
Column_31=6
Column_32=6
Column_44=6
Column_57=6
Column_88=6
Column_13=5
Column_29=5
Column_35=5
Column_60=5
Column_64=5
Column_85=5
Column_90=5
Column_15=4
Column_23=4
Column_30=4
Column_37=4
Column_38=4
Column_41=4
Column_45=4
Column_47=4
Column_54=4
Column_58=4
Column_59=4
Column_62=4
Column_65=4
Column_83=4
Column_95=4
Column_98=4
Column_8=3
Column_12=3
Column_63=3
Column_72=3
Column_82=3
Column_7=2
Column_11=2
Column_14=2
Column_16=2
Column_56=2
Column_68=2
Column_21=1
Column_22=1
Column_28=1
Column_34=1
Column_39=1
Column_53=1
Column_69=1
Column_79=1
Column_89=1
Column_96=1
parameters:
[boosting: gbdt]
[objective: binary]
[metric: binary_logloss]
[tree_learner: serial]
[device_type: cpu]
[data_sample_strategy: bagging]
[data: ]
[valid: ]
[num_iterations: 1200]
[learning_rate: 0.04]
[num_leaves: 31]
[num_threads: 4]
[seed: 42]
[deterministic: 0]
[force_col_wise: 0]
[force_row_wise: 0]
[histogram_pool_size: -1]
[max_depth: 6]
[min_data_in_leaf: 24]
[min_sum_hessian_in_leaf: 0.001]
[bagging_fraction: 0.84]
[pos_bagging_fraction: 1]
[neg_bagging_fraction: 1]
[bagging_freq: 5]
[bagging_seed: 400]
[bagging_by_query: 0]
[feature_fraction: 0.82]
[feature_fraction_bynode: 1]
[feature_fraction_seed: 30056]
[extra_trees: 0]
[extra_seed: 12879]
[early_stopping_round: 0]
[early_stopping_min_delta: 0]
[first_metric_only: 0]
[max_delta_step: 0]
[lambda_l1: 0]
[lambda_l2: 0]
[linear_lambda: 0]
[min_gain_to_split: 0]
[drop_rate: 0.1]
[max_drop: 50]
[skip_drop: 0.5]
[xgboost_dart_mode: 0]
[uniform_drop: 0]
[drop_seed: 17869]
[top_rate: 0.2]
[other_rate: 0.1]
[min_data_per_group: 100]
[max_cat_threshold: 32]
[cat_l2: 10]
[cat_smooth: 10]
[max_cat_to_onehot: 4]
[top_k: 20]
[monotone_constraints: ]
[monotone_constraints_method: basic]
[monotone_penalty: 0]
[feature_contri: ]
[forcedsplits_filename: ]
[refit_decay_rate: 0.9]
[cegb_tradeoff: 1]
[cegb_penalty_split: 0]
[cegb_penalty_feature_lazy: ]
[cegb_penalty_feature_coupled: ]
[path_smooth: 0]
[interaction_constraints: ]
[verbosity: -1]
[saved_feature_importance_type: 0]
[use_quantized_grad: 0]
[num_grad_quant_bins: 4]
[quant_train_renew_leaf: 0]
[stochastic_rounding: 1]
[linear_tree: 0]
[max_bin: 255]
[max_bin_by_feature: ]
[min_data_in_bin: 3]
[bin_construct_sample_cnt: 200000]
[data_random_seed: 175]
[is_enable_sparse: 1]
[enable_bundle: 1]
[use_missing: 1]
[zero_as_missing: 0]
[feature_pre_filter: 1]
[pre_partition: 0]
[two_round: 0]
[header: 0]
[label_column: ]
[weight_column: ]
[group_column: ]
[ignore_column: ]
[categorical_feature: ]
[forcedbins_filename: ]
[precise_float_parser: 0]
[parser_config_file: ]
[objective_seed: 16083]
[num_class: 1]
[is_unbalance: 0]
[scale_pos_weight: 1]
[sigmoid: 1]
[boost_from_average: 1]
[reg_sqrt: 0]
[alpha: 0.9]
[fair_c: 1]
[poisson_max_delta_step: 0.7]
[tweedie_variance_power: 1.5]
[lambdarank_truncation_level: 30]
[lambdarank_norm: 1]
[label_gain: ]
[lambdarank_position_bias_regularization: 0]
[eval_at: ]
[multi_error_top_k: 1]
[auc_mu_weights: ]
[num_machines: 1]
[local_listen_port: 12400]
[time_out: 120]
[machine_list_filename: ]
[machines: ]
[gpu_platform_id: -1]
[gpu_device_id: -1]
[gpu_use_dp: 0]
[num_gpu: 1]
end of parameters
pandas_categorical:null
@@ -0,0 +1,790 @@
tree
version=v4
num_class=1
num_tree_per_iteration=1
label_index=0
max_feature_idx=98
objective=binary sigmoid:1
feature_names=Column_0 Column_1 Column_2 Column_3 Column_4 Column_5 Column_6 Column_7 Column_8 Column_9 Column_10 Column_11 Column_12 Column_13 Column_14 Column_15 Column_16 Column_17 Column_18 Column_19 Column_20 Column_21 Column_22 Column_23 Column_24 Column_25 Column_26 Column_27 Column_28 Column_29 Column_30 Column_31 Column_32 Column_33 Column_34 Column_35 Column_36 Column_37 Column_38 Column_39 Column_40 Column_41 Column_42 Column_43 Column_44 Column_45 Column_46 Column_47 Column_48 Column_49 Column_50 Column_51 Column_52 Column_53 Column_54 Column_55 Column_56 Column_57 Column_58 Column_59 Column_60 Column_61 Column_62 Column_63 Column_64 Column_65 Column_66 Column_67 Column_68 Column_69 Column_70 Column_71 Column_72 Column_73 Column_74 Column_75 Column_76 Column_77 Column_78 Column_79 Column_80 Column_81 Column_82 Column_83 Column_84 Column_85 Column_86 Column_87 Column_88 Column_89 Column_90 Column_91 Column_92 Column_93 Column_94 Column_95 Column_96 Column_97 Column_98
feature_infos=none none none none none none none [0:100] [0:100] [-100:100] [54:128.875] [61:138] [-41.333333333333329:24] [61:130.125] [59:133] [-50:34.916666666666671] [-28.666666666666657:32.5] [-24.5:27] [-35.5:31.833333333333329] [0:1] [0:1] [-1:1] [0:12] [0:12] [-12:10] [0.91666666666666663:186.1875] [0.85416666666666663:185.08333333333337] [-179.02083333333334:184.16666666666663] [20:54] [22:56] [-21:15.833333333333332] [11:34.5] [9:36] [-15:13.333333333333332] [2:13] [3:13] [-8:5.5] [0.875:9] [0:12] [-9:4.875] [7:22] [8:23] [-10:10] [0.41025641025641019:0.67647058823529416] [0.39730639730639727:0.76744186046511631] [-0.3174418604651163:0.22647058823529409] [0.20419254658385089:0.4825000000000001] [0.1875:0.54166666666666663] [-0.22171945701357459:0.17923280423280419] [0.47826086956521741:0.87898457080200498] [0.42857142857142849:0.88636363636363635] [-0.29647435897435898:0.38962801311591622] [11:39.666666666666664] [15.5:40] [10:31.875] [11.5:34] [15:56] [17:57] [12:32.125] [10:38] [8:20] [7:22] [0.40000000000000002:0.69047619047619047] [0.3902439024390244:0.71739130434782605] [0.23753976670201479:0.51851851851851849] [0.1707317073170731:0.54545454545454541] [0:8] [0:1] [120:295] [-38:36] [0:1] [0:1] [140.39130434782609:233.09999999999999] [0:1] [0:1] [0:1] [1.05:4.1100000000000003] [1.05:4.1299999999999999] [0.2034883720930232:0.79729729729729726] [0.20270270270270269:0.79651162790697683] [145.5:257.5] [1.3:7.7599999999999998] [1.4099999999999999:7.5599999999999996] [0.1789976133651551:0.75977653631284914] [0.24022346368715081:0.82100238663484482] [-15.5:12.5] [1.3:2.7200000000000002] [1.1799999999999999:2.7799999999999998] [0.36298076923076922:0.64953271028037374] [0.35046728971962621:0.63701923076923073] [0.1918833727344364:0.21555133935749429] none none none none [132:250.6875] [126.25:256.625] [-63.5:88.6875] [-35.5:27.5]
tree_sizes=1917 2132 1711 1718 2135 1820 1814 2034 1817 1826 2446 2567 2567 2150 2344 1601 1499 1400 990 1399 1731 1721 2040 1829 1725 1721 1504 1717 1516 1931
Tree=0
num_leaves=17
num_cat=0
split_feature=87 40 77 86 73 40 95 12 17 82 74 40 55 71 81 53
split_gain=15.1457 13.2046 13.5042 11.2099 13.4357 11.4245 10.2245 9.8615 9.24196 7.11979 6.96376 9.0178 6.02258 5.9784 5.26359 4.73348
threshold=1.5850000000000002 11.267857142857144 1.7850000000000004 1.7450000000000003 0.5344873150105709 11.79166666666667 228.96875000000003 0.14583333333333573 -3.8124999999999996 1.905 0.63245614035087727 15.401785714285717 22.645833333333339 0.22500000000000003 1.7350000000000001 23.437500000000004
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=15 8 6 4 -4 -6 10 -8 -2 13 11 -3 -12 -5 -10 -1
right_child=1 2 3 9 5 -7 7 -9 14 -11 12 -13 -14 -15 -16 -17
leaf_value=-0.097004150398104866 -0.032826856509161513 -0.0043502309061410754 0.01431771893192204 -0.091075161948939082 -0.0036225823237434548 -0.050018228661574106 -0.029276859267249353 -0.092899466087143945 -0.061043546543381851 -0.031637052746051886 -0.012858122023405583 -0.051389190877909906 -0.063112770479224672 -0.058465725478527159 -0.099324189356474121 -0.059651523168360317
leaf_weight=10.494576394557955 9.2452220618724805 41.228692978620529 7.2462551295757285 13.49302679300308 9.9948346614837629 56.47081583738327 9.7449637949466723 6.4966425299644461 11.494059860706331 9.9948346614837629 6.4966425299644461 7.7459968626499167 9.2452220618724805 26.986053586006165 11.494059860706328 11.244188994169233
leaf_count=42 37 165 29 54 40 226 39 26 46 40 26 31 37 108 46 45
internal_value=-0.0457273 -0.0428007 -0.0390611 -0.0473474 -0.0374027 -0.0430414 -0.0263504 -0.0547259 -0.0666008 -0.0618705 -0.0192291 -0.0117901 -0.0423728 -0.0693355 -0.0801839 -0.0776838
internal_weight=259.116 237.377 205.144 124.186 73.7119 66.4657 80.9582 16.2416 32.2333 50.4739 64.7166 48.9747 15.7419 40.4791 22.9881 21.7388
internal_count=1037 950 821 497 295 266 324 65 129 202 259 196 63 162 92 87
is_linear=0
shrinkage=1
Tree=1
num_leaves=19
num_cat=0
split_feature=87 40 75 64 30 36 33 63 36 17 39 47 17 64 15 30 11 25
split_gain=13.9729 12.1763 14.362 13.4541 10.6931 12.6392 11.2985 10.0326 8.61679 8.52999 8.49866 8.85202 6.83713 6.53966 4.98946 6.17658 4.17697 6.76785
threshold=1.5850000000000002 11.267857142857144 0.4095454545454546 0.35791741982096692 -3.850000000000001 -0.13125000000000001 2.8660714285714275 0.50088993919059788 -0.17708333333333343 -3.8124999999999996 0.4702380952380954 0.34444905543815357 -7.8374999999999977 0.33306045819490598 -4.4732142857142838 -2.0744047619047588 90.687500000000014 3.9114583333333335
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=16 9 3 12 5 -4 8 -8 -6 -2 11 -5 -3 -14 -11 -16 -1 -18
right_child=1 2 4 10 6 -7 7 -9 -10 14 -12 -13 13 -15 15 -17 17 -19
leaf_value=-0.057851344720467247 0.012127031371442692 -0.065070317857066393 -0.046234344564243093 -0.034302041708384759 0.0097538558374562559 0.015919168977055184 -0.044150175296378553 0.012931828545225675 0.032691802732617729 -0.063902755868288641 0.044060792595361865 0.025258066535241798 0.0038131771685337061 -0.035910700877364692 0.0058413187436127693 -0.043222266889543948 -0.041738488387342523 0.012479643131747724
leaf_weight=6.4850978553295162 9.2475084811449033 6.2444083094596854 9.9929635822772997 8.2424890547990817 47.478157848119736 10.994390726089476 8.24348331987858 12.243208199739454 58.476858302950859 6.238056123256686 8.4939699023962003 7.743343085050582 11.740369915962221 15.236236929893492 7.2386563420295742 9.484264671802519 8.9887140691280347 6.2415239214897147
leaf_count=26 37 25 40 33 190 44 33 49 234 25 34 31 47 61 29 38 36 25
internal_value=-0.00025478 0.00255523 0.0061453 -0.0107729 0.0127667 -0.0136747 0.0171556 -0.010037 0.0224134 -0.0203093 0.011728 -0.00545185 -0.0273532 -0.0186226 -0.033373 -0.0219847 -0.0309668 -0.0195193
internal_weight=259.054 237.338 205.13 57.7008 147.429 20.9874 126.442 20.4867 105.955 32.2085 24.4798 15.9858 33.221 26.9766 22.961 16.7229 21.7153 15.2302
internal_count=1037 950 821 231 590 84 506 82 424 129 98 64 133 108 92 67 87 61
is_linear=0
shrinkage=0.04
Tree=2
num_leaves=15
num_cat=0
split_feature=87 79 86 40 80 29 95 46 61 76 27 42 51 53
split_gain=12.8995 11.4587 12.6562 11.6114 9.74256 9.00377 8.40775 12.7519 9.12686 9.48882 6.86723 5.78581 4.37714 4.21145
threshold=1.5850000000000002 0.47526524006473664 1.425 11.437500000000002 159.00000000000003 40.312500000000007 228.96875000000003 0.35543310268494094 12.225000000000001 2.6550000000000007 -13.968749999999998 -0.61249999999999971 -0.024362426215342146 23.437500000000004
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=13 2 5 -3 -4 -2 8 -8 12 -10 -6 -12 -5 -1
right_child=1 3 4 6 10 -7 7 -9 9 -11 11 -13 -14 -15
leaf_value=-0.048059265540188849 0.06036588533776735 -0.030072544599413102 0.030990903589822205 0.026560568681132761 -0.051938571878065741 0.0015271190921610995 0.033770613746158619 -0.039908770322941725 0.026584849929801698 -0.017941657017813275 -0.018812929986079977 -0.0011568414139026314 0.070670836690303579 -0.012775033151309585
leaf_weight=10.448318898677828 7.7465823441743877 8.4830420017242414 9.4921096563339216 5.9989815205335608 6.2402391731739035 8.9906333237886411 6.7452828139066687 8.4876060187816602 35.739069759845734 9.7458354085683805 52.919037282466888 67.668311461806297 8.9985922873020154 11.229458034038542
leaf_count=42 31 34 38 24 25 36 27 34 143 39 212 271 36 45
internal_value=-0.000245346 0.00245334 -0.00406662 0.0143054 -0.00809702 0.0287598 0.0192774 -0.00728277 0.0259668 0.0170444 -0.0110225 -0.00890511 0.0530269 -0.0297814
internal_weight=258.933 237.255 153.057 84.1984 136.32 16.7372 75.7154 15.2329 60.4825 45.4849 126.828 120.587 14.9976 21.6778
internal_count=1037 950 613 337 546 67 303 61 242 182 508 483 60 87
is_linear=0
shrinkage=0.04
Tree=3
num_leaves=15
num_cat=0
split_feature=87 79 33 88 72 60 95 12 15 52 14 30 65 53
split_gain=11.9137 10.5659 11.7467 12.0731 10.7663 8.68363 8.59661 13.4701 7.47172 9.03668 6.50335 5.61483 4.0183 3.91278
threshold=1.5850000000000002 0.47526524006473664 5.4583333333333348 0.57686355548965829 164.17133924969747 13.133928571428573 228.96875000000003 0.14583333333333573 -3.4499999999999953 18.645833333333339 82.937500000000014 -2.0119047619047588 0.34544552124032007 23.437500000000004
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=13 2 3 4 10 -6 8 -8 -3 -10 -2 -11 -5 -1
right_child=1 6 -4 12 5 -7 7 -9 9 11 -12 -13 -14 -15
leaf_value=-0.046319955261748876 0.037448948902587834 -0.0088284694187992091 -0.057678550882037501 0.059995488602543341 -0.019214067900304579 0.0032492412381891801 0.018976181265562397 -0.053115091799964251 -0.018039785729919627 0.013382514787688078 -0.0088821137810651384 0.041942269689724629 0.013681980687733714 -0.012271034609450827
leaf_weight=10.407825455069544 13.739773482084276 11.737559467554091 6.2349889278411856 5.9960742145776731 66.626564428210258 46.927813142538071 9.7444915026426333 7.219090297818183 6.492861956357955 16.732775956392288 7.4898996800184241 32.22684919834137 5.9935915470123291 11.223179444670675
leaf_count=42 55 47 25 24 267 188 39 29 26 67 30 129 24 45
internal_value=-0.000235931 0.00235599 -0.00390538 -0.00162108 -0.00504269 -0.00993082 0.0137405 -0.0117033 0.0201643 0.0263011 0.0211032 0.0321815 0.0368435 -0.0286538
internal_weight=258.793 237.162 153.009 146.774 134.784 113.554 84.1536 16.9636 67.19 55.4525 21.2297 48.9596 11.9897 21.631
internal_count=1037 950 613 588 540 455 337 68 269 222 85 196 48 87
is_linear=0
shrinkage=0.04
Tree=4
num_leaves=19
num_cat=0
split_feature=87 40 75 64 80 66 15 15 61 64 88 17 95 39 81 63 59 33
split_gain=11.0085 10.3158 11.8546 12.6055 10.3939 11.5844 9.67074 10.8155 9.0703 7.92497 7.91682 7.84355 6.80908 5.80185 4.70535 3.79881 3.16026 2.99516
threshold=1.5850000000000002 11.267857142857144 0.4095454545454546 0.35791741982096692 165.00000000000003 1.0000000180025095e-35 -9.1071428571428523 -3.725000000000001 12.690476190476192 0.33306045819490598 0.51341535239840341 -3.8124999999999996 225.46875000000003 0.33333333333333331 1.7350000000000001 0.4972961247664654 21.775000000000002 -1.0000000180025095e-35
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=15 11 3 8 5 17 12 -8 -3 -10 -5 -2 -6 -12 -13 -1 -17 -4
right_child=1 2 4 10 6 -7 7 -9 9 -11 13 14 -14 -15 -16 16 -18 -19
leaf_value=-0.00040554144175779727 0.012283500305906152 -0.058979450937986018 0.078347880748130691 -0.01274589698312884 0.010508267343845815 0.0014292549509687211 -0.030425427623467446 0.0073372268956942068 0.014443127755364691 -0.032407221257139643 0.0078896532861558941 -0.013267174082330087 0.063584791906166116 0.060934134607739728 -0.049547993082610223 -0.054100670196884616 -0.017902921750126199 0.042409147472346274
leaf_weight=5.95706579089165 9.2431617081165296 9.4672348201274854 5.9933577626943579 11.224958077073099 7.9917045235633832 10.236197039484976 13.982299685478209 91.852746203541756 9.9857960343360919 13.705725058913229 6.9919697046279889 11.458065599203112 7.4933722615242004 6.2467137277126312 11.419983848929403 8.6485812216997129 6.9683696925640097 9.7404364198446292
leaf_count=24 37 38 24 45 32 41 56 368 40 55 28 46 30 25 46 35 28 39
internal_value=-0.000227018 0.00226278 0.00556662 -0.00981529 0.0115843 0.0345507 0.00666807 0.00234825 -0.0258849 -0.0126602 0.0119659 -0.0188136 0.0361925 0.0329189 -0.0313774 -0.0275824 -0.037949 0.056099
internal_weight=258.608 237.034 204.913 57.6224 147.29 25.97 121.32 105.835 33.1588 23.6915 24.4636 32.1212 15.4851 13.2387 22.878 21.574 15.617 15.7338
internal_count=1037 950 821 231 590 104 486 424 133 95 98 129 62 53 92 87 63 63
is_linear=0
shrinkage=0.04
Tree=5
num_leaves=16
num_cat=0
split_feature=33 88 55 55 17 54 73 52 77 17 41 97 29 62 40
split_gain=15.351 13.8158 14.9232 12.5185 11.4912 11.3671 8.05265 14.2571 9.45676 11.7484 10.6395 4.12203 3.9043 3.69138 2.1447
threshold=5.4583333333333348 0.54438855792157115 24.062500000000004 21.387500000000003 -9.4642857142857064 26.312500000000004 0.53494186046511638 25.387500000000003 2.0850000000000004 3.0625000000000004 12.775000000000002 0.7544642857142777 35.562500000000007 0.5081756463571766 12.803571428571431
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 4 3 -3 11 13 7 12 9 -8 -10 -1 -6 -4 -5
right_child=-2 2 5 14 6 -7 8 -9 10 -11 -12 -13 -14 -15 -16
leaf_value=-0.019289747362067435 -0.049291431039616411 0.0079530049109373772 -0.0023167773542189392 0.049152011733235204 0.016609632632338327 0.040013888589411864 0.013951234712260528 -0.0188765863308245 -0.039607282005252709 -0.012366184034748687 -0.0024271531554327401 -0.063622619627354979 0.056973143291589742 -0.043118301029727156 0.079727872448937878
leaf_weight=6.9703347086906415 10.684137105941771 11.479128912091257 6.2386636584997204 7.7240137755870801 6.2296538949012819 5.9881500154733649 54.543849483132362 10.221605300903319 23.669574245810509 54.01922245323658 25.669763669371605 6.4709125310182571 9.9719207286834699 8.2252761721611005 6.9943185448646545
leaf_count=28 43 46 25 31 25 24 219 41 95 217 103 26 40 33 28
internal_value=-0.00235947 -0.000307937 0.0192731 0.0392628 -0.00492675 -0.00633208 -0.00232303 0.018115 -0.0057431 0.000856114 -0.0202636 -0.0406326 0.041453 -0.0255196 0.063682
internal_weight=255.101 244.416 46.6496 26.1975 197.767 20.4521 184.326 26.4232 157.902 108.563 49.3393 13.4412 16.2016 14.4639 14.7183
internal_count=1024 981 187 105 794 82 740 106 634 436 198 54 65 58 59
is_linear=0
shrinkage=0.04
Tree=6
num_leaves=16
num_cat=0
split_feature=60 62 73 75 86 9 86 68 9 75 55 17 12 53 82
split_gain=10.7843 10.6542 9.72935 12.6928 12.2359 16.8146 10.0524 12.101 8.66024 8.63902 8.35887 7.20778 6.43585 4.33555 3.6338
threshold=15.225000000000001 0.4697988539447821 0.53158244680851074 0.64508064516129038 1.7450000000000003 -8.3333333333333233 1.4450000000000001 207.25000000000003 -37.987012987012982 0.61348484848484863 24.291666666666668 -2.7083333333333353 3.2678571428571463 22.387500000000003 1.8350000000000002
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 11 4 6 -3 -6 7 -4 -5 -8 13 -1 -10 -2 -13
right_child=10 2 3 8 5 -7 9 -9 12 -11 -12 14 -14 -15 -16
leaf_value=-0.014255420135607853 0.023310237479360944 0.060688099681206698 0.047635605094340575 -0.022508370276938035 0.03702793647430791 -0.038866757036159119 -0.017236539926759407 -0.026749878847924364 0.026448512176737455 -0.054467609884275854 -0.007621975235864803 0.052597905023238606 -0.014975239364821315 0.067546521320461253 0.012566451021375784
leaf_weight=6.2201643437147167 8.2134438902139681 6.9778638184070614 7.975793316960333 11.688230097293852 9.2276621013879812 9.4580885171890223 113.23386019468307 6.2343348562717438 27.121811106801033 10.934748843312262 8.4812200069427472 9.2266680598258954 7.7061219811439505 6.2366772443056098 5.9792356044054022
leaf_count=25 33 28 32 47 37 38 455 25 109 44 34 37 31 25 24
internal_value=-0.00226718 -0.00485385 -0.00758834 -0.0107917 0.0154908 -0.00138736 -0.0168681 0.0150009 0.00728451 -0.0205152 0.0239009 0.0220185 0.017283 0.0424026 0.0368568
internal_weight=254.916 231.985 210.559 184.895 25.6636 18.6858 138.379 14.2101 46.5162 124.169 22.9313 21.4261 34.8279 14.4501 15.2059
internal_count=1024 932 846 743 103 75 556 57 187 499 92 86 140 58 61
is_linear=0
shrinkage=0.04
Tree=7
num_leaves=18
num_cat=0
split_feature=89 55 54 8 43 96 98 40 90 55 37 97 49 80 65 97 62
split_gain=10.4246 11.2262 10.7967 9.95568 11.3883 8.90539 8.65133 12.0582 11.6055 8.30861 8.22307 8.46564 6.54104 5.03193 4.78896 4.43026 4.16993
threshold=0.45561144207842896 24.062500000000004 26.387500000000003 31.666666666666668 0.45725155221122971 175.77083333333337 -6.2083333333333348 11.162500000000001 0.19645540169586981 21.387500000000003 2.3541666666666674 1.7812500000000002 0.71927164205124738 212.00000000000003 0.35922103320833165 1.0312500000000002 0.50486813817504161
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 9 16 5 -5 -2 10 14 -9 -1 11 -6 -12 -7 -8 -11 -3
right_child=3 2 -4 4 6 13 7 8 -10 15 12 -13 -14 -15 -16 -17 -18
leaf_value=0.0056796349867628713 0.0048498008748587788 0.0010948092386472363 0.038285447553195809 0.03821577456306921 0.045277568585778402 -0.075491676385977521 -0.0073319216517205438 -0.028247542141635638 0.013897886134041438 0.075581548081493269 -0.059494968657228614 -0.018282852479456907 -0.017519335700108753 -0.030147268708022427 -0.056245855074949898 0.032607041143986076 -0.041584861624048911
leaf_weight=11.720010221004488 8.2118753790855425 5.9827445000410107 5.9779978692531577 10.210737749934195 6.474472910165785 5.959468916058543 5.9542931914329511 12.197263866662977 73.146875634789467 6.4630497843027106 6.7146378159523001 6.953786626458168 51.473725467920303 11.417679086327551 6.9297584295272827 9.4506065398454648 9.4453652203083021
leaf_count=47 33 24 24 41 26 24 24 49 294 26 27 28 207 46 28 38 38
internal_value=-0.00217862 0.0143933 -0.00735136 -0.00613051 -0.00281263 -0.0294766 -0.00527918 0.00242922 0.00787452 0.0312376 -0.0158519 0.012363 -0.0223631 -0.0456981 -0.0336406 0.0500604 -0.0250345
internal_weight=254.684 49.0398 21.4061 205.645 180.056 25.589 169.845 98.2282 85.3441 27.6337 71.6166 13.4283 58.1884 17.3771 12.8841 15.9137 15.4281
internal_count=1024 197 86 827 724 103 683 395 343 111 288 54 234 70 52 64 62
is_linear=0
shrinkage=0.04
Tree=8
num_leaves=16
num_cat=0
split_feature=33 88 55 55 17 54 27 62 59 60 11 86 27 15 39
split_gain=14.0335 11.5183 12.3907 10.6473 9.85957 9.50449 8.91174 9.57804 8.94568 7.22924 4.21001 3.63151 3.58218 2.92798 2.2831
threshold=5.4583333333333348 0.54438855792157115 24.062500000000004 21.535714285714288 -9.4642857142857064 26.312500000000004 -7.9791666666666661 0.47163958632520975 19.162500000000005 15.225000000000001 114.68750000000001 1.885 -0.06770833333333344 -1.0000000180025095e-35 -0.11249999999999999
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 4 3 12 11 10 8 13 -6 -9 -4 -1 -3 -8 -5
right_child=-2 2 5 14 6 -7 7 9 -10 -11 -12 -13 -14 -15 -16
leaf_value=-0.017506959694980919 -0.047093096159148848 -0.013030607518081373 -0.0050211407760301866 0.041861979165786371 -0.0054470000340854632 0.036839736021523913 0.0084047253587021115 -0.0053865642716596834 -0.068322690853454923 0.024130364791092217 -0.048932153004977219 -0.059267051521897846 0.029896871945299459 0.042249585564822965 0.074866216000504499
leaf_weight=6.9434050768613798 10.625629082322119 5.9767347723245621 8.4471292942762393 5.9510868340730667 8.7058147490024549 5.9711179584264746 6.7232442200183895 137.19698540866375 6.1981060206890097 14.698398649692534 5.9571634531021109 6.405640184879303 6.4850186705589312 10.441370263695715 7.6831773668527585
leaf_count=28 43 24 34 24 35 24 27 552 25 59 24 26 26 42 31
internal_value=-0.00209418 -0.000132847 0.017783 0.0360337 -0.0043524 -0.00559187 -0.00194377 0.000670223 -0.0315952 -0.00253031 -0.0231814 -0.0375459 0.00930858 0.0289928 0.0604605
internal_weight=254.41 243.784 46.4714 26.096 197.313 20.3754 183.964 169.06 14.9039 151.895 14.4043 13.349 12.4618 17.1646 13.6343
internal_count=1024 981 187 105 794 82 740 680 60 611 58 54 50 69 55
is_linear=0
shrinkage=0.04
Tree=9
num_leaves=16
num_cat=0
split_feature=33 88 55 55 54 8 43 98 37 50 96 80 11 38 39
split_gain=13.0015 10.6389 11.4582 9.90804 8.77538 8.77091 9.15232 8.52658 12.059 8.78208 8.37899 5.06162 3.90198 3.4714 2.1433
threshold=5.4583333333333348 0.54438855792157115 24.062500000000004 21.535714285714288 26.312500000000004 31.666666666666668 0.45725155221122971 -6.2083333333333348 3.0625000000000004 0.77107730652069451 175.77083333333337 212.00000000000003 114.68750000000001 3.0625000000000004 -0.11249999999999999
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 5 3 13 12 10 -7 9 -9 -8 -1 -12 -4 -3 -5
right_child=-2 2 4 14 -6 6 7 8 -10 -11 11 -13 -14 -15 -16
leaf_value=0.0073464817079793203 -0.045435105019414168 -0.011342813791158599 -0.0048216877875364326 0.040315188023492481 0.035433511001246025 0.035818050318993391 -0.0019489232180927723 -0.01869102791191958 0.013448791274373359 -0.030954345333154155 -0.072293858788171919 -0.026294701048497598 -0.047141285128800781 0.030922033964920961 0.072368292940394013
leaf_weight=7.9585580676794079 10.573872312903402 6.4793082326650637 8.4447395205497759 5.9326046109199524 5.9599837511777869 10.172822326421736 40.264701783657074 25.803554385900497 67.645555391907692 28.539862290024757 5.9069683849811581 10.872051388025282 5.9365375787019721 5.9781518280506134 7.6314401328563672
leaf_count=32 43 26 34 24 24 41 162 104 272 115 24 44 24 24 31
internal_value=-0.0020131 -0.00012773 0.0171133 0.0346948 -0.00537778 -0.00418193 -0.000986385 -0.00329391 0.00457421 -0.0139803 -0.0264556 -0.0424885 -0.0222911 0.00893946 0.058349
internal_weight=254.101 243.527 46.3628 26.0215 20.3413 197.164 172.426 162.254 93.4491 68.8046 24.7376 16.779 14.3813 12.4575 13.564
internal_count=1024 981 187 105 82 794 694 653 376 277 100 68 58 50 55
is_linear=0
shrinkage=0.04
Tree=10
num_leaves=22
num_cat=0
split_feature=75 87 95 14 11 36 58 58 8 58 40 54 31 17 32 95 54 90 30 18 46
split_gain=10.0606 13.2442 13.2354 7.28198 9.27874 9.24179 8.65882 8.65277 7.77526 9.89423 8.92987 8.43675 7.69243 6.47202 5.91656 6.71204 5.41747 5.66364 4.6817 4.44902 4.42172
threshold=0.58007633587786267 1.5850000000000002 229.28125000000003 117.31250000000001 115.18750000000001 0.59375000000000011 24.803571428571434 24.354166666666668 42.261904761904766 17.062500000000004 13.35416666666667 20.062500000000004 24.562500000000004 5.4375000000000009 20.145833333333339 220.46875000000003 20.775000000000002 0.19788825188533934 0.60416666666666796 -4.3541666666666705 0.35905136104149266
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 18 13 4 8 7 19 -5 11 -10 -11 -1 -13 16 -15 -16 -3 -18 -2 -6 -4
right_child=1 2 20 5 6 -7 -8 -9 9 10 -12 12 -14 14 15 -17 17 -19 -20 -21 -22
leaf_value=0.016260010087575247 -0.00081200744809965017 0.009281923181331737 0.0032509448060061706 -0.025493613615531322 -0.074632317496072326 0.049617586690472636 0.0010470202171259946 0.03284274215952835 -0.029153960626547033 0.0012674975757078009 0.038103679086545161 -0.037991918317932111 0.0028048182430481369 0.031585379031769001 -0.038777627179031486 0.020611308349097066 0.013481360235318833 0.054225087511587569 -0.04021991897983087 -0.030101771519433589 -0.04216852229361847
leaf_weight=9.4118207991123182 10.595016002655031 9.211836025118826 7.6921787559986097 10.383948802948003 6.9398950040340432 7.9301784783601752 7.663398250937461 6.6885293126106253 11.645323514938353 37.428400799632072 14.651625692844389 25.72077003121376 10.378917679190634 10.19868426024914 6.2205115109682065 5.9642188549041748 7.1979017257690456 22.59254564344883 8.8545045703649503 7.4363927543163308 6.1885804086923599
leaf_count=38 43 37 31 42 28 32 31 27 47 151 59 104 42 41 25 24 29 91 36 30 25
internal_value=0.000384931 0.0106717 0.0182752 -0.00584956 -0.00961775 0.0139354 -0.0332929 -0.00263903 -0.00484104 0.00417752 0.0116306 -0.0174689 -0.0262626 0.0262515 0.00910688 -0.00970775 0.0360908 0.0443807 -0.0187527 -0.0515981 -0.0169988
internal_weight=250.995 94.716 75.2665 156.279 131.277 25.0027 22.0397 17.0725 109.237 63.7254 52.08 45.5115 36.0997 61.3857 22.3834 12.1847 39.0023 29.7904 19.4495 14.3763 13.8808
internal_count=1013 382 303 631 530 101 89 69 441 257 210 184 146 247 90 49 157 120 79 58 56
is_linear=0
shrinkage=0.04
Tree=11
num_leaves=23
num_cat=0
split_feature=46 34 52 26 15 25 38 25 38 45 51 29 31 85 45 97 77 48 50 89 35 15
split_gain=8.75369 11.3656 14.889 8.46246 7.41306 7.3335 8.41222 8.91166 7.82615 7.37315 7.16896 7.00775 6.83601 6.4889 6.43695 6.32157 6.2568 5.61316 5.35895 4.7914 4.5558 2.75508
threshold=0.38135121729750621 6.8875000000000011 22.775000000000002 7.1406250000000009 -7.5624999999999991 2.0520833333333335 2.6904761904761911 13.989583333333334 3.0625000000000004 0.026346315240645454 0.014637337657707453 41.401785714285722 17.354166666666668 -2.9999999999999996 -0.03868421822005879 -3.0312499999999996 1.7550000000000001 0.050581642342345606 0.78802449932964647 0.44915558749550849 7.4226190476190483 3.7857142857142851
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 5 3 15 20 16 9 8 -8 12 11 18 -7 -5 -6 -3 -1 -12 -2 -16 -4 -17
right_child=10 2 4 13 14 6 7 -9 -10 -11 17 -13 -14 -15 19 21 -18 -19 -20 -21 -22 -23
leaf_value=-0.0054894704946171517 0.0015186659314490275 0.012422457986275524 0.045080183146636456 -0.016969101137900868 -0.040094377162748857 0.019099728305053802 0.028826737553279212 0.048811854341376415 -0.010159784240800534 -0.052368685379215477 -0.057749287922771578 0.032139891068220787 -0.025625732807283655 0.02903959599951628 0.028393278510576315 0.063703565821716004 -0.055469403457085856 -0.0080914581224991559 -0.047345864671010984 -0.006544736689285552 0.00014063060048828728 0.032655129091640721
leaf_weight=7.6778917312622097 9.1392136514186877 8.1994599252939242 8.1893729716539365 8.9037519246339816 7.9227678179740897 9.4139631241560036 11.599057033658026 7.4388175308704367 28.433443784713745 7.6398711800575247 9.4042548984289152 6.4537621885538092 13.044003814458845 10.920071274042128 7.412743777036666 14.361867994070055 8.3833764642477018 5.9441701322793952 5.9151509106159201 41.112754985690117 6.4537106007337561 6.7087357491254798
leaf_count=31 37 33 33 36 32 38 47 30 115 31 38 26 53 44 30 58 34 24 24 166 26 27
internal_value=0.000369477 0.00347289 0.0116128 0.0285547 -8.68246e-05 -0.0069756 -0.00188168 0.00860705 0.00113621 -0.0184248 -0.0176344 -0.00273177 -0.00687765 0.00837506 -0.00666555 0.0422218 -0.0315771 -0.0385177 -0.0176812 -0.00120761 0.0252738 0.053818
internal_weight=250.672 213.816 120.185 49.0939 71.0914 93.6304 77.5692 47.4713 40.0325 30.0978 36.8566 21.5081 22.458 19.8238 56.4483 29.2701 16.0613 15.3484 15.0544 48.5255 14.6431 21.0706
internal_count=1013 864 485 198 287 379 314 192 162 122 149 87 91 80 228 118 65 62 61 196 59 85
is_linear=0
shrinkage=0.04
Tree=12
num_leaves=23
num_cat=0
split_feature=75 60 11 14 27 80 97 64 82 74 61 28 51 77 17 18 26 73 77 29 17 40
split_gain=9.16434 11.0589 9.72885 16.5355 11.7419 10.0895 9.63264 9.40553 6.81505 6.89416 6.69885 7.87536 6.2381 5.78129 6.43908 7.23032 5.6961 5.53399 5.51269 7.0191 3.79526 2.70316
threshold=0.40070422535211275 12.937500000000002 115.68750000000001 116.31250000000001 -0.9583333333333337 166.00000000000003 7.8229166666666581 0.35791741982096692 2.3750000000000004 0.68460526315789483 11.312500000000002 43.669642857142868 -0.00055870543754424991 2.4850000000000008 2.6875000000000004 11.267857142857148 3.0104166666666674 0.54977477477477488 3.5300000000000007 41.937500000000007 -3.0624999999999996 13.062500000000002
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=7 8 4 17 5 -3 16 13 9 10 -2 -12 -7 -1 15 21 -6 -4 19 20 -9 -15
right_child=1 2 3 -5 6 12 -8 18 -10 -11 11 -13 -14 14 -16 -17 -18 -19 -20 -21 -22 -23
leaf_value=-0.054673857447064877 0.031426635667739504 0.027443307507820841 -0.0071693293359852243 0.038925915934784942 0.050420363552369514 -0.055974948859624327 -0.015254363709585871 -0.00096046171680576385 -0.04185499138480625 -0.033145360136453486 -0.0080473225873038019 0.03965940078614745 0.00028714853340948784 -0.057615503467417908 0.017966063373814029 0.0078546298795259799 0.022851321189234875 -0.052721746825391283 0.037758480535269187 0.02828748068011893 -0.046251148347414762 -0.023260226786483273
leaf_weight=8.6124199628829938 10.635522291064261 9.4400324672460574 8.4194629490375572 8.1747892647981626 24.48761148750782 6.1720191240310669 6.4190221130847922 5.9309897720813787 7.6241378188133231 10.108165323734282 52.924647763371468 6.1832769215106955 6.4464221149682981 8.5726141929626518 6.4193207621574393 6.403973162174224 23.496750026941299 8.652211129665373 6.4387345761060706 6.4527361392974845 5.9104709476232546 6.4002304375171652
leaf_count=35 43 38 34 33 99 25 26 24 31 41 214 25 26 35 26 26 95 35 26 26 24 26
internal_value=0.000354686 0.00470563 0.0136746 -0.00785498 0.0207833 -0.00383352 0.0307644 -0.013108 -0.00572256 -0.00227267 0.00220181 -0.00305672 -0.0272322 -0.0260387 -0.0171663 -0.0277163 0.0369205 -0.0302561 0.00592675 -0.00527659 -0.0235666 -0.0429301
internal_weight=250.326 189.184 101.708 25.2465 76.4619 22.0585 54.4034 61.1415 87.4758 79.8516 69.7434 59.1079 12.6184 36.4086 27.7961 21.3768 47.9844 17.0717 24.7329 18.2942 11.8415 14.9728
internal_count=1013 765 411 102 309 89 220 248 354 323 282 239 51 148 113 87 194 69 100 74 48 61
is_linear=0
shrinkage=0.04
Tree=13
num_leaves=19
num_cat=0
split_feature=75 87 95 39 88 58 95 57 51 48 58 32 63 24 83 55 97 82
split_gain=8.61924 11.4411 11.6705 8.28079 6.48928 6.06533 8.08152 14.8066 8.57749 8.46636 7.40152 6.03526 6.41459 8.82896 6.7274 5.49578 4.98309 3.77165
threshold=0.58007633587786267 1.5850000000000002 229.28125000000003 -0.08571428571428534 0.57686355548965829 16.437500000000004 171.09375000000003 38.687500000000007 -0.016939089932731747 -0.033449118040512087 27.387500000000003 17.062500000000004 0.48324752364570916 -1.0000000180025095e-35 0.52167307176691413 21.937500000000004 3.2187500000000004 1.6850000000000003
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=4 15 11 -4 5 -1 8 9 -7 -8 -9 -3 13 -13 -14 -2 -17 -10
right_child=1 2 3 -5 -6 6 7 10 17 -11 -12 12 14 -15 -16 16 -18 -19
leaf_value=-0.037191415145792946 -0.049107836359544479 0.057010861992900341 0.011593508527805042 -0.050683110628723223 0.024122659092291082 -0.013516688468798235 0.0070009506232722408 -0.0048723531932253544 0.015903788969995617 -0.045659362164178104 0.041619476315647989 -0.040633960200292647 0.017460987223942896 0.023949802445685821 0.054172151049742406 0.018705900435433812 -0.030600473896497015 0.053826338845108651
leaf_weight=10.358175262808798 6.0601196587085751 7.9332039654254904 7.6448924392461777 6.1759819239377958 11.035771340131758 11.594122901558878 6.1791122853755978 69.580003947019577 10.684715464711191 23.321105942130089 5.9471079260110846 5.9447444081306458 28.258737400174141 7.8707564771175367 11.133399918675421 7.4004974067211133 5.8894641548395157 6.9099283069372168
leaf_count=42 25 32 31 25 45 47 25 282 43 95 24 24 114 32 45 30 24 28
internal_value=0.000340314 0.00988207 0.0169604 -0.0162354 -0.00544273 -0.00769954 -0.0054235 -0.0105979 0.0131951 -0.0346291 -0.00121152 0.0244643 0.0196117 -0.00384028 0.0278367 -0.0175393 -0.00314428 0.0307971
internal_weight=249.922 94.3118 74.9617 13.8209 155.61 144.574 134.216 105.027 29.1888 29.5002 75.5271 61.1408 53.2076 13.8155 39.3921 19.3501 13.29 17.5946
internal_count=1013 382 303 56 631 586 544 426 118 120 306 247 215 56 159 79 54 71
is_linear=0
shrinkage=0.04
Tree=14
num_leaves=21
num_cat=0
split_feature=75 79 28 27 14 60 31 69 47 53 31 75 36 41 56 46 49 46 77 74
split_gain=8.04256 13.9131 8.58147 7.46144 10.122 9.11597 10.0351 11.5688 9.90082 8.37289 8.31104 8.15738 7.02093 6.34993 5.18118 6.69615 5.11523 3.3449 3.22441 5.46203
threshold=0.6718233349078887 0.6464920911223796 41.687500000000007 -2.3020833333333326 91.062500000000014 12.937500000000002 25.803571428571434 1.7083333333333337 0.37218684558476361 25.812500000000004 17.354166666666668 0.3980198019801981 -0.26190476190476181 12.775000000000002 40.062500000000007 0.37106571647400344 0.76572220694348403 0.35494543794947025 1.4850000000000001 0.49543388429752072
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=3 18 12 4 13 6 14 -8 10 -10 -7 -12 -3 -1 15 -5 -16 -6 19 -2
right_child=1 2 -4 5 17 8 7 -9 9 -11 11 -13 -14 -15 16 -17 -18 -19 -20 -21
leaf_value=-0.019711473750247656 0.056394721780793459 -0.014592747921604247 -0.045243698355551337 0.0010942635313375348 -0.026082382729623245 0.059046481579424839 0.042087030208466859 -0.025395196611899901 -0.0380827165006535 0.012578046288920864 -0.012429649451340766 0.020979301067090405 0.03089449715034762 0.032768838321915822 -0.061831820327718327 -0.032477187860471046 -0.018967296365290393 -0.062065664493685058 0.060078921525471347 0.0051456786696745154
leaf_weight=8.626388296484949 5.9154238402843493 10.863973572850233 6.1918527483940116 30.794128924608231 9.0536976456642169 6.6757322102785102 11.092378646135332 6.415662854909896 11.35324338078499 9.6619694828987104 16.008781462907791 43.380531519651413 10.852814644575117 6.4449361413717261 6.8486602157354382 13.751119881868361 12.741591066122053 7.6055790930986396 7.6382495164871207 7.6053119897842407
leaf_count=35 24 44 25 125 37 27 45 26 46 39 65 176 44 26 28 56 52 31 31 31
internal_value=0.000326808 0.0148417 -0.00370442 -0.00322617 -0.0210218 0.000120512 -0.00948172 0.0173588 0.00912325 -0.0147909 0.0167303 0.0119737 0.00813919 0.00273063 -0.0168088 -0.00926924 -0.0339525 -0.0425101 0.0393039 0.0275675
internal_weight=249.522 49.0676 27.9086 200.454 31.7306 168.724 81.6435 17.508 87.0803 21.0152 66.065 59.3893 21.7168 15.0713 64.1355 44.5452 19.5903 16.6593 21.159 13.5207
internal_count=1013 199 113 814 129 685 332 71 353 85 268 241 88 61 261 181 80 68 86 55
is_linear=0
shrinkage=0.04
Tree=15
num_leaves=14
num_cat=0
split_feature=86 40 27 10 41 31 73 65 28 74 14 97 65
split_gain=9.19077 7.28223 7.35402 9.11593 7.22847 7.50822 7.14398 7.51306 6.94962 5.72956 5.5869 4.91206 4.48842
threshold=1.425 10.937500000000002 -10.947916666666666 84.062500000000014 13.645833333333334 15.29166666666667 0.5344873150105709 0.32491602042266449 39.687500000000007 0.47029411764705881 85.535714285714292 -0.42708333333332854 0.35541085304317482
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=8 11 10 4 5 -4 9 -8 -1 -5 -3 -2 -13
right_child=1 2 3 6 -6 -7 7 -9 -10 -11 -12 12 -14
leaf_value=0.059592270356515986 0.0053735337050535084 0.0014536342489078382 -0.0042774365987278294 0.047077753667467796 -0.014197369069938693 0.036529368742735643 0.01583862591245707 -0.0087913938399981923 0.0061247756213370069 0.0052192588836833601 -0.050790366696786798 -0.012402789325272627 -0.059610908501483095
leaf_weight=7.3100549280643454 6.1067151725292232 5.9317886233329755 9.6296757310628873 7.8829693347215679 8.6421001106500608 28.761566951870918 23.91489639878273 115.60252341628075 8.3125372529029828 15.558673441410063 7.3123210966587067 5.9148542433977109 7.0790839940309525
leaf_count=30 25 24 39 32 35 117 97 469 34 63 30 24 29
internal_value=0.00140631 -0.00051071 0.00151755 0.00334081 0.0188538 0.0262938 -0.00113655 -0.00456952 0.0311431 0.0192955 -0.0273913 -0.0242158 -0.0381217
internal_weight=257.96 242.337 223.237 209.992 47.0333 38.3912 162.959 139.517 15.6226 23.4416 13.2441 19.1007 12.9939
internal_count=1048 984 906 852 191 156 661 566 64 95 54 78 53
is_linear=0
shrinkage=0.04
Tree=16
num_leaves=13
num_cat=0
split_feature=33 88 64 14 97 55 33 27 95 11 62 89
split_gain=8.94152 9.44973 8.34757 9.96923 9.2269 7.77034 6.84612 6.71789 8.91623 11.1245 5.59099 2.07466
threshold=5.4583333333333348 0.5709995709995711 0.32539627952456907 83.535714285714292 -1.5982142857142916 24.062500000000004 0.3184523809523796 -12.94270833333333 170.79017857142858 87.645833333333357 0.50771095194186133 0.48370246267843525
decision_type=2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 2 3 4 -1 -3 11 -4 10 -10 -9 -5
right_child=-2 5 7 6 -6 -7 -8 8 9 -11 -12 -13
leaf_value=-0.021346605099593578 -0.036738482853078236 0.058873571365345455 -0.029645258704030009 -0.03604840849327056 0.042575470200724393 0.0048157514196816237 -0.0070588968786982603 0.037451850608501855 -0.041717482942480864 0.0036108240465800787 0.0091615077619154172 -0.067942279025163504
leaf_weight=7.6162292957305908 9.497162461280821 8.4247576445341128 9.057781442999838 6.4219828844070452 6.8739977478981 8.594713091850279 9.099840894341467 19.09737765789032 9.280805364251135 130.09927693009377 26.950699493288994 6.6345582008361816
leaf_count=31 39 35 37 26 28 35 37 78 38 528 109 27
internal_value=0.00135171 0.00280948 0.00069135 -0.0168207 0.0089773 0.0315748 -0.0336925 0.00399111 0.00563418 0.000592579 0.0208943 -0.052255
internal_weight=257.649 248.152 231.133 36.6466 14.4902 17.0195 22.1564 194.486 185.428 139.38 46.0481 13.0565
internal_count=1048 1009 939 149 59 70 90 790 753 566 187 53
is_linear=0
shrinkage=0.04
Tree=17
num_leaves=12
num_cat=0
split_feature=33 88 55 27 10 32 31 60 75 90 14
split_gain=8.28401 8.76598 7.29837 6.5019 6.36672 9.86493 7.51301 7.17497 6.93615 6.2403 5.35126
threshold=5.4583333333333348 0.5709995709995711 24.062500000000004 -12.94270833333333 84.062500000000014 20.062500000000004 25.803571428571434 12.937500000000002 0.42318766937669378 0.19774467682950328 85.535714285714292
decision_type=2 2 2 2 2 2 2 2 2 2 2
left_child=1 3 -3 10 5 8 7 -6 -5 -8 -1
right_child=-2 2 -4 4 6 -7 9 -9 -10 -11 -12
leaf_value=5.5780973020133532e-06 -0.035467582222402753 0.05716359984226644 0.004623749836194309 -0.010644706886582317 -0.01582380049665728 -0.034391445298106116 0.046794179848204068 0.0026044549279971439 0.028201993083561465 0.0052060024656404988 -0.051579011297576027
leaf_weight=6.1612475216388685 9.4450326114892942 8.3315776139497775 8.5935685932636243 9.1481379419565183 59.708019092679024 5.8809057772159568 8.1617092192173022 77.918728813529015 37.500212237238884 19.722670719027519 6.7347901910543442
leaf_count=25 39 35 35 37 243 24 33 316 153 80 28
internal_value=0.00129938 0.00270042 0.030487 0.000663966 0.00229624 0.0144291 -0.00155443 -0.00539046 0.0205838 0.0173788 -0.0269338
internal_weight=257.307 247.862 16.9251 230.936 218.04 52.5293 165.511 137.627 46.6484 27.8844 12.896
internal_count=1048 1009 70 939 886 214 672 559 190 113 53
is_linear=0
shrinkage=0.04
Tree=18
num_leaves=8
num_cat=0
split_feature=33 88 55 11 17 87 45
split_gain=7.67811 8.13711 6.86921 6.16429 9.02438 5.74349 6.3002
threshold=5.4583333333333348 0.5709995709995711 24.062500000000004 74.354166666666671 -10.687499999999998 2.1450000000000005 -0.0034592463243838995
decision_type=2 2 2 2 2 2 2
left_child=1 3 -3 -1 -5 6 -6
right_child=-2 2 -4 4 5 -7 -8
leaf_value=0.036528500712599402 -0.034255437987614025 0.0555735632038485 0.004439425576339767 -0.038357932926460371 -0.0070697236803132311 0.030356748335144887 0.0070235502949047429
leaf_weight=7.4107095897197715 9.3892168849706632 8.2292238324880618 8.5923809558153135 9.6656254380941373 105.90707743167877 10.260427355766295 97.451314657926559
leaf_count=30 39 35 35 40 431 42 396
internal_value=0.00124942 0.00259625 0.0294545 0.000637822 -0.000553374 0.00115717 -0.00031609
internal_weight=256.906 247.517 16.8216 230.695 223.284 213.619 203.358
internal_count=1048 1009 70 939 909 869 827
is_linear=0
shrinkage=0.04
Tree=19
num_leaves=12
num_cat=0
split_feature=33 86 64 12 27 95 11 49 62 29 63
split_gain=7.11864 7.56899 6.11087 12.4985 5.73268 8.29302 10.1577 5.48698 5.06855 4.93802 3.11264
threshold=5.4583333333333348 1.4150000000000003 0.32539627952456907 1.9375000000000002 -12.94270833333333 170.79017857142858 87.645833333333357 0.77175549922512843 0.54605872496633345 41.401785714285722 0.51856200631031568
decision_type=2 2 2 2 2 2 2 2 2 2 2
left_child=1 9 3 8 -4 7 -7 -6 10 -1 -3
right_child=-2 2 4 -5 5 6 -8 -9 -10 -11 -12
leaf_value=0.054127893054378734 -0.033097093119392906 -0.024347971534982155 -0.027558678245584001 0.020697790803193339 0.012679398061121673 -0.0402735013491324 0.0031588930417813959 0.046340548831502577 0.0016157647003835103 0.0053158736723644484 -0.056291390595480988
leaf_weight=7.2425345033407194 9.3303139358758909 10.320196986198431 8.9888450950384122 11.517354950308798 36.484531104564667 9.2175567746162397 131.94763413071632 9.837190628051756 6.3723442256450644 6.1165111809968948 9.2601860165595991
leaf_count=31 39 42 37 47 149 38 536 40 26 25 38
internal_value=0.0012009 0.0024949 0.00082268 -0.0139809 0.00364588 0.00514195 0.000322921 0.0198279 -0.0293706 0.031779 -0.039455
internal_weight=256.635 247.305 233.946 37.4701 196.476 187.487 141.165 46.3217 25.9527 13.359 19.5804
internal_count=1048 1009 953 153 800 763 574 189 106 56 80
is_linear=0
shrinkage=0.04
Tree=20
num_leaves=15
num_cat=0
split_feature=60 18 88 26 36 31 9 24 61 35 75 60 95 51
split_gain=8.03818 7.72948 8.40414 12.0656 10.8001 9.64736 8.20128 6.93288 5.91815 6.68838 6.79855 4.79664 4.70914 4.49622
threshold=15.225000000000001 -0.29166666666667135 0.56584311806914556 14.937500000000002 -0.83124999999999971 19.645833333333339 16.515151515151516 -1.0000000180025095e-35 15.645833333333334 8.5669642857142865 0.58007633587786267 15.775000000000002 222.78125000000003 0.0018411019890782001
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 6 3 4 13 -6 8 -2 9 10 -1 -9 -4 -3
right_child=7 2 12 -5 5 -7 -8 11 -10 -11 -12 -13 -14 -15
leaf_value=-0.0089530653611645094 -0.013396078382306617 -0.012100783857722328 0.043432242412436123 -0.05470139039712181 0.021915508768684949 -0.01294489712125084 0.048027122146776596 0.063122343027819355 0.035214924114047572 -0.036143728924515048 0.011385864248782825 0.018228963324568446 -0.0045408979170662904 -0.049373438013463097
leaf_weight=53.602775543928146 6.6217800229787853 8.524589270353319 6.1861436218023282 11.381553843617437 18.768473565578461 39.295118644833565 6.1146008819341651 6.1183906942605999 7.3963862508535376 8.3620016276836378 51.616437286138535 10.084086462855337 6.9543650448322296 13.191540747880934
leaf_count=219 27 35 26 47 77 160 25 25 30 34 210 41 29 54
internal_value=-0.00155888 -0.00379275 -0.0118627 -0.0161736 -0.0106771 -0.00167662 0.00283011 0.0210883 0.000545703 -0.00171195 0.00102442 0.0351816 0.0180434 -0.0347422
internal_weight=254.218 231.394 104.302 91.1613 79.7797 58.0636 127.092 22.8243 120.978 113.581 105.219 16.2025 13.1405 21.7161
internal_count=1039 946 428 373 326 237 518 93 493 463 429 66 55 89
is_linear=0
shrinkage=0.04
Tree=21
num_leaves=15
num_cat=0
split_feature=60 18 88 26 36 31 9 9 61 35 35 95 51 28
split_gain=7.42241 7.14338 7.76759 11.245 10.0048 8.89861 7.56719 6.33462 5.44938 6.18568 6.60464 4.38569 4.18924 2.06881
threshold=15.225000000000001 -0.29166666666667135 0.56584311806914556 14.937500000000002 -0.83124999999999971 19.645833333333339 16.515151515151516 -4.2748917748917838 15.645833333333334 8.5669642857142865 8.2678571428571441 222.78125000000003 0.0018411019890782001 41.187500000000007
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 6 3 4 12 -6 8 -2 9 10 -1 -4 -3 -9
right_child=7 2 11 -5 5 -7 -8 13 -10 -11 -12 -13 -14 -15
leaf_value=-0.0020580613462284694 -0.0067757984042552816 -0.011632476414499757 0.042031501598029888 -0.052911002493618228 0.021055706218397436 -0.012438211689475422 0.046156795546078516 0.022002629364403069 0.033775328851131076 -0.034815903882803013 0.034011731538430165 -0.0043576236343065185 -0.047663392508499491 0.052589999762468774
leaf_weight=96.316321060061455 8.6104428470134753 8.5132995098829287 6.1375970840454084 11.297693848609923 18.753774255514145 39.26072384417057 6.1079912632703772 7.3507572114467603 7.4030207246541968 8.3342373967170698 8.8703934699296934 6.9569248706102371 13.119665667414663 6.8210117071866989
leaf_count=393 35 35 26 47 77 160 25 30 30 34 36 29 54 28
internal_value=-0.00149909 -0.00364675 -0.0114181 -0.0155653 -0.010268 -0.00161096 0.002718 0.0202838 0.000523866 -0.00164456 0.000983704 0.0173856 -0.033484 0.0367246
internal_weight=253.854 231.072 104.04 90.9452 79.6475 58.0145 127.032 22.7822 120.924 113.521 105.187 13.0945 21.633 14.1718
internal_count=1039 946 428 373 326 237 518 93 493 463 429 55 89 58
is_linear=0
shrinkage=0.04
Tree=22
num_leaves=18
num_cat=0
split_feature=75 10 73 56 97 33 54 28 63 53 73 64 38 53 14 13 63
split_gain=6.8389 8.24825 6.33351 8.88087 6.18565 5.67783 6.55094 5.59704 6.85557 8.66603 7.65659 7.21773 11.1113 8.66033 3.97328 3.82022 2.45765
threshold=0.71055555555555572 112.31250000000001 0.53158244680851074 39.937500000000007 -0.42708333333332854 -3.077380952380953 21.145833333333339 44.937500000000007 0.51560927336724371 25.687500000000004 0.63958726168028501 0.34495732674280544 4.3875000000000011 27.437500000000004 110.18750000000001 98.625000000000014 0.51800407176145413
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=2 5 3 15 -5 -2 -7 8 9 10 -4 12 -10 -13 -11 -1 -8
right_child=1 -3 7 4 -6 6 16 -9 11 14 -12 13 -14 -15 -16 -17 -18
leaf_value=0.015837814614419042 -0.0054090476665669711 -0.020413774544095844 -0.0067905346474414312 0.026068878291681773 -0.024468940768756574 0.010241531744467547 0.073552533225814301 0.020224466305667418 -0.0046266451570048392 -0.021209825690180987 -0.052758572611820695 0.0025546027891674618 -0.064146045205184979 0.052111362242983086 -0.062702945254864487 0.060247018203511145 0.036959564623549768
leaf_weight=5.9841428995132464 7.3666127473115912 8.3540667146444303 65.92797489464283 6.133746981620793 10.522817611694332 9.2578747272491508 5.8667192012071592 11.480442628264425 29.450126886367798 6.6407066881656638 6.3565134406089774 47.05113098025322 6.0492770820856085 6.4109846800565711 8.3170198649167997 6.4289718419313431 5.8796851336956024
leaf_count=25 30 34 271 25 43 38 24 47 121 27 26 193 25 26 34 26 24
internal_value=-0.00144173 0.0145202 -0.00414618 0.0132275 -0.00585849 0.0248068 0.0354041 -0.00683713 -0.00860031 -0.0165676 -0.0108328 -0.000786974 -0.0147691 0.00849727 -0.0442815 0.0388381 0.0552359
internal_weight=253.479 36.725 216.754 29.0697 16.6566 28.3709 21.0043 187.684 176.204 87.2422 72.2845 88.9615 35.4994 53.4621 14.9577 12.4131 11.7464
internal_count=1039 150 889 119 68 116 86 770 723 358 297 365 146 219 61 51 48
is_linear=0
shrinkage=0.04
Tree=23
num_leaves=16
num_cat=0
split_feature=60 18 35 26 21 11 72 34 42 69 65 24 60 29 16
split_gain=6.76622 6.44432 7.43603 8.61975 6.98342 7.45293 9.26335 8.42599 8.25952 6.4395 7.95854 6.13581 4.32375 3.78406 2.36778
threshold=15.225000000000001 -0.29166666666667135 8.8229166666666696 14.937500000000002 0.16515151515151519 123.68750000000001 227.82000000000002 8.4687500000000018 -0.84166666666666667 8.2500000000000018 0.32491602042266449 -1.0000000180025095e-35 15.775000000000002 32.708333333333336 4.9375000000000009
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 4 3 9 5 6 7 -1 -8 10 -3 -2 -13 -5 -11
right_child=11 2 -4 13 -6 -7 8 -9 -10 14 -12 12 -14 -15 -16
leaf_value=0.0068377190178299978 -0.012999968486144695 0.046209360196903707 -0.047916200210550458 -0.016602574178768866 0.044382956007966268 0.041711064554847729 0.012542729748890782 -0.040858831579305734 -0.041878353739748291 -0.047809843446037739 -0.0025966166987321858 0.059424945850961183 0.016650328155874659 -0.061312735990994086 -0.01220746979923833
leaf_weight=86.415405079722404 6.6122610569000271 5.8188596814870825 7.9874863028526297 5.9741610884666425 6.0946186929941168 6.6349345445632926 6.355574399232867 6.3623842000961295 14.977498367428778 6.0902193337678892 65.746306836605072 6.0662906020879772 10.036743447184561 6.1432460248470306 5.8692503720521927
leaf_count=353 27 24 33 25 25 27 26 26 61 25 271 25 41 26 24
internal_value=-0.00138606 -0.00343895 -0.0108389 -0.0077424 0.00260686 0.000498221 -0.00189808 0.00356685 -0.0256652 -0.00316857 0.00137173 0.0194426 0.0327643 -0.0392696 -0.0303376
internal_weight=253.185 230.47 103.63 95.642 126.84 120.746 114.111 92.7778 21.3331 83.5246 71.5652 22.7153 16.103 12.1174 11.9595
internal_count=1039 946 428 395 518 493 466 379 87 344 295 93 66 51 49
is_linear=0
shrinkage=0.04
Tree=24
num_leaves=15
num_cat=0
split_feature=60 18 88 27 62 9 11 72 12 48 36 24 43 95
split_gain=6.25224 5.95135 7.10947 8.50211 8.95714 6.45695 6.86321 8.56032 7.77198 7.68621 6.38574 5.67858 4.21565 3.98223
threshold=15.225000000000001 -0.29166666666667135 0.56584311806914556 -6.9999999999999973 0.46489792570397087 16.515151515151516 123.68750000000001 227.82000000000002 -0.75595238095237949 0.025274304665849606 -1.705357142857143 -1.0000000180025095e-35 0.50860236372279055 222.78125000000003
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 5 3 -3 -5 6 7 9 -9 -1 -6 -2 -13 -4
right_child=11 2 13 4 10 -7 -8 8 -10 -11 -12 12 -14 -15
leaf_value=0.0087138894445027461 -0.01248475536504879 -0.052100240369848241 0.040799272900499305 0.024363685744045852 -0.051816351953662361 0.042735686521304467 0.040009035522266888 -0.046006108176515285 0.00273483800408156 -0.021635092287458264 -0.0115203606180113 0.0090547062831142257 0.050240538556617337 -0.0035196794361016148
leaf_weight=76.604642063379288 6.6097762733697918 8.6576687693595868 6.0909906178712827 10.348592042922972 6.9732603579759589 6.0767100751399985 6.6403718292713156 11.978713899850847 9.2968079298734647 16.170377686619759 64.432133376598358 7.2736790329217937 8.7717214673757535 6.9399073719978333
leaf_count=313 27 37 26 43 29 25 27 49 38 66 264 30 36 29
internal_value=-0.00133294 -0.00330607 -0.0104257 -0.0144068 -0.0104152 0.0025036 0.000477939 -0.00182368 -0.0247077 0.00342416 -0.0154556 0.0187169 0.0315702 0.0171962
internal_weight=252.865 230.21 103.443 90.4117 81.754 126.768 120.691 114.051 21.2755 92.775 71.4054 22.6552 16.0454 13.0309
internal_count=1039 946 428 373 336 518 493 466 87 379 293 93 66 55
is_linear=0
shrinkage=0.04
Tree=25
num_leaves=15
num_cat=0
split_feature=33 87 85 10 83 10 39 86 72 49 13 16 38 27
split_gain=7.25855 6.38507 8.41516 9.2502 8.14048 7.48236 8.40852 7.85339 11.0386 8.60242 6.83818 5.86449 5.49791 3.04698
threshold=5.4583333333333348 1.5850000000000002 1.0000000180025095e-35 117.22500000000001 0.54234668644906048 116.93750000000001 -1.0000000180025095e-35 1.5650000000000002 162.69282608695656 0.75102962846491172 104.77500000000002 -8.9374999999999982 4.1833333333333345 0.020833333333333655
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 10 5 4 11 7 -7 9 -9 -3 -1 -4 -8 -6
right_child=-2 2 3 -5 13 6 12 8 -10 -11 -12 -13 -14 -15
leaf_value=-0.041539672781130549 -0.031422067002150791 0.042995214924070081 0.042296436297125074 -0.023883650662784965 0.072297264891014304 -0.009582647504040049 0.058812399794648744 0.021985462315431858 -0.014896003510037621 -0.0040262327529876152 0.0035283522947714635 0.0075685995167286092 0.011039775290075307 0.0323219849604569
leaf_weight=10.137894913554193 10.139534831047056 10.71300193667412 9.8288029283285123 7.8387844264507285 5.8450987040996534 9.664890334010126 8.7909274846315366 15.362282529473303 83.881821304559708 14.860007524490355 11.493947580456732 37.329559937119484 6.8639661371707907 6.381435826420784
leaf_count=42 43 45 40 32 24 40 36 63 345 62 47 153 28 26
internal_value=0.00172561 0.00313194 0.0051945 0.0169566 0.0223475 -7.19483e-05 0.0197544 -0.00409382 -0.00918701 0.0156719 -0.0175931 0.0148066 0.0378663 0.0514328
internal_weight=249.132 238.992 217.361 67.2237 59.3849 150.137 25.3198 124.817 99.2441 25.573 21.6318 47.1584 15.6549 12.2265
internal_count=1026 983 894 275 243 619 104 515 408 107 89 193 64 50
is_linear=0
shrinkage=0.04
Tree=26
num_leaves=13
num_cat=0
split_feature=48 87 75 13 13 26 98 83 54 28 63 26
split_gain=6.07316 5.62173 8.54781 6.21306 5.95377 6.62941 7.29602 6.07248 5.64674 6.51449 5.46195 3.27744
threshold=0.064872666259492118 1.5850000000000002 0.69045454545454554 104.77500000000002 120.56250000000001 15.072916666666668 -11.562499999999998 0.49675474226372424 20.535714285714288 40.187500000000007 0.51085383914331295 6.072916666666667
decision_type=2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 3 4 -1 5 6 -3 -7 -4 11 -2 -10
right_child=10 2 8 -5 -6 7 -8 -9 9 -11 -12 -13
leaf_value=-0.039674658535771484 -0.052512934017087537 -0.014023176104202862 -0.005798976431383154 0.0036351517375994916 -0.026305412473366974 0.0062814492481233046 0.010148203473004854 -0.041049983242601396 0.028561765576098901 0.0087152875473318417 -0.0029143682076594587 0.066287713324515871
leaf_weight=9.8422116041183489 6.0319006294012061 23.719328477978706 7.8761165887117377 11.48290069401264 11.350633054971693 7.9110089093446758 126.75102043151855 9.5997582674026471 6.5257693231105831 10.448119834065436 8.6423251926898939 8.462289631366728
leaf_count=41 25 97 32 47 47 33 522 41 27 43 36 35
internal_value=0.00166 0.00322559 0.00518911 -0.0163537 0.00173262 0.00362718 0.00633796 -0.0196666 0.0237966 0.0329606 -0.0233021 0.0498619
internal_weight=248.643 233.969 212.644 21.3251 179.332 167.981 150.47 17.5108 33.3123 25.4362 14.6742 14.9881
internal_count=1026 965 877 88 740 693 619 74 137 105 61 62
is_linear=0
shrinkage=0.04
Tree=27
num_leaves=15
num_cat=0
split_feature=33 40 43 64 56 29 74 7 16 12 36 53 32 33
split_gain=6.67537 6.00723 6.58775 6.4086 6.22146 8.22654 7.17113 5.57557 10.8982 10.7588 6.64454 7.95563 6.24334 4.86842
threshold=5.4583333333333348 10.937500000000002 0.53985767334155887 0.34495732674280544 30.562500000000004 32.791666666666679 0.62250000000000016 59.166666666666679 3.7916666666666647 9.1339285714285712 -0.34166666666666673 22.387500000000003 23.791666666666668 -0.29166666666666602
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 3 7 -1 -4 -6 -7 8 9 -3 12 -12 -9 -5
right_child=-2 2 4 13 5 6 -8 10 -10 -11 11 -13 -14 -15
leaf_value=-0.04539631182232725 -0.030331123415790143 0.0036500818553591068 -0.017115468469559102 -0.026046588004955247 0.065457960355147352 0.02547423439415672 -0.018488966847576102 0.017724882848208653 -0.041424367213409224 -0.047501812723565191 -0.0045522301063454888 0.04416916038732871 -0.038718688099103261 0.024752126334750945
leaf_weight=8.1224037855863589 10.053963825106619 115.14034233987331 6.9894797801971427 5.7684260606765747 6.3827460259199134 27.928439050912857 7.5389710813760749 6.7571853250265104 10.671643391251562 6.9777127951383582 8.1633969545364398 15.628093481063841 5.8501180559396744 6.3321559280157071
leaf_count=34 43 476 29 24 26 114 31 28 44 29 34 64 24 26
internal_value=0.00159555 0.00294283 0.00487723 -0.0179123 0.0178183 0.0236527 0.0161294 0.00114151 -0.00266021 0.00072731 0.0150109 0.0274518 -0.00846641 0.000536049
internal_weight=248.305 238.251 218.028 20.223 48.8396 41.8502 35.4674 169.188 132.79 122.118 36.3988 23.7915 12.6073 12.1006
internal_count=1026 983 899 84 200 171 145 699 549 505 150 98 52 50
is_linear=0
shrinkage=0.04
Tree=28
num_leaves=13
num_cat=0
split_feature=33 27 14 48 24 53 59 57 34 47 81 50
split_gain=6.19142 5.48502 8.27407 6.44704 5.11208 5.11786 5.55155 5.07182 5.97421 8.76085 8.56983 3.46147
threshold=5.4583333333333348 -2.3020833333333326 85.312500000000014 0.064872666259492118 -1.0000000180025095e-35 21.291666666666668 23.387500000000003 45.687500000000007 6.8875000000000011 0.37218684558476361 1.7150000000000001 0.74075516458343793
decision_type=2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 2 11 7 -4 -6 -7 8 10 -10 -3 -1
right_child=-2 3 4 -5 5 6 -8 -9 9 -11 -12 -13
leaf_value=0.036467676081935275 -0.029325034647802492 0.012704755728250297 -0.051764929617036118 -0.023669451716433799 -0.044508533316403635 0.026446805663908529 -0.027271352696527536 -0.022833580306035422 0.022463638898768467 -0.0033374130186977686 -0.014458359978742813 -0.0028572785771652266
leaf_weight=6.0484680682420722 9.9843682050704938 40.375474900007248 7.6424536705017116 11.367303296923636 5.5812820345163372 6.6618274450302106 5.7221347242593765 8.4949828088283521 73.357022643089294 29.534479916095734 34.431760281324387 8.7800204604864103
leaf_count=25 43 166 32 47 24 28 24 35 302 121 143 36
internal_value=0.0015332 0.00282775 -0.0105946 0.005575 -0.0243635 -0.0127068 0.00162579 0.0073604 0.00880384 0.0150576 0.000202303 0.0131832
internal_weight=247.982 237.997 40.4362 197.561 25.6077 17.9652 12.384 186.194 177.699 102.892 74.8072 14.8285
internal_count=1026 983 169 814 108 76 52 767 732 423 309 61
is_linear=0
shrinkage=0.04
Tree=29
num_leaves=17
num_cat=0
split_feature=87 85 10 83 41 73 86 39 8 44 13 76 61 36 97 45
split_gain=5.37924 8.77104 7.1461 6.5763 6.24618 5.74552 7.24104 5.90731 7.82727 5.90939 5.68942 4.92086 12.7968 7.49136 4.69159 2.29423
threshold=1.5850000000000002 1.0000000180025095e-35 117.22500000000001 0.54234668644906048 10.812500000000002 0.53158244680851074 1.7450000000000003 -0.61249999999999993 47.727272727272734 0.48331411181158185 104.77500000000002 2.4650000000000003 12.866071428571431 0.28869047619047633 1.4062500000000002 -0.013905145374354397
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=10 4 3 11 -2 6 -6 8 -7 -9 -1 13 -13 -3 -12 -5
right_child=1 2 -4 15 5 7 -8 9 -10 -11 14 12 -14 -15 -16 -17
leaf_value=-0.039294680659493833 -0.031671221811972641 0.037455766277146953 -0.018600965920339822 0.063774142191859146 0.050889907105456336 -0.036046187526128669 -0.0011468294092016626 -0.016947588402893 -0.00031638211057623142 0.0084049263294318916 0.026762860869410491 0.035225255717854904 -0.041754796373233906 -0.00065805839704901963 -0.023845024484234915 0.029272484285479398
leaf_weight=10.1972071826458 10.532053381204603 19.837578102946281 8.7409028112888318 6.0329279154539108 7.6620817482471493 16.978984668850899 9.6890967190265638 18.620566442608833 23.233949556946754 70.049215823411942 5.8855016827583295 6.3545227348804456 7.5724885612726212 14.127186641097067 5.8383642733097076 6.3081436455249769
leaf_count=43 44 82 36 25 32 71 40 78 96 290 24 26 31 58 24 26
internal_value=0.0014732 0.00331024 0.0151971 0.0201018 -0.00191971 0.000223048 0.021832 -0.00268611 -0.0154025 0.00308092 -0.0174444 0.0133924 -0.00663086 0.0216028 0.00156066 0.0461386
internal_weight=247.661 225.74 68.9738 60.2328 156.766 146.234 17.3512 128.883 40.2129 88.6698 21.9211 47.8918 13.927 33.9648 11.7239 12.3411
internal_count=1026 935 284 248 651 607 72 535 167 368 91 197 57 140 48 51
is_linear=0
shrinkage=0.04
end of trees
feature_importances:
Column_33=16
Column_55=14
Column_75=13
Column_95=13
Column_17=12
Column_27=12
Column_40=12
Column_60=12
Column_88=12
Column_87=11
Column_11=10
Column_97=10
Column_14=9
Column_36=9
Column_53=9
Column_64=9
Column_86=9
Column_31=8
Column_54=8
Column_73=8
Column_15=7
Column_26=7
Column_29=7
Column_39=7
Column_62=7
Column_63=7
Column_9=6
Column_10=6
Column_12=6
Column_18=6
Column_28=6
Column_51=6
Column_61=6
Column_77=6
Column_13=5
Column_24=5
Column_35=5
Column_38=5
Column_46=5
Column_48=5
Column_58=5
Column_65=5
Column_74=5
Column_80=5
Column_8=4
Column_30=4
Column_32=4
Column_41=4
Column_43=4
Column_45=4
Column_49=4
Column_72=4
Column_82=4
Column_83=4
Column_16=3
Column_25=3
Column_34=3
Column_47=3
Column_50=3
Column_52=3
Column_56=3
Column_59=3
Column_79=3
Column_81=3
Column_85=3
Column_89=3
Column_90=3
Column_98=3
Column_37=2
Column_42=2
Column_57=2
Column_69=2
Column_76=2
Column_96=2
Column_7=1
Column_21=1
Column_44=1
Column_66=1
Column_68=1
Column_71=1
parameters:
[boosting: gbdt]
[objective: binary]
[metric: binary_logloss]
[tree_learner: serial]
[device_type: cpu]
[data_sample_strategy: bagging]
[data: ]
[valid: ]
[num_iterations: 1200]
[learning_rate: 0.04]
[num_leaves: 31]
[num_threads: 4]
[seed: 42]
[deterministic: 0]
[force_col_wise: 0]
[force_row_wise: 0]
[histogram_pool_size: -1]
[max_depth: 6]
[min_data_in_leaf: 24]
[min_sum_hessian_in_leaf: 0.001]
[bagging_fraction: 0.84]
[pos_bagging_fraction: 1]
[neg_bagging_fraction: 1]
[bagging_freq: 5]
[bagging_seed: 400]
[bagging_by_query: 0]
[feature_fraction: 0.82]
[feature_fraction_bynode: 1]
[feature_fraction_seed: 30056]
[extra_trees: 0]
[extra_seed: 12879]
[early_stopping_round: 0]
[early_stopping_min_delta: 0]
[first_metric_only: 0]
[max_delta_step: 0]
[lambda_l1: 0]
[lambda_l2: 0]
[linear_lambda: 0]
[min_gain_to_split: 0]
[drop_rate: 0.1]
[max_drop: 50]
[skip_drop: 0.5]
[xgboost_dart_mode: 0]
[uniform_drop: 0]
[drop_seed: 17869]
[top_rate: 0.2]
[other_rate: 0.1]
[min_data_per_group: 100]
[max_cat_threshold: 32]
[cat_l2: 10]
[cat_smooth: 10]
[max_cat_to_onehot: 4]
[top_k: 20]
[monotone_constraints: ]
[monotone_constraints_method: basic]
[monotone_penalty: 0]
[feature_contri: ]
[forcedsplits_filename: ]
[refit_decay_rate: 0.9]
[cegb_tradeoff: 1]
[cegb_penalty_split: 0]
[cegb_penalty_feature_lazy: ]
[cegb_penalty_feature_coupled: ]
[path_smooth: 0]
[interaction_constraints: ]
[verbosity: -1]
[saved_feature_importance_type: 0]
[use_quantized_grad: 0]
[num_grad_quant_bins: 4]
[quant_train_renew_leaf: 0]
[stochastic_rounding: 1]
[linear_tree: 0]
[max_bin: 255]
[max_bin_by_feature: ]
[min_data_in_bin: 3]
[bin_construct_sample_cnt: 200000]
[data_random_seed: 175]
[is_enable_sparse: 1]
[enable_bundle: 1]
[use_missing: 1]
[zero_as_missing: 0]
[feature_pre_filter: 1]
[pre_partition: 0]
[two_round: 0]
[header: 0]
[label_column: ]
[weight_column: ]
[group_column: ]
[ignore_column: ]
[categorical_feature: ]
[forcedbins_filename: ]
[precise_float_parser: 0]
[parser_config_file: ]
[objective_seed: 16083]
[num_class: 1]
[is_unbalance: 0]
[scale_pos_weight: 1]
[sigmoid: 1]
[boost_from_average: 1]
[reg_sqrt: 0]
[alpha: 0.9]
[fair_c: 1]
[poisson_max_delta_step: 0.7]
[tweedie_variance_power: 1.5]
[lambdarank_truncation_level: 30]
[lambdarank_norm: 1]
[label_gain: ]
[lambdarank_position_bias_regularization: 0]
[eval_at: ]
[multi_error_top_k: 1]
[auc_mu_weights: ]
[num_machines: 1]
[local_listen_port: 12400]
[time_out: 120]
[machine_list_filename: ]
[machines: ]
[gpu_platform_id: -1]
[gpu_device_id: -1]
[gpu_use_dp: 0]
[num_gpu: 1]
end of parameters
pandas_categorical:null
@@ -0,0 +1,202 @@
tree
version=v4
num_class=1
num_tree_per_iteration=1
label_index=0
max_feature_idx=98
objective=binary sigmoid:1
feature_names=Column_0 Column_1 Column_2 Column_3 Column_4 Column_5 Column_6 Column_7 Column_8 Column_9 Column_10 Column_11 Column_12 Column_13 Column_14 Column_15 Column_16 Column_17 Column_18 Column_19 Column_20 Column_21 Column_22 Column_23 Column_24 Column_25 Column_26 Column_27 Column_28 Column_29 Column_30 Column_31 Column_32 Column_33 Column_34 Column_35 Column_36 Column_37 Column_38 Column_39 Column_40 Column_41 Column_42 Column_43 Column_44 Column_45 Column_46 Column_47 Column_48 Column_49 Column_50 Column_51 Column_52 Column_53 Column_54 Column_55 Column_56 Column_57 Column_58 Column_59 Column_60 Column_61 Column_62 Column_63 Column_64 Column_65 Column_66 Column_67 Column_68 Column_69 Column_70 Column_71 Column_72 Column_73 Column_74 Column_75 Column_76 Column_77 Column_78 Column_79 Column_80 Column_81 Column_82 Column_83 Column_84 Column_85 Column_86 Column_87 Column_88 Column_89 Column_90 Column_91 Column_92 Column_93 Column_94 Column_95 Column_96 Column_97 Column_98
feature_infos=none none none none none none none [0:100] [0:100] [-100:100] [54:128.875] [61:138] [-41.333333333333329:24] [61:130.125] [59:133] [-50:34.916666666666671] [-28.666666666666657:32.5] [-24.5:27] [-35.5:31.833333333333329] [0:1] [0:1] [-1:1] [0:12] [0:12] [-12:10] [0.91666666666666663:186.1875] [0.85416666666666663:185.08333333333337] [-179.02083333333334:184.16666666666663] [20:54] [22:56] [-21:15.833333333333332] [11:34.5] [9:36] [-15:13.333333333333332] [2:13] [3:13] [-8:5.5] [0.875:9] [0:12] [-9:4.875] [7:22] [8:23] [-10:10] [0.41025641025641019:0.67647058823529416] [0.39730639730639727:0.76744186046511631] [-0.3174418604651163:0.22647058823529409] [0.20419254658385089:0.4825000000000001] [0.1875:0.54166666666666663] [-0.22171945701357459:0.17923280423280419] [0.47826086956521741:0.87898457080200498] [0.42857142857142849:0.88636363636363635] [-0.29647435897435898:0.38962801311591622] [11:39.666666666666664] [15.5:40] [10:31.875] [11.5:34] [15:56] [17:57] [12:32.125] [10:38] [8:20] [7:22] [0.40000000000000002:0.69047619047619047] [0.3902439024390244:0.71739130434782605] [0.23753976670201479:0.51851851851851849] [0.1707317073170731:0.54545454545454541] [0:8] [0:1] [120:295] [-38:36] [0:1] [0:1] [140.39130434782609:233.09999999999999] [0:1] [0:1] [0:1] [1.05:4.1100000000000003] [1.05:4.1299999999999999] [0.2034883720930232:0.79729729729729726] [0.20270270270270269:0.79651162790697683] [145.5:257.5] [1.3:7.7599999999999998] [1.4099999999999999:7.5599999999999996] [0.1789976133651551:0.75977653631284914] [0.24022346368715081:0.82100238663484482] [-15.5:12.5] [1.3:2.7200000000000002] [1.1799999999999999:2.7799999999999998] [0.36298076923076922:0.64953271028037374] [0.35046728971962621:0.63701923076923073] [0.1918833727344364:0.21555133935749429] none none none none [132:250.6875] [126.25:256.625] [-63.5:88.6875] [-35.5:27.5]
tree_sizes=1714 1917
Tree=0
num_leaves=15
num_cat=0
split_feature=72 82 49 25 36 25 17 77 34 69 25 40 80 40
split_gain=11.2305 10.9895 14.5557 11.7291 11.5678 6.991 6.42947 6.52716 10.0318 9.34909 5.86638 5.89396 5.34606 8.20903
threshold=232.51500000000001 1.8250000000000004 0.74446554478882077 2.8645833333333335 -0.46874999999999994 1.0104166666666667 12.562500000000002 2.6750000000000003 5.6339285714285721 -5.4166666666666652 3.1145833333333335 12.803571428571431 162.00000000000003 13.937500000000002
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=1 5 10 4 -4 -1 7 8 -7 -9 -3 -12 -5 -14
right_child=-2 2 3 12 -6 6 -8 9 -10 -11 11 -13 13 -15
leaf_value=-0.014742735215146609 -0.095823069621212592 -0.030752227995962191 -0.086785452728816673 0.017792685597478597 -0.021483574280753166 -0.0010203128315903962 -0.085759203191584921 -0.020079232808751797 -0.05043362279072628 -0.074223259488395713 -0.046761720776777753 -0.098557138597063443 -0.031433483007911781 0.014451045738105336
leaf_weight=7.4955528974533072 7.7454046607017508 6.2462940812110928 7.995256423950198 7.7454046607017508 9.4943670034408552 6.99584937095642 9.7442187666892988 5.9964423179626456 108.93536877632141 34.229691565036774 5.9964423179626456 8.4949599504470807 23.486065745353702 8.4949599504470807
leaf_count=30 31 25 32 31 38 28 39 24 436 137 24 34 94 34
internal_value=-0.0483827 -0.0469209 -0.0344467 -0.0240408 -0.0513359 -0.0525288 -0.054236 -0.052269 -0.0474518 -0.0661521 -0.063157 -0.0771246 -0.0120241 -0.0192454
internal_weight=259.096 251.351 77.9538 57.2161 17.4896 173.397 165.902 156.157 115.931 40.2261 20.7377 14.4914 39.7264 31.981
internal_count=1037 1006 312 229 70 694 664 625 464 161 83 58 159 128
is_linear=0
shrinkage=1
Tree=1
num_leaves=17
num_cat=0
split_feature=52 63 82 49 63 11 9 22 47 27 26 50 60 68 37 64
split_gain=9.46149 12.0286 10.1725 11.5657 11.422 11.4943 11.2254 9.35335 10.3758 7.46163 7.15253 10.8161 5.87382 5.6308 5.10807 3.94362
threshold=18.267857142857146 0.57978414386394084 1.7650000000000003 0.74446554478882077 0.52121781606433182 81.062500000000014 8.7121212121212164 1.5000000000000002 0.36820747200853426 0.06249999999999991 7.0312500000000009 0.79401716042796011 12.35416666666667 161.75000000000003 4.5500000000000007 0.34986080718052731
decision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
left_child=15 2 4 9 5 -2 -7 8 -6 13 11 -5 -12 -4 -9 -1
right_child=1 -3 3 10 7 6 -8 14 -10 -11 12 -13 -14 -15 -16 -17
leaf_value=-0.057339030773059012 0.028542958096860364 -0.044782355190073488 0.032618410031766333 0.0057409124284739923 0.013389736272609485 -0.026136292703093016 0.010233031617588078 0.056167817502913067 -0.029699787865729181 -0.038327130963158276 0.074132616447759023 0.052492105838845886 0.026735863001827433 -0.01725653802460729 0.0087599672849197358 -0.011453089463718463
leaf_weight=5.9920798838138563 9.7432544529438001 8.4919032305479032 7.2431264519691485 42.224371567368507 31.477784723043442 57.952298685908318 17.733558729290962 9.2429284304380399 12.488689020276068 9.2367470711469633 6.4985252916812923 9.7452246546745283 11.743822261691092 7.2440863251686096 5.9950847774744025 5.9950532913208008
leaf_count=24 39 34 29 169 126 232 71 37 50 37 26 39 47 29 24 24
internal_value=0.000314777 0.00199863 0.0036638 0.0139129 -0.00299273 -0.0123505 -0.0176148 0.01051 0.00115015 -0.010233 0.0220716 0.0145076 0.0436201 0.00767928 0.0375162 -0.0343904
internal_weight=259.049 247.061 238.57 93.9359 144.634 85.4291 75.6859 59.2045 43.9665 23.724 70.2119 51.9696 18.2423 14.4872 15.238 11.9871
internal_count=1037 989 955 376 579 342 303 237 176 95 281 208 73 58 61 48
is_linear=0
shrinkage=0.04
end of trees
feature_importances:
Column_25=3
Column_40=2
Column_49=2
Column_63=2
Column_82=2
Column_9=1
Column_11=1
Column_17=1
Column_22=1
Column_26=1
Column_27=1
Column_34=1
Column_36=1
Column_37=1
Column_47=1
Column_50=1
Column_52=1
Column_60=1
Column_64=1
Column_68=1
Column_69=1
Column_72=1
Column_77=1
Column_80=1
parameters:
[boosting: gbdt]
[objective: binary]
[metric: binary_logloss]
[tree_learner: serial]
[device_type: cpu]
[data_sample_strategy: bagging]
[data: ]
[valid: ]
[num_iterations: 1200]
[learning_rate: 0.04]
[num_leaves: 31]
[num_threads: 4]
[seed: 42]
[deterministic: 0]
[force_col_wise: 0]
[force_row_wise: 0]
[histogram_pool_size: -1]
[max_depth: 6]
[min_data_in_leaf: 24]
[min_sum_hessian_in_leaf: 0.001]
[bagging_fraction: 0.84]
[pos_bagging_fraction: 1]
[neg_bagging_fraction: 1]
[bagging_freq: 5]
[bagging_seed: 400]
[bagging_by_query: 0]
[feature_fraction: 0.82]
[feature_fraction_bynode: 1]
[feature_fraction_seed: 30056]
[extra_trees: 0]
[extra_seed: 12879]
[early_stopping_round: 0]
[early_stopping_min_delta: 0]
[first_metric_only: 0]
[max_delta_step: 0]
[lambda_l1: 0]
[lambda_l2: 0]
[linear_lambda: 0]
[min_gain_to_split: 0]
[drop_rate: 0.1]
[max_drop: 50]
[skip_drop: 0.5]
[xgboost_dart_mode: 0]
[uniform_drop: 0]
[drop_seed: 17869]
[top_rate: 0.2]
[other_rate: 0.1]
[min_data_per_group: 100]
[max_cat_threshold: 32]
[cat_l2: 10]
[cat_smooth: 10]
[max_cat_to_onehot: 4]
[top_k: 20]
[monotone_constraints: ]
[monotone_constraints_method: basic]
[monotone_penalty: 0]
[feature_contri: ]
[forcedsplits_filename: ]
[refit_decay_rate: 0.9]
[cegb_tradeoff: 1]
[cegb_penalty_split: 0]
[cegb_penalty_feature_lazy: ]
[cegb_penalty_feature_coupled: ]
[path_smooth: 0]
[interaction_constraints: ]
[verbosity: -1]
[saved_feature_importance_type: 0]
[use_quantized_grad: 0]
[num_grad_quant_bins: 4]
[quant_train_renew_leaf: 0]
[stochastic_rounding: 1]
[linear_tree: 0]
[max_bin: 255]
[max_bin_by_feature: ]
[min_data_in_bin: 3]
[bin_construct_sample_cnt: 200000]
[data_random_seed: 175]
[is_enable_sparse: 1]
[enable_bundle: 1]
[use_missing: 1]
[zero_as_missing: 0]
[feature_pre_filter: 1]
[pre_partition: 0]
[two_round: 0]
[header: 0]
[label_column: ]
[weight_column: ]
[group_column: ]
[ignore_column: ]
[categorical_feature: ]
[forcedbins_filename: ]
[precise_float_parser: 0]
[parser_config_file: ]
[objective_seed: 16083]
[num_class: 1]
[is_unbalance: 0]
[scale_pos_weight: 1]
[sigmoid: 1]
[boost_from_average: 1]
[reg_sqrt: 0]
[alpha: 0.9]
[fair_c: 1]
[poisson_max_delta_step: 0.7]
[tweedie_variance_power: 1.5]
[lambdarank_truncation_level: 30]
[lambdarank_norm: 1]
[label_gain: ]
[lambdarank_position_bias_regularization: 0]
[eval_at: ]
[multi_error_top_k: 1]
[auc_mu_weights: ]
[num_machines: 1]
[local_listen_port: 12400]
[time_out: 120]
[machine_list_filename: ]
[machines: ]
[gpu_platform_id: -1]
[gpu_device_id: -1]
[gpu_use_dp: 0]
[num_gpu: 1]
end of parameters
pandas_categorical:null
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+569
View File
@@ -0,0 +1,569 @@
"""
Calibration Module for XGBoost Models
=====================================
Calibrates raw probabilities from XGBoost models using Isotonic Regression.
Ensures that a predicted probability of 70% actually corresponds to a 70% win rate.
Usage:
from ai_engine.models.calibration import Calibrator
calibrator = Calibrator()
calibrated_prob = calibrator.calibrate("ms", raw_prob)
# Training new calibration models:
calibrator.train_calibration(valid_df, market="ms")
"""
import os
import pickle
import json
import numpy as np
import pandas as pd
from datetime import datetime
from typing import Dict, List, Optional, Tuple, Any
from sklearn.isotonic import IsotonicRegression
from sklearn.calibration import calibration_curve
from sklearn.metrics import brier_score_loss
AI_ENGINE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
CALIBRATION_DIR = os.path.join(AI_ENGINE_DIR, "models", "calibration")
os.makedirs(CALIBRATION_DIR, exist_ok=True)
# Supported markets for calibration
SUPPORTED_MARKETS = [
"ms", # Match Result (1X2) - multi-class, calibrated per class
"ms_home", # Standard Home win probability
"ms_home_heavy_fav", # Context: home odds <= 1.40
"ms_home_fav", # Context: 1.40 < home odds <= 1.80
"ms_home_balanced", # Context: 1.80 < home odds <= 2.50
"ms_home_underdog", # Context: home odds > 2.50
"ms_draw", # Draw probability
"ms_away", # Away win probability
"ou15", # Over/Under 1.5
"ou25", # Over/Under 2.5
"ou35", # Over/Under 3.5
"btts", # Both Teams to Score
"ht_ft", # Half-Time/Full-Time
"dc", # Double Chance
"ht", # Half-Time Result
"ht_home", # Half-Time Home win
"ht_draw", # Half-Time Draw
"ht_away", # Half-Time Away win
]
class CalibrationMetrics:
"""Stores calibration quality metrics for a market."""
def __init__(self):
self.brier_score: float = 0.0
self.calibration_error: float = 0.0
self.sample_count: int = 0
self.last_trained: str = ""
self.mean_predicted: float = 0.0
self.mean_actual: float = 0.0
def to_dict(self) -> Dict:
return {
"brier_score": round(self.brier_score, 4),
"calibration_error": round(self.calibration_error, 4),
"sample_count": self.sample_count,
"last_trained": self.last_trained,
"mean_predicted": round(self.mean_predicted, 4),
"mean_actual": round(self.mean_actual, 4),
}
class Calibrator:
"""
Probability calibration using Isotonic Regression.
Isotonic Regression is a non-parametric method that fits a piecewise
constant function that is monotonically increasing. It's ideal for
calibrating probabilities because:
1. It preserves ranking (if P(A) > P(B) before, P(A) > P(B) after)
2. It doesn't assume a specific distribution shape
3. It can correct systematic over/under-confidence
Example:
# Before calibration: model predicts 70% but actual win rate is 60%
# After calibration: model predicts 70% → calibrated to 60%
"""
def __init__(self):
self.calibrators: Dict[str, IsotonicRegression] = {}
self.metrics: Dict[str, CalibrationMetrics] = {}
# Less aggressive shrinkage — only meaningful overconfident bands are pulled.
# Default raised from ~0.85-0.90 to 0.95+ since the orchestrator and config
# already apply market-level multipliers; double-shrinkage was the root cause
# of 24-35pt avg calibrated-vs-raw drops in production traces.
self.heuristic_fallback: Dict[str, float] = {
"ms": 0.96,
"ms_home": 0.96,
"ms_home_heavy_fav": 0.98,
"ms_home_fav": 0.96,
"ms_home_balanced": 0.94,
"ms_home_underdog": 0.92,
"ms_draw": 0.94,
"ms_away": 0.96,
"ou15": 0.96,
"ou25": 0.96,
"ou35": 0.94,
"btts": 0.96,
"ht_ft": 0.92,
"dc": 0.97,
"ht": 0.92,
"ht_home": 0.92,
"ht_draw": 0.92,
"ht_away": 0.92,
}
self._load_calibrators()
def _load_calibrators(self):
"""Load trained calibrators for each market from disk."""
for market in SUPPORTED_MARKETS:
model_path = os.path.join(CALIBRATION_DIR, f"{market}_calibrator.pkl")
metrics_path = os.path.join(CALIBRATION_DIR, f"{market}_metrics.json")
if os.path.exists(model_path):
try:
with open(model_path, "rb") as f:
self.calibrators[market] = pickle.load(f)
print(f"[Calibrator] Loaded calibration model for {market}")
except Exception as e:
print(f"[Calibrator] Warning: Failed to load {market}: {e}")
if os.path.exists(metrics_path):
try:
with open(metrics_path, "r") as f:
data = json.load(f)
metrics = CalibrationMetrics()
metrics.brier_score = data.get("brier_score", 0.0)
metrics.calibration_error = data.get("calibration_error", 0.0)
metrics.sample_count = data.get("sample_count", 0)
metrics.last_trained = data.get("last_trained", "")
metrics.mean_predicted = data.get("mean_predicted", 0.0)
metrics.mean_actual = data.get("mean_actual", 0.0)
self.metrics[market] = metrics
except Exception as e:
print(f"[Calibrator] Warning: Failed to load metrics for {market}: {e}")
# Below this sample count, the isotonic model is treated as untrained
# (raw_prob is returned). Between MIN and FLOOR we ramp from 0 to ~15%
# trust. Between FLOOR and CEILING we ramp to full trust.
# Rationale: 12-sample calibrators are statistical noise; even 30%
# blending on them propagates the noise into the confidence value the
# betting_brain reads downstream.
HARD_MIN_SAMPLES = 50
TRUSTED_SAMPLE_FLOOR = 100
TRUSTED_SAMPLE_CEILING = 400
# Hard cap on how far calibration can move probability in either direction.
MAX_DELTA = 0.20
def calibrate(self, market_type: str, raw_prob: float, odds_val: Optional[float] = None) -> float:
"""
Calibrate a raw probability using Isotonic Regression with safeguards.
Args:
market_type (str): 'ms_home', 'ou25', 'btts', 'ht_ft', etc.
raw_prob (float): The raw probability from XGBoost (0.0 - 1.0)
odds_val (float, optional): The pre-match odds, used for context-aware bucket mapping
Returns:
float: Calibrated probability (0.0 - 1.0)
Safeguards:
* Low-sample trained models are blended with raw_prob to dampen overfit.
* MAX_DELTA caps the per-call adjustment (prevents 40pp swings).
"""
# Normalize market type
market_key = market_type.lower().replace("-", "_")
# Route to bucket if ms_home and odds provided
if market_key == "ms_home" and odds_val is not None and odds_val > 1.0:
if odds_val <= 1.40:
bucket_key = "ms_home_heavy_fav"
elif odds_val <= 1.80:
bucket_key = "ms_home_fav"
elif odds_val <= 2.50:
bucket_key = "ms_home_balanced"
else:
bucket_key = "ms_home_underdog"
if bucket_key in self.calibrators:
market_key = bucket_key
# If we have a trained Isotonic Regression model, use it (with safeguards)
if market_key in self.calibrators:
try:
iso_pred = float(self.calibrators[market_key].predict([raw_prob])[0])
# Sample-count weighted blend with raw probability.
# Sparse models barely move probability; mature models dominate.
metrics = self.metrics.get(market_key)
n_samples = metrics.sample_count if metrics else 0
if n_samples < self.HARD_MIN_SAMPLES:
# Below 50 samples isotonic fit is unreliable — bypass it
# entirely and return raw_prob. The heuristic shrinkage
# below would still apply a model-version multiplier elsewhere.
return float(np.clip(raw_prob, 0.01, 0.99))
if n_samples >= self.TRUSTED_SAMPLE_CEILING:
iso_weight = 1.0
elif n_samples <= self.TRUSTED_SAMPLE_FLOOR:
# Linear ramp from 0% at HARD_MIN_SAMPLES to ~25% at FLOOR
span = self.TRUSTED_SAMPLE_FLOOR - self.HARD_MIN_SAMPLES
iso_weight = 0.25 * (n_samples - self.HARD_MIN_SAMPLES) / span
else:
# Linearly ramp 25% → 100% between floor and ceiling
span = self.TRUSTED_SAMPLE_CEILING - self.TRUSTED_SAMPLE_FLOOR
iso_weight = 0.25 + 0.75 * (n_samples - self.TRUSTED_SAMPLE_FLOOR) / span
blended = iso_weight * iso_pred + (1.0 - iso_weight) * raw_prob
# Cap delta to avoid huge swings on noisy calibrators
delta = blended - raw_prob
if delta > self.MAX_DELTA:
blended = raw_prob + self.MAX_DELTA
elif delta < -self.MAX_DELTA:
blended = raw_prob - self.MAX_DELTA
return float(np.clip(blended, 0.01, 0.99))
except Exception as e:
print(f"[Calibrator] Warning: Isotonic failed for {market_key}: {e}")
# Fall through to heuristic
# Fallback to heuristic calibration
return self._heuristic_calibrate(market_key, raw_prob)
def _heuristic_calibrate(self, market_type: str, raw_prob: float) -> float:
"""
Heuristic calibration fallback when no trained model exists.
This applies a conservative shrinkage towards the mean:
- Binary markets (OU, BTTS): shrink towards 0.5
- Multi-class (MS): shrink towards 0.33
- HT/FT: stronger shrinkage due to higher variance
"""
# Get shrinkage factor for this market
shrinkage = self.heuristic_fallback.get(market_type, 0.90)
if market_type in ["ms", "ms_home", "ms_home_heavy_fav", "ms_home_fav", "ms_home_balanced", "ms_home_underdog", "ms_draw", "ms_away"]:
# Pull towards 0.33 (uniform for 3-class)
return (raw_prob * shrinkage) + (0.33 * (1.0 - shrinkage))
elif market_type in ["ou15", "ou25", "ou35", "btts"]:
# Pull towards 0.5 (uniform for binary)
return (raw_prob * shrinkage) + (0.5 * (1.0 - shrinkage))
elif market_type in ["ht_ft", "ht"]:
# Stronger shrinkage for high-variance markets
return raw_prob * shrinkage
elif market_type == "dc":
# Double chance is more reliable
return (raw_prob * shrinkage) + (0.66 * (1.0 - shrinkage))
return raw_prob
def train_calibration(
self,
df: pd.DataFrame,
market: str,
prob_col: str,
actual_col: str,
min_samples: int = 100,
save: bool = True,
) -> CalibrationMetrics:
"""
Train an Isotonic Regression calibration model for a specific market.
Args:
df: DataFrame with predictions and actual outcomes
market: Market identifier (e.g., 'ms_home', 'ou25', 'btts')
prob_col: Column name for raw probabilities
actual_col: Column name for actual outcomes (0 or 1)
min_samples: Minimum samples required to train
save: Whether to save the model to disk
Returns:
CalibrationMetrics with quality metrics
"""
# Filter valid data
valid_df = df[[prob_col, actual_col]].dropna()
n_samples = len(valid_df)
if n_samples < min_samples:
print(f"[Calibrator] Warning: Only {n_samples} samples for {market}, "
f"need at least {min_samples}")
metrics = CalibrationMetrics()
metrics.sample_count = n_samples
return metrics
# Extract arrays
raw_probs = valid_df[prob_col].values
actuals = valid_df[actual_col].values
# Train Isotonic Regression
iso = IsotonicRegression(out_of_bounds="clip", increasing=True)
iso.fit(raw_probs, actuals)
# Calculate calibrated probabilities
calibrated_probs = iso.predict(raw_probs)
# Calculate metrics
metrics = CalibrationMetrics()
metrics.sample_count = n_samples
metrics.last_trained = datetime.utcnow().isoformat()
metrics.brier_score = brier_score_loss(actuals, calibrated_probs)
metrics.mean_predicted = np.mean(raw_probs)
metrics.mean_actual = np.mean(actuals)
# Calculate Expected Calibration Error (ECE)
metrics.calibration_error = self._calculate_ece(
calibrated_probs, actuals, n_bins=10
)
# Store in memory
self.calibrators[market] = iso
self.metrics[market] = metrics
# Save to disk
if save:
self._save_calibration(market, iso, metrics)
print(f"[Calibrator] Trained {market}: "
f"Brier={metrics.brier_score:.4f}, "
f"ECE={metrics.calibration_error:.4f}, "
f"n={n_samples}")
return metrics
def train_all_markets(
self,
df: pd.DataFrame,
market_config: Dict[str, Tuple[str, str]],
min_samples: int = 100,
) -> Dict[str, CalibrationMetrics]:
"""
Train calibration models for multiple markets at once.
Args:
df: DataFrame with all predictions and outcomes
market_config: Dict mapping market -> (prob_col, actual_col)
e.g., {'ou25': ('ou25_over_prob', 'ou25_over_actual')}
min_samples: Minimum samples per market
Returns:
Dict of market -> CalibrationMetrics
"""
results = {}
for market, (prob_col, actual_col) in market_config.items():
print(f"\n[Calibrator] Training {market}...")
try:
metrics = self.train_calibration(
df=df,
market=market,
prob_col=prob_col,
actual_col=actual_col,
min_samples=min_samples,
save=True,
)
results[market] = metrics
except Exception as e:
print(f"[Calibrator] Failed to train {market}: {e}")
return results
def _calculate_ece(
self,
probs: np.ndarray,
actuals: np.ndarray,
n_bins: int = 10
) -> float:
"""
Calculate Expected Calibration Error (ECE).
ECE = sum(|bin_accuracy - bin_confidence| * bin_weight)
Lower is better. Perfect calibration = 0.
"""
bin_boundaries = np.linspace(0, 1, n_bins + 1)
ece = 0.0
for i in range(n_bins):
in_bin = (probs >= bin_boundaries[i]) & (probs < bin_boundaries[i + 1])
prop_in_bin = np.mean(in_bin)
if prop_in_bin > 0:
accuracy_in_bin = np.mean(actuals[in_bin])
avg_confidence_in_bin = np.mean(probs[in_bin])
ece += np.abs(accuracy_in_bin - avg_confidence_in_bin) * prop_in_bin
return ece
def _save_calibration(
self,
market: str,
calibrator: IsotonicRegression,
metrics: CalibrationMetrics
):
"""Save calibration model and metrics to disk."""
# Save model
model_path = os.path.join(CALIBRATION_DIR, f"{market}_calibrator.pkl")
with open(model_path, "wb") as f:
pickle.dump(calibrator, f)
# Save metrics
metrics_path = os.path.join(CALIBRATION_DIR, f"{market}_metrics.json")
with open(metrics_path, "w") as f:
json.dump(metrics.to_dict(), f, indent=2)
print(f"[Calibrator] Saved {market} to {CALIBRATION_DIR}")
def get_calibration_report(self) -> Dict[str, Any]:
"""Generate a summary report of all calibration models."""
report = {
"trained_markets": list(self.calibrators.keys()),
"metrics": {},
"heuristic_only": [],
}
for market in SUPPORTED_MARKETS:
if market in self.metrics:
report["metrics"][market] = self.metrics[market].to_dict()
elif market not in self.calibrators:
report["heuristic_only"].append(market)
return report
def get_calibrated_probabilities(
self,
market: str,
raw_probs: np.ndarray
) -> np.ndarray:
"""
Batch calibration for array of probabilities.
Args:
market: Market type
raw_probs: Array of raw probabilities
Returns:
Array of calibrated probabilities
"""
return np.array([self.calibrate(market, p) for p in raw_probs])
# Singleton instance
_calibrator_instance: Optional[Calibrator] = None
def get_calibrator() -> Calibrator:
"""Get or create the global Calibrator instance."""
global _calibrator_instance
if _calibrator_instance is None:
_calibrator_instance = Calibrator()
return _calibrator_instance
# ── FINAL-OUTPUT RECALIBRATION LAYER (V31e) ─────────────────────────────────
# A thin, LAST-STEP per-market map: production calibrated_confidence -> reality.
# Built from a 60-day backtest (scripts/fit_recalibrators.py); inference is a
# pure np.interp over a 99-point monotone grid — NO sklearn needed at runtime.
#
# WHY THIS EXISTS:
# The upstream chain (temperature scaling T=1.5 -> per-outcome isotonic ->
# POST_CAL_TRUST blend) crushes high-base-rate binary markets toward 0.5,
# so "system says 51%" can really hit 70%. MS survives (near-uniform picks),
# which is why MS is already well-calibrated and OU/HT-OU markets are not.
#
# SAFETY / "DO NO HARM":
# * Only markets whose fit-time ECE >= 5.0 carry a map (currently OU15, OU35,
# HT_OU05, HT_OU15). MS and every already-good market have NO map ->
# recalibrate_conf() returns the input UNCHANGED -> guaranteed no regression.
# * Out-of-sample validated (fit=older 65%, test=unseen 35%):
# MS ECE 1.1 -> 1.3 (flat, safe)
# HT_OU15 29.2 -> 0.8
# OU15 19.0 -> 3.3
# OU35 13.9 -> 4.3
# HT_OU05 11.5 -> 2.4
# * Adjusts ONLY the displayed confidence number. All rich analysis payload
# (probabilities, edges, vetoes, tiers, bands) is preserved untouched, and
# the pre-recalibration value is kept for audit by the caller.
FINAL_RECALIBRATOR_PATH = os.path.join(CALIBRATION_DIR, "final_recalibrators.json")
class FinalRecalibrator:
"""Per-market final-output recalibration via piecewise-linear interpolation.
Loads a compact JSON of 99-point lookup grids (x=calibrated_confidence/100,
y=reality). Markets absent from the file pass through as identity.
"""
def __init__(self, path: str = FINAL_RECALIBRATOR_PATH):
self.grid: Optional[np.ndarray] = None
self.maps: Dict[str, np.ndarray] = {}
self.source_path = path
self._load(path)
def _load(self, path: str) -> None:
if not os.path.exists(path):
print(f"[FinalRecalibrator] No map file at {path} — pass-through mode (all markets unchanged)")
return
try:
with open(path, "r") as f:
data = json.load(f)
meta = data.get("_meta", {})
grid = meta.get("grid")
if not grid:
print("[FinalRecalibrator] Map file missing _meta.grid — pass-through mode")
return
self.grid = np.asarray(grid, dtype=float)
for market, m in data.items():
if market == "_meta" or not isinstance(m, dict):
continue
y = m.get("y")
if y and len(y) == len(self.grid):
self.maps[str(market).upper()] = np.asarray(y, dtype=float)
else:
print(f"[FinalRecalibrator] Skipped {market}: grid/y length mismatch")
print(f"[FinalRecalibrator] Loaded reality maps for {sorted(self.maps.keys())} "
f"(everything else, incl. MS, passes through unchanged)")
except Exception as e:
print(f"[FinalRecalibrator] Warning: failed to load {path}: {e} — pass-through mode")
self.grid = None
self.maps = {}
def has_map(self, market: str) -> bool:
return bool(self.maps) and (market or "").upper() in self.maps
def recalibrate_conf(self, market: str, calibrated_conf: float) -> float:
"""Map a 0100 confidence to its reality-aligned value.
Markets without a trained map (including MS and all already-good
markets) return the input UNCHANGED. Any failure also returns the
input unchanged so this layer can never regress production.
"""
try:
key = (market or "").upper()
if self.grid is None or key not in self.maps:
return calibrated_conf
x = float(calibrated_conf) / 100.0
x = min(max(x, 0.0), 1.0)
y = float(np.interp(x, self.grid, self.maps[key]))
return max(1.0, min(99.0, y * 100.0))
except Exception:
return calibrated_conf
# Singleton instance
_final_recalibrator_instance: Optional[FinalRecalibrator] = None
def get_final_recalibrator() -> FinalRecalibrator:
"""Get or create the global FinalRecalibrator instance."""
global _final_recalibrator_instance
if _final_recalibrator_instance is None:
_final_recalibrator_instance = FinalRecalibrator()
return _final_recalibrator_instance
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.2444,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.117788",
"mean_predicted": 0.4015,
"mean_actual": 0.5454
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1264,
"calibration_error": 0.0,
"sample_count": 26,
"last_trained": "2026-05-11T23:38:11.473645",
"mean_predicted": 0.757,
"mean_actual": 0.6923
}
@@ -0,0 +1,532 @@
{
"_meta": {
"grid": [
0.01,
0.02,
0.03,
0.04,
0.05,
0.06,
0.07,
0.08,
0.09,
0.1,
0.11,
0.12,
0.13,
0.14,
0.15,
0.16,
0.17,
0.18,
0.19,
0.2,
0.21,
0.22,
0.23,
0.24,
0.25,
0.26,
0.27,
0.28,
0.29,
0.3,
0.31,
0.32,
0.33,
0.34,
0.35,
0.36,
0.37,
0.38,
0.39,
0.4,
0.41,
0.42,
0.43,
0.44,
0.45,
0.46,
0.47,
0.48,
0.49,
0.5,
0.51,
0.52,
0.53,
0.54,
0.55,
0.56,
0.57,
0.58,
0.59,
0.6,
0.61,
0.62,
0.63,
0.64,
0.65,
0.66,
0.67,
0.68,
0.69,
0.7,
0.71,
0.72,
0.73,
0.74,
0.75,
0.76,
0.77,
0.78,
0.79,
0.8,
0.81,
0.82,
0.83,
0.84,
0.85,
0.86,
0.87,
0.88,
0.89,
0.9,
0.91,
0.92,
0.93,
0.94,
0.95,
0.96,
0.97,
0.98,
0.99
],
"threshold_ece": 5.0,
"source": "/tmp/multi_60d.csv",
"note": "x=calibrated_confidence/100; new=interp(grid,y)"
},
"HT_OU05": {
"grid_min": 0.01,
"grid_max": 0.99,
"n": 3683,
"y": [
0.0833,
0.3333,
0.3333,
0.3333,
0.3394,
0.3636,
0.3636,
0.3636,
0.3636,
0.3636,
0.3636,
0.3636,
0.3636,
0.3727,
0.3955,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4,
0.4583,
0.6286,
0.6286,
0.6286,
0.6286,
0.6286,
0.6286,
0.6286,
0.6286,
0.6286,
0.6531,
0.672,
0.7143,
0.7262,
0.7262,
0.7312,
0.7406,
0.7655,
0.7655,
0.8495,
0.8495,
0.8495,
0.8495,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0
]
},
"HT_OU15": {
"grid_min": 0.01,
"grid_max": 0.99,
"n": 5200,
"y": [
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4118,
0.4521,
0.5385,
0.5385,
0.5385,
0.5848,
0.6142,
0.6142,
0.6142,
0.6245,
0.6245,
0.6245,
0.6262,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6275,
0.6452,
0.6842,
0.6842,
0.6842,
0.6842,
0.6842,
0.6842,
0.8077,
0.8077,
0.8077,
0.8077,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0
]
},
"OU15": {
"grid_min": 0.01,
"grid_max": 0.99,
"n": 2724,
"y": [
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.2797,
0.4352,
0.6295,
0.7165,
0.7174,
0.7987,
0.8197,
0.8197,
0.8197,
0.8197,
0.8197,
0.8197,
0.9118,
0.9276,
0.9502,
0.9729,
0.9955,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0
]
},
"OU35": {
"grid_min": 0.01,
"grid_max": 0.99,
"n": 4277,
"y": [
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.474,
0.474,
0.474,
0.474,
0.474,
0.474,
0.474,
0.474,
0.474,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.571,
0.6222,
0.6222,
0.6222,
0.6222,
0.6222,
0.7747,
0.7747,
0.7747,
0.7747,
0.7747,
0.7788,
0.8195,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.8333,
0.836,
0.8624,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889,
0.8889
]
}
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1852,
"calibration_error": 0.0,
"sample_count": 4989,
"last_trained": "2026-05-24T23:10:58.129841",
"mean_predicted": 0.2305,
"mean_actual": 0.263
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.2383,
"calibration_error": 0.0,
"sample_count": 4989,
"last_trained": "2026-05-24T23:10:58.142015",
"mean_predicted": 0.4549,
"mean_actual": 0.4001
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1667,
"calibration_error": 0.0,
"sample_count": 12,
"last_trained": "2026-05-11T23:38:11.484990",
"mean_predicted": 0.3579,
"mean_actual": 0.25
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.2137,
"calibration_error": 0.0,
"sample_count": 4989,
"last_trained": "2026-05-24T23:10:58.153390",
"mean_predicted": 0.3146,
"mean_actual": 0.3369
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1956,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.164629",
"mean_predicted": 0.3215,
"mean_actual": 0.3196
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1816,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.175615",
"mean_predicted": 0.2453,
"mean_actual": 0.2426
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.2241,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.186473",
"mean_predicted": 0.4332,
"mean_actual": 0.4378
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.1708,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.196981",
"mean_predicted": 0.7586,
"mean_actual": 0.7732
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.2386,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.207151",
"mean_predicted": 0.5013,
"mean_actual": 0.5538
}
Binary file not shown.
@@ -0,0 +1,8 @@
{
"brier_score": 0.215,
"calibration_error": 0.0,
"sample_count": 5000,
"last_trained": "2026-05-24T23:10:58.217663",
"mean_predicted": 0.305,
"mean_actual": 0.3408
}
+191
View File
@@ -0,0 +1,191 @@
"""
League-Specific Model Loader
=============================
Loads per-league XGBoost models + isotonic calibrators trained by
scripts/train_league_models.py and provides a unified prediction interface.
Falls back to general V25 for any market/league without a dedicated model.
"""
import os
import json
import pickle
from functools import lru_cache
from typing import Dict, Optional, Tuple
import numpy as np
import pandas as pd
import xgboost as xgb
AI_ENGINE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
LEAGUE_MODEL_DIR = os.path.join(AI_ENGINE_DIR, "models", "league_specific")
# Market file name → (num_class, label_list)
MARKET_META: Dict[str, Tuple[int, list]] = {
"ms": (3, ["1", "X", "2"]),
"ou15": (2, ["Over", "Under"]),
"ou25": (2, ["Over", "Under"]),
"ou35": (2, ["Over", "Under"]),
"btts": (2, ["Yes", "No"]),
"ht": (3, ["1", "X", "2"]),
"ht_ou05": (2, ["Over", "Under"]),
"ht_ou15": (2, ["Over", "Under"]),
"htft": (9, ["1/1","1/X","1/2","X/1","X/X","X/2","2/1","2/X","2/2"]),
"oe": (2, ["Odd", "Even"]),
"cards": (2, ["Over", "Under"]),
"handicap": (3, ["1", "X", "2"]),
}
# Signal key map (file key → uppercase signal key used in _get_v25_signal)
FILE_TO_SIGNAL = {
"ms": "MS", "ou15": "OU15", "ou25": "OU25", "ou35": "OU35",
"btts": "BTTS", "ht": "HT", "ht_ou05": "HT_OU05", "ht_ou15": "HT_OU15",
"htft": "HTFT", "oe": "OE", "cards": "CARDS", "handicap": "HCAP",
}
class LeagueModel:
"""Holds XGBoost models + isotonic calibrators for one league."""
def __init__(self, league_id: str):
self.league_id = league_id
self.league_dir = os.path.join(LEAGUE_MODEL_DIR, league_id)
self.models: Dict[str, xgb.Booster] = {} # market_key → booster
self.calibrators: Dict[str, object] = {} # cal_key → isotonic
self.feature_cols: Optional[list] = None
self._loaded = False
def load(self) -> bool:
if not os.path.isdir(self.league_dir):
return False
try:
fc_path = os.path.join(self.league_dir, "feature_cols.json")
if os.path.exists(fc_path):
with open(fc_path) as f:
self.feature_cols = json.load(f)
for mkey in MARKET_META:
xgb_path = os.path.join(self.league_dir, f"xgb_{mkey}.json")
if os.path.exists(xgb_path) and os.path.getsize(xgb_path) > 100:
b = xgb.Booster()
b.load_model(xgb_path)
self.models[mkey] = b
for fname in os.listdir(self.league_dir):
if fname.startswith("cal_") and fname.endswith(".pkl"):
cal_key = fname[4:-4] # strip cal_ and .pkl
with open(os.path.join(self.league_dir, fname), "rb") as f:
self.calibrators[cal_key] = pickle.load(f)
self._loaded = bool(self.models or self.calibrators)
return self._loaded
except Exception as e:
print(f"[LeagueModel] Load failed for {self.league_id}: {e}")
return False
def has_market(self, mkey: str) -> bool:
return mkey in self.models
def predict_market(
self,
mkey: str,
feature_row: Dict[str, float],
) -> Optional[Dict[str, float]]:
"""
Predict one market using league-specific XGBoost + isotonic calibration.
Returns {label: prob} dict or None if no model available.
"""
if mkey not in self.models:
return None
num_class, labels = MARKET_META[mkey]
fc = self.feature_cols
if fc is None:
# Fallback to whatever the booster expects (it knows its feature names)
fc = list(self.models[mkey].feature_names or [])
try:
X = pd.DataFrame([{col: feature_row.get(col, 0.0) for col in fc}])
dmat = xgb.DMatrix(X)
raw = self.models[mkey].predict(dmat)
if num_class > 2:
probs_arr = raw.reshape(-1, num_class)[0]
probs = {labels[i]: float(probs_arr[i]) for i in range(num_class)}
# Apply isotonic calibration per class
cal_total = 0.0
for i, label in enumerate(labels):
cal_key = f"{mkey}_{i}"
if cal_key in self.calibrators:
p_cal = float(self.calibrators[cal_key].predict([probs_arr[i]])[0])
probs[label] = max(0.01, min(0.99, p_cal))
cal_total += probs[label]
if cal_total > 0:
probs = {k: v / cal_total for k, v in probs.items()}
else:
p = float(raw[0])
cal_key = mkey
if cal_key in self.calibrators:
p = float(self.calibrators[cal_key].predict([p])[0])
p = max(0.01, min(0.99, p))
probs = {labels[0]: p, labels[1]: 1.0 - p}
return probs
except Exception as e:
print(f"[LeagueModel] predict_market({mkey}) failed for {self.league_id}: {e}")
return None
class LeagueModelLoader:
"""
In-memory cache for league-specific models.
Thread-safe for single-process async servers (FastAPI/uvicorn).
"""
def __init__(self, max_cached: int = 80):
self._cache: Dict[str, Optional[LeagueModel]] = {}
self._max_cached = max_cached
def get(self, league_id: str) -> Optional[LeagueModel]:
"""Return loaded LeagueModel for this league, or None if unavailable."""
if league_id in self._cache:
return self._cache[league_id]
# Evict oldest entry if cache is full
if len(self._cache) >= self._max_cached:
oldest = next(iter(self._cache))
del self._cache[oldest]
model = LeagueModel(league_id)
loaded = model.load()
self._cache[league_id] = model if loaded else None
if loaded:
n_models = len(model.models)
n_cals = len(model.calibrators)
print(f"[LeagueModel] Loaded {league_id}: {n_models} XGB models, {n_cals} calibrators")
return self._cache[league_id]
def available_leagues(self) -> list:
if not os.path.isdir(LEAGUE_MODEL_DIR):
return []
return [d for d in os.listdir(LEAGUE_MODEL_DIR)
if os.path.isdir(os.path.join(LEAGUE_MODEL_DIR, d))]
def readiness_summary(self) -> dict:
leagues = self.available_leagues()
return {
"league_specific_dir": LEAGUE_MODEL_DIR,
"available_leagues": len(leagues),
"cached": len([v for v in self._cache.values() if v is not None]),
}
# ── Singleton ──────────────────────────────────────────────────────
_loader: Optional[LeagueModelLoader] = None
def get_league_model_loader() -> LeagueModelLoader:
global _loader
if _loader is None:
_loader = LeagueModelLoader()
return _loader
+87
View File
@@ -0,0 +1,87 @@
"""Market-anchored calibration (V35) — pure functions, no I/O.
WHY THIS EXISTS
---------------
The model's invented per-market probabilities were *measured* to be badly
overconfident. Grading the engine's own stored predictions against actual
results: it says ~50% where reality is ~25%, ~67% where reality is ~37%
(calibration error / ECE on the order of 25-30%). That mis-calibration is the
direct cause of the false "value" picks and the negative realised ROI.
The de-vigged market price, by contrast, is empirically near-perfectly
calibrated. Out-of-sample (correction fit on 2023-24, tested on 2025-26;
78k real-odds football matches) the de-vigged market's ECE was:
home 1.56% | draw 1.85% | away 1.49% | over2.5 1.79% | btts 1.38%
Adding one small, large-sample home-favourite correction cut MS-home ECE
from 1.56% -> 0.64%.
So for the DISPLAYED probabilities we anchor to the de-vigged market and apply
only that one proven correction. ~20-40x more calibrated than the model's
numbers, and fully transparent.
These functions are pure (stdlib only) so they can be unit-tested in isolation
without the DB or the heavy model stack.
"""
from __future__ import annotations
from typing import List, Optional, Tuple
def devig(odds: List[Optional[float]]) -> Optional[List[float]]:
"""Vig-removed (fair) probabilities from a group of decimal odds.
``p_i = (1/odds_i) / Σ(1/odds_j)`` normalising the raw implied
probabilities to sum to 1 removes the bookmaker margin.
Returns ``None`` when ANY leg is missing or non-real (``<= 1.01``). That is
deliberate: a market with a missing/placeholder leg has no real price, and
the product rule is to never fabricate numbers for a match without odds.
"""
if not odds or any(o is None or float(o) <= 1.01 for o in odds):
return None
inv = [1.0 / float(o) for o in odds]
total = sum(inv)
if total <= 0.0:
return None
return [x / total for x in inv]
# Home-favourite correction: measured (actual home-win rate de-vigged implied)
# by implied-home band, out-of-sample on real-odds matches. Big home favourites
# win a few points MORE than the de-vigged price implies; underdogs are roughly
# unbiased. Values are deliberately conservative — universal and shrunk toward 0
# vs the raw tier-0 (soft-league) edge, because the bias is weaker in efficient
# top leagues. Applying these took MS-home OOS ECE 1.56% -> 0.64%.
_HOME_FAV_BANDS: Tuple[Tuple[float, float, float], ...] = (
(0.45, 0.55, 0.010),
(0.55, 0.65, 0.018),
(0.65, 0.75, 0.028),
(0.75, 1.01, 0.034),
)
def home_favorite_delta(p_home: float) -> float:
"""Additive correction to the de-vigged home-win probability.
Zero below 0.45 (no measured bias for non-favourites)."""
for lo, hi, delta in _HOME_FAV_BANDS:
if lo <= p_home < hi:
return delta
return 0.0
def apply_home_correction(
p1: float, px: float, p2: float
) -> Tuple[float, float, float]:
"""Apply the home-favourite delta to a 3-way (1, X, 2) probability vector,
renormalising draw/away so the three still sum to 1.0."""
delta = home_favorite_delta(p1)
if delta <= 0.0:
return p1, px, p2
p1n = min(0.98, p1 + delta)
remaining = 1.0 - p1n
rest = px + p2
if rest <= 0.0:
return p1n, px, p2
return p1n, px / rest * remaining, p2 / rest * remaining
Binary file not shown.
+154
View File
@@ -0,0 +1,154 @@
[
"home_overall_elo",
"away_overall_elo",
"elo_diff",
"home_home_elo",
"away_away_elo",
"home_form_elo",
"away_form_elo",
"form_elo_diff",
"home_goals_avg",
"home_conceded_avg",
"away_goals_avg",
"away_conceded_avg",
"home_clean_sheet_rate",
"away_clean_sheet_rate",
"home_scoring_rate",
"away_scoring_rate",
"home_winning_streak",
"away_winning_streak",
"home_unbeaten_streak",
"away_unbeaten_streak",
"h2h_total_matches",
"h2h_home_win_rate",
"h2h_draw_rate",
"h2h_avg_goals",
"h2h_btts_rate",
"h2h_over25_rate",
"home_avg_possession",
"away_avg_possession",
"home_avg_shots_on_target",
"away_avg_shots_on_target",
"home_shot_conversion",
"away_shot_conversion",
"home_avg_corners",
"away_avg_corners",
"odds_ms_h",
"odds_ms_d",
"odds_ms_a",
"implied_home",
"implied_draw",
"implied_away",
"odds_ht_ms_h",
"odds_ht_ms_d",
"odds_ht_ms_a",
"odds_ou05_o",
"odds_ou05_u",
"odds_ou15_o",
"odds_ou15_u",
"odds_ou25_o",
"odds_ou25_u",
"odds_ou35_o",
"odds_ou35_u",
"odds_ht_ou05_o",
"odds_ht_ou05_u",
"odds_ht_ou15_o",
"odds_ht_ou15_u",
"odds_btts_y",
"odds_btts_n",
"odds_ms_h_present",
"odds_ms_d_present",
"odds_ms_a_present",
"odds_ht_ms_h_present",
"odds_ht_ms_d_present",
"odds_ht_ms_a_present",
"odds_ou05_o_present",
"odds_ou05_u_present",
"odds_ou15_o_present",
"odds_ou15_u_present",
"odds_ou25_o_present",
"odds_ou25_u_present",
"odds_ou35_o_present",
"odds_ou35_u_present",
"odds_ht_ou05_o_present",
"odds_ht_ou05_u_present",
"odds_ht_ou15_o_present",
"odds_ht_ou15_u_present",
"odds_btts_y_present",
"odds_btts_n_present",
"home_xga",
"away_xga",
"league_avg_goals",
"league_zero_goal_rate",
"upset_atmosphere",
"upset_motivation",
"upset_fatigue",
"upset_potential",
"referee_home_bias",
"referee_avg_goals",
"referee_cards_total",
"referee_avg_yellow",
"referee_experience",
"home_momentum_score",
"away_momentum_score",
"momentum_diff",
"home_squad_quality",
"away_squad_quality",
"squad_diff",
"home_key_players",
"away_key_players",
"home_missing_impact",
"away_missing_impact",
"home_goals_form",
"away_goals_form",
"home_lineup_goals_per90",
"away_lineup_goals_per90",
"home_lineup_assists_per90",
"away_lineup_assists_per90",
"home_squad_continuity",
"away_squad_continuity",
"home_top_scorer_form",
"away_top_scorer_form",
"home_avg_player_exp",
"away_avg_player_exp",
"home_goals_diversity",
"away_goals_diversity",
"h2h_home_goals_avg",
"h2h_away_goals_avg",
"h2h_recent_trend",
"h2h_venue_advantage",
"home_rolling5_goals",
"home_rolling5_conceded",
"home_rolling10_goals",
"home_rolling10_conceded",
"home_rolling20_goals",
"home_rolling20_conceded",
"away_rolling5_goals",
"away_rolling5_conceded",
"away_rolling10_goals",
"away_rolling10_conceded",
"home_rolling5_cs",
"away_rolling5_cs",
"home_venue_goals",
"home_venue_conceded",
"away_venue_goals",
"away_venue_conceded",
"home_goal_trend",
"away_goal_trend",
"home_days_rest",
"away_days_rest",
"match_month",
"is_season_start",
"is_season_end",
"attack_vs_defense_home",
"attack_vs_defense_away",
"xg_diff",
"form_momentum_interaction",
"elo_form_consistency",
"upset_x_elo_gap",
"league_home_win_rate",
"league_draw_rate",
"league_btts_rate",
"league_ou25_rate",
"league_reliability_score"
]
Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More