Add backtest pipeline, betting_brain filters, score coherence + social v3

betting_brain.py:
- HARD_MIN_SAMPLES=50 floor for calibrator bypass
- ev_edge < 0 + >= 0.20 hard vetoes
- BTTS muted (grid search found no profitable config)
- Per-market optimal envelopes (MS, OU25)
- Score coherence filter: main_pick must agree with score prediction
- HTFT reversal cross-check for MS picks

feature_builder.py / data_loader.py:
- Real home/away_position from data (was hardcoded 10)
- Cup detection wired into UpsetEngine
- _estimate_league_position with 300-day season filter

New scripts:
- diagnostic_backtest.py: per-bet diagnostic backtest with loss patterns
- optimize_filters.py: grid search per-market optimal thresholds
- analyze_backtest_csv.py: root-cause hypothesis testing on CSV
- compare_backtests.py: side-by-side validation with verdict
- test_score_coherence.py: smoke test for coherence filter (20/20 pass)

Reports:
- diagnostic_backtest_20260525_024437 (50-match smoke)
- diagnostic_backtest_20260525_035649 (1000-match in-sample)
- filter_optimization_patch.json (grid search winners per market)

Social poster v3:
- satori + resvg HTML/CSS rendering pipeline
- Twemoji football/basketball + flag SVGs
- caption SEO: 12 curated hashtags per post
- image SEO: descriptive filenames + .json metadata sidecar
- /health, /preview-png, /run-now endpoints

Docs:
- mds/SESSION_HANDOFF.md: full session state for cross-machine continuity
- mds/SOCIAL_POSTER_SETUP.md: API keys + test commands

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-25 20:43:28 +03:00
parent b619c2454a
commit 988ee2f50d
36 changed files with 5268 additions and 46 deletions
+13 -1
View File
@@ -449,6 +449,12 @@ class DataLoaderMixin:
return 1.5, 1.2
return weighted_for / total_weight, weighted_against / total_weight
# Approximate European season window — Eredivisie/PL/La Liga start late
# July / mid-August, end May. Using 300 days as a buffer covers most
# competitions while excluding "career points" from previous seasons.
# When a proper seasons table lands this should query season boundaries.
_SEASON_LOOKBACK_MS = 300 * 24 * 60 * 60 * 1000
def _estimate_league_position(
self,
cur: RealDictCursor,
@@ -458,6 +464,7 @@ class DataLoaderMixin:
) -> int:
if not team_id or not league_id:
return 10
season_start_ms = before_date_ms - self._SEASON_LOOKBACK_MS
try:
cur.execute(
"""
@@ -478,6 +485,7 @@ class DataLoaderMixin:
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
AND m.mst_utc < %s
AND m.mst_utc >= %s
UNION ALL
SELECT
m.away_team_id AS team_id,
@@ -492,11 +500,15 @@ class DataLoaderMixin:
AND m.score_home IS NOT NULL
AND m.score_away IS NOT NULL
AND m.mst_utc < %s
AND m.mst_utc >= %s
) tm
GROUP BY tm.team_id
ORDER BY points DESC
""",
(league_id, before_date_ms, league_id, before_date_ms),
(
league_id, before_date_ms, season_start_ms,
league_id, before_date_ms, season_start_ms,
),
)
rows = cur.fetchall()
if not rows: