Statistical proof that our edge is real — not luck. Bootstrap confidence intervals, calibration analysis, and risk metrics across 3 sports.
Generated 2026-03-24 | Model v4
All results reflect the deployed strategy configuration including pregame filters. Backtested on real Polymarket bid/ask prices.
| Trades | 169 |
| Win Rate | 70.4% |
| Avg c/Trade | +12.6c |
| Total P&L | $21.22 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +12.6c |
| 95% CI | [+5.4c, +19.6c] |
| 99% CI | [+3.1c, +21.5c] |
| p-value | 0.0007 |
| Interpretation | Highly significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 43 | 74.4% | +6.7c |
| 10-15c | 42 | 69.0% | +5.0c |
| 15-20c | 25 | 72.0% | +12.9c |
| 20+c | 59 | 67.8% | +22.1c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| CHI @ HOU | home | 57-61 | 73.0c | 87.1c | +14.1c | WIN | +27.0c |
| PHX @ DET | home | 62-66 | 54.0c | 68.1c | +14.1c | WIN | +46.0c |
| IND @ PHI | home | 50-55 | 58.0c | 72.1c | +14.1c | WIN | +42.0c |
| WSH vs LAC | away | 72-67 | 53.0c | 67.2c | +14.2c | WIN | +47.0c |
| CHI @ DET | home | 57-61 | 57.0c | 72.1c | +15.1c | WIN | +43.0c |
| Month | Trades | Win Rate | Avg c/Trade | Total P&L |
|---|---|---|---|---|
| 2026-01 | 161 | 72.7% | +14.7c | +2368c |
| Architecture | Split-Phase XGBoost (early-game + clutch-time models) |
| Features | 14 engineered features |
| Calibration | Isotonic regression |
| Training Data | 5,285 games |
Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_court, pregame_wp, score_diff_x_tf, score_diff_sq, total_score, score_diff_x_elo, pace_diff, ortg_diff, drtg_diff
| Trades | 781 |
| Win Rate | 73.5% |
| Avg c/Trade | +13.5c |
| Total P&L | $105.45 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +13.5c |
| 95% CI | [+10.3c, +16.6c] |
| 99% CI | [+9.3c, +17.6c] |
| p-value | 0.0000 |
| Interpretation | Highly significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 204 | 72.5% | +3.9c |
| 10-15c | 240 | 71.7% | +4.3c |
| 15-20c | 97 | 71.1% | +9.7c |
| 20+c | 240 | 77.1% | +32.4c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| DAY @ LAS | home | 51-41 | 61.0c | 72.3c | +11.3c | WIN | +37.0c |
| UGA @ TEX | home | 53-49 | 68.0c | 79.3c | +11.3c | WIN | +30.0c |
| GMU @ URI | home | 53-48 | 61.0c | 72.3c | +11.3c | WIN | +37.0c |
| WEB @ MTST | home | 45-42 | 68.0c | 79.3c | +11.3c | WIN | +30.0c |
| HC @ COLG | home | 37-40 | 61.0c | 72.3c | +11.3c | WIN | +37.0c |
| Month | Trades | Win Rate | Avg c/Trade | Total P&L |
|---|---|---|---|---|
| 2026-01 | 781 | 73.5% | +13.5c | +10545c |
| Architecture | Split-Phase XGBoost (early-game + clutch-time models) |
| Features | 14 engineered features |
| Calibration | Isotonic regression |
| Training Data | 12,285 games |
Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_court, pregame_wp, score_diff_x_tf, score_diff_sq, total_score, score_diff_x_elo, pace_diff, ortg_diff, drtg_diff
| Trades | 50 |
| Win Rate | 74.0% |
| Avg c/Trade | +5.2c |
| Total P&L | $2.60 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +5.2c |
| 95% CI | [-7.5c, +17.0c] |
| 99% CI | [-11.8c, +20.9c] |
| p-value | 0.2000 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 11 | 81.8% | +8.5c |
| 10-15c | 22 | 72.7% | +1.0c |
| 15-20c | 11 | 81.8% | +16.9c |
| 20+c | 6 | 50.0% | -7.0c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| NYR @ WSH | home | 2-1 | 65.0c | 77.7c | +12.7c | WIN | +35.0c |
| ANA vs TB | away | 0-1 | 75.0c | 88.0c | +13.0c | WIN | +25.0c |
| ANA @ EDM | home | 3-2 | 78.0c | 91.0c | +13.0c | WIN | +22.0c |
| CBJ @ VGK | home | 3-2 | 76.0c | 89.1c | +13.1c | WIN | +24.0c |
| NYR @ LA | home | 3-2 | 65.0c | 79.3c | +14.3c | WIN | +35.0c |
| Month | Trades | Win Rate | Avg c/Trade | Total P&L |
|---|---|---|---|---|
| 2026-01 | 42 | 73.8% | +5.3c | +221c |
| Architecture | XGBoost + Isotonic calibration |
| Features | 12 engineered features |
| Calibration | Isotonic regression |
| Training Data | 4,225 games |
Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_ice, pregame_wp, score_diff_x_tf, score_diff_sq, pace_diff, ortg_diff, drtg_diff
How our model performs vs naive strategies. A model that can't beat simple baselines isn't worth using.
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 70.4% | +12.6c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 57.9% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 73.5% | +13.5c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 58.5% | -0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 74.0% | +5.2c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 68.8% | +0.0c |
Our approach is grounded in peer-reviewed research on sports prediction markets and probabilistic forecasting.
Training / test split — Models are trained on historical ESPN game-state snapshots (multiple seasons), then tested on held-out recent-season data using real Polymarket prices the model never saw during training. No future data leaks into features.
Realistic backtesting — Poly-price backtests use actual Polymarket bid/ask prices from enriched market snapshots, including a 2c taker fee per trade. Entry prices reflect real market conditions, not simulated fills.
Bootstrap confidence intervals — 10,000 resamples with replacement. The p-value is the fraction of bootstrap means ≤ 0, testing H0: "the model has no edge." A p < 0.05 means we're 95%+ confident the edge is real.
Calibration — Predictions are bucketed into 5%-wide bins (min 5 trades each). A well-calibrated model's dots land on the diagonal; points below the line indicate overconfidence.
Sharpe ratio — Annualized (sqrt(252) scaling) on per-trade P&L. Values above 1.0 indicate strong risk-adjusted returns; above 3.0 is exceptional.
Profit factor — Gross wins / gross losses. Above 1.25 = profitable. Above 1.5 = strong. Above 2.0 = excellent.
Fee assumptions — All results are net of a 2c flat taker fee per contract. Live Polymarket fees may vary by sport (e.g., NCAAMB has a 2% taker fee effective Feb 2026).
Pregame filter — For NBA and NHL, the deployed strategy requires the pregame market price to agree with the model's bet side at ≥55c. This filters out trades where the model disagrees with market consensus, reducing adverse selection.