The same data and models that power our API. Train your own models, backtest strategies, or learn to build a prediction market bot from scratch.
6 Jupyter notebooks: scraping ESPN, Elo ratings, WP models, backtesting, live bot, deployment. Build a complete prediction system from scratch.
Tick-level orderbook data from Polymarket and Kalshi. These platforms do not publish historical market data — we recorded it ourselves.
2,203 games, 737K+ candle bars from the 2025 MLB season. OHLCV + bid/ask at 1-minute resolution with game outcomes. Kalshi does not publish historical data.
Tick-level bid/ask/spread/volume snapshots from Polymarket sports markets. Score-synced with live game state. 30+ days of continuous recording. Polymarket has no historical orderbook API.
Research-grade analysis of Kalshi market dynamics: spread compression events, quote freezes, recovery curves, leader-lag clusters, reversion patterns. Includes charts and methodology.
25.6M rows of in-game state snapshots across NBA, NCAAMB, NCAAWB, CFB, NFL, NHL, MLB. Parquet format. 2020-2026.
2015-2024: leak-free features (140 cols), raw box scores, player stats, lineups, recruiting, KenPom-style ratings, SRS, shooting splits.
Historical odds data (2020-2025) + MoneyPuck advanced analytics. Player-level and goalie stats with data dictionary.
Retrosheet play-by-play (2020-2024), game info, player stats, and historical odds. Ready for pitcher modeling and game simulation.
Everything: WP training data, NCAAB mega-pack, NHL odds, MLB analytics, Kalshi candles, tennis data, microstructure pack, AND the bot course.
Win probability models — 25M+ labeled game-state snapshots with ESPN WP as baseline. Train LR, XGBoost, or neural nets.
Prediction market bots — the course walks you through building a complete Polymarket bot with edge detection and live execution.
Market microstructure research — tick-level orderbook data for studying price discovery, spread dynamics, and liquidity patterns in prediction markets.
Custom backtests — test entry criteria against real game outcomes and real market prices. Score, period, Elo, bid/ask — all included.
Elo rating systems — cleaned game results for 258+ teams across 10 sports. Build your own Elo, Glicko, or TrueSkill.
Academic papers — prediction market efficiency, price reaction to scoring events, sports betting market analysis. Cite-ready with provenance metadata.
Datasets delivered as ZIP files containing Apache Parquet and CSV. Compatible with pandas, polars, DuckDB, and Spark.
Instant download after purchase. Payments processed securely via Stripe.
Questions? admin@zenhodl.net
Want real-time signals instead? See API plans →