Back

Research & Analytics

What I've
Studied

Controlled experiments, statistical rigor, and predictive models validated on real data with measurable outcomes.

835+
Experiments
7,330+
Games Analyzed
770K+
Plays Processed
2
Deployed Models

How I Work

Every project starts with a question and a dataset. I design controlled experiments, validate with proper statistical methods (walk-forward CV, permutation tests, Benjamini-Hochberg correction), and only trust results that survive out-of-sample testing. If it ships, it has numbers behind it.

ResearchLive

NHL Betting Market Efficiency Study

MGS 649 Practicum | University at Buffalo

Systematic analysis of NHL betting market inefficiencies. Designed 235+ controlled experiments across 6,560 games and 100K+ odds records, applying Benjamini-Hochberg correction and permutation testing to rigorously evaluate market efficiency.

235+
Experiments
60.9%
Accuracy
+9-11%
ROI
6,560
Games

Key Findings

  • 158-feature logistic regression with 5 custom Elo systems and expected goals pipeline
  • 60.9% walk-forward accuracy on 6,560 out-of-sample games
  • Identified exploitable inefficiencies yielding +9-11% ROI
  • Deployed as PuckCast (puckcast.ai) serving 5,500+ unique visitors

Methodology

Walk-forward cross-validation on rolling 3-season windows. Each experiment tests a specific hypothesis about market behavior (e.g., 'home favorites are overpriced after back-to-back losses'). Statistical significance assessed via Benjamini-Hochberg correction for multiple comparisons and permutation testing with 10,000 shuffles per hypothesis.

Results

The final model uses 158 engineered features including 5 custom Elo rating systems (overall, home/away, recent form, goaltender, situational), an expected goals pipeline built from 100K+ shot-level records, and rolling team performance metrics. Walk-forward accuracy of 60.9% on 6,560 fully out-of-sample games across 5 NHL seasons, with a Brier score significantly below the bookmaker baseline.

Impact

Findings deployed as PuckCast (puckcast.ai), a live analytics platform generating real-time win probabilities and value betting signals for every NHL game. Companion developer API (puckapi.com) serves model predictions and historical data. Platform has served 5,500+ unique visitors with sub-second response times.

Pythonscikit-learnSupabaseVercelThe Odds API
ResearchIn Progress

NFL Game Prediction Model (NoPUNT)

Independent Research

Large-scale NFL prediction system benchmarking 600+ model configurations across 35 optimization rounds on 770,000+ plays of play-by-play data, systematically isolating the strongest predictive signals for game outcomes.

600+
Configs
77.6%
Consensus
+41.6%
ROI
770K+
Plays

Key Findings

  • Discovered two independent ~67% accuracy signals with zero feature overlap
  • Consensus reaches 77.6% accuracy at high confidence thresholds
  • Validated at +41.6% ROI against closing odds over 6 backtested seasons
  • Proprietary prediction pipelines and backtesting infrastructure in Python

Methodology

Systematic grid search over 600+ model configurations combining feature sets, algorithms (Random Forest, Gradient Boosting, Logistic Regression), hyperparameters, and training window sizes. Each configuration evaluated via walk-forward backtesting across 6 complete NFL seasons. Play-by-play data from nfl_data_py covering 770,000+ individual plays decomposed into game-level predictive features.

Results

Identified two statistically independent signal families with zero feature overlap, each achieving ~67% standalone accuracy. When both signals agree (high-confidence consensus), accuracy reaches 77.6%. Backtested against historical closing lines, this consensus generates +41.6% ROI, surviving robustness checks across all 6 test seasons individually.

Impact

Proprietary backtesting infrastructure enables rapid iteration on new feature hypotheses. The dual-signal architecture provides natural confidence calibration: predictions are only surfaced when both independent models agree, dramatically reducing false positive rate. Public frontend currently in development.

PythonRandom ForestGradient Boostingnfl_data_py
Want the full picture? See all my projects or check the resume.