EPL Projection Model
Why Dixon-Coles?
Football (soccer) has a fundamental modeling challenge that standard Poisson doesn't handle well: low-scoring outcomes are correlated. In a 0-0 or 1-1 game, both teams' goal counts reflect the same underlying match conditions (defensive tactics, poor pitch, weather) suppressing both sides at once. Standard Poisson treats the two teams' scores as independent draws, and that assumption systematically underestimates the probability of draws.
The Dixon-Coles model (1997) fixes this by adding a tau correction that adjusts exactly four cells in the score matrix (the 0-0, 1-0, 0-1, and 1-1 outcomes) using a parameter rho (-0.07) fitted from historical match data. The correction shifts draw probability by 2-3 percentage points, which is the highest-Sharpe edge in football betting markets. Draws are the most systematically mispriced outcome because recreational bettors rarely back them.
The model also incorporates time-decay weighting (recent matches count for more than older ones) because football teams change significantly within a season through transfers, injuries, tactical evolution, and manager changes. The EPL is also unique in having a three-way outcome (Home/Draw/Away), so the model must correctly distribute probability across all three outcomes and cannot treat the market as a binary yes/no.
Component Weights
The model computes each team's scoring rate (lambda) from six weighted components. Expected goals (xG) differential is the foundation because xG measures the quality of chances created and conceded, stripping out finishing luck. The remaining components capture factors that xG alone misses.
Per-Team Home/Away Splits: Each team's home and away attack and defense strengths are computed from their actual goals scored and conceded at each venue, drawn from finished fixtures. The samples are Bayesian-blended with the league venue baseline using 10 phantom matches as a prior, so early-season noise gets shrunk toward the league mean. This replaces a flat venue factor that assumed every team shared the league-wide H/A asymmetry — in 2025-26 the per-team H-A spread ranges from +0.89 (Newcastle) to -0.41 (Chelsea), so a uniform factor erased real venue effects.
Defensive Suppression (15%): How well a team limits opponents' xG. Separating this from overall xG differential lets the model weight defensive solidity independently, which matters most in low-scoring league phases.
Recent Form (18%): An exponentially-weighted moving average over the last 8 matches. Captures tactical shifts and momentum that season-long data is slow to absorb.
Congestion (8%): Fixture pile-ups from Champions League, Europa League, and cup competitions force rotation and fatigue, reducing attacking output.
Home Venue (8%): Captured at two layers. The per-team H/A splits already encode each ground's actual scoring and conceding profile. On top of that, a default 6.0% multiplicative boost is applied to the home λ, with extra premiums for select grounds (Anfield, St James' Park).
Set Pieces (5%): Teams with above-average set-piece xG share get a small lambda boost. Set pieces are a repeatable, coachable skill that general xG models tend to blur together with open-play chances.
Player Absence (0-25%): A separate multiplicative reduction applied on top of the six core components. When the pipeline runs within 90 minutes of kickoff, the official starting XI is fetched from PulseLive (the backend that powers premierleague.com), and any expected starter not in it is treated as a rotation-out absence alongside FPL-flagged injuries and suspensions. Each absence reduces lambda proportional to that player's xG/90 share of team output, capped at 25% total. This catches B-team rotations before midweek cup matches and end-of-season dead rubbers, where the manager rests healthy starters and FPL injury data has no signal.
Dixon-Coles Tau Correction
The heart of the model. Standard Poisson produces a 9x9 matrix of scoreline probabilities assuming independence. The tau correction then adjusts four specific cells:
- P(0,0) is increased, because goalless draws happen more often than independence predicts
- P(1,1) is increased, because 1-1 draws are also more common
- P(1,0) and P(0,1) are decreased, because narrow wins are slightly less likely
With rho = -0.07, this shifts draw probability by 2-3 percentage points and is the primary source of model edge. The parameter is refitted at mid-season and end-of-season from EPL data; it has remained stable in the -0.10 to -0.20 range across multiple seasons.
Time Decay
Football teams evolve within a season through January transfers, tactical adjustments, and injuries. The model applies exponential time-decay to historical results so that recent matches influence the ratings more than older ones. A short-term form window sits on top of that to capture streaks and momentum.
| Parameter | Value |
|---|---|
| Decay rate (xi) | 0.003 |
| Form window | 8 matches |
| Form vs season split | 18% / 83% |
Home Advantage
Home advantage in the EPL is applied as a multiplicative boost to the home team's expected scoring rate. The default 6% uplift reflects the average effect across all 20 grounds. A few historically imposing venues get an additional premium.
| Parameter | Value |
|---|---|
| Default boost | 6% uplift |
| Liverpool | +3% extra |
| Newcastle | +2% extra |
Contextual Adjustments
These adjustments capture situations that baseline team strength doesn't account for. Fixture congestion from European competition forces rotation and fatigue. A new manager appointment produces a well-documented short-term performance boost as players respond to fresh methods and increased scrutiny.
| Parameter | Value |
|---|---|
| Fixture congestion penalty | -0.075 |
| New manager bounce | +12% |
League Baselines
The league-wide scoring averages that anchor all lambda calculations. Home and away baselines are separated because home advantage in football is baked into the attack/defense strength ratios, with home teams creating better chances on average.
| Parameter | Value |
|---|---|
| Goals per game | 2.73 |
| Home xG | 1.45 |
| Away xG | 1.25 |
| BTTS rate | 52% |
Minimum Edge Thresholds
Football markets vary widely in efficiency. Asian handicap is the sharpest market (lowest threshold) because it attracts the most sophisticated bettors. Correct score and cards require the largest edge because they're high-variance, low-liquidity markets where the model's estimation error is largest.
| Parameter | Value |
|---|---|
| 1x2 | 3.0% |
| Asian Handicap | 2.0% |
| Total | 2.0% |
| Btts | 3.0% |
| Correct Score | 5.0% |
| Team Total | 2.5% |
| Anytime Goalscorer | 4.0% |
| Cards | 5.0% |
Markets Explained
Football has more distinct market types than any other sport the model covers. Each one asks a different question, and the Dixon-Coles score matrix lets the model answer all of them from a single unified probability distribution.
1X2 (Match Result)
LIV 1.75 / Draw 3.50 / ARS 4.50Three-way outcome: home win, draw, or away win. Football is the only major sport where a tie is a regular outcome, with about 25% of matches ending in a draw. Shown in decimal odds here (1.75 = +75 American). This is where Dixon-Coles does its primary work: the tau correction shifts draw probability by 2-3 points vs standard Poisson, and draws are consistently underpriced because casual bettors avoid them.
Asian Handicap
LIV -1.5 +120 / ARS +1.5 -140A goal-based spread that eliminates draws entirely. If the result is a push, you get your stake back. Quarter lines (-1.25, -1.75) split your bet across two adjacent handicaps. Asian handicap is the sharpest, most efficient football market, with the lowest EV threshold and the cleanest prices, and it's the market most favored by sharp bettors worldwide.
Goal Totals (Over/Under)
O 2.5 -110 / U 2.5 -110Combined goals across both teams. 2.5 is the most common line. Driven by both teams' xG rates, defensive suppression, and fixture-specific factors like congestion (tired legs produce fewer goals). Quarter lines (2.25, 2.75) work like Asian handicaps, splitting between adjacent whole lines to soften pushes.
BTTS (Both Teams To Score)
Yes -125 / No +105Does each team score at least one goal? Independent of final margin: a 3-1 and a 1-1 both cash 'Yes'. Derived directly from the score matrix: sum all cells where both home and away goals > 0. The league BTTS rate sits around 55%, and the model's edge here comes from defensive matchups where one side is unusually likely to be shut out.
Correct Score
2-1 +650 / 1-1 +550Pick the exact final score. The highest-variance market football offers: correct score cashes rarely but pays huge. The model reads each scoreline directly from the Dixon-Coles matrix cell, so priced bets are always internally consistent with the rest of the markets. Requires the largest edge threshold because pricing errors compound at long odds.
Team Totals
LIV O 1.5 -130 / LIV U 1.5 +110Over/under on one team's goals. Useful when the model has a strong view on one side's attack (Manchester City at home vs. a relegation defense) but an unclear read on the opponent. Derived by marginalizing the score matrix across one team's axis.
Model Track Record
EPL presents an honest backtesting challenge because the season is only 380 matches, a sample too small for hit-rate metrics to be meaningful on their own. Dixon-Coles is validated primarily through Brier score and CLV. The core question is whether the model's probabilities are better calibrated than the market's opening lines. Historical academic replications of Dixon-Coles on multi-decade European football data consistently show it beats closing lines, and our implementation follows the same framework. Live validation happens continuously against bet results on the History page.
When to Trust This Model (and When Not To)
EPL's weekly rhythm means most matches are clean projection environments. The exceptions are European-competition weeks and international breaks, where fixture context fights the model's assumptions.
- Standard Saturday 3pm/5pm/7pm matches with full week of rest
- Mid-season form (matchweeks 6-34, after early-season noise settles)
- Matches between non-European teams (no congestion variable)
- Asian handicap and totals markets (sharpest lines)
- Draw-friendly matchups where Dixon-Coles' edge is strongest
- Post-Champions League midweek (heavy rotation, hard to predict XI)
- First match after an international break (player fatigue and injuries)
- First match under a new manager (bounce is modeled but volatile)
- Final-day matches with nothing to play for (motivation flips)
- Matchweeks 1-3 (priors dominate, little current-season data)
- Correct score markets (always treat as entertainment bets)
The parameter values on this page are served live from the model configuration and refresh periodically; when a weight or threshold changes, this page reflects it automatically.