How we think about the lineup optimizer

The lineup optimizer takes a slate’s projected players, the user’s exposure preferences, and the contest’s roster rules, and returns a lineup that maximizes expected outcome subject to the constraints. This post walks through why we use integer linear programming for that job, what “expected outcome” actually means when the optimizer is solving over a Bayesian posterior, and the distinction between hard and soft constraints in our formulation.

The setup

A DFS lineup is a constrained combinatorial problem. The user has to select a fixed-size roster (eight or nine players, depending on the contest) under several layers of constraints:

Salary cap — the lineup’s total cost cannot exceed the contest’s budget
Position eligibility — each roster slot accepts a specific set of player positions, and the same player can occupy different slots in different sports’ rule sets
Sport-specific rules — DraftKings NHL contests require at least three teams represented and prohibit skaters from rostering against their own goalie; FanDuel MLB has different stacking rules than DraftKings; and so on
User-imposed constraints — minimum or maximum exposure per player or team, required stacks, locked-in players

The combinatorics are large. A typical MLB main slate has 250+ priced players; an NHL slate has 150+. Solving this by exhaustively enumerating roster combinations is computationally infeasible. We need a method that finds the best lineup under the constraints without searching every option.

Why integer linear programming

Three families of methods can find good lineups under constraints. We considered each.

Greedy heuristics — sort by points-per-dollar, fill slots top-down — are fast but produce systematically suboptimal lineups under tight salary constraints. They cannot respect multi-player rules like stacks or exposure ceilings except through post-hoc filtering.

Genetic algorithms can reach near-optimal solutions through mutation and crossover, but give no guarantee of optimality, require parameter tuning, and produce different answers across runs. The behavior is hard to reason about when a constraint needs to be added or removed.

Integer linear programming formulates the problem as a linear objective over binary roster-inclusion variables, with the constraints expressed as linear inequalities. A modern solver finds the provably-optimal lineup (or proves none exists) in seconds for the problem sizes we deal with.

We use ILP. The formulation maps cleanly onto the structure of the problem — select a subset of players that satisfies these conditions and maximizes this objective — and the solver’s guarantees mean we can iterate the formulation (add a constraint, remove a constraint, change a weight) without worrying that the result is a search artifact.

Solving over the posterior

The objective the optimizer maximizes is not a single projected score. Each player carries a distribution of projected fantasy-points outcomes — drawn from the Bayesian model’s posterior — and the optimizer can target any percentile of that distribution: the mean (cash-game default), the 75th percentile (tournament leverage), the 90th (high-ceiling GPP), or any other level the user selects.

Pre-computed histograms make this efficient. At slate-load we sample the posterior once per player and bin the draws; the optimizer then sees a discrete distribution per player rather than a single number, and the objective is computed over those distributions. The same solver call that produces the mean-maximizing lineup can produce the ceiling- maximizing lineup just by changing which percentile we target.

Optimizing over a point estimate (the posterior mean) collapses this distinction. The result is a single “highest expected score” lineup, which is the right answer for cash games but the wrong answer for tournaments where variance is your friend. Solving over the posterior is what lets the optimizer serve both contest types from the same engine.

Hard versus soft constraints

The optimizer’s constraint set has two layers.

Hard constraints filter the feasible set — they must hold, or the lineup is rejected. Salary cap, position eligibility, sport-specific rules (the DraftKings NHL three-teams minimum, the hard-zero skater-versus-own-goalie rule), and the user’s locked-in players are all hard.

Soft weights tilt the objective within the feasible set. A line-mate bonus on (team, ev_line, position_group) adds expected-points credit when two skaters from the same forward line or defense pair appear together — encoding the correlation that line-mates exhibit on the ice. A goalie- team-anchor bonus rewards lineups that pair a goalie with several of his own team’s skaters. These are tilts, not filters: a lineup that would benefit from breaking a soft correlation can still be optimal if the projection delta is large enough.

The split is a design choice. Hard constraints encode rules that must not be violated; soft weights encode preferences that should hold all else equal. Putting line-mate correlation in the hard layer would force every lineup to stack a forward line, which is too prescriptive for a tool that has to serve both cash games and contrarian tournament builds.

Sport generalization

A single optimizer, generalized across sports. The MLB-, NFL-, and NHL-specific rules live in a configuration layer that defines roster slots, position mappings, and the hard-constraint set for each sport-and-contest combination. The solver itself does not change.

Adding a new sport requires defining the roster slots, the position eligibility map, and the sport-specific hard rules. The objective function, the constraint compilation, and the solver call are sport-agnostic.

What the optimizer does not encode

The optimizer maximizes lineup score under constraints. It does not handle several decisions that sit upstream and downstream of it.

Ownership. The optimizer optimizes over projected fantasy points, not over projected ownership. A high-projection low-ownership pivot is structurally identical to a high-projection high-ownership chalk play from the optimizer’s perspective. Adjusting for ownership is the gauntlet’s job, not the optimizer’s.
Portfolio sizing. How many lineups to enter is a separate decision the user makes. The optimizer can produce N diverse lineups under exposure constraints, but it does not advise on what N should be.
Live lineup management. The optimizer is invoked at slate-load and at refresh; it does not run continuously through the slate. Late-swap recompiles the optimization with the updated player pool.

Limitations

The largest limitation is that the optimizer is exact for the problem we formulate, not for the real-world objective. Maximize expected fantasy points subject to cap and stacking is a model; win the most money is the actual goal. The two align in cash games and diverge in tournaments — where ownership, field composition, and variance preference all matter. The optimizer’s output is one input to a tournament strategy, not the strategy itself.

The pre-computed histograms also assume the posterior is fixed at the moment we sampled. When news arrives — a starting pitcher gets scratched, a goalie change is announced — the posterior shifts but the optimizer’s histograms do not, until inference re-runs. Lineups generated against stale histograms will reflect the old posterior, not the new one. We mitigate this with intraday delta inference, but the lag is real and worth knowing about.