How we think about the OneTouch gauntlet

OneTouch serves a small number of pre-built lineups on every slate, each tagged with a strategy chip (Chalk 4-Stack, Game Stack, Naked Stars, Punt, and others) and a pair of metrics — win percentage and in-the-money percentage — drawn from a simulation against a synthetic GPP field. This post walks through how that simulation works, why we use a synthetic field instead of a uniform-random one, and the design decisions inside the gauntlet that produce the rankings.

The problem

A tournament contest is a comparison problem: a lineup is good not because it scored a high number, but because it scored higher than enough of the field to cash. If the field is full of correlated chalk stacks on the highest-implied team, a lineup that goes contrarian on a leverage spot has a different edge than the same lineup carries in a recreational-heavy field of randomly assembled rosters.

A simulation that compares each candidate lineup to a uniform-random field — pick eight or nine players at random subject to roster constraints — systematically overestimates edge. Real tournament fields cluster around chalk teams, include heavy stacks, and have a long tail of well-constructed contrarian builds. The synthetic field we simulate against has to look like that, not like a coin flip.

The four-bucket archetype mix

For MLB tournaments, we generate the synthetic field as a mix of four archetypes:

Recreational random (37%) — weighted-random rosters drawn from the full player pool, no stacking constraints imposed. These approximate casual entrants playing without an optimizer.
Chalk 4-stack (28%) — full four-batter stacks from the top-implied-total teams of the slate. The bucket that fields cluster on.
Strategic 3-4 stack, ceiling-weighted (20%) — stacks built with a tilt toward upside rather than mean, often including one or two players outside the slate’s top- ownership positions.
Contrarian value-per-dollar (15%) — rosters that optimize points-per-dollar rather than absolute points, pulling in low-ownership leverage plays.

The percentages are not derived from an ownership-distribution fit; we have not run that analysis. They reflect a calibration against the GPP entries we have observed at similar slate sizes — the team’s prior, not a posterior. The gauntlet’s metrics inherit whatever bias comes with that assumption.

For NHL tournaments the archetypes are NHL-native: forward line stacks (three forwards from the same trio), goalie-anchor builds (the goalie’s team well-represented), 3-3 game stacks, naked-stars builds, and punts.

Candidate diversity

The other half of the gauntlet is the candidate side — the lineups we serve to the user. A naive optimizer would return the salary-cap-maximizing lineup, repeatedly, with minor exposure changes. That single lineup family is not a portfolio; it is a single bet rendered seven times.

The candidate generator iterates the optimizer over four additive constraint hooks, producing structurally distinct lineups:

Required game stack — force a 4-2 / 2-4 / 3-3 split across two teams in the same game so the lineup carries a game’s worth of correlation
Bring-back — when a stack is forced, require a player from the opposing team in the same game as a hedge
Max-team-batters override — a “naked stars” build capped at two players per team, so the lineup is wide rather than stacked
Punt min/max — force a low-salary player into a premium position, freeing salary for stars elsewhere

The candidate set across these hooks is several times larger than the unconstrained optimizer would produce, and the diversity is structural — not exposure shifts on the same skeleton.

What the gauntlet does not encode

The gauntlet estimates how a given lineup performs against the synthetic field we built, not how it performs against the actual field on the night. The synthetic field is a reasonable model; it is not a measurement.

Two specific things the gauntlet does not encode:

Ownership shifts during late-swap. The synthetic field is generated once at slate-load and held fixed through the gauntlet run. If a high-implied team’s pitcher gets scratched at 6:55 and ownership re-distributes across the slate, the gauntlet’s metrics from earlier in the day are no longer current. The candidate set is regenerated when inference is regenerated, but the gauntlet’s reading of the field is a snapshot, not a stream.
Cross-slate correlation. The gauntlet treats each slate independently. A player who appears in lineups on both the early slate and the main slate is sampled twice, with no shared draws across slates.

Limitations

The largest limitation is the bucket-mix calibration. The 37 / 28 / 20 / 15 split is a reasonable estimate of GPP field composition but is not measured against actual contest results. We have not run a fit; the percentages reflect a prior, not a posterior. Adjusting the split would change the win-percentage and ITM-percentage estimates the user sees.

The candidate side has a similar limitation: the constraint hooks produce diverse lineups, but the diversity is along the dimensions we chose to vary. A user who wants a build type that does not fit one of the four archetypes — a “stars and scrubs” pitcher pivot, for example — would not see it represented in OneTouch’s candidates.

Both of these are knobs the gauntlet exposes deliberately. They are not free parameters fit to data; they are design choices we are willing to defend, but not without acknowledging that the calibration is an open question.