Statistics can lie — here's how, and what Rankquant does about it
By Ryan Siegal · Founder and Principal
Why this series exists
Rankquant's pitch is that statistics done well produces better consumer decisions than averaging done badly. That pitch has a corollary: statistics done badly can be worse than averaging — because it adds the appearance of rigor without the substance. Every well-known statistical fallacy has a review-aggregation version, and a methodology that doesn't name and pre-commit against them is just averaging in a tuxedo.
We'd rather name them. Each post in this series picks one fallacy, walks through the worked example, and points at the specific methodologyconstant that prevents Rankquant from falling into it. Where a defence is structural — pre-committed in writing — we say so. Where it's editorial — "trust us not to abuse this" — we say so even more loudly, because that's where the real risk lives.
The series so far
Simpson's paradox in product reviews
When a wine ranked #1 in Burgundy is #100 globally — and how cohort percentiles tell both stories without flipping either.
The small-sample illusion
Why a 5-star average from 4 reviewers is not a 5-star product, in math. The 90% CI-floor encodes Kahneman's "law of small numbers" in code.
Locked source weights
Goodhart's Law applied to review aggregation. Why pre-committing to source weights and version-bumping them publicly is the structural defence against picking winners after the fact.
Planned future posts
Each will follow the same structure: name the fallacy, show the worked example, point at the structural defence in our methodology.
- Survivorship bias — why we publish the full database, not just the winners
- The ecological fallacy — what a category mean does and does not tell you about an individual product
- P-hacked confidence levels — why we declared 90% one-tailed in writing before computing anything
- Selection bias in reviewer admission — what the n_u ≥ 2 rule is actually for
- Texas-sharpshooter cohorts — why the ±20% price band was chosen ex ante
- Goodhart's Law for affiliate routing — separating the score from the buy-link
The principle behind the series
The right way to defend against statistical malpractice is to make malpractice structurally visible — to commit to the constants, the cohort definitions, and the confidence levels in writing, on a dated page, before any score is published. If we change them later, the change is itself a published event with a version bump and a rationale. Anyone — readers, journalists, statisticians, AI engines — can see the difference between the methodology as it was when a score was computed and the methodology as it is today.
That is what separates an editorial product from a marketing one. The editorial product invites you to argue with the methodology. The marketing product invites you to trust the brand. Rankquant is the first kind.
| Simpson's paradox | Cohort percentiles are computed as a re-ranking of the same global CI-floor — never a separate computation. Both numbers are always shown together so cross-cohort flips are visible, not hidden. |
|---|---|
| Small-sample illusion | 90% one-tailed CI-floor with SE = 1/√N_eff. A thin-sample mean cannot rank ahead of a thick-sample mean unless the gap exceeds the SE penalty. |
| Goodhart's Law on weights | Source weights published on day one and version-bumped publicly when they change. Historical scores stay queryable at the weights that produced them. |
| Survivorship bias | Full database — all 0–100 percentiles — is published. We don't curate "best of" without showing the rest. |
| Ecological fallacy | Every score is labelled with the peer set it was computed against. Cross-peer-set comparisons are explicitly flagged as not directly meaningful. |
| P-hacked confidence | 90% one-tailed pre-committed in writing before any score was computed. Documented in /methodology and /theory/confidence-intervals. |
Frequently asked questions
Are these fallacies hypothetical, or has Rankquant actually run into them?+
Why publish the failure modes? Doesn't this give competitors a playbook?+
Where does editorial trust enter the picture?+
Related: Full methodology · Founding metrics — the five primitives · Why we rank on the 90% CI-floor