Why Rotten Tomatoes' Tomatometer is statistically meaningless
By Ryan Siegal · Founder and Principal
What the Tomatometer actually measures
Despite looking like a 0-100 score, the Tomatometer is a percentage of reviewers, not a percentage of quality. Each critic's full-text review is assigned a binary fresh/rotten label by a Rotten Tomatoes editor using a threshold (typically around 60% in the reviewer's native scale, or positive-overall sentiment). The Tomatometer is the fraction of that binary that comes out "fresh".
So a film where 100 critics all wrote glowing 95/100 reviews has the same Tomatometer as one where 100 critics all wrote barely-positive 62/100 reviews: 100%. The first film is universally acclaimed; the second is universally considered mediocre-but-watchable. The Tomatometer cannot distinguish them.
The Tomatometer is a binary — every critic review is reduced to one bit of information.
Rotten Tomatoes methodology documentation
Metacritic's weighted Metascore preserves the cardinal score with editor-assigned source weights.
Metacritic methodology
The threshold Rotten Tomatoes uses to label a reviewer's score 'fresh'. A 59 is rotten; a 61 is fresh. Two-point difference in the review = sign-change in aggregate.
Published editorial guidelines
The information loss is quantifiable
In information-theoretic terms, a binary contains at most 1 bit of information. A 100-point score contains about 6.6 bits (log₂ 100). The Tomatometer throws away roughly 85% of the information in each critic's review before aggregation.
That loss compounds at the aggregate level. With 100 critic reviews of a film:
- Cardinal aggregate (Metacritic): 100 critics × 6.6 bits ≈ 660 bits of signal before weighting, with the weighted mean recovering most of it.
- Tomatometer: 100 critics × 1 bit = 100 bits, with the percentage recovering a small fraction.
Empirically, this shows up as the Tomatometer's inability to distinguish "broadly liked" from "universally acclaimed." Both can pin at 95-100%. A cardinal aggregate would show the former at 70-80 and the latter at 90-95.
Why the binary exists anyway
The fresh/rotten binary made sense when RT launched in 1998: most critics weren't using numeric scales, and a binary classification worked around the heterogeneity of star counts, letter grades, and pure-prose reviews. "Recommendation or not" was a lowest common denominator.
It's 2026 and almost every publication now publishes a numeric score or at minimum supplies a 5-star equivalent. The technical justification for the binary has disappeared. The Tomatometer format survives because it's instantly understandable and has become a marketing signal in its own right.
Rotten Tomatoes has become, despite itself, the Michelin Guide of American cinema. A binary built as a convenience now shapes what studios finance.
Metacritic's approach has its own problems
Metacritic's Metascore is a weighted mean of critic scores on a 0-100 scale, with source weights assigned by Metacritic editors. Preserves information. But:
- The source weights are opaque — Metacritic doesn't publish them.
- Not all critics supply a numeric score; Metacritic's editors convert prose to a number using unpublished heuristics.
- The scale still suffers inflation pressure from contemporary critic norms.
Metacritic is better than Rotten Tomatoes but still not what a consumer needs for a confident recommendation. The ideal tool combines the Metacritic-style cardinal aggregate with within-category z-scoring — which is what Rankquant does for movies. Source weights are published, scale is deterministic, and the output is a 1-5 score where a 5 means top 5% of the peer set.
What Rankquant does for movies
Rankquant's movie source weights (published at /methodology):
- Metacritic (pre-weighted professional aggregate) — 8
- Rotten Tomatoes (critics) — 6 (binary loses info)
- Letterboxd (enthusiast audience, cleaner signal than IMDb) — 6
- IMDb (weighted average) — 5
The Tomatometer is included but down-weighted because of the binary-information-loss problem. Metacritic carries more weight because it preserves cardinal information. Letterboxd is a cleaner crowd signal because its audience is cinephiles rather than general viewers.
Normalized across the peer set (genre × era × budget tier), the resulting 1-5 score distinguishes "broadly liked" from "universally acclaimed" — something no single source alone can do.
Frequently asked questions
Is Rotten Tomatoes useless?+
Doesn't RT also have an "Audience Score"?+
What's the argument FOR binary aggregates?+
Does Rankquant include audience scores or only critics?+
Related: Rating inflation explained · The 7 review sources that dominate every category