Theory & Derivations
Theory & derivations
Start here: the founding metrics
New to the methodology? Read Founding metrics — the five statistics primitives behind every Rankquant percentile first. It is a first-principles tour of the five textbook ingredients (z-score, standard error, 90% CI-floor, empirical CDF, Kish design effect) the rest of the pipeline composes.
The three load-bearing concepts
| Founding metrics (start here) | First-principles tour of the five primitives the pipeline rests on: z-score, standard error, 90% CI-floor, empirical CDF, Kish design effect. Each primitive paired with its canonical statistical reference. |
|---|---|
| Degrees of freedom (df) | Why reviewer σ_u uses n_u − 1 (Bessel), why we require n_u ≥ 2, how effective sample size N_eff enters the R2 weighted aggregation, and why a reviewer with df = 1 is admitted but implicitly penalized at the CI-floor step. |
| Confidence intervals (CI-floor ranking) | Why we rank on the lower bound of the 90% CI rather than the mean. Derivation of SE(Ẑ) = 1/√N_eff for R1, R2, R3. Why 90% and not 95%. Analytical comparison to Wilson score and IMDb-style Bayesian shrinkage. |
| Inter-rater reliability (reviewer-level) | ICC(1,1) variance decomposition across reviewers, how per-reviewer normalization removes reviewer main-effects, Cohen's κ for binary reviewer decisions, and how low agreement informs R2 weight calibration. |
Each step of the pipeline, with its statistical justification
Rankquant's methodology (/methodology) is a four-step compositional estimator. The table below maps each step to the statistical property that keeps it rigorous, and points at the deep-dive page that justifies it.
| Step | Estimator | Statistical property |
|---|---|---|
| 1 | zu,i = (ru,i − μu) / σu | Per-reviewer studentization. σu uses Bessel correction (df = nu − 1). Removes reviewer main-effects; dimensionless output. |
| 2a | R1(i) = meanu zu,i | Unweighted reviewer-fixed-effects aggregation. Unbiased estimator of the product's population mean standardized rating. |
| 2b | R2(i) = weighted mean with ws | Source-weighted aggregation. Effective sample size Neff = (Σ ws)² / Σ ws² (Kish). |
| 2c | R3(i) with imputed σ̃s | Broadened pool includes σu = 0 reviewers via pooled source-level SD imputation. Trades a small bias for higher N; see /theory/degrees-of-freedom/. |
| 3 | floor = Ẑ − 1.645 · SE(Ẑ) | One-tailed 90% CI lower bound. SE(Ẑ) = 1/√Neff under unit-variance z-scores. Penalizes uncertainty; same family as Wilson score / Reddit's "best" sort. |
| 4 | p = empirical-CDF rank of floor | Nonparametric ranking. Robust to non-normality of the underlying floor distribution. Global p uses all products; cohort p uses the ±20%-price peer set. |
| Cohort | re-rank same floor within Cohort(i) | No new estimator. Cohort score is a view, not a computation. Guarantees internal consistency: cohort rank is a deterministic function of CI-floors. |
Each sub-tab above unpacks one of the three load-bearing concepts in full. Read in order (Degrees of freedom → Confidence intervals → Inter-rater reliability), or jump to the one you need. All pages cite primary statistical sources; nothing here is proprietary theory — what's proprietary is publishing the details, committing to the constants, and running the pipeline on real data at scale.
Prerequisites: basic familiarity with mean, standard deviation, and the normal distribution. For term-by-term definitions see /glossary/statistics-terms.