How We Score Confidence: A Transparent Method
Every data point on BehindTheHate carries a confidence score. Here is exactly how we calculate it and why 'insufficient evidence' is a valid answer.
Listen: How We Score Confidence: A Transparent Method
Narrated overview - 5 min
Every data point on BehindTheHate carries a confidence score. Here is exactly how we calculate it and why 'insufficient evidence' is a valid answer.
Every claim, chart, and timeline entry on BehindTheHate carries a confidence level. This is not decoration. It is a core feature that distinguishes evidence-backed assertions from editorial inference.
The four factors
Confidence scores are derived from four weighted factors:
- **Source diversity** (30% weight) - How many independent source families corroborate the claim? A finding backed by official statistics, survey data, and academic research scores higher than one backed only by media reports.
2. **Coverage** (25% weight) - What is the geographic and temporal completeness of the underlying data? FBI hate crime data covers approximately 85% of the US population. ACLED covers 200+ countries but with varying depth.
3. **Recency** (20% weight) - When was the source last updated? A 2024 survey scores higher than a 2018 survey on the same topic.
4. **Methodological rigor** (25% weight) - Was the data collected via peer-reviewed methodology? Is the sample size adequate? Are known biases documented?
Scoring thresholds
- **High**: 3+ independent source families, 80%+ coverage, updated within 2 years, peer-reviewed methodology
- **Medium**: 2+ source families, 50%+ coverage, updated within 5 years, documented methodology
- **Low**: 1 source family, limited coverage, older than 5 years, or methodology concerns
- **Insufficient**: Cannot be scored due to lack of data
Why "insufficient" matters
Many datasets about inter-group hostility have enormous gaps. Sub-Saharan Africa, Central Asia, and many Pacific Island nations have minimal hate crime reporting infrastructure. Rather than forcing a score from thin data, we mark these as "insufficient evidence" and show the coverage gap explicitly.
This is uncomfortable for users who want a complete global picture. But showing confident numbers where none exist would be worse.