← Back to theaitoll.com

Methodology & data quality

Every data point in The AI Toll company is collected from verified primary sources, classified according to a published taxonomy, and documented with full provenance. This page describes how.

Data collection

The company is powered by automated collection pipelines that harvest data from institutional APIs, government databases, academic repositories, and verified news sources. No data is AI-generated. No data is fabricated. Every entry traces back to a verifiable primary source.

Primary source categories

Source typeExamplesCollection method
Academic databasesOpenAlex, PubMed, arXiv, CrossRef, DBLPREST API queries with domain-specific filters
Government databasesNVD (NIST), NHTSA, SEC/EDGAR, FDA, BLSOfficial public APIs
International organisationsWorld Bank, WHO, ILO, OECD, V-Dem, SIPRIStatistical APIs, bulk CSV
News aggregatorsGDELT, Google News RSSFiltered by domain keywords, deduplicated
Technical and practitioner forumsReddit, Hacker News, Stack ExchangePublic JSON/search APIs — used for early signal detection, not scored directly in the index
Incident databasesAI Incident Database (AIID)Direct API integration
Ecosystem trackersHuggingFace, GitHub, PyPIPlatform APIs for metadata
Consumer protectionCFPB, UK Police DataBulk data downloads, keyword filtering

Quality assurance

Data quality is enforced at every stage of the pipeline:

Verification and exclusion criteria

The company applies strict inclusion criteria. Entries are accepted only if they meet all of the following conditions:

The following are excluded by design:

Excluded entries are moved to a quarantine archive, not deleted. Every exclusion decision is logged and auditable.

Incident methodology

A core distinction in the company is between articles and incidents. An article is a single document from a single source. An incident is a real-world event — it may be covered by one article or by dozens.

When a teenager's suicide is linked to an AI chatbot, that single event generates coverage across multiple outlets and languages. Each article is collected and classified individually, but for scoring purposes, they are clustered into a single incident record using title similarity, date proximity, country, and domain matching.

The Toll AI Index uses incident density, not article density. A country is not penalised for having more media coverage of the same event. This prevents media attention from inflating risk scores and ensures that countries with a free, active press are not artificially rated as higher-risk than countries where incidents go unreported.

Each incident record carries:

Severity framework

Every incident and regulatory entry is assigned a severity level from 1 to 5 based on documented consequences, not potential harm or media coverage. The scale is anchored on observable, verifiable criteria:

LevelLabelObservable criterion
1InformationalNo incident. Analysis, research, policy discussion, or industry announcement with no documented harm.
2LowIncident documented but contained or corrected. No formal legal or regulatory consequence. No physical harm.
3MediumOfficial investigation or inquiry opened, formal complaint filed, or documented harm to specific individuals with measurable impact.
4HighFormal legal action filed, regulatory enforcement with fine or ban, or systematic discrimination documented and affecting an identifiable group.
5CriticalDocumented death, serious physical injury, or mass fundamental rights violation directly caused by or linked to an AI system.

The threshold between levels is defined by formal action: no authority action caps an entry at Level 2; an investigation opened raises it to Level 3; a lawsuit filed or fine issued to Level 4; documented death or serious injury to Level 5. This escalation ladder is binary, not interpretive.

Entries where severity cannot be assessed from available information are left unscored rather than estimated. The full severity framework, including decision rules and regulation-specific scoring, is available to institutional reviewers on request.

Toll AI Index

The Toll AI Index is the company's flagship product — a composite score that measures where AI harm occurs. Each incident is attributed to the country or countries where the impact is documented. Technology origin is tracked separately and does not affect country scores.

The index is built from five dimensions, each of which requires the company's full dataset to compute:

1. Verified Harm Score (30%)

Counts verified L4 (severe) and L5 (critical/death) incidents per country. L5 incidents score 10 points, L4 score 5 points. Only entries with high confidence verification or LLM-based severity assessment are counted. This dimension measures actual documented harm, not media noise.

2. Legal Action Density (25%)

Counts entries involving court cases, legislation, enforcement actions, or documents containing legal keywords (lawsuit, fine, penalty, enforcement, ruling, sanctions). Only entries with severity ≥ 3 qualify. Legal actions are independently verifiable and indicate real consequences.

3. Regulatory Gap (20%)

Measures whether a country's regulatory framework matches its AI risk exposure. Formula: verified incidents divided by regulatory coverage score. A well-regulated country (e.g. EU member states) with incidents sees its score reduced by its regulatory response. Higher score means more problems relative to regulatory capacity.

4. Domain Breadth (15%)

Number of distinct impact domains with at least one moderately verified incident of severity 3 or above. A country with AI problems across many sectors (health, transport, privacy, finance) is more broadly exposed than one with a single-sector issue.

5. Trend Severity (10%)

Compares the count of verified severe incidents in the last 12 months versus the previous 12 months. Captures acceleration of AI harm, not just static totals. A rising trend scores higher.

Country attribution

The index uses only countries_impacted — the countries where harm physically occurs or where people are directly affected. countries_responsible (country where the technology originates) is stored but not used in the index calculation; it powers a separate "AI Origin Risk" view.

If an incident impacts multiple countries, each impacted country receives the full weight of the incident independently.

Each dimension is normalised to a 0–100 scale. The composite score is a weighted average of the five dimensions. Weighting details and full methodology are available to institutional partners and academic reviewers on request.

Score interpretation:

27-domain taxonomy

Every entry is classified into one of 27 impact domains. The taxonomy covers both direct harms (incidents, accidents) and systemic impacts (workforce displacement, democratic erosion, environmental cost).

#DomainScope
01HealthMental health harms, AI diagnostics errors, surgical robots, algorithmic prescriptions, triage failures
02Children and youthExposure of minors to harmful content, AI-generated CSAM, age-inappropriate chatbots
03EducationAcademic fraud, AI-generated plagiarism, assessment integrity, teacher displacement
04EmploymentJob displacement, algorithmic hiring bias, workplace surveillance, gig economy automation
05CreativityCopyright infringement, generative AI vs human artists, music and image theft
06DemocracyDeepfakes in elections, AI-generated propaganda, voter manipulation, political bots
07PrivacyFacial recognition, mass surveillance, data harvesting, biometric tracking
08Justice and biasAlgorithmic discrimination in sentencing, policing, credit scoring, insurance
09FraudAI-powered scams, voice cloning, deepfake identity theft, phishing at scale
10CybersecurityAI-powered cyberattacks, automated vulnerability exploitation, deepfake phishing, adversarial ML
11EnvironmentEnergy consumption of data centres, water usage, carbon footprint of training runs
12MilitaryAutonomous weapons, lethal autonomous systems, AI in targeting and surveillance
13SovereigntyNational dependence on foreign AI, data colonialism, strategic AI autonomy
14FinanceAlgorithmic trading failures, AI-driven market manipulation, robo-advisory risks
15EnterpriseAI system failures in business operations, hallucination in enterprise tools, vendor lock-in
16ScienceAI-fabricated research, paper mills, peer review manipulation, reproducibility crisis
17Info pollutionAI-generated misinformation, synthetic media flooding, dead internet theory
18Vulnerable peopleExploitation of elderly, disabled, and marginalised groups by AI systems
19LanguageLinguistic homogenisation, low-resource language erasure, translation bias
20SexualityNon-consensual deepfake pornography, AI companions, exploitation of intimacy
21MarketingHyper-targeted manipulation, dark patterns, synthetic influencers, deceptive ads
22Food and agricultureAI in precision farming failures, food supply chain disruption, land-use algorithms
23HousingAlgorithmic rent pricing, AI-driven gentrification, discriminatory mortgage models
24TransportAutonomous vehicle accidents, AI traffic management failures, aviation automation
25SportAI in doping detection, algorithmic refereeing, performance prediction ethics
26ReligionAI-generated sermons, chatbot spiritual advisors, theological disruption
27Human identityDigital resurrection, grief bots, consciousness debate, human-AI boundaries, post-mortem data rights

Incident clustering

A single real-world event may generate dozens of articles across multiple outlets and languages. Without deduplication, one incident counted 20 times would inflate a country's score. The index applies incident clustering before scoring:

Clustering reduces the total entry count by approximately 85%, primarily affecting countries with extensive multilingual media coverage of the same events.

Known limitations

Incident documentation density varies by country and is influenced by media landscape, language coverage, and public reporting norms. The Toll AI Index reflects documented exposure based on available sources.

English-language sources remain overrepresented. The company now covers 17 languages, but coverage of incidents in Chinese, Russian, Japanese, and Korean remains uneven compared to Western European languages.

US dominance caveat: the United States consistently scores significantly higher than all other countries. This gap partly reflects a genuine concentration of AI deployment and litigation in the US, but is amplified by the dominance of English-language American sources in the scraping pipeline. The absolute gap should not be interpreted as proportional real-world risk difference — it reflects documentation density.

Regulatory data for jurisdictions in Sub-Saharan Africa and Central Asia is sparse, reflecting limited public availability rather than absence of activity. Some data sources carry an inherent reporting lag of days to weeks, particularly court filings and regulatory decisions. Community signals (forums, practitioner discussions) are used for early detection only and are not scored directly in the Toll AI Index.

Academic exclusion: entries classified as academic papers or preprints are excluded from all index scoring. They are retained in the company for research purposes but do not contribute to any dimension of the Toll AI Index.

Update frequency

The raw database is updated daily through automated pipelines. The Toll AI Index is recalculated quarterly, with the next publication scheduled for Q2 2026. Incident alerts and regulatory updates are processed within 24 hours of source publication.

About the company

The AI Toll is an independent company created to document, structure, and make accessible the full scope of artificial intelligence's impact on society. The project emerged from the recognition that while AI's benefits receive extensive coverage, its costs, risks, and harms are fragmented across thousands of sources in dozens of languages.

The company is not funded by any technology company. The data is collected, structured, and maintained independently. Academic partnerships are available on request. The methodology is documented and available for review.

For enquiries: contact@theaitoll.com