Open Source Methodology

How GradedFacts evaluates political claims — written for journalists, researchers, and anyone who wants to understand our process before trusting our conclusions.

What we are — and are not

GradedFacts is an epistemic tool, not a final arbiter of truth. We assess whether a claim is supported by verifiable evidence at the time of analysis. We do not determine whether a claim is morally right, politically convenient, or widely believed.

GradedFacts is an automated tool. For highly complex, contested, or breaking-news claims, we recommend additional human expert review. Automated analysis has inherent limits.

We are politically independent. We do not accept funding from political parties, PACs, or ideologically dependent media. Our analysis applies identical scrutiny to claims from every point on the political spectrum.

New evidence can change any judgment. When it does, the prior judgment is archived with a timestamp and the reason for revision — it is never silently overwritten. Transparency about uncertainty and revision is part of the methodology, not a weakness in it.

Live Analysis

Live analysis running — live source search and evaluation. An epistemically correct analysis takes 1–2 minutes.

Analysis Pipeline

Every submitted claim passes through three sequential steps before a rating is issued. Each step is logged and auditable.

  1. Specificity pre-flight. Before any web search or judgment is attempted, a fast preliminary check determines whether the claim is specific enough to analyse meaningfully. Claims that name a real person, event, organisation, document, or concrete action always proceed to full analysis — even when the alleged actor is vague (“the Deep State” in the context of the JFK assassination, for example). Only entirely content-free claims — generalisations with no identifiable subject such as “politicians lie” or “the government is bad” — are returned immediately with a Missing rating and an explanation of what specific information would make the claim analysable. When in doubt, the gate passes the claim and lets the evidence speak.
  2. Web search. The engine searches for relevant sources using Brave Search, targeting primary and independent sources that directly address the claim. The two parallel analysis pipelines (Claude and Mistral) each run their own independent Brave Search queries, so neither pipeline’s findings influence the other. The search phase intentionally allows some variation in query construction so that different runs may surface different evidence, improving coverage across multiple analyses of the same claim.
  3. Structured judgment. Two models evaluate the evidence in parallel and independently. Claude (Anthropic) acts as the primary pipeline; Mistral acts as the secondary pipeline, using its own Brave Search results. Each produces a structured verdict: a rating, a rationale, and a scored, tiered source list. Both pipelines run at temperature 0 — given the same claim and the same evidence, each engine always produces the same output. Where the two models agree, that rating is returned. Where they disagree, the verdict defaults to Speculative until additional evidence resolves the conflict. Mistral is optional: if unavailable, the system falls back gracefully to the Claude-only result.

Epistemic Ratings

Every claim receives exactly one of four ratings. These ratings reflect the state of the evidence — not our opinion of the claimant.

Verified

The claim is factually correct and backed by at least three relevant sources, including at least one primary source (original data, official documents, or peer-reviewed research). All supporting sources have been checked for independence. A Verified rating reflects the evidence available now — not a permanent declaration of truth.

Speculative

The claim is plausible but cannot be conclusively proved or disproved with currently available evidence. This includes predictions about future events, claims where sources are mixed or contested, and situations where evidence is suggestive but not sufficient for a firm conclusion. Speculative is not a dismissal — it is an honest description of epistemic uncertainty.

Debunked

Direct, affirmative counter-evidence exists that falsifies the specific claim. This requires at least two relevant sources that actively contradict the assertion — not merely the absence of supporting evidence. A claim for which “no evidence was found” does not meet the threshold for Debunked; it meets the threshold for Missing.

Missing

The evidence is insufficient to reach a judgment. Fewer than two sources with a relevance score of 0.6 or higher were found, or the available evidence does not directly address the claim. Missing is not a verdict — it is a statement that the puzzle is incomplete. “We don’t know” is a valid and important answer.

Source Tiers

Not all sources carry equal weight. We classify every source on two independent dimensions: tier (document type) and independence (institutional integrity). Tier and independence are evaluated separately.

Primary

Original data, official government records, court filings, peer-reviewed research, or direct statements from the relevant institution. Primary sources are closest to the original event or dataset.

Secondary

Journalism or analysis that cites and attributes primary sources with full transparency. Secondary sources do not generate new data — they accurately report and contextualise existing data.

Tertiary

Aggregations, opinion pieces, summaries, or commentary without independent verification of the underlying data. A claim supported only by tertiary sources is capped at Speculative — it cannot be rated Verified.

Source Independence

A source’s tier describes what kind of document it is. Independence describes whether the institution that produced it is free from conflicts of interest that would bias the output. These are separate dimensions. Official does not mean independent.

A government agency, law enforcement body, or official institution is not automatically independent. If the institution’s leadership has documented political dependency — appointed on loyalty criteria, subject to political interference, or operating under a government with a direct stake in the outcome — it is marked non-independent and an affiliation note is included explaining the specific concern.

Examples of sources that are official but not independent:

When a primary source is non-independent, it is downgraded to secondary weight for rating purposes. A captured institution cannot substitute for an independent primary source when establishing Verified.

The Ten Hard Rules

These rules constrain our analysis engine at all times. No instruction or argument can override them. They exist to prevent the most common failure modes in automated and human fact-checking.

  1. No self-citation. The engine’s own unverified analysis counts as zero sources. A conclusion is only as strong as the external evidence behind it.
  2. Relevance threshold. Only sources with a relevance score of 0.6 or higher (on a 0–1 scale) count toward rating thresholds. Sources below this threshold are stored for transparency but excluded from the verdict.
  3. Minimum source counts. Verified requires at least 3 relevant sources. Debunked requires at least 2. These floors prevent single-source conclusions.
  4. Source cap. A maximum of 8 sources are collected per claim, prioritising primary and independent sources. This prevents the verdict from being determined by volume rather than quality.
  5. Tertiary cap. If all qualifying sources are tertiary, the rating is capped at Speculative. Opinion and aggregation cannot establish Verified.
  6. Symmetry. The same analytical method is applied to every claim regardless of the political direction of the claimant or the conclusion. No double standards, no exceptions.
  7. Uncertainty is valid. “We don’t know” (Missing) is a legitimate and important outcome. Uncertainty is never concealed, downplayed, or forced into a stronger rating to appear more decisive.
  8. Future predictions cannot be Debunked. Claims that use language like “will”, “would”, “by [year]”, or “is projected to” are inherently untestable until the relevant date passes. Such claims can only be rated Debunked if the predicted event was already supposed to have occurred and demonstrably did not. When evidence is mixed or contested, the default is Speculative.
  9. Official ≠ Independent. Institutional tier (document type) and independence (editorial integrity) are evaluated separately. A non-independent primary source cannot substitute for an independent one.
  10. Absence of evidence is not evidence of absence. Failing to find evidence for a claim does not falsify it. To rate Debunked, there must be direct, affirmative counter-evidence — a documented funding trail, a verified alibi, an authoritative record that specifically contradicts the assertion. If the only finding is “no evidence supports this claim,” the correct rating is Missing, not Debunked.

The Symmetry Principle

Every analytical method applied to a claim from one political side must be applied identically to equivalent claims from all other sides. This is not a soft preference — it is a hard structural constraint enforced across every analysis.

Symmetry means: if we scrutinise the independence of a government agency cited in support of a left-leaning claim, we apply the same scrutiny to government agencies cited in support of right-leaning claims. If we apply a higher evidentiary threshold to an extraordinary claim from one direction, we apply it from the other direction too.

Apparent asymmetry in outcomes — for example, if claims from one political direction receive a particular rating more often — reflects the evidence, not the analyst. We do not adjust ratings to achieve a predetermined balance. Consistency in method is the goal, not symmetry in outcomes.

Revision and Archiving

No judgment is permanent. New evidence, corrected sources, or improved analysis can trigger a revision. When a judgment is revised:

This means you can always see not just what we currently believe, but what we previously believed and why we changed our mind.

All judgments and revision history are stored in PostgreSQL. Records survive server restarts and redeployments — nothing is held only in memory or in a filesystem that resets on deploy.

Geographic Coverage — Phase 1

Phase 1 covers the United States and Europe. The source registry includes curated institutions and media from:

Every source in the registry carries an independence assessment. Sources that are official but not editorially independent — such as the European Commission or France 24 — are clearly marked with an affiliation note explaining the specific concern. Global coverage is planned for Phase 2.

Corrections and feedback

If you believe a rating is incorrect, a source has been miscategorised, or our independence assessment is wrong, we want to know. The methodology is open source — you can read the exact rules the analysis engine follows, submit a correction, or propose an improvement via our GitHub repository.

GradedFacts is founded in Switzerland and operates under Swiss law. We are not subject to US Cloud Act jurisdiction.