RAG vs signed-claim verification (VERITAS)

TL;DR

RAG is the right tool when your knowledge lives in prose — documentation, articles, customer-support tickets, internal wikis. You vector-index it, retrieve semantically similar chunks at query time, paste them into the prompt as context. Works at any scale; tolerates messy unstructured input.

VERITAS (or any signed-claim system) is the right tool when you need atomic facts with verification — specific assertions like "GPT-4 was released on 2023-03-14" with sources you can cite and signatures you can re-verify. Bounded coverage; high precision on what it covers.

They're complementary. RAG covers breadth; VERITAS covers atoms. Most production LLM applications eventually run both.

The shape difference

The two systems retrieve different things:

Aspect	RAG	VERITAS
Retrieves	Prose chunks (200-2000 tokens)	Atomic claims (subject + predicate + object)
Returns	Semantically similar text	Verified facts with confidence + signature
Trust model	Trust the corpus you indexed	Trust the curation methodology + HMAC signature
Scale	Millions of chunks easy	Limited by curation effort (today: 91, target Q3: ~150)
Hallucination on covered domain	~10-15%	<1%
Coverage	Whatever you index	Whatever the catalog covers (AI/ML today; new verticals Y2)
Citation precision	Chunk-level (paragraph at best)	Fact-level (single assertion)
Auditability	Manual	Programmatic (signature)
Infra requirement	Vector DB + embedding model	HTTP fetch (zero infra)

Why RAG alone leaks

RAG works well most of the time and fails in a specific class of cases:

Semantic-similar but factually-wrong chunks. A chunk about "OpenAI launched ChatGPT in 2022" retrieves on a query about GPT-4. Embeddings see the same topic; the dates are different. The model stitches the retrieved date into the wrong context.
Multiple chunks contradict. Two retrieved chunks disagree on a fact. The model picks one — sometimes the wrong one — without surfacing the contradiction.
Chunk boundaries split facts. The relevant fact is at the boundary between two retrieved chunks. The model gets half of it; fabricates the other half.
Indexed corpus is wrong. RAG retrieves faithfully from a corpus. If the corpus has wrong facts, RAG confidently surfaces them as authoritative.
Citation drift. The model cites "chunk #4 says X" but actually emitted X from its parametric memory and pasted the chunk citation as plausible cover.

Signed-claim verification doesn't have these failure modes because the unit is the typed fact, not a prose chunk. Either the catalog has the claim (return it, confidence-stamped) or it doesn't (return null, force fallback path).

Why VERITAS alone is insufficient

The signed-claim approach has its own limits:

Coverage gaps. If the user asks about something the catalog doesn't cover — say, a specific product feature or a niche academic result — VERITAS returns null and your code falls through to whatever else you have. Without RAG as the fallback, your application is silent.
Curation latency. New facts (a model released yesterday) take time to verify and enter the catalog. RAG over a freshly-indexed news corpus is faster.
Prose vs claim shape. Some queries genuinely want explanatory prose, not a typed fact. "How does attention work?" isn't an atomic claim — it's a paragraph. RAG over the right corpus answers; VERITAS doesn't.
Subjective questions. "Is GPT-4 better than Claude 3 for code generation?" isn't a fact — it's an evaluation. VERITAS explicitly doesn't ship performance-comparison claims (see the methodology post). RAG over benchmark reports + community discussion fills this gap.

The hybrid pattern

Three layers, ordered by precision:

# Pseudocode
async def answer(question):
    # Layer 1 — VERITAS for atomic facts (high precision)
    veritas_claims = await veritas_search(question, limit=3)

    # Layer 2 — RAG for explanatory context
    rag_chunks = await vector_search(question, limit=5)

    # Layer 3 — model prompt with both, ordered by trust
    context = ""
    if veritas_claims:
        context += "Verified atomic facts (cite [claim_id]):\n"
        context += "\n".join(f"- {c.statement} [{c.id}]" for c in veritas_claims)

    if rag_chunks:
        context += "\n\nDocumentation context (cite [chunk_id]):\n"
        context += "\n".join(f"- {c.text} [{c.id}]" for c in rag_chunks)

    return await llm.generate(
        f"Use the verified facts first. Cite every assertion.\n\n{context}\n\nQ: {question}"
    )

The model is instructed to prefer verified facts over RAG chunks when both cover the same assertion. Verification badges in the UI distinguish the two — clicking [claim_id] opens the canonical SourceScore page; clicking [chunk_id] opens your indexed source.

Cost comparison

For a typical production query (~1,000 queries/day):

RAG: embedding model API ~$0.0001/query + vector DB hosted ~$30/mo + storage. ~$0.001/query total.
VERITAS Free tier: 1,000 calls/mo free, then ~€0.0004/call on the next tier (Indie €19 for 50,000 calls = ~€0.00038/call). Volume tiers (€99 / €499) drop per-call cost further.
Hybrid: additive. ~$0.0018/query for both layers.

The marginal cost of adding VERITAS to an existing RAG stack is small relative to the value of reducing hallucination on covered atoms. The ROI is dominated by your hallucination cost — if it's zero, neither matters; if it's high, both are cheap.

When to use RAG only (skip VERITAS)

Your domain isn't AI/ML and isn't covered by any other signed-claim catalog.
Your queries are explanatory, not factual (how does X work, why is Y, summarize Z).
You have a high-quality indexed corpus you trust completely.
Latency is dominant; you can't afford the verification step.

When to use VERITAS only (skip RAG)

Your application is bounded to AI/ML facts.
You don't have time to build a RAG pipeline.
You need zero-infra grounding — VERITAS is one HTTP call, no vector DB.
You need programmatic auditability of cited facts (signatures).

Is RAG dead?

No. The "RAG is dead" takes you see on Twitter are marketing for the framing-of-the-month, not a methodological shift. RAG remains the default pattern for indexing your own corpus of prose, and that need isn't going away.

What's changed is the recognition that RAG isn't a complete grounding solution — it's one layer in a stack. Signed-claim verification, tool-use grounding, prompt-stuffed invariants, and structured-output schemas all sit alongside RAG in a production-grade pipeline.

Getting started

If you have RAG today: add VERITAS as a parallel retrieval step. Five lines of code change, no infrastructure change. See the 5-line Python tutorial.

If you don't have RAG yet: start with VERITAS for atomic facts (covers ~40-60% of typical AI/ML domain queries). Add RAG once your application has shipped and you've seen which queries fall outside the VERITAS catalog. The data tells you which layer to invest in.

RAG vs signed-claim verification (VERITAS) — when to use each