Integration guide
DSPy + VERITAS
DSPy is Stanford's compound-AI-system framework — programs instead of prompts. This guide shows two integration patterns: a custom dspy.Retrieve backed by the VERITAS catalog, and a verify-and-flag post-processor module.
Why DSPy + VERITAS
DSPy programs declare what the system should do (signatures + modules) and leave the how (exact prompts) to the optimizer. That separation makes external retrieval modules first-class — a VERITAS retriever fits naturally into the existing dspy.Retrieve interface.
The compound system gains a typed retrieval path that DSPy's optimizers can reason about — verified-claim retrieval becomes a tunable step, not a brittle prompt-stuffing decision.
Install
pip install dspy-ai requestsPattern 1 — Custom dspy.Retrieve
Subclass dspy.Retrieve and translate VERITAS search hits into DSPy passages. Each passage carries the claim id, confidence, and source URLs as metadata so downstream modules can render citations.
import dspy
import requests
from typing import List
VERITAS = "https://sourcescore.org/api/v1"
class VeritasRetriever(dspy.Retrieve):
def __init__(self, k: int = 5, min_confidence: float = 0.8):
super().__init__(k=k)
self.min_confidence = min_confidence
def forward(self, query_or_queries, k=None) -> List[dspy.Example]:
queries = [query_or_queries] if isinstance(query_or_queries, str) else query_or_queries
results = []
for q in queries:
r = requests.get(
f"{VERITAS}/search",
params={"q": q, "limit": k or self.k},
timeout=8,
)
for hit in r.json().get("matches", []):
if hit.get("confidence", 0) < self.min_confidence:
continue
results.append(
dspy.Example(
long_text=hit["statement"],
claim_id=hit["id"],
confidence=hit["confidence"],
canonical_url=f"https://sourcescore.org/claims/{hit['id']}/",
tags=hit.get("tags", []),
).with_inputs("long_text")
)
return results
Wire into a DSPy program
import dspy
# Set up the LM + retriever
lm = dspy.OpenAI(model="gpt-4o-mini", temperature=0)
rm = VeritasRetriever(k=5, min_confidence=0.85)
dspy.settings.configure(lm=lm, rm=rm)
# Define the signature
class CitedAnswer(dspy.Signature):
"""Answer the question using only the verified claims. Cite [claim_id] inline."""
question: str = dspy.InputField()
context: list[str] = dspy.InputField(desc="Verified claims with [claim_id] tags")
answer: str = dspy.OutputField(desc="Answer with [claim_id] citations after every fact")
# Build the program
class CitedRAG(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=5)
self.generate = dspy.ChainOfThought(CitedAnswer)
def forward(self, question: str):
passages = self.retrieve(question).passages
context = [
f"{p.long_text} [{p.claim_id}] (conf={p.confidence:.2f})"
for p in passages
]
return self.generate(question=question, context=context)
program = CitedRAG()
result = program(question="When was the Transformer architecture introduced?")
print(result.answer)
The signature forces a [claim_id] citation after every assertion. DSPy's optimizer can later tune the exact prompt around this signature without changing the contract — VERITAS continues to feed verified passages regardless of which prompt-template the optimizer settles on.
Pattern 2 — Verify post-processor module
When you want free-form generation but a verification layer afterwards, wrap /api/v1/verify in a DSPy module that runs after the answer generation.
class VeritasVerify(dspy.Module):
"""Post-process an answer — verify each assertion against the catalog."""
def __init__(self, min_confidence: float = 0.85):
super().__init__()
self.min_confidence = min_confidence
def forward(self, answer: str) -> dict:
lines = [l.strip() for l in answer.split("\n") if l.strip()]
verified, unverified = [], []
for line in lines:
r = requests.post(
f"{VERITAS}/verify",
json={"claim": line, "minConfidence": self.min_confidence},
timeout=8,
).json()
if r.get("bestMatch"):
verified.append({
"text": line,
"claim_id": r["bestMatch"]["id"],
"confidence": r["bestMatch"]["confidence"],
"url": f"https://sourcescore.org/claims/{r['bestMatch']['id']}/",
})
else:
unverified.append(line)
return dspy.Prediction(
verified=verified,
unverified=unverified,
verification_rate=len(verified) / max(1, len(lines)),
)
# Composing it into a larger program
class AnswerAndVerify(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.ChainOfThought("question -> answer")
self.verify = VeritasVerify(min_confidence=0.85)
def forward(self, question: str):
a = self.generate(question=question)
v = self.verify(a.answer)
return dspy.Prediction(
answer=a.answer,
verified_claims=v.verified,
unverified_claims=v.unverified,
verification_rate=v.verification_rate,
)
DSPy optimizer compatibility
DSPy's optimizers (BootstrapFewShot, MIPRO, COPRO) can tune around the VERITAS retriever — they'll adjust the prompts that consume the passages, but they can't change what the passages contain. That's by design: the catalog is the trusted layer, the optimizer improves how the model uses it.
A good metric for optimization: verification_rate from AnswerAndVerify. Maximizing it tunes the program toward producing more verifiable assertions — the catalog acts as the ground-truth signal.
Compose with other DSPy modules
The VERITAS retriever composes with any DSPy pattern: ReAct, MultiHopProgram, ProgramOfThought. A multi-hop pattern with verification:
class MultiHopVerified(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = VeritasRetriever(k=3)
self.hop1 = dspy.ChainOfThought("question -> sub_question")
self.hop2 = dspy.ChainOfThought("question, sub_answer -> final_answer")
self.verify = VeritasVerify()
def forward(self, question: str):
sub_q = self.hop1(question=question).sub_question
passages = self.retrieve(sub_q).passages
sub_a = "\n".join(p.long_text for p in passages)
final = self.hop2(question=question, sub_answer=sub_a).final_answer
return self.verify(final)
Next steps
- • Full API reference
- • LangChain guide — similar patterns in a different framework
- • LlamaIndex guide — Retriever + NodePostprocessor
- • OpenAI tool-calls — native function-calling
- • Vercel AI SDK — TypeScript/Next.js
- • Citation chains — local verification of signed envelopes
- • Browse the catalog — 206 verified AI/ML claims