Pydantic AI + SourceScore VERITAS
Type-safe claim verification as a Pydantic AI tool. The model calls verify_claim() with a structured input, gets back a typed verification envelope, and the agent loop continues with validated data — not free-text.
Why Pydantic AI fits this well
Pydantic AI's design principle is: tools are typed functions with Pydantic models for inputs and outputs. The model gets a JSON schema; the runtime validates every tool call against the schema before the function runs.
That maps cleanly onto VERITAS's envelope format. We define a VerificationResultPydantic model, the agent emits structured tool calls, and the downstream consumer (your application) gets a typed object — not a free-text claim with maybe-a-link.
Installation
pip install pydantic-ai httpx
Pattern: verify-claim tool with typed envelope
Define the Pydantic models for tool input + output, register the tool with the agent, and let the LLM call it when it needs to verify a factual claim.
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
import httpx
class VerifyClaimInput(BaseModel):
"""Input for the verify_claim tool."""
claim: str = Field(description="Natural-language claim to verify")
min_confidence: float = Field(default=0.85, ge=0.0, le=1.0)
class VerificationSource(BaseModel):
url: str
title: str
publisher: str
type: str
published_date: str | None = None
class VerifiedClaim(BaseModel):
id: str
subject: str
predicate: str
object: str
statement: str
confidence: float
sources: list[VerificationSource]
detail_url: str
class VerifySignature(BaseModel):
algorithm: str
signed_by: str
signed_at: str
signature: str
class VerificationResult(BaseModel):
"""Typed envelope from VERITAS /verify."""
query: str
best_match: VerifiedClaim | None
confidence: float
matches_count: int
signature: VerifySignature | None = None
agent = Agent(
"openai:gpt-4o",
system_prompt=(
"You are a research assistant. When a user makes a factual "
"claim about AI/ML, call verify_claim() before responding. "
"If unverified, say so explicitly."
),
)
@agent.tool
async def verify_claim(ctx: RunContext, input: VerifyClaimInput) -> VerificationResult:
"""Verify a natural-language claim against SourceScore VERITAS."""
async with httpx.AsyncClient() as client:
r = await client.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": input.claim, "minConfidence": input.min_confidence},
timeout=5.0,
)
data = r.json()
return VerificationResult.model_validate(data)
# Use it:
result = await agent.run("When was Llama 3.1 released?")
print(result.output)
# Agent calls verify_claim(input=VerifyClaimInput(claim="Llama 3.1 release date"))
# Gets back VerificationResult with typed VerifiedClaim
# Responds: "Llama 3.1 was released 2024-07-23 per the Meta AI announcement
# and the Hugging Face model card (source: https://sourcescore.org/claims/...)."Pattern: structured agent output with required verification
Force the agent to emit a typed answer with verification metadata via Pydantic AI's result_type parameter. The model can't return a free-text answer; it must populate a structured object including the verification source.
class AnsweredQuestion(BaseModel):
"""Required output shape — every answer must include verification."""
question: str
answer: str
verification_status: str # "verified" | "unverified" | "refuted"
primary_sources: list[str] # URLs from verify_claim
confidence: float
agent_with_typed_output = Agent(
"openai:gpt-4o",
result_type=AnsweredQuestion,
system_prompt=(
"Answer questions about AI/ML by calling verify_claim() for "
"each factual assertion. Populate the AnsweredQuestion fields "
"with the verification data; never invent sources."
),
)
# The agent.tool decorator from the previous section also registers here
agent_with_typed_output._tools.update(agent._tools)
result = await agent_with_typed_output.run(
"When was GPT-4 released and how many parameters does it have?"
)
# result.output is now strictly typed:
print(result.output.question) # str
print(result.output.answer) # str
print(result.output.verification_status) # "verified" | "unverified" | "refuted"
print(result.output.primary_sources) # list[str]
print(result.output.confidence) # 0.0-1.0Pattern: multi-claim verification with validation
For research-assistant agents that return multiple claims, verify all of them in parallel before composing the response.
from typing import List
import asyncio
class ClaimWithVerification(BaseModel):
claim_text: str
verified: bool
confidence: float
source_url: str | None
class MultiClaimResponse(BaseModel):
summary: str
claims: List[ClaimWithVerification]
verification_rate: float # % of claims that resolved with ≥0.85 confidence
@agent.tool
async def verify_many(ctx: RunContext, claims: list[str]) -> List[ClaimWithVerification]:
"""Verify multiple claims in parallel."""
async with httpx.AsyncClient() as client:
responses = await asyncio.gather(*[
client.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": c, "minConfidence": 0.85},
timeout=5.0,
)
for c in claims
])
out = []
for claim, response in zip(claims, responses):
data = response.json()
match = data.get("bestMatch")
out.append(ClaimWithVerification(
claim_text=claim,
verified=match is not None and match["confidence"] >= 0.85,
confidence=match["confidence"] if match else 0.0,
source_url=match["detailUrl"] if match else None,
))
return out
# In the agent's response composition:
result = await research_agent.run(
"What can you tell me about Llama 3.1, GPT-4, and Claude 3?"
)
# Agent calls verify_many(["Llama 3.1 release", "GPT-4 release", "Claude 3 release"])
# Composes response only from verified claims; flags unverified explicitlyWhy this pattern beats free-text
- No hallucinated sources. The model can't cite a URL it didn't get from verify_claim — the source field is a typed string from the API response, not a hallucinated string from the model.
- No silent confidence drift. Pydantic validates that confidence is in [0.0, 1.0]; the model can't emit "high confidence" as a free-text string.
- Downstream code is type-safe. Your application that consumes the agent output gets a Pydantic object, not a JSON-shaped string that might be missing fields.
- Validators catch errors early. Add Pydantic
@field_validatordecorators to reject results that don't pass your business rules.
What VERITAS is not (for Pydantic AI agents)
VERITAS today covers AI/ML research — model releases, foundational papers, organizations, datasets, benchmarks. If your agent asks about "the capital of France" the verify_claim tool will return best_match=None and your agent should fall through to a different retrieval path.
Catalog: 206 claims today, growing weekly. New verticals (cybersecurity, data engineering, scientific computing) ship Year 2.
Next steps
- • Browser playground — try /verify before wiring it up
- • OpenAPI 3.1 spec — generate Pydantic models from the spec via
datamodel-code-generator - • DSPy guide — compound-AI-system framework
- • OpenAI tool-calls — the underlying primitive
- • Browse the catalog — 206 verified AI/ML claims
Bug in this guide? Tell us. Pydantic AI's API surface evolves fast; we update this guide on every minor release.