Instructor + SourceScore VERITAS
Instructor (Jason Liu's library) is the canonical pattern for getting typed structured outputs from LLMs. Pair it with VERITAS for structured responses where every cited fact resolves to a verified envelope — caught by Pydantic validators before the response reaches the user.
Installation
pip install instructor httpx
Pattern: typed claim with verified-source field
Define a Pydantic model where the LLM populates structured fields including a source_url field validated against a VERITAS lookup. The validator runs at response-parsing time; if VERITAS doesn't verify, the response fails and Instructor retries.
from pydantic import BaseModel, field_validator, model_validator
from openai import OpenAI
import instructor
import httpx
client = instructor.from_openai(OpenAI())
class ClaimAnswer(BaseModel):
"""LLM response with a verified claim."""
claim: str
answer: str
source_url: str | None = None
confidence: float = 0.0
@model_validator(mode="after")
def verify_with_veritas(self) -> "ClaimAnswer":
"""Look up the claim in SourceScore VERITAS; populate source_url + confidence."""
r = httpx.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": self.claim, "minConfidence": 0.85},
timeout=5.0,
)
result = r.json()
match = result.get("bestMatch")
if match and match["confidence"] >= 0.85:
self.source_url = match["detailUrl"]
self.confidence = match["confidence"]
else:
# Trigger Instructor retry with a different LLM phrasing
raise ValueError(
f"Claim '{self.claim}' could not be verified. "
"Please rephrase using a more specific fact."
)
return self
# Use it:
result = client.chat.completions.create(
model="gpt-4o",
response_model=ClaimAnswer,
messages=[
{"role": "user", "content": "When was Llama 3.1 released?"},
],
max_retries=3, # Instructor retries on validation failure
)
print(result.claim) # "Llama 3.1 release date"
print(result.answer) # "2024-07-23"
print(result.source_url) # "https://sourcescore.org/api/v1/claims/.../"
print(result.confidence) # 1.0Pattern: list of verified claims
For research-assistant agents that produce multiple claims, extract a list of verified-claim objects:
from typing import List
from pydantic import BaseModel, field_validator
class VerifiedClaim(BaseModel):
statement: str
source_url: str
confidence: float
@field_validator("source_url", mode="before")
@classmethod
def verify(cls, v, info):
statement = info.data.get("statement", "")
r = httpx.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": statement, "minConfidence": 0.85},
timeout=5.0,
)
result = r.json()
match = result.get("bestMatch")
if not match or match["confidence"] < 0.85:
raise ValueError(f"Unverified claim: {statement!r}")
return match["detailUrl"]
class ResearchSummary(BaseModel):
topic: str
summary: str
key_claims: List[VerifiedClaim]
result = client.chat.completions.create(
model="gpt-4o",
response_model=ResearchSummary,
messages=[
{"role": "user", "content": "Summarize the foundational papers behind modern LLMs."},
],
max_retries=3,
)
# result is a fully-typed ResearchSummary
# every key_claims entry was verified by SourceScore before parsing succeeded
for c in result.key_claims:
print(f"{c.statement} — {c.source_url} (conf: {c.confidence})")Why this pattern beats free-text + post-hoc verification
- Validation happens at parse-time. Failed verification triggers Instructor's retry mechanism with the original prompt — model gets to self-correct before the user sees a response.
- Type-safety at the application boundary. Downstream code receives a typed Pydantic object; can't accidentally render an unverified claim because the field is never populated without verification.
- No regex extraction. Free-text + post-hoc verification needs heuristic claim extraction (which fails on multi-clause sentences). Instructor-validated approach extracts claims at structured-output time.
- Retries are automatic. max_retries=3 means three attempts at a verifiable response before failing. Tunable per-call.
When this pattern fits
- Production AI/ML research assistants where every cited fact needs verification
- Documentation chatbots that summarize technical content
- Internal company knowledge tools where the LLM cites verified-only facts
- Citation-heavy reports or briefs where unverified claims are unacceptable
Comparison: Instructor vs Pydantic AI
Both libraries solve the "typed LLM outputs" problem. Differences:
- Instructor — older, larger ecosystem, supports more LLM providers, no built-in tool-calling abstractions. Pair-with-anything design.
- Pydantic AI — newer, agent-loop-aware, native tool registration, type-safety end-to-end. More opinionated; bigger framework.
Pick Instructor for one-shot structured-output use cases. Pick Pydantic AI for agent loops with multiple tool calls. Both work with VERITAS the same way.
Next steps
- • Pydantic AI guide — agent-loop variant
- • Research citation use case — Instructor-shape patterns
- • Playground — try /verify before wiring it up
- • OpenAPI 3.1 spec
- • Catalog — 206 verified AI/ML claims