SourceScore

Instructor + SourceScore VERITAS

Instructor (Jason Liu's library) is the canonical pattern for getting typed structured outputs from LLMs. Pair it with VERITAS for structured responses where every cited fact resolves to a verified envelope — caught by Pydantic validators before the response reaches the user.

Installation

pip install instructor httpx

Pattern: typed claim with verified-source field

Define a Pydantic model where the LLM populates structured fields including a source_url field validated against a VERITAS lookup. The validator runs at response-parsing time; if VERITAS doesn't verify, the response fails and Instructor retries.

from pydantic import BaseModel, field_validator, model_validator
from openai import OpenAI
import instructor
import httpx

client = instructor.from_openai(OpenAI())

class ClaimAnswer(BaseModel):
    """LLM response with a verified claim."""
    claim: str
    answer: str
    source_url: str | None = None
    confidence: float = 0.0

    @model_validator(mode="after")
    def verify_with_veritas(self) -> "ClaimAnswer":
        """Look up the claim in SourceScore VERITAS; populate source_url + confidence."""
        r = httpx.post(
            "https://sourcescore.org/api/v1/verify",
            json={"claim": self.claim, "minConfidence": 0.85},
            timeout=5.0,
        )
        result = r.json()
        match = result.get("bestMatch")
        if match and match["confidence"] >= 0.85:
            self.source_url = match["detailUrl"]
            self.confidence = match["confidence"]
        else:
            # Trigger Instructor retry with a different LLM phrasing
            raise ValueError(
                f"Claim '{self.claim}' could not be verified. "
                "Please rephrase using a more specific fact."
            )
        return self

# Use it:
result = client.chat.completions.create(
    model="gpt-4o",
    response_model=ClaimAnswer,
    messages=[
        {"role": "user", "content": "When was Llama 3.1 released?"},
    ],
    max_retries=3,  # Instructor retries on validation failure
)

print(result.claim)        # "Llama 3.1 release date"
print(result.answer)       # "2024-07-23"
print(result.source_url)   # "https://sourcescore.org/api/v1/claims/.../"
print(result.confidence)   # 1.0

Pattern: list of verified claims

For research-assistant agents that produce multiple claims, extract a list of verified-claim objects:

from typing import List
from pydantic import BaseModel, field_validator

class VerifiedClaim(BaseModel):
    statement: str
    source_url: str
    confidence: float

    @field_validator("source_url", mode="before")
    @classmethod
    def verify(cls, v, info):
        statement = info.data.get("statement", "")
        r = httpx.post(
            "https://sourcescore.org/api/v1/verify",
            json={"claim": statement, "minConfidence": 0.85},
            timeout=5.0,
        )
        result = r.json()
        match = result.get("bestMatch")
        if not match or match["confidence"] < 0.85:
            raise ValueError(f"Unverified claim: {statement!r}")
        return match["detailUrl"]

class ResearchSummary(BaseModel):
    topic: str
    summary: str
    key_claims: List[VerifiedClaim]

result = client.chat.completions.create(
    model="gpt-4o",
    response_model=ResearchSummary,
    messages=[
        {"role": "user", "content": "Summarize the foundational papers behind modern LLMs."},
    ],
    max_retries=3,
)

# result is a fully-typed ResearchSummary
# every key_claims entry was verified by SourceScore before parsing succeeded
for c in result.key_claims:
    print(f"{c.statement} — {c.source_url} (conf: {c.confidence})")

Why this pattern beats free-text + post-hoc verification

  • Validation happens at parse-time. Failed verification triggers Instructor's retry mechanism with the original prompt — model gets to self-correct before the user sees a response.
  • Type-safety at the application boundary. Downstream code receives a typed Pydantic object; can't accidentally render an unverified claim because the field is never populated without verification.
  • No regex extraction. Free-text + post-hoc verification needs heuristic claim extraction (which fails on multi-clause sentences). Instructor-validated approach extracts claims at structured-output time.
  • Retries are automatic. max_retries=3 means three attempts at a verifiable response before failing. Tunable per-call.

When this pattern fits

  • Production AI/ML research assistants where every cited fact needs verification
  • Documentation chatbots that summarize technical content
  • Internal company knowledge tools where the LLM cites verified-only facts
  • Citation-heavy reports or briefs where unverified claims are unacceptable

Comparison: Instructor vs Pydantic AI

Both libraries solve the "typed LLM outputs" problem. Differences:

  • Instructor — older, larger ecosystem, supports more LLM providers, no built-in tool-calling abstractions. Pair-with-anything design.
  • Pydantic AI — newer, agent-loop-aware, native tool registration, type-safety end-to-end. More opinionated; bigger framework.

Pick Instructor for one-shot structured-output use cases. Pick Pydantic AI for agent loops with multiple tool calls. Both work with VERITAS the same way.

Next steps