Anthropic SDK + SourceScore VERITAS
Expose VERITAS as a Claude tool via the Anthropic SDK. When Claude needs to ground a factual claim, it emits a tool_use block calling verify_claim; you execute the API call and feed the result back as a tool_result; Claude composes the final answer with the verified data.
Installation
# Python pip install anthropic httpx # TypeScript / Node npm install @anthropic-ai/sdk
Pattern: tool definition + agent loop (Python)
Claude's tool-use protocol is a multi-turn loop: model emits tool_use, you execute, you reply with tool_result, model composes final response. SourceScore's response envelope drops in as a tool_result content block verbatim.
import anthropic
import httpx
import json
client = anthropic.Anthropic() # picks up ANTHROPIC_API_KEY
tools = [
{
"name": "verify_claim",
"description": (
"Verify a natural-language factual claim about AI/ML "
"research (model releases, papers, dates, parameter counts). "
"Returns a verified-claim envelope with primary sources and "
"HMAC signature, or no match if the claim isn't in the catalog."
),
"input_schema": {
"type": "object",
"properties": {
"claim": {
"type": "string",
"description": "Natural-language claim to verify",
},
"min_confidence": {
"type": "number",
"description": "Minimum confidence threshold (0.0-1.0)",
"default": 0.85,
},
},
"required": ["claim"],
},
},
]
async def execute_verify_claim(claim: str, min_confidence: float = 0.85) -> dict:
"""Call SourceScore VERITAS /verify endpoint."""
async with httpx.AsyncClient() as http:
r = await http.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": claim, "minConfidence": min_confidence},
timeout=5.0,
)
return r.json()
async def chat(user_message: str) -> str:
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=messages,
)
# If Claude wants to use a tool, execute it
if response.stop_reason == "tool_use":
tool_use_block = next(
b for b in response.content if b.type == "tool_use"
)
if tool_use_block.name == "verify_claim":
result = await execute_verify_claim(
claim=tool_use_block.input["claim"],
min_confidence=tool_use_block.input.get("min_confidence", 0.85),
)
# Continue the loop with the tool result
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use_block.id,
"content": json.dumps(result),
}
],
})
continue
# No more tool use; return Claude's final response
return "".join(b.text for b in response.content if b.type == "text")
# Use it:
import asyncio
answer = asyncio.run(chat("When was Llama 3.1 released?"))
print(answer)
# → "Llama 3.1 was released on 2024-07-23, per the Meta AI announcement
# and the model card on Hugging Face."Pattern: TypeScript with the @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [
{
name: "verify_claim",
description: (
"Verify a natural-language factual claim about AI/ML research. " +
"Returns a verified-claim envelope with primary sources and HMAC signature."
),
input_schema: {
type: "object",
properties: {
claim: { type: "string" },
min_confidence: { type: "number", default: 0.85 },
},
required: ["claim"],
},
},
];
async function executeVerifyClaim(claim: string, minConfidence = 0.85) {
const r = await fetch("https://sourcescore.org/api/v1/verify", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ claim, minConfidence }),
});
return r.json();
}
async function chat(userMessage: string): Promise<string> {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages,
});
if (response.stop_reason === "tool_use") {
const toolUseBlock = response.content.find(
(b): b is Anthropic.ToolUseBlock => b.type === "tool_use",
);
if (!toolUseBlock) break;
if (toolUseBlock.name === "verify_claim") {
const result = await executeVerifyClaim(
(toolUseBlock.input as { claim: string }).claim,
(toolUseBlock.input as { min_confidence?: number }).min_confidence ?? 0.85,
);
messages.push({ role: "assistant", content: response.content });
messages.push({
role: "user",
content: [
{
type: "tool_result",
tool_use_id: toolUseBlock.id,
content: JSON.stringify(result),
},
],
});
continue;
}
}
return response.content
.filter((b): b is Anthropic.TextBlock => b.type === "text")
.map((b) => b.text)
.join("");
}
return "";
}
const answer = await chat("When was Llama 3.1 released?");
console.log(answer);When Claude self-corrects with verify_claim
One useful pattern: a system prompt that instructs Claude to verify any claim it's about to emit about AI/ML before including it in the response. Claude will autonomously decide to call verify_claim mid-reasoning, then either confirm or correct its initial assertion.
system_prompt = """You are a research assistant for AI/ML topics. CRITICAL: When you make ANY factual claim about an AI model, paper, release date, parameter count, or architecture decision — you MUST verify it via the verify_claim tool BEFORE including it in your response. If verify_claim returns best_match with confidence >= 0.85, cite the detail_url in your response. If best_match is null OR confidence < 0.85, explicitly mark the assertion as "unverified" in your response. NEVER assert a release date or parameter count without first calling verify_claim. The cost of being wrong is higher than the latency of the API call."""
With this system prompt, Claude self-grounds. The downstream application doesn't need to extract claims + verify them — Claude does it inline.
When this pattern fits
- Conversational AI/ML research assistants — where users ask factual questions and you want the model to ground itself
- Documentation chatbots over AI/ML knowledge — internal team support tools, public-facing FAQs
- Citation-required production systems — papers, technical reports, audit trails
- Multi-step agentic flows where one step is "look up a fact"
Next steps
- • OpenAI tool-calls guide — the parallel pattern for GPT-4
- • Pydantic AI guide — typed-tool pattern with validators
- • Playground — try /verify before wiring it up
- • OpenAPI 3.1 spec — full endpoint reference
- • Catalog — 206 verified AI/ML claims