SourceScore

Anthropic SDK + SourceScore VERITAS

Expose VERITAS as a Claude tool via the Anthropic SDK. When Claude needs to ground a factual claim, it emits a tool_use block calling verify_claim; you execute the API call and feed the result back as a tool_result; Claude composes the final answer with the verified data.

Installation

# Python
pip install anthropic httpx

# TypeScript / Node
npm install @anthropic-ai/sdk

Pattern: tool definition + agent loop (Python)

Claude's tool-use protocol is a multi-turn loop: model emits tool_use, you execute, you reply with tool_result, model composes final response. SourceScore's response envelope drops in as a tool_result content block verbatim.

import anthropic
import httpx
import json

client = anthropic.Anthropic()  # picks up ANTHROPIC_API_KEY

tools = [
    {
        "name": "verify_claim",
        "description": (
            "Verify a natural-language factual claim about AI/ML "
            "research (model releases, papers, dates, parameter counts). "
            "Returns a verified-claim envelope with primary sources and "
            "HMAC signature, or no match if the claim isn't in the catalog."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "claim": {
                    "type": "string",
                    "description": "Natural-language claim to verify",
                },
                "min_confidence": {
                    "type": "number",
                    "description": "Minimum confidence threshold (0.0-1.0)",
                    "default": 0.85,
                },
            },
            "required": ["claim"],
        },
    },
]

async def execute_verify_claim(claim: str, min_confidence: float = 0.85) -> dict:
    """Call SourceScore VERITAS /verify endpoint."""
    async with httpx.AsyncClient() as http:
        r = await http.post(
            "https://sourcescore.org/api/v1/verify",
            json={"claim": claim, "minConfidence": min_confidence},
            timeout=5.0,
        )
        return r.json()

async def chat(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

        # If Claude wants to use a tool, execute it
        if response.stop_reason == "tool_use":
            tool_use_block = next(
                b for b in response.content if b.type == "tool_use"
            )

            if tool_use_block.name == "verify_claim":
                result = await execute_verify_claim(
                    claim=tool_use_block.input["claim"],
                    min_confidence=tool_use_block.input.get("min_confidence", 0.85),
                )

                # Continue the loop with the tool result
                messages.append({"role": "assistant", "content": response.content})
                messages.append({
                    "role": "user",
                    "content": [
                        {
                            "type": "tool_result",
                            "tool_use_id": tool_use_block.id,
                            "content": json.dumps(result),
                        }
                    ],
                })
                continue

        # No more tool use; return Claude's final response
        return "".join(b.text for b in response.content if b.type == "text")

# Use it:
import asyncio
answer = asyncio.run(chat("When was Llama 3.1 released?"))
print(answer)
# → "Llama 3.1 was released on 2024-07-23, per the Meta AI announcement
#    and the model card on Hugging Face."

Pattern: TypeScript with the @anthropic-ai/sdk

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [
  {
    name: "verify_claim",
    description: (
      "Verify a natural-language factual claim about AI/ML research. " +
      "Returns a verified-claim envelope with primary sources and HMAC signature."
    ),
    input_schema: {
      type: "object",
      properties: {
        claim: { type: "string" },
        min_confidence: { type: "number", default: 0.85 },
      },
      required: ["claim"],
    },
  },
];

async function executeVerifyClaim(claim: string, minConfidence = 0.85) {
  const r = await fetch("https://sourcescore.org/api/v1/verify", {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ claim, minConfidence }),
  });
  return r.json();
}

async function chat(userMessage: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-opus-4-7",
      max_tokens: 1024,
      tools,
      messages,
    });

    if (response.stop_reason === "tool_use") {
      const toolUseBlock = response.content.find(
        (b): b is Anthropic.ToolUseBlock => b.type === "tool_use",
      );
      if (!toolUseBlock) break;

      if (toolUseBlock.name === "verify_claim") {
        const result = await executeVerifyClaim(
          (toolUseBlock.input as { claim: string }).claim,
          (toolUseBlock.input as { min_confidence?: number }).min_confidence ?? 0.85,
        );

        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [
            {
              type: "tool_result",
              tool_use_id: toolUseBlock.id,
              content: JSON.stringify(result),
            },
          ],
        });
        continue;
      }
    }

    return response.content
      .filter((b): b is Anthropic.TextBlock => b.type === "text")
      .map((b) => b.text)
      .join("");
  }
  return "";
}

const answer = await chat("When was Llama 3.1 released?");
console.log(answer);

When Claude self-corrects with verify_claim

One useful pattern: a system prompt that instructs Claude to verify any claim it's about to emit about AI/ML before including it in the response. Claude will autonomously decide to call verify_claim mid-reasoning, then either confirm or correct its initial assertion.

system_prompt = """You are a research assistant for AI/ML topics.

CRITICAL: When you make ANY factual claim about an AI model, paper,
release date, parameter count, or architecture decision — you MUST
verify it via the verify_claim tool BEFORE including it in your response.

If verify_claim returns best_match with confidence >= 0.85, cite the
detail_url in your response. If best_match is null OR confidence < 0.85,
explicitly mark the assertion as "unverified" in your response.

NEVER assert a release date or parameter count without first calling
verify_claim. The cost of being wrong is higher than the latency of
the API call."""

With this system prompt, Claude self-grounds. The downstream application doesn't need to extract claims + verify them — Claude does it inline.

When this pattern fits

  • Conversational AI/ML research assistants — where users ask factual questions and you want the model to ground itself
  • Documentation chatbots over AI/ML knowledge — internal team support tools, public-facing FAQs
  • Citation-required production systems — papers, technical reports, audit trails
  • Multi-step agentic flows where one step is "look up a fact"

Next steps