SourceScore
SourceScore VERITAS · verified claim100% confidence

SWE-bench introduced in: Jimenez et al. 2024 — software engineering benchmark from GitHub issues.

Subject
SWE-bench
Predicate
introduced_in
Object
Jimenez et al. 2024 — software engineering benchmark from GitHub issues
Primary source · preprint · 2023-10-10
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? arXiv (Jimenez, Yang, Wettig, Yao, Pei, Press, Narasimhan / Princeton + Chicago)
Last verified 2026-05-16 · 2 sources · b16b5f5297e5f621View full claim →