GPT-5.4 Thinking System Card
GPT-5.4 Thinking System Card Read More »
Auto Added by WPeMatico
SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE-bench Pro.
Why we no longer evaluate SWE-bench Verified Read More »
GPT‑5.3-Codex is the most capable agentic coding model to date, combining the frontier coding performance of GPT‑5.2-Codex with the reasoning and professional knowledge capabilities of GPT‑5.2.
GPT-5.3-Codex System Card Read More »