Publication

Auto Added by WPeMatico

Where the goblins came from

ai, AI (Artificial Intelligence), Artificial Intelligence, Publication

How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.

Where the goblins came from Read More »

GPT-5.4 Thinking System Card

ai, AI (Artificial Intelligence), Artificial Intelligence, Publication

GPT-5.4 Thinking System Card Read More »

GPT-5.3 Instant System Card

ai, AI (Artificial Intelligence), Artificial Intelligence, Publication

GPT-5.3 Instant System Card Read More »

Why we no longer evaluate SWE-bench Verified

ai, AI (Artificial Intelligence), Artificial Intelligence, Publication

SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE-bench Pro.

Why we no longer evaluate SWE-bench Verified Read More »

GPT-5.3-Codex System Card

ai, AI (Artificial Intelligence), Artificial Intelligence, Publication

GPT‑5.3-Codex is the most capable agentic coding model to date, combining the frontier coding performance of GPT‑5.2-Codex with the reasoning and professional knowledge capabilities of GPT‑5.2.

GPT-5.3-Codex System Card Read More »