Scaling laws for reward model overoptimization
Scaling laws for reward model overoptimization Read More »
Auto Added by WPeMatico
This report outlines the safety work carried out for the OpenAI o3-mini model, including safety evaluations, external red teaming, and Preparedness Framework evaluations.
OpenAI o3-mini System Card Read More »
An agent that uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you. Available to Pro users today, Plus and Team next.
Introducing deep research Read More »
A universal interface for AI to interact with the digital world.
Computer-Using Agent Read More »
Trading Inference-Time Compute for Adversarial Robustness
Trading inference-time compute for adversarial robustness Read More »
This report outlines the safety work carried out prior to releasing OpenAI o1 and o1-mini, including external red teaming and frontier risk evaluations according to our Preparedness Framework.
OpenAI o1 System Card Read More »
Advancing red teaming with people and AI
Advancing red teaming with people and AI Read More »
A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.
Introducing SimpleQA Read More »
We’ve simplified, stabilized, and scaled continuous-time consistency models, achieving comparable sample quality to leading diffusion models, while using only two sampling steps.
Simplifying, stabilizing, and scaling continuous-time consistency models Read More »