Safety & Alignment

Auto Added by WPeMatico

Frontier AI regulation: Managing emerging risks to public safety

Artificial Intelligence, Safety & Alignment

Frontier AI regulation: Managing emerging risks to public safety Read More »

Moving AI governance forward

Artificial Intelligence, Safety & Alignment

OpenAI and other leading labs reinforce AI safety, security and trustworthiness through voluntary commitments.

Moving AI governance forward Read More »

Insights from global conversations

Artificial Intelligence, Safety & Alignment

We are sharing what we learned from our conversations across 22 countries, and how we will be incorporating those insights moving forward.

Insights from global conversations Read More »

Governance of superintelligence

Artificial Intelligence, Safety & Alignment

Now is a good time to start thinking about the governance of superintelligence—future AI systems dramatically more capable than even AGI.

Governance of superintelligence Read More »

Our approach to AI safety

Artificial Intelligence, Safety & Alignment

Ensuring that AI systems are built, deployed, and used safely is critical to our mission.

Our approach to AI safety Read More »

Language models can explain neurons in language models

Artificial Intelligence, Safety & Alignment

We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset of these (imperfect) explanations and scores for every neuron in GPT-2.

Language models can explain neurons in language models Read More »

How should AI systems behave, and who should decide?

Artificial Intelligence, Safety & Alignment

We’re clarifying how ChatGPT’s behavior is shaped and our plans for improving that behavior, allowing more user customization, and getting more public input into our decision-making in these areas.

How should AI systems behave, and who should decide? Read More »

Planning for AGI and beyond

Artificial Intelligence, Safety & Alignment

Our mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

Planning for AGI and beyond Read More »

Forecasting potential misuses of language models for disinformation campaigns and how to reduce risk

Artificial Intelligence, Safety & Alignment

OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Internet Observatory to investigate how large language models might be misused for disinformation purposes. The collaboration included an October 2021 workshop bringing together 30 disinformation researchers, machine learning experts, and policy analysts, and culminated in a co-authored report building on

Forecasting potential misuses of language models for disinformation campaigns and how to reduce risk Read More »

Operator System Card

Artificial Intelligence, Safety & Alignment

Drawing from OpenAI’s established safety frameworks, this document highlights our multi-layered approach, including model and product mitigations we’ve implemented to protect against prompt engineering and jailbreaks, protect privacy and security, as well as details our external red teaming efforts, safety evaluations, and ongoing work to further refine these safeguards.

Operator System Card Read More »