Safety & Alignment

Auto Added by WPeMatico

Forecasting potential misuses of language models for disinformation campaigns and how to reduce risk

OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Internet Observatory to investigate how large language models might be misused for disinformation purposes. The collaboration included an October 2021 workshop bringing together 30 disinformation researchers, machine learning experts, and policy analysts, and culminated in a co-authored report building on

Forecasting potential misuses of language models for disinformation campaigns and how to reduce risk Read More »

Operator System Card

Drawing from OpenAI’s established safety frameworks, this document highlights our multi-layered approach, including model and product mitigations we’ve implemented to protect against prompt engineering and jailbreaks, protect privacy and security, as well as details our external red teaming efforts, safety evaluations, and ongoing work to further refine these safeguards.

Operator System Card Read More »