Adversarial attacks on neural network policies
Adversarial attacks on neural network policies Read More »
Auto Added by WPeMatico
We’ve written a paper arguing that long-term AI safety research needs social scientists to ensure AI alignment algorithms succeed when actual humans are involved. Properly aligning advanced AI systems with human values requires resolving many uncertainties related to the psychology of human rationality, emotion, and biases. The aim of this paper is to spark further
AI safety needs social scientists Read More »
We’re proposing an AI safety technique called iterated amplification that lets us specify complicated behaviors and goals that are beyond human scale, by demonstrating how to decompose a task into simpler sub-tasks, rather than by providing labeled data or a reward function. Although this idea is in its very early stages and we have only
Learning complex goals with iterated amplification Read More »
We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored
Improving language understanding with unsupervised learning Read More »
We’re proposing an AI safety technique which trains agents to debate topics with one another, using a human to judge who wins.
AI safety via debate Read More »
We’ve co-authored a paper that forecasts how malicious actors could misuse AI technology, and potential ways we can prevent and mitigate these threats. This paper is the outcome of almost a year of sustained work with our colleagues at the Future of Humanity Institute, the Centre for the Study of Existential Risk, the Center for
Preparing for malicious uses of AI Read More »
Artificial general intelligence has the potential to benefit nearly every aspect of our lives—so it must be developed and deployed responsibly.
OpenAI safety practices Read More »
We’re working to prevent abuse, provide transparency on AI-generated content, and improve access to accurate voting information.
How OpenAI is approaching 2024 worldwide elections Read More »
We funded 10 teams from around the world to design ideas and tools to collectively govern AI. We summarize the innovations, outline our learnings, and call for researchers and engineers to join us as we continue this work.
Democratic inputs to AI grant program: lessons learned and implementation plans Read More »