training

Auto Added by WPeMatico

LLMs believe false statements even after explicit warnings that they’re false

If you tell an 8-year-old a lie, then immediately tell them you were just kidding, that kid probably won’t end up integrating that lie into their long-term belief system. But new research on so-called “negation neglect” finds that LLMs have a robust tendency to accept false or fictitious statements even when they are clearly and […]

LLMs believe false statements even after explicit warnings that they’re false Read More »

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this “misalignment” was primarily the result of training on

Anthropic blames dystopian sci-fi for training AI models to act “evil” Read More »

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this “misalignment” was primarily the result of training on

Anthropic blames dystopian sci-fi for training AI models to act “evil” Read More »

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this “misalignment” was primarily the result of training on

Anthropic blames dystopian sci-fi for training AI models to act “evil” Read More »

Meta will use employee-tracking software to help train AI agents: Report

Meta will begin tracking the mouse movements, clicks, and keystrokes of its US employees to generate high-quality training data for future AI agents, Reuters reports. The news organization cites internal memos posted by the Meta Superintelligence Labs team in reporting on the new Model Capability Initiative employee-tracking software. That software will operate on specific work-related

Meta will use employee-tracking software to help train AI agents: Report Read More »

From folding boxes to fixing vacuums, GEN-1 robotics model hits 99% reliability

Robotic machine learning company Generalist has announced GEN-1, a new physical AI system that it says “crosses into production-level success rates” on “a broad range of physical skills” that used to require the dexterity and muscle memory of human hands. Generalist is also touting the new model’s ability to respond to disruptions by improvising new

From folding boxes to fixing vacuums, GEN-1 robotics model hits 99% reliability Read More »

Figuring out why AIs get flummoxed by some games

With its Alpha series of game-playing AIs, Google’s DeepMind group seemed to have found a way for its AIs to tackle any game, mastering games like chess and Go by repeatedly playing itself during training. But then some odd things happened as people started identifying Go positions that would lose against relative newcomers to the

Figuring out why AIs get flummoxed by some games Read More »

Large genome model: Open source AI trained on trillions of bases

Late in 2025, we covered the development of an AI system called Evo that was trained on massive numbers of bacterial genomes. So many that, when prompted with sequences from a cluster of related genes, it could correctly identify the next one or suggest a completely novel protein. That system worked because bacteria tend to

Large genome model: Open source AI trained on trillions of bases Read More »

Anthropic: Claude faces ‘industrial-scale’ AI model distillation

Anthropic has detailed three “industrial-scale” AI model distillation campaigns by overseas labs designed to extract abilities from Claude. These competitors generated over 16 million exchanges using approximately 24,000 deceptive accounts. Their goal was to acquire proprietary logic to improve their competing platforms. The extraction technique, known as distillation, involves training a weaker system on the

Anthropic: Claude faces ‘industrial-scale’ AI model distillation Read More »