MIT-IBM Watson AI Lab

Auto Added by WPeMatico

Guided learning lets “untrainable” neural networks realize their potential

Even networks long considered “untrainable” can learn effectively with a bit of a helping hand. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have shown that a brief period of alignment between neural networks, a method they call guidance, can dramatically improve the performance of architectures previously thought unsuitable for modern tasks.Their findings […]

Guided learning lets “untrainable” neural networks realize their potential Read More »

A new way to increase the capabilities of large language models

Most languages use word position and sentence structure to extract meaning. For example, “The cat sat on the box,” is not the same as “The box was on the cat.” Over a long text, like a financial document or a novel, the syntax of these words likely evolves. Similarly, a person might be tracking variables in

A new way to increase the capabilities of large language models Read More »

Enabling small language models to solve complex reasoning tasks

As language models (LMs) improve at tasks like image generation, trivia questions, and simple math, you might think that human-like reasoning is around the corner. In reality, they still trail us by a wide margin on complex tasks. Try playing Sudoku with one, for instance, where you fill in numbers one through nine in such

Enabling small language models to solve complex reasoning tasks Read More »

A smarter way for large language models to think about hard problems

To make large language models (LLMs) more accurate when answering harder questions, researchers can let the model spend more time thinking about potential solutions.But common approaches that give LLMs this capability set a fixed computational budget for every problem, regardless of how complex it is. This means the LLM might waste computational resources on simpler questions

A smarter way for large language models to think about hard problems Read More »

Charting the future of AI, from safer answers to faster thinking

Adoption of new tools and technologies occurs when users largely perceive them as reliable, accessible, and an improvement over the available methods and workflows for the cost. Five PhD students from the inaugural class of the MIT-IBM Watson AI Lab Summer Program are utilizing state-of-the-art resources, alleviating AI pain points, and creating new features and

Charting the future of AI, from safer answers to faster thinking Read More »

Method teaches generative AI models to locate personalized objects

Say a person takes their French Bulldog, Bowser, to the dog park. Identifying Bowser as he plays among the other canines is easy for the dog-owner to do while onsite.But if someone wants to use a generative AI model like GPT-5 to monitor their pet while they are at work, the model could fail at

Method teaches generative AI models to locate personalized objects Read More »

Creating AI that matters

When it comes to artificial intelligence, MIT and IBM were there at the beginning: laying foundational work and creating some of the first programs — AI predecessors — and theorizing how machine “intelligence” might come to be.Today, collaborations like the MIT-IBM Watson AI Lab, which launched eight years ago, are continuing to deliver expertise for

Creating AI that matters Read More »

How to build AI scaling laws for efficient LLM training and budget maximization

When researchers are building large language models (LLMs), they aim to maximize performance under a particular computational and financial budget. Since training a model can amount to millions of dollars, developers need to be judicious with cost-impacting decisions about, for instance, the model architecture, optimizers, and training datasets before committing to a model. To anticipate

How to build AI scaling laws for efficient LLM training and budget maximization Read More »