MIT-IBM Watson AI Lab

Auto Added by WPeMatico

Guided learning lets “untrainable” neural networks realize their potential

Algorithms, Artificial Intelligence, Center for Brains Minds and Machines, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, data, Defense Advanced Research Projects Agency (DARPA), Electrical engineering and computer science (EECS), Machine Learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, National Science Foundation (NSF), Networks, Neuroscience, Research, School of Engineering

Even networks long considered “untrainable” can learn effectively with a bit of a helping hand. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have shown that a brief period of alignment between neural networks, a method they call guidance, can dramatically improve the performance of architectures previously thought unsuitable for modern tasks.Their findings […]

Guided learning lets “untrainable” neural networks realize their potential Read More »

A new way to increase the capabilities of large language models

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, data, Electrical engineering and computer science (EECS), Machine Learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering

Most languages use word position and sentence structure to extract meaning. For example, “The cat sat on the box,” is not the same as “The box was on the cat.” Over a long text, like a financial document or a novel, the syntax of these words likely evolves. Similarly, a person might be tracking variables in

A new way to increase the capabilities of large language models Read More »

Enabling small language models to solve complex reasoning tasks

Algorithms, Artificial Intelligence, Brain and cognitive sciences, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical engineering and computer science (EECS), Machine Learning, MIT Schwarzman College of Computing, MIT Siegel Family Quest for Intelligence, MIT-IBM Watson AI Lab, National Science Foundation (NSF), Research, School of Engineering, School of Science

As language models (LMs) improve at tasks like image generation, trivia questions, and simple math, you might think that human-like reasoning is around the corner. In reality, they still trail us by a wide margin on complex tasks. Try playing Sudoku with one, for instance, where you fill in numbers one through nine in such

Enabling small language models to solve complex reasoning tasks Read More »

A smarter way for large language models to think about hard problems

Algorithms, Artificial Intelligence, Computer science and technology, Electrical engineering and computer science (EECS), IDSS, Laboratory for Information and Decision Systems (LIDS), Machine Learning, Mechanical engineering, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering

To make large language models (LLMs) more accurate when answering harder questions, researchers can let the model spend more time thinking about potential solutions.But common approaches that give LLMs this capability set a fixed computational budget for every problem, regardless of how complex it is. This means the LLM might waste computational resources on simpler questions

A smarter way for large language models to think about hard problems Read More »

Charting the future of AI, from safer answers to faster thinking

Algorithms, Artificial Intelligence, Classes and programs, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, data, Electrical engineering and computer science (EECS), Graduate, postdoctoral, Institute for Medical Engineering and Science (IMES), Laboratory for Information and Decision Systems (LIDS), Machine Learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, Students

Adoption of new tools and technologies occurs when users largely perceive them as reliable, accessible, and an improvement over the available methods and workflows for the cost. Five PhD students from the inaugural class of the MIT-IBM Watson AI Lab Summer Program are utilizing state-of-the-art resources, alleviating AI pain points, and creating new features and

Charting the future of AI, from safer answers to faster thinking Read More »

Method teaches generative AI models to locate personalized objects

Say a person takes their French Bulldog, Bowser, to the dog park. Identifying Bowser as he plays among the other canines is easy for the dog-owner to do while onsite.But if someone wants to use a generative AI model like GPT-5 to monitor their pet while they are at work, the model could fail at

Method teaches generative AI models to locate personalized objects Read More »

Creating AI that matters

Algorithms, Artificial Intelligence, Business and management, Collaboration, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, data, Electrical engineering and computer science (EECS), IDSS, Industry, Machine Learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, Technology and society

When it comes to artificial intelligence, MIT and IBM were there at the beginning: laying foundational work and creating some of the first programs — AI predecessors — and theorizing how machine “intelligence” might come to be.Today, collaborations like the MIT-IBM Watson AI Lab, which launched eight years ago, are continuing to deliver expertise for

Creating AI that matters Read More »

How to build AI scaling laws for efficient LLM training and budget maximization

When researchers are building large language models (LLMs), they aim to maximize performance under a particular computational and financial budget. Since training a model can amount to millions of dollars, developers need to be judicious with cost-impacting decisions about, for instance, the model architecture, optimizers, and training datasets before committing to a model. To anticipate

How to build AI scaling laws for efficient LLM training and budget maximization Read More »