Machine Learning

Auto Added by WPeMatico

NVIDIA AI Releases Nemotron 3: A Hybrid Mamba Transformer MoE Stack for Long Context Agentic AI

NVIDIA has released the Nemotron 3 family of open models as part of a full stack for agentic AI, including model weights, datasets and reinforcement learning tools. The family has three sizes, Nano, Super and Ultra, and targets multi agent systems that need long context reasoning with tight control over inference cost. Nano has about […]

NVIDIA AI Releases Nemotron 3: A Hybrid Mamba Transformer MoE Stack for Long Context Agentic AI Read More »

Google Introduces T5Gemma 2: Encoder Decoder Models with Multimodal Inputs via SigLIP and 128K Context

Google has released T5Gemma 2, a family of open encoder-decoder Transformer checkpoints built by adapting Gemma 3 pretrained weights into an encoder-decoder layout, then continuing pretraining with the UL2 objective. The release is pretrained only, intended for developers to post-train for specific tasks, and Google explicitly notes it is not releasing post-trained or IT checkpoints

Google Introduces T5Gemma 2: Encoder Decoder Models with Multimodal Inputs via SigLIP and 128K Context Read More »

Unsloth AI and NVIDIA are Revolutionizing Local LLM Fine-Tuning: From RTX Desktops to DGX Spark

Fine-tune popular AI models faster with Unsloth on NVIDIA RTX AI PCs such as GeForce RTX desktops and laptops to RTX PRO workstations and the new DGX Spark to build personalized assistants for coding, creative work, and complex agentic workflows. The landscape of modern AI is shifting. We are moving away from a total reliance

Unsloth AI and NVIDIA are Revolutionizing Local LLM Fine-Tuning: From RTX Desktops to DGX Spark Read More »

Guided learning lets “untrainable” neural networks realize their potential

Even networks long considered “untrainable” can learn effectively with a bit of a helping hand. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have shown that a brief period of alignment between neural networks, a method they call guidance, can dramatically improve the performance of architectures previously thought unsuitable for modern tasks.Their findings

Guided learning lets “untrainable” neural networks realize their potential Read More »

A new way to increase the capabilities of large language models

Most languages use word position and sentence structure to extract meaning. For example, “The cat sat on the box,” is not the same as “The box was on the cat.” Over a long text, like a financial document or a novel, the syntax of these words likely evolves. Similarly, a person might be tracking variables in

A new way to increase the capabilities of large language models Read More »

OpenAI’s new ChatGPT image generator makes faking photos easy

For most of photography’s roughly 200-year history, altering a photo convincingly required either a darkroom, some Photoshop expertise, or, at minimum, a steady hand with scissors and glue. On Tuesday, OpenAI released a tool that reduces the process to typing a sentence. It’s not the first company to do so. While OpenAI had a conversational

OpenAI’s new ChatGPT image generator makes faking photos easy Read More »

The afterparty: Hyperparameter autotuning revisited

In my first article on hyperparameter autotuning, I used a cake analogy to show how to use hyperparameter autotuning with Optuna and the sasviya.ml package in Python to improve detecting Higgs bosons in a particle accelerator. SAS Viya Workbench now supports hyperparameter autotuning in SAS code with a variety of […] The post The afterparty:

The afterparty: Hyperparameter autotuning revisited Read More »

A “scientific sandbox” lets researchers explore the evolution of vision systems

Why did humans evolve the eyes we have today?While scientists can’t go back in time to study the environmental pressures that shaped the evolution of the diverse vision systems that exist in nature, a new computational framework developed by MIT researchers allows them to explore this evolution in artificial intelligence agents.The framework they developed, in

A “scientific sandbox” lets researchers explore the evolution of vision systems Read More »

Thinking Machines Lab Makes Tinker Generally Available: Adds Kimi K2 Thinking And Qwen3-VL Vision Input

Thinking Machines Lab has moved its Tinker training API into general availability and added 3 major capabilities, support for the Kimi K2 Thinking reasoning model, OpenAI compatible sampling, and image input through Qwen3-VL vision language models. For AI engineers, this turns Tinker into a practical way to fine tune frontier models without building distributed training

Thinking Machines Lab Makes Tinker Generally Available: Adds Kimi K2 Thinking And Qwen3-VL Vision Input Read More »

“Robot, make me a chair”

Computer-aided design (CAD) systems are tried-and-true tools used to design many of the physical objects we use each day. But CAD software requires extensive expertise to master, and many tools incorporate such a high level of detail they don’t lend themselves to brainstorming or rapid prototyping.In an effort to make design faster and more accessible

“Robot, make me a chair” Read More »