Staff

Auto Added by WPeMatico

Google DeepMind Releases Gemma 4 12B: An Encoder-Free Multimodal Model with Native audio that runs on a 16 GB laptop

Google DeepMind just released Gemma 4 12B, a dense multimodal model that strips out traditional encoders entirely. Vision and audio flow straight into the LLM backbone. The result is a model that runs agentic workflows on a consumer laptop with 16 GB of RAM. It ships under the Apache 2.0 license. Model Overview & Access […]

Google DeepMind Releases Gemma 4 12B: An Encoder-Free Multimodal Model with Native audio that runs on a 16 GB laptop Read More »

Nous Research Releases Hermes Desktop: A Native Cross-Platform Front End for Hermes Agent v0.15.2 with Streaming Tool Output

Nous Research has released Hermes Desktop in public preview. It is a native application for macOS, Windows, and Linux. It gives the open-source Hermes Agent a graphical interface. Until now, users ran Hermes through a CLI and messaging gateways. The current build is Hermes Agent v0.15.2. Per Nous Research’s documentation, the desktop reuses the same

Nous Research Releases Hermes Desktop: A Native Cross-Platform Front End for Hermes Agent v0.15.2 with Streaming Tool Output Read More »

NVIDIA Releases Cosmos 3: A Two-Tower Mixture-of-Transformers Foundation Model Unifying Physical Reasoning, World Generation, and Action Generation

NVIDIA AI team have released Cosmos 3. It is a family of omnimodal world models for physical AI. The models combine physical reasoning, world generation, and action generation. All three capabilities live inside one open model. NVIDIA open sourced the checkpoints, training scripts, deployment tools, and datasets. The Cosmos 3 release targets robotics, autonomous vehicles,

NVIDIA Releases Cosmos 3: A Two-Tower Mixture-of-Transformers Foundation Model Unifying Physical Reasoning, World Generation, and Action Generation Read More »

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab

In this tutorial, we fine-tune Liquid AI’s LFM2 model through a complete open-source workflow. We start by loading the base LFM2 checkpoint with QLoRA, preparing a chat-style supervised fine-tuning dataset, training a lightweight LoRA adapter using TRL and PEFT, and then merging the adapter back into the model. We also extend the workflow with DPO

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab Read More »

TinyFish Launches BigSet: An Open-Source Multi-Agent System That Builds Structured Live Datasets from Plain-English Descriptions

Building a structured dataset from the web is still a pipeline problem. You identify a data source, write or configure a scraper, design a schema, handle deduplication, schedule refreshes, and fix breakage when upstream sites change. That process stays roughly the same whether you do it once or a hundred times. TinyFish is releasing BigSet

TinyFish Launches BigSet: An Open-Source Multi-Agent System That Builds Structured Live Datasets from Plain-English Descriptions Read More »

Alibaba’s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform

Alibaba’s Qwen team has released Qwen3.7-Plus. The model is now available through Alibaba Cloud’s Bailian platform. Bailian is the console international users access as Model Studio. It offers API services to external developers. The release follows Alibaba’s May unveiling of the Qwen3.7 generation. Qwen3.7-Plus Qwen3.7-Plus is a multimodal large language model. The model understands images

Alibaba’s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform Read More »

JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines

JetBrains released Mellum2, open-sourcing the weights under the Apache 2.0 license. The first version of Mellum was a completion-focused 4B dense model. Mellum2 is its successor: a general-purpose model specialized in software engineering. It covers code generation and editing, debugging, multi-step reasoning, tool use and function calling, agentic coding, and conversational programming assistance. JetBrains team

JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines Read More »

How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp

In this tutorial, we work through an implementation of NVIDIA Apex, focusing on the components that still matter in modern GPU training workflows. Instead of treating Apex as a general mixed-precision library, we separate the older parts from the still-useful ones and test them directly. We begin by checking the CUDA runtime, building Apex with

How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp Read More »

MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding

MiniMax officially released MiniMax M3 on June 1, 2026. The model introduces MSA (MiniMax Sparse Attention), a new sparse attention architecture that gives M3 a 1M-token context window. M3 also supports image and video input and desktop computer operation natively. The API is live now. MiniMax M3 is available today via MiniMax Code, the MiniMax

MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding Read More »

🪽

Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent

Hermes Agent already remembers across sessions. The open-source agent from Nous Research ships with curated memory files and full-text session search. But a new community project argues that built-in memory is too shallow for serious work. A new library named ‘Memory OS‘ has been released under an MIT license by a developer (ClaudioDrews). It stacks

Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent Read More »