Tech News

Auto Added by WPeMatico

AMD Releases Instella-MoE-16B-A3B: A Fully Open Mixture-of-Experts LLM With 2.8B Active Parameters Trained On Instinct GPUs

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Python, Tech News, Technology

AMD released Instella-MoE-16B-A3B, a fully open Mixture-of-Experts language model trained from scratch on Instinct MI300X and MI325X GPUs. The model holds 16B total parameters but activates only 2.8B per token. AMD is publishing weights from every training stage, along with data mixtures, training configs, and inference code. Two systems-level choices carry the release: Gated Multi-head […]

AMD Releases Instella-MoE-16B-A3B: A Fully Open Mixture-of-Experts LLM With 2.8B Active Parameters Trained On Instinct GPUs Read More »

Supabase Releases Evals: an Open Source Benchmark That Scores Claude Code, Codex and OpenCode on Real Supabase Tasks

agentic ai, ai, AI (Artificial Intelligence), AI Agents, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Machine Learning, Model Context Protocol (MCP), New Releases, Open Source, Staff, Tech News, Technology

Supabase has open sourced Supabase Evals, its benchmark and framework for testing how well AI agents build using Supabase. It runs coding agents including Claude Code, Codex, and OpenCode against real tasks, such as building a schema, debugging a failed Edge Function, or fixing a broken RLS policy, then scores the result. It powers the

Supabase Releases Evals: an Open Source Benchmark That Scores Claude Code, Codex and OpenCode on Real Supabase Tasks Read More »

MiniMax Releases MiniMax H3: An Omni-Modal Video Model That Generates 15-Second 2K Clips With Native Stereo Audio

ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Audio Language Model, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Staff, Tech News, Technology, Voice AI

MiniMax releases MiniMax H3, a general-purpose multimodal generation model. MiniMax H3 is not a text-to-video model with add-ons. MiniMax describes it as a general-purpose multimodal generation model that reads text, images, video, and audio as one unified context and returns video with native stereo sound. The mains specs include: 2K output, 4–15 seconds, integer durations

MiniMax Releases MiniMax H3: An Omni-Modal Video Model That Generates 15-Second 2K Clips With Native Stereo Audio Read More »

DeepSeek Upgrades DeepSeek-V4-Flash-0731 with Major Agentic and Coding Gains

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Staff, Tech News, Technology

DeepSeek published DeepSeek-V4-Flash-0731 on Hugging Face and moved the official V4-Flash API into public beta on July 31, 2026. The model card is explicit that this is the official release superseding the preview, and that the architecture and size are unchanged. The gains come from re-post-training, not a new design. The checkpoint ships with the

DeepSeek Upgrades DeepSeek-V4-Flash-0731 with Major Agentic and Coding Gains Read More »

JetBrains Open-Sources KotlinLLM: Smart Macros That Generate Kotlin Source Code at Runtime and Hot-Reload It Through JDI

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Shorts, Applications, Artificial Intelligence, Editors Pick, New Releases, Open Source, Software engineering, Staff, Tech News, Technology

JetBrains Research Open-Sources KotlinLLM. KotlinLLM is an IntelliJ IDEA plugin for Kotlin/JVM projects that adds a language feature called Smart macros. A Smart macro is a regular Kotlin function call whose body is generated Kotlin code. The public API is deliberately small. asLlm<F, T>(from, hint) converts an input of type F into a typed value

JetBrains Open-Sources KotlinLLM: Smart Macros That Generate Kotlin Source Code at Runtime and Hot-Reload It Through JDI Read More »

Google DeepMind Ships Three Physical AI Models For Whole Body Control, Dexterity And Multi Robot Collaboration

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, New Releases, physical ai, Robotics, Staff, Tech News, Technology, Vision Language Model

Google DeepMind has released Gemini Robotics 2, the intelligence layer for its next generation of robots. The release moves the stack past table-top manipulation into whole body control, five finger dexterity and multi robot teamwork. It ships as three separate models with three different access tiers. Most robots today are pre-programmed or tele-operated for narrow,

Google DeepMind Ships Three Physical AI Models For Whole Body Control, Dexterity And Multi Robot Collaboration Read More »

Tencent Open-Sources AngelSpec: A Unified Training Framework for MTP and Block-Parallel Speculative Decoding on Hy3 Models

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Artificial Intelligence, Editors Pick, Language Model, New Releases, Open Source, Software engineering, Staff, Tech News, Technology, Uncategorized

Tencent has released AngelSpec, an open-source, torch-native training framework for speculative-decoding draft models. The release covers both autoregressive multi-token prediction (MTP) and the block-parallel DFlash family. Most speculative-decoding work searches for one drafter that scores well on an averaged benchmark mixture. Real serving traffic does not look like that mixture. AngelSpec treats workload heterogeneity as

Tencent Open-Sources AngelSpec: A Unified Training Framework for MTP and Block-Parallel Speculative Decoding on Hy3 Models Read More »

Meet Token Saver: An Open-Source MCP Extension Using Local Hybrid RAG to Cut Claude PDF Token Costs 90-99%

agentic ai, ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Computer vision, Editors Pick, Embedding Model, Machine Learning, New Releases, OCR, Open Source, Python, Software engineering, Staff, Tech News, Technology, Vision Language Model

AI developers, researchers, and professionals frequently hit a frustrating wall when analyzing large documents with LLMs: the hidden, compounding cost of context windows. Pasting a 200-page PDF into a chat isn’t a one-time charge. Because the conversation history is re-sent to the model on every single turn, that massive document is paid for again with

Meet Token Saver: An Open-Source MCP Extension Using Local Hybrid RAG to Cut Claude PDF Token Costs 90-99% Read More »

Moonshot AI Open-Sources MoonEP: A Perfectly Balanced Expert Parallelism Library for MoE Training

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Open Source, Software engineering, Staff, Tech News, Technology

Moonshot AI has open-sourced MoonEP, an Expert Parallelism (EP) communication library for distributed Mixture-of-Experts (MoE) workloads. The team announced the release as a library built to make expert-parallel communication more efficient at scale. It ships under an MIT license. MoonEP arrived as part of Kimi K3 Open Day. Alongside the K3 model weights and technical

Moonshot AI Open-Sources MoonEP: A Perfectly Balanced Expert Parallelism Library for MoE Training Read More »

Liquid AI Releases LFM2.5-Encoder-230M and LFM2.5-Encoder-350M: Bidirectional Encoders That Stay Fast at 8K Context on CPU

Liquid AI has released two open-weight bidirectional encoders, LFM2.5-Encoder-230M and LFM2.5-Encoder-350M. Both are masked language models built on the LFM2 hybrid backbone. Both carry an 8,192-token context. Encoders sit underneath classifiers, intent routers, safety filters, and PII detectors. Those jobs run continuously, usually without a GPU, and increasingly on longer inputs. BERT established the class.

Liquid AI Releases LFM2.5-Encoder-230M and LFM2.5-Encoder-350M: Bidirectional Encoders That Stay Fast at 8K Context on CPU Read More »