New Releases

Auto Added by WPeMatico

Sakana AI Commercializes AB-MCTS in Sakana Marlin, an Enterprise Agent Generating Up to 100-Page Research Reports With Slides

Tokyo-based Sakana AI shipped its first commercial product ‘Sakana Marlin’ this week. Sakana team positions it as a Virtual CSO (Chief Strategy Officer). It is a B2B autonomous research agent built for enterprises. Marlin does not answer in seconds like a chatbot. You give it one research topic. It then runs autonomously for up to […]

Sakana AI Commercializes AB-MCTS in Sakana Marlin, an Enterprise Agent Generating Up to 100-Page Research Reports With Slides Read More »

Meet Flash-KMeans: An IO-Aware, Exact K-Means That Runs Over 200× Faster Than FAISS on GPUs

k-means has been an offline tool for decades. You run it once to preprocess data, then move on. A team of researchers from UC Berkeley and UT Austin released Flash-KMeans, a new open-source library that targets a different setting. Modern AI pipelines now call k-means inside training and inference loops. At that frequency, latency per

Meet Flash-KMeans: An IO-Aware, Exact K-Means That Runs Over 200× Faster Than FAISS on GPUs Read More »

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch

GLM-5.2 is the latest large language model from Z.ai, becoming the third major release in the GLM-5 line. It follows GLM-5 (February 11), GLM-5-Turbo (March 15), and GLM-5.1 (April 7). That makes four flagship-tier coding releases in roughly four months. Usable 1M-Token Context Window GLM-5.2’s standout spec is a 1,000,000-token context window. Z.ai labels the

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch Read More »

Databricks Open-Sources Omnigent: A Meta-Harness That Composes, Governs, and Shares AI Agents Across Claude Code, Codex, and Pi

Databricks released Omnigent, an open source ‘meta-harness’ for AI agents. The project ships under the Apache 2.0 license. The Databricks AI team built it with Neon. A harness is the wrapper around a model that turns it into an agent. Claude Code, Codex, and Pi are harnesses. Omnigent sits one level above them. It treats

Databricks Open-Sources Omnigent: A Meta-Harness That Composes, Governs, and Shares AI Agents Across Claude Code, Codex, and Pi Read More »

Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order

Anthropic has disabled its two most capable models for every customer. The shutdown followed a US government export control directive. The order arrived on June 12, 2026. It named Claude Fable 5 and Claude Mythos 5 specifically. Both models had launched only three days earlier, on June 9. The directive cited national security authorities, according

Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order Read More »

Google Releases Gemini-SQL2: Gemini 3.1 Pro Text-to-SQL Scores 80.04% on BIRD Single-Model Leaderboard

Google Research team has announced the launch of Gemini-SQL2 on X. They described this system as a breakthrough text-to-SQL capability powered by Gemini 3.1 Pro. Gemini-SQL2 posted 80.04% execution accuracy on the BIRD Text-to-SQL Leaderboard (Single Model). Google’s chart places it above its own Gemini-SQL, the prior top entry. The metric measures whether generated SQL

Google Releases Gemini-SQL2: Gemini 3.1 Pro Text-to-SQL Scores 80.04% on BIRD Single-Model Leaderboard Read More »

↗

Moonshot AI Launches Kimi Work, a Local Desktop Agent Reportedly Running on Kimi K2.6 With a 300-Sub-Agent Agent Swarm

Moonshot AI has introduced Kimi Work, an AI agent that runs on your own desktop. The Beijing-based AI entity announced it this week along with downloads for macOS and Windows. Kimi Work reads local files, drives your real browser, and runs scheduled tasks. It targets knowledge workers whose bottleneck is access to files and live

Moonshot AI Launches Kimi Work, a Local Desktop Agent Reportedly Running on Kimi K2.6 With a 300-Sub-Agent Agent Swarm Read More »

Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude

Zyphra has released Zamba2-VL, a family of open vision-language models. The release covers three sizes: 1.2B, 2.7B, and 7B parameters. Each model is built on the Zamba2 hybrid SSM–Transformer backbone. Vision-language models (VLMs) read images and text together. They answer questions about charts, documents, and photos. Most open VLMs use a dense Transformer as the

Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude Read More »

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards

Perplexity has moved Deep Research into Computer, its multi-model orchestration system. The upgrade improves accuracy, depth of analysis, and citation quality. Deep Research now breaks hard questions into subtasks and routes them across 20+ frontier models. It returns work-ready reports, decks, and dashboards, all inside Computer. Deep Research in Computer Deep Research is a mode

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards Read More »

xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch

Today, xAI shipped the Grok Build Plugin Marketplace. It is a built-in catalog of plugins for Grok Build, the company’s terminal coding agent. A plugin bundles skills, slash commands, agents, hooks, MCP servers, and LSPs into one package. You browse, install, and update these packages without leaving the terminal. Plugin Marketplace Grok Build is xAI’s

xAI Ships Grok Build Plugin Marketplace With MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers Plugins at Launch Read More »