Tech News

Auto Added by WPeMatico

Anthropic Releases Claude Fable 5 and Claude Mythos 5: Same Underlying Model, Different Safeguards, New Mythos-Class Tier

Anthropic released two models on June 9, 2026: Claude Fable 5 and Claude Mythos 5. Both belong to a tier called “Mythos-class.” This tier sits above the Opus class in capability. Fable 5 is the version claimed to be made safe for general use. Mythos 5 is the same model with some safeguards lifted, kept […]

Anthropic Releases Claude Fable 5 and Claude Mythos 5: Same Underlying Model, Different Safeguards, New Mythos-Class Tier Read More »

A New Study from Harvard and Perplexity Finds AI Agents Perform 26 Minutes of Autonomous Work per Session vs 33 Seconds for Search

A new working research from Perplexity and Harvard offers field evidence on what AI agents do to knowledge work. It draws on production data from two Perplexity products: Search and Computer. The setup is a natural comparison. Search is a conversational answer engine. Computer is an agent that plans and executes tasks end to end.

A New Study from Harvard and Perplexity Finds AI Agents Perform 26 Minutes of Autonomous Work per Session vs 33 Seconds for Search Read More »

Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Commodity GPUs

Inference speed is becoming a competitive metric for large language models. Xiaomi’s MiMo team just released MiMo-V2.5-Pro-UltraSpeed, built in collaboration with the TileRT systems group. It decodes faster than 1000 tokens per second on a 1-trillion-parameter model. Xiaomi team describes this as a first at trillion-parameter scale. Demos show generation peaks near 1200 tokens per

Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Commodity GPUs Read More »

Best 21 Low-Code and No-Code AI Tools in 2026

Low-code and no-code platforms have moved from simple drag-and-drop builders to AI-native development environments. In 2026, most of them ship a built-in assistant that turns a text prompt into a working app, agent, or automation. This list covers 21 tools that AI practitioners use today, grouped by what they do best. Each tool name links

Best 21 Low-Code and No-Code AI Tools in 2026 Read More »

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Most search agents are trained as policies over a growing transcript. The model decides how to search. It must also remember what it saw, which evidence matters, and which claims it checked. A team of researchers from University of Illinois Urbana-Champaign, UC Berkeley, and Chroma argues this asks too much. Reinforcement learning ends up optimizing

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b Read More »

Google’s New Colab CLI Lets Developers and AI Agents Run Python on Remote Colab GPUs and TPUs From the Terminal

This week, Google AI team released the Colab CLI. The tool connects your local terminal to remote Colab runtimes. It lets developers and AI agents run code on cloud GPUs and TPUs. You stay in your terminal the entire time. The CLI is open source under the Apache 2.0 license. What is Google Colab CLI

Google’s New Colab CLI Lets Developers and AI Agents Run Python on Remote Colab GPUs and TPUs From the Terminal Read More »

Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents

Moonshot AI has released Kimi Code CLI, an open-source coding agent that runs in the terminal. The tool reads and edits code, runs shell commands, searches files, and fetches web pages. It then chooses its next step based on the feedback it receives. The project is MIT-licensed and lives on GitHub.. Kimi Code CLI is

Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents Read More »

Google DeepMind Releases Gemma 4 QAT Checkpoints: Q4_0 and a New Mobile Format Cut On-Device Memory

Google DeepMind released Quantization-Aware Training (QAT) checkpoints for the Gemma 4 family. The release targets local deployment on edge devices and consumer GPUs. It follows the Gemma 4 launch in April and a 12B model two days earlier. We compared the available Gemma 4 edge-model formats using only published numbers. The goal was simple. Show

Google DeepMind Releases Gemma 4 QAT Checkpoints: Q4_0 and a New Mobile Format Cut On-Device Memory Read More »

NVIDIA AI Releases Dynamo Snapshot: A CRIU-Based Fast Startup System for AI Inference on Kubernetes

In production inference deployments, demand fluctuates over time, requiring inference replicas to scale elastically. Cold-starting inference workloads on Kubernetes can take several minutes. During that time, GPUs are allocated but idle, generating no tokens and serving no requests. ‘Cold start’ means the full sequence a model server must complete before serving any request: pulling the

NVIDIA AI Releases Dynamo Snapshot: A CRIU-Based Fast Startup System for AI Inference on Kubernetes Read More »

Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routing

Perplexity AI announced what it calls the first hybrid local-server inference orchestrator at Computex 2026. The system is designed to automatically route AI tasks between a user’s local device and cloud-based frontier models without requiring the user to decide in advance. The feature is expected come to Perplexity Computer in July 2026. What is Hybrid

Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routing Read More »