New Releases

Auto Added by WPeMatico

🇺🇦

What is Clawdbot? How a Local First Agent Stack Turns Chats into Real Automations

Clawdbot is an open source personal AI assistant that you run on your own hardware. It connects large language models from providers such as Anthropic and OpenAI to real tools such as messaging apps, files, shell, browser and smart home devices, while keeping the orchestration layer under your control. The interesting part is not that […]

What is Clawdbot? How a Local First Agent Stack Turns Chats into Real Automations Read More »

StepFun AI Introduce Step-DeepResearch: A Cost-Effective Deep Research Agent Model Built Around Atomic Capabilities

StepFun has introduced Step-DeepResearch, a 32B parameter end to end deep research agent that aims to turn web search into actual research workflows with long horizon reasoning, tool use and structured reporting. The model is built on Qwen2.5 32B-Base and is trained to act as a single agent that plans, explores sources, verifies evidence and

StepFun AI Introduce Step-DeepResearch: A Cost-Effective Deep Research Agent Model Built Around Atomic Capabilities Read More »

GitHub Releases Copilot-SDK to Embed Its Agentic Runtime in Any App

GitHub has opened up the internal agent runtime that powers GitHub Copilot CLI and exposed it as a programmable SDK. The GitHub Copilot-SDK, now in technical preview, lets you embed the same agentic execution loop into any application so the agent can plan, invoke tools, edit files, and run commands as part of your own

GitHub Releases Copilot-SDK to Embed Its Agentic Runtime in Any App Read More »

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

Alibaba Cloud’s Qwen team has open-sourced Qwen3-TTS, a family of multilingual text-to-speech models that target three core tasks in one stack, voice clone, voice design, and high quality speech generation. https://arxiv.org/pdf/2601.15621v1 Model family and capabilities Qwen3-TTS uses a 12Hz speech tokenizer and 2 language model sizes, 0.6B and 1.7B, packaged into 3 main tasks. The

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control Read More »

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

Microsoft has released VibeVoice-ASR as part of the VibeVoice family of open source frontier voice AI models. VibeVoice-ASR is described as a unified speech-to-text model that can handle 60-minute long-form audio in a single pass and output structured transcriptions that encode Who, When, and What, with support for Customized Hotwords. VibeVoice sits in a single

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass Read More »

FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning

Chroma 1.0 is a real time speech to speech dialogue model that takes audio as input and returns audio as output while preserving the speaker identity across multi turn conversations. It is presented as the first open source end to end spoken dialogue system that combines low latency interaction with high fidelity personalized voice cloning

FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning Read More »

Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents

Inworld AI has introduced Inworld TTS-1.5, an upgrade to its TTS-1 family that targets realtime voice agents with strict constraints on latency, quality, and cost. TTS-1.5 is described as the number top ranked text to speech system on Artificial Analysis and is designed to be more expressive and more stable than prior generations while remaining

Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents Read More »

Liquid AI Releases LFM2.5-1.2B-Thinking: a 1.2B Parameter Reasoning Model That Fits Under 1 GB On-Device

Liquid AI has released LFM2.5-1.2B-Thinking, a 1.2 billion parameter reasoning model that runs fully on device and fits in about 900 MB on a modern phone. What needed a data center 2 years ago can now run offline on consumer hardware, with a focus on structured reasoning traces, tool use, and math, rather than general

Liquid AI Releases LFM2.5-1.2B-Thinking: a 1.2B Parameter Reasoning Model That Fits Under 1 GB On-Device Read More »

Zhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and Agents

GLM-4.7-Flash is a new member of the GLM 4.7 family and targets developers who want strong coding and reasoning performance in a model that is practical to run locally. Zhipu AI (Z.ai) describes GLM-4.7-Flash as a 30B-A3B MoE model and presents it as the strongest model in the 30B class, designed for lightweight deployment where

Zhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and Agents Read More »

Microsoft Research Releases OptiMind: A 20B Parameter Model that Turns Natural Language into Solver Ready Optimization Models

Microsoft Research has released OptiMind, an AI based system that converts natural language descriptions of complex decision problems into mathematical formulations that optimization solvers can execute. It targets a long standing bottleneck in operations research, where translating business intent into mixed integer linear programs usually needs expert modelers and days of work. What OptiMind Is

Microsoft Research Releases OptiMind: A 20B Parameter Model that Turns Natural Language into Solver Ready Optimization Models Read More »