Tech News

Auto Added by WPeMatico

NVIDIA AI Releases Nemotron 3 Embed: An Open Embedding Collection Whose 8B Checkpoint Ranks #1 on RTEB

ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Editors Pick, Embedding Model, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Staff, Tech News, Technology

Embedding models decide which passages an agent ever sees. NVIDIA released Nemotron 3 Embed model to work on that layer. It targets production-scale RAG, agentic retrieval, code retrieval, and agent memory. What is Nemotron 3 Embed? The model collection includes three open checkpoints. Nemotron-3-Embed-8B-BF16 is the accuracy-first option. Nemotron-3-Embed-1B-BF16 carries the same design into a […]

NVIDIA AI Releases Nemotron 3 Embed: An Open Embedding Collection Whose 8B Checkpoint Ranks #1 on RTEB Read More »

Moonshot AI Releases Kimi K3: A 2.8 Trillion Parameter Open MoE Model With Kimi Delta Attention and 1M Context

agentic ai, ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Software engineering, Tech News, Technology, Top, Uncategorized

Moonshot AI just released Kimi K3. It is a 2.8-trillion-parameter model with native vision and a 1-million-token context window. Moonshot calls it the world’s first open 3T-class model. What is Kimi K3? Kimi K3 is a sparse Mixture-of-Experts (MoE) model built on two architectural updates. Those are Kimi Delta Attention (KDA) and Attention Residuals (AttnRes).

Moonshot AI Releases Kimi K3: A 2.8 Trillion Parameter Open MoE Model With Kimi Delta Attention and 1M Context Read More »

OpenAI Details GPT-Red: An Internal Automated Red-Teaming Model That Beat Human Red-Teamers 84% To 13% On Prompt Injection

agentic ai, ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Machine Learning, New Releases, Security, Software engineering, Staff, Tech News, Technology

This week, OpenAI published details of GPT-Red, an internal-only automated red-teaming model. Its job is to attack OpenAI’s own models and find prompt injection vulnerabilities. OpenAI gives two reasons. Human red-teaming is time-intensive and does not scale. Commonly used robustness evaluations are already saturated by its latest models. Meanwhile, the attack surface grows. Agents read

OpenAI Details GPT-Red: An Internal Automated Red-Teaming Model That Beat Human Red-Teamers 84% To 13% On Prompt Injection Read More »

SpaceXAI Open-Sources Grok Build: The Rust Agent Harness, TUI, and Tool Layer Behind Its Coding CLI

agentic ai, ai, AI (Artificial Intelligence), AI Agents, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Machine Learning, Open Source, Software engineering, Staff, Tech News, Technology

SpaceXAI has open-sourced Grok Build, the terminal-based AI coding agent behind its grok CLI. The source landed on GitHub today. The release covers the agent harness, TUI, CLI shell, and developer tooling under the Apache 2.0 license What is Grok Build? A harness is the scaffolding around a model. It assembles context, calls the model,

SpaceXAI Open-Sources Grok Build: The Rust Agent Harness, TUI, and Tool Layer Behind Its Coding CLI Read More »

Thinking Machines Lab Releases Inkling: A 975B-Parameter Open-Weights Multimodal MoE With 41B Active Parameters And Controllable Thinking Effort

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Staff, Tech News, Technology

Thinking Machines Lab just released Inkling, their first model trained from scratch, weights are open, fine-tunable on Tinker. The lab pitches it as a base for customization. What is Inkling? Inkling is a Mixture-of-Experts transformer with 975B total parameters and 41B active. It supports a context window of up to 1M tokens. Pretraining covered 45

Thinking Machines Lab Releases Inkling: A 975B-Parameter Open-Weights Multimodal MoE With 41B Active Parameters And Controllable Thinking Effort Read More »

Soofi Consortium Releases Soofi S 30B-A3B: An Open Hybrid Mamba-Transformer MoE Foundation Model For German And English

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Staff, Tech News, Technology, Uncategorized

A German research consortium has published the pretraining report for Soofi S 30B-A3B. It is an open base model for German and English. Training ran end to end on Deutsche Telekom’s Industrial AI Cloud in Munich. Preview weights are on Hugging Face. It is worth noting that among some of the fully open base models

Soofi Consortium Releases Soofi S 30B-A3B: An Open Hybrid Mamba-Transformer MoE Foundation Model For German And English Read More »

Google Releases LiteRT.js: A JavaScript Binding of LiteRT That Runs .tflite Models in Browsers via WebGPU

ai, AI (Artificial Intelligence), Artificial Intelligence, Editors Pick, New Releases, Software engineering, Staff, Tech News

Google released LiteRT.js, a JavaScript binding of LiteRT. LiteRT is Google’s on-device inference library, previously called TensorFlow Lite. LiteRT.js runs .tflite models directly inside the browser. Because inference stays local, Google cites enhanced user privacy, zero server costs, and ultra-low latency. What is LiteRT.js? It is not a new model format. Rather, Google compiled its

Google Releases LiteRT.js: A JavaScript Binding of LiteRT That Runs .tflite Models in Browsers via WebGPU Read More »

PrismML Releases Bonsai 27B: 1-bit and Ternary Builds of Qwen3.6-27B That Run on Laptops and Phones

PrismML just released Bonsai 27B. It is a low-bit representation of Qwen3.6-27B, not a new pretrain. The architecture is unchanged. Two variants ship under Apache 2.0. Ternary Bonsai 27B uses {−1, 0, +1} weights at a true 1.71 bits per weight. Its ideal size is 5.9GB. 1-bit Bonsai 27B uses binary {−1, +1} weights at

PrismML Releases Bonsai 27B: 1-bit and Ternary Builds of Qwen3.6-27B That Run on Laptops and Phones Read More »

Mistral Vibe for Code vs Claude Code vs Cursor vs Codex: Four Agents Scored on One Scaffold-to-PR Task

agentic ai, ai, AI (Artificial Intelligence), AI Agents, Artificial Intelligence, Editors Pick, Software engineering, Staff, Tech News, Uncategorized

Coding agents are the most contested category in developer tooling right now. Four names dominate the shortlist: Mistral Vibe for Code, Claude Code, Cursor, and OpenAI Codex. Each claims to take a feature from prompt to pull request. This comparison runs all four against one practical workflow. Not a toy script. A real unit of

Mistral Vibe for Code vs Claude Code vs Cursor vs Codex: Four Agents Scored on One Scaffold-to-PR Task Read More »

Meet Blume: An Open-Source, Zero-Config Documentation Framework That Ships AI-Ready Docs From a Markdown Folder

agentic ai, ai, AI (Artificial Intelligence), AI Shorts, Applications, Artificial Intelligence, Editors Pick, Generative AI, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Python, Software engineering, Staff, Tech News, Technology

Hayden Bleasel, an expert developer from OpenAI, released Blume, an open-source documentation framework. Blume shipped to npm as version 1.0.3 the same day. It is as simple as Drop Markdown into a folder and ship a docs site. No app boilerplate is written or maintained afterward. The project is MIT-licensed and open sourced. What is

Meet Blume: An Open-Source, Zero-Config Documentation Framework That Ships AI-Ready Docs From a Markdown Folder Read More »