Software engineering

Auto Added by WPeMatico

Meet EverOS: An Open Source Markdown-First Agent Memory Runtime With Hybrid BM25 + Vector Retrieval and Self-Evolving Skills

EverMind has released EverOS, an open-source memory runtime for AI agents. It ships under an Apache 2.0 license. It targets a problem agent builders hit early: large language models are stateless. The conversation ends, and the context is gone. EverOS proposes a different substrate. Instead of locking memory inside a vector database, it writes memory […]

Meet EverOS: An Open Source Markdown-First Agent Memory Runtime With Hybrid BM25 + Vector Retrieval and Self-Evolving Skills Read More »

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines

In this tutorial, we work with the Fable 5 Traces dataset from Hugging Face and build a complete workflow around real coding-agent trace data. We start by setting up a lightweight environment that avoids fragile dependencies such as datasets, scikit-learn, and scipy. Then we manually download and parse the merged JSONL file to keep the

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines Read More »

Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference

Liquid AI shipped LFM2.5-230M, it’s the company’s smallest model to date. The release targets a specific job: running agentic tasks on phones, robots, and automation devices. Both the base and instruction-tuned checkpoints are open-weight on Hugging Face. The pitch is narrow on purpose. This is not a general reasoning model. It is built for data

Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference Read More »

DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1

DeepSeek released DSpark, a speculative decoding framework, with open-source checkpoints and training code. It is a serving optimization, not a new model. The checkpoints DeepSeek-V4-Pro-DSpark and DeepSeek-V4-Flash-DSpark reuse the existing V4 weights, with a draft module attached. The DeepSeek research team also open-sourced DeepSpec, an MIT-licensed codebase for training and evaluating speculative decoding drafters. The

DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1 Read More »

↗

Meta’s Astryx Brings a CLI and MCP Server to an Open-Source React Design System Agents Can Read

Meta released Astryx this week. It is an open-source design system, currently in Beta. The project grew inside Meta’s monorepo over eight years. Astryx is built on React and StyleX. StyleX is Meta’s compile-time CSS engine. TL;DR Astryx is Meta’s open-source, agent-ready React design system, now in Beta. It pairs StyleX styling with a CSS-variable

Meta’s Astryx Brings a CLI and MCP Server to an Open-Source React Design System Agents Can Read Read More »

Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics

In this tutorial, we explore the Open-SWE-Traces dataset as a practical resource for studying and preparing agentic software-engineering trajectories for fine-tuning. We stream the dataset directly from Hugging Face, so we can work with a large dataset efficiently in Google Colab without downloading everything locally. We inspect individual records, normalize multi-turn agent conversations, parse final

Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics Read More »

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

A new Cursor study reports that newer coding agents often retrieve known fixes instead of deriving them, inflating popular benchmark scores. Reward hacking means a model earns the reward without doing the intended work. Here the reward is a passing test. The intended work is deriving the bug fix. The research study focuses on agentic

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro Read More »

▶

Perplexity Launches Computer for Counsel: A Multi-Model Agentic Layer for Legal Workflows

Perplexity launched Computer for Counsel. It is an agentic AI system built for legal teams. The product extends Perplexity Computer, the company’s LLM-agnostic agentic system. It is available now to Perplexity Enterprise and Max subscribers. Lawyers lose hours to administrative work. Computer for Counsel targets that work directly. Nearly 75% of lawyers call administrative tasks

Perplexity Launches Computer for Counsel: A Multi-Model Agentic Layer for Legal Workflows Read More »

Meet container: Apple’s Open-Source Swift Tool for Running Linux Containers as Lightweight VMs on Apple Silicon

Apple research team recently released the container project. It is an open-source command-line tool written in Swift. It creates and runs Linux containers as lightweight virtual machines on a Mac. The project ships under the Apache 2.0 license and targets Apple silicon. Containers are how you ship reproducible environments from a laptop to a datacenter.

Meet container: Apple’s Open-Source Swift Tool for Running Linux Containers as Lightweight VMs on Apple Silicon Read More »

▶

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds

DeepReinforce has released Ornith-1.0, an open-source model family built for agentic coding. The lineup spans four sizes, from a 9B dense model to a 397B mixture-of-experts flagship. Every checkpoint ships under the MIT license on Hugging Face. The models are post-trained on top of pretrained Gemma 4 and Qwen 3.5. Most coding agents pair a

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds Read More »