agentic ai

Auto Added by WPeMatico

Perplexity Launches Brain, a Self-Improving Memory System That Builds a Context Graph of an Agent’s Work and Learns Overnight

Most AI memory remembers the user. It stores your preferences, your tastes, and your role. Perplexity is taking a different path. Today, Perplexity launched Brain, a self-improving memory system for its agent product, Computer. Brain does not focus on remembering you. It remembers what the agent did. That reframes what memory in AI is for. […]

Perplexity Launches Brain, a Self-Improving Memory System That Builds a Context Graph of an Agent’s Work and Learns Overnight Read More »

OpenAI Releases LifeSciBench, a 750-Task Benchmark Grading AI Models on Real Life-Science Research With Expert-Written Rubric

Most biology benchmarks ask narrow, fact-based questions with clean answers. Scientists weigh imperfect evidence and make decisions. OpenAI released LifeSciBench and it targets that gap directly. Even the strongest model passes roughly one task in three. The benchmark is far from saturated. What is LifeSciBench LifeSciBench contains 750 expert-authored tasks. They span seven workflows and

OpenAI Releases LifeSciBench, a 750-Task Benchmark Grading AI Models on Real Life-Science Research With Expert-Written Rubric Read More »

⚠

NVIDIA SkillSpector Guide: Scanning AI Skills for Security Risks with Static Analysis and SARIF Reports

In this tutorial, we explore how NVIDIA SkillSpector helps us evaluate AI skills for security risks before they are used in real-world workflows. We build a controlled corpus containing both benign and deliberately vulnerable skills, scan them through SkillSpector’s programmatic LangGraph workflow, and organize the resulting risk scores and findings with pandas. We then visualize

NVIDIA SkillSpector Guide: Scanning AI Skills for Security Risks with Static Analysis and SARIF Reports Read More »

Vercel Releases Eve: An Open-Source AI Agent Framework Where Each Agent is a Directory of Files Mapped to Capabilities

Vercel has released eve, an open-source framework for building, running, and scaling agents. The project is published as the npm package eve, licensed under Apache-2.0. Building an agent should mean defining what it does. It should not mean assembling all the plumbing that an agent needs to run in production. eve is the framework Vercel

Vercel Releases Eve: An Open-Source AI Agent Framework Where Each Agent is a Directory of Files Mapped to Capabilities Read More »

MiniMax Sparse Attention (MSA): a Two-Branch Block-Sparse Attention Trained on a 109B-Parameter MoE With a 3T-Token Budget

MiniMax released MSA (MiniMax Sparse Attention), a sparse attention method built directly on Grouped Query Attention (GQA). It targets one bottleneck: the quadratic cost of softmax attention at long context. The MiniMax research team tested it inside a 109B-parameter Mixture-of-Experts model trained with native multimodal data. They also open-sourced an inference kernel and shipped a

MiniMax Sparse Attention (MSA): a Two-Branch Block-Sparse Attention Trained on a 109B-Parameter MoE With a 3T-Token Budget Read More »

OpenAI’s Deployment Simulation Extends Pre-Deployment Risk Assessment to Agentic Coding Through Simulated Tool Calls

OpenAI published a new pre-deployment safety method called Deployment Simulation. The idea is direct. Before a model ships, simulate its deployment first. Replay past conversations through the new candidate model. Then study how it behaves in realistic contexts. OpenAI already uses insights from the method during model development. It has informed mitigations and deployment decisions,

OpenAI’s Deployment Simulation Extends Pre-Deployment Risk Assessment to Agentic Coding Through Simulated Tool Calls Read More »

🪽

Hermes Agent Adds Asynchronous Subagents, So Delegated Work No Longer Blocks the Parent Chat

Nous Research has shipped a change to Hermes Agent. Its delegate tool can now run subagents asynchronously. Per the announcement, delegated work no longer blocks the parent chat. Hermes Agent is an open-source personal agent from Nous Research. A parent agent can spawn child agents, called subagents, to fan out work. Until now, that delegation

Hermes Agent Adds Asynchronous Subagents, So Delegated Work No Longer Blocks the Parent Chat Read More »

Meet Atoms: A Vibe Coding Tool That Uses AI Agents to Build, Deploy, and Market Your App (No Code)

The concept of vibe coding is interesting; you don’t need to be a developer or software engineer to build your own applications. You can describe your idea to an AI in plain language, and it will build, edit, and refine your applications so you don’t have to write code line by line. It sounds simple

Meet Atoms: A Vibe Coding Tool That Uses AI Agents to Build, Deploy, and Market Your App (No Code) Read More »

Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context

Foundation models keep getting stronger, yet they still stall on the same thing: context. A model can write code or analyze a dataset, but only with the right internal knowledge. That knowledge includes table schemas, metric definitions, runbooks, join paths and it lives scattered across catalogs, wikis, and a few senior engineers’ heads. Google Cloud

Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context Read More »