AI Infrastructure

Auto Added by WPeMatico

Nous Research Releases Token Superposition Training to Speed Up LLM Pre-Training by Up to 2.5x Across 270M to 10B Parameter Models

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Staff, Tech News, Technology

Pre-training large language models is expensive enough that even modest efficiency improvements can translate into meaningful cost and time savings. Nous Research is releasing Token Superposition Training (TST), a method that substantially reduces pre-training wall-clock time at fixed compute without touching the model architecture, optimizer, tokenizer, parallelism strategy, or training data. At the 10B-A1B mixture-of-experts […]

Nous Research Releases Token Superposition Training to Speed Up LLM Pre-Training by Up to 2.5x Across 270M to 10B Parameter Models Read More »

Enterprise AI Governance in 2026: Why the Tools Employees Use Are Ahead of the Policies That Cover Them

ai, AI (Artificial Intelligence), AI Ethics, AI Governance, AI Infrastructure, Applications, Artificial Intelligence, Editors Pick, Enterprise AI, Staff, Technology

By the time a company’s legal team finishes drafting its generative AI acceptable use policy, a meaningful percentage of its engineers, analysts, and product managers have already moved past it. Not deliberately. Not maliciously. Just practically. This is the core dynamic of what the industry now calls shadow AI: the unauthorized, ungoverned use of AI

Enterprise AI Governance in 2026: Why the Tools Employees Use Are Ahead of the Policies That Cover Them Read More »

Fastino Labs Open-Sources GLiGuard: A 300M Parameter Safety Moderation Model That Matches or Exceeds Accuracy of Models 23–90x Its Size

agentic ai, ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Applications, Artificial Intelligence, Editors Pick, New Releases, Open Source, Security, Software engineering, Staff, Tech News, Technology

As LLM-powered applications move into production — and as AI agents take on more consequential tasks like browsing the web, writing and executing code, and interacting with external services — safety moderation has quietly become one of the most operationally expensive parts of the stack. Most developers who’ve deployed a production LLM system know the

Fastino Labs Open-Sources GLiGuard: A 300M Parameter Safety Moderation Model That Matches or Exceeds Accuracy of Models 23–90x Its Size Read More »

Mira Murati’s Thinking Machines Lab Introduces Interaction Models: A Native Multimodal Architecture for Real-Time Human-AI Collaboration

agentic ai, ai, AI (Artificial Intelligence), AI Agents, AI Infrastructure, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Machine Learning, New Releases, physical ai, Software engineering, Staff, Tech News, Technology

Most AI systems today work in turns. You type or speak, the model waits, processes your input, and then responds. That’s the entire interaction loop. Thinking Machines Lab, an AI research lab, is arguing that this model of interaction is a fundamental bottleneck. Thinking Machines Lab team introduced a research preview of a new class

Mira Murati’s Thinking Machines Lab Introduces Interaction Models: A Native Multimodal Architecture for Real-Time Human-AI Collaboration Read More »

Tilde Research Introduces Aurora: A Leverage-Aware Optimizer That Fixes a Hidden Neuron Death Problem in Muon

ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Artificial Intelligence, Editors Pick, Language Model, Machine Learning, New Releases, Staff, Tech News, Technology

Researchers at Tilde Research have released Aurora, a new optimizer for training neural networks that addresses a structural flaw in the widely-used Muon optimizer. The flaw quietly kills off a significant fraction of MLP neurons during training and keeps them permanently dead. Aurora comes with a 1.1B parameter pretraining experiment, a new state-of-the-art result on

Tilde Research Introduces Aurora: A Leverage-Aware Optimizer That Fixes a Hidden Neuron Death Problem in Muon Read More »

Meta and Stanford Researchers Propose Fast Byte Latent Transformer That Reduces Inference Memory Bandwidth by Over 50% Without Tokenization

ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Software engineering, Staff, Tech News, Technology

A team of researchers from Meta, Stanford University, and the University of Washington have introduced three new methods that substantially accelerate generation in the Byte Latent Transformer (BLT) — a language model architecture that operates directly on raw bytes instead of tokens. Byte-Level Models Are Slow at Inference To understand what this new research solves,

Meta and Stanford Researchers Propose Fast Byte Latent Transformer That Reduces Inference Memory Bandwidth by Over 50% Without Tokenization Read More »

Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Training Speedup in LLMs

ai, AI (Artificial Intelligence), AI Infrastructure, AI Paper Summary, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Large Language Model, Machine Learning, New Releases, Open Source, Software engineering, Staff, Tech News, Technology

Scaling large language models (LLMs) is expensive. Every token processed during inference and every gradient computed during training flows through feedforward layers that account for over two-thirds of model parameters and more than 80% of total FLOPs in larger models. A team researchers from Sakana AI and NVIDIA have worked on a new research that

Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Training Speedup in LLMs Read More »

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX

ai, AI (Artificial Intelligence), AI Infrastructure, AI Shorts, Applications, Artificial Intelligence, Editors Pick, Language Model, Machine Learning, New Releases, Open Source, Software engineering, Staff, Tech News, Technology

NVIDIA AI researchers recently released cuda-oxide, an experimental compiler that allows developers to write CUDA SIMT (Single Instruction, Multiple Threads) GPU kernels in standard Rust code. The project compiles Rust directly to PTX (Parallel Thread Execution) — the assembly-like intermediate representation that CUDA uses to target NVIDIA GPUs — without requiring domain-specific languages, foreign function

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX Read More »

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing

Training a family of large language models (LLMs) has always come with a painful multiplier: every model variant in the family—whether 8B, 30B, or 70B—typically requires its own full training run, its own storage, and its own deployment stack. For a dev team running inference at scale, this means multiplying compute costs by the number

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing Read More »

Wave-Powered Ocean Data Centres: Inside Panthalassa’s $140M Bet

ai, AI (Artificial Intelligence), AI Infrastructure, ai power demand, Artificial Intelligence, data centre innovation, floating data centre, ocean data centres, offshore computing, panthalassa, Peter Thiel, Renewable energy, Tech Investment, wave energy

Peter Thiel just backed Panthalassa with $140M to build wave-powered ocean data centres. Here’s how the tech works and why it matters for AI infrastructure. The post Wave-Powered Ocean Data Centres: Inside Panthalassa’s $140M Bet appeared first on 1redDrop.

Wave-Powered Ocean Data Centres: Inside Panthalassa’s $140M Bet Read More »