Sakana AI Launches Sakana Fugu: An Orchestration Model That Routes Tasks Across a Swappable Pool of Frontier LLMs

Today, Sakana AI launched Sakana Fugu. It is a multi-agent orchestration system that behaves like one model. You send a request to a single endpoint. Fugu decides how to handle it internally. It solves a task directly when that is enough. It also assembles and coordinates a team of expert models when needed. The complexity of a multi-agent system never reaches your code.

TL;DR

Fugu delivers a multi-agent system behind one OpenAI-compatible API.

Fugu Ultra leads most published coding and reasoning benchmarks.

The orchestrator beats the individual models it coordinates.

Opt-out and provider routing target compliance and single-vendor risk.

Routing is proprietary, so per-query model selection stays hidden.

What is Sakana Fugu

Fugu is itself a language model. It is trained to call other LLMs in an agent pool. That pool includes instances of itself, called recursively. Fugu manages model selection, delegation, verification, and synthesis internally.

Instead of hard-coded roles or workflows, Fugu learns how to coordinate. It decides when to delegate and how agents should communicate. It then combines their work into one answer. From the outside, you call a single model. Inside, a coordinated system of experts does the work.

Sakana AI frames this as a hedge against single-vendor dependency. If one provider restricts access, Fugu routes around the disruption. The research team cites recent export controls on Anthropic’s Fable and Mythos models as motivation. Over time, newer models can be folded into the pool.

Fugu and Fugu Ultra: Two Models, One API

Fugu ships in two variants, both behind one OpenAI-compatible API:

Fugu balances strong performance with low latency. It is a default for everyday coding, code review, and chatbots. It also fits tools like Codex. You can opt specific agents out of its pool. That helps teams meet data, privacy, and compliance requirements.

Fugu Ultra is tuned for maximum answer quality on hard, multi-step problems. It coordinates a deeper pool of expert agents. Its pool is fixed, so opt-out is not available. The current model ID is fugu-ultra-20260615.

The Research Behind the Orchestrator

Fugu builds on two ICLR 2026 papers Trinity and the Conductor on learned orchestration.

TRINITY uses a lightweight evolved coordinator across several turns. It assigns Thinker, Worker, or Verifier roles to delegate work adaptively. Conductor is trained with reinforcement learning. It discovers natural-language coordination strategies and focused prompts for diverse LLM pools.

Together, they show systems can learn to assemble and route agents per task. That replaces hand-designed workflows.

Interactive Explainer

(function(){
window.addEventListener(“message”, function(e){
if (e && e.data && e.data.type === “fugu-sim-height”) {
var f = document.getElementById(“fugu-sim-frame”);
if (f && e.data.height) { f.style.height = e.data.height + “px”; }
}
});
})();

Benchmark

Sakana AI compares Fugu against the foundation models it orchestrates. Baselines use provider-reported scores. SWE Bench Pro uses the mini-swe-agent as scaffolding.

BenchmarkFuguFugu UltraOpus 4.8Gemini 3.1 ProGPT 5.5SWE Bench Pro*59.073.769.254.258.6TerminalBench 2.180.282.174.670.378.2LiveCodeBench92.993.287.888.585.3LiveCodeBench Pro87.890.884.882.988.4Humanity’s Last Exam47.250.049.844.441.4CharXiv Reasoning85.186.684.283.384.1GPQA-D95.595.592.094.393.6SciCode60.158.753.558.956.1τ³ Banking21.720.620.68.420.6Long Context Reasoning74.773.367.772.774.3MRCRv286.693.687.984.994.8

The orchestrator posts the top score on 10 of 11 rows. Fugu Ultra tops the four coding benchmarks, CharXiv Reasoning, and Humanity’s Last Exam. It ties regular Fugu on GPQA-D. Regular Fugu leads SciCode, τ³ Banking, and Long Context Reasoning. GPT 5.5 wins MRCRv2, the only baseline win here.

Its Fugu models stand shoulder-to-shoulder with Anthropic’s Fable 5 and Mythos Preview. Those two are not in Fugu’s pool, since they are not publicly accessible.

Use Cases

Sakana AI ran a beta with close to 500 early users. The published examples favor long, multi-step tasks.

AutoResearch: An agent improved a small GPT’s training recipe autonomously. It ran 123 experiments over roughly 14 hours on one H100 GPU. Fugu Ultra reached the best mean validation BPB of 0.9774, with a best single run of 0.9748.

Rubik’s Cube solver: Each model wrote a pure-Python solver, no libraries allowed. Fugu Ultra solved all 300 held-out cubes, averaging 19.72 moves. One baseline matched it closely at 19.76 moves. Two others crashed and solved none.

Classical Japanese kana reading order: On a 1610 letter, Fugu Ultra scored NED 0.80. The nearest baseline reached only 0.24.

Blindfold chess: Fugu played four games from memory, with no board shown. It beat three frontier models and a 2100-Elo Stockfish engine.

Online trading: On one 50-week window, Fugu Ultra returned +19.43% on average across five runs. The other frontier models stayed below +15%. Sakana AI notes past performance does not guarantee future results.

A Minimal API Example

Fugu uses an OpenAI-compatible API, so no SDK migration is required. Point an existing client at your console-provided endpoint.

Copy CodeCopiedUse a different Browserfrom openai import OpenAI

# Endpoint and key come from your Sakana console (console.sakana.ai).
client = OpenAI(
base_url=”https://<your-fugu-endpoint>/v1″, # from console.sakana.ai
api_key=”YOUR_SAKANA_API_KEY”,
)

resp = client.chat.completions.create(
model=”fugu-ultra-20260615″, # or “fugu”
messages=[
{“role”: “user”,
“content”: “Reproduce the method in this paper and report the gap.”},
],
)

print(resp.choices[0].message.content)

Token usage and cost are reported per request. So you can monitor spend in real time.

Community Reactions

#fugu-sent-root *{box-sizing:border-box;margin:0;padding:0}
#fugu-sent-root{
–bg:#fff;–ink:#0a0a0a;–mut:#6b6b6b;–line:#dcdcdc;–soft:#f5f5f5;–soft2:#ebebeb;
font-family:”IBM Plex Mono”,ui-monospace,SFMono-Regular,Menlo,Consolas,monospace;
background:var(–bg);color:var(–ink);border:1px solid var(–ink);
max-width:920px;margin:0 auto;-webkit-font-smoothing:antialiased;line-height:1.5;
}
#fugu-sent-root .hd{border-bottom:1px solid var(–ink);padding:18px 20px;display:flex;justify-content:space-between;align-items:flex-start;gap:12px;flex-wrap:wrap}
#fugu-sent-root .hd h2{font-size:17px;letter-spacing:.03em;font-weight:700}
#fugu-sent-root .hd p{font-size:11.5px;color:var(–mut);margin-top:6px;max-width:560px}
#fugu-sent-root .tag{font-size:10px;letter-spacing:.12em;text-transform:uppercase;border:1px solid var(–ink);padding:4px 8px;white-space:nowrap}
#fugu-sent-root .panel{padding:18px 20px;border-bottom:1px solid var(–line)}
#fugu-sent-root .lbl{font-size:10px;letter-spacing:.16em;text-transform:uppercase;color:var(–mut);margin-bottom:10px;display:block}
/* overview bar */
#fugu-sent-root .obar{display:flex;height:32px;border:1px solid var(–ink);overflow:hidden}
#fugu-sent-root .seg{display:flex;align-items:center;justify-content:center;white-space:nowrap;border-right:1px solid var(–ink)}
#fugu-sent-root .seg:last-child{border-right:0}
#fugu-sent-root .seg.sup{background:#0a0a0a}
#fugu-sent-root .seg.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}
#fugu-sent-root .seg.cri{background:#fff}
#fugu-sent-root .seg .t{font-size:10.5px;font-weight:700;background:#fff;color:#0a0a0a;border:1px solid #0a0a0a;padding:1px 7px;line-height:1.4}
#fugu-sent-root .legend{display:flex;gap:18px;flex-wrap:wrap;margin-top:12px;font-size:11px;color:var(–mut)}
#fugu-sent-root .legend span{display:inline-flex;align-items:center;gap:7px}
#fugu-sent-root .sw{width:14px;height:14px;border:1px solid var(–ink);display:inline-block}
#fugu-sent-root .sw.sup{background:#0a0a0a}
#fugu-sent-root .sw.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}
#fugu-sent-root .sw.cri{background:#fff}
#fugu-sent-root .summary{font-size:12.5px;margin-top:14px;border-left:3px solid var(–ink);padding-left:12px}
/* filters */
#fugu-sent-root .filters{display:flex;gap:0;flex-wrap:wrap;border:1px solid var(–ink);width:max-content;max-width:100%}
#fugu-sent-root .filters button{font-family:inherit;font-size:12px;background:var(–bg);color:var(–ink);border:0;padding:9px 15px;cursor:pointer;letter-spacing:.02em;border-right:1px solid var(–ink)}
#fugu-sent-root .filters button:last-child{border-right:0}
#fugu-sent-root .filters button.on{background:#0a0a0a;color:#fff}
/* cards */
#fugu-sent-root .cards{padding:8px 20px 18px;display:grid;grid-template-columns:1fr 1fr;gap:12px}
#fugu-sent-root .card{border:1px solid var(–ink);padding:13px 14px;display:flex;flex-direction:column;gap:9px;background:var(–bg)}
#fugu-sent-root .card .top{display:flex;justify-content:space-between;align-items:center;gap:8px}
#fugu-sent-root .who{display:flex;align-items:center;gap:8px;min-width:0}
#fugu-sent-root .plat{font-size:9px;letter-spacing:.08em;border:1px solid var(–ink);padding:2px 5px;font-weight:700;flex:none}
#fugu-sent-root .plat.x{background:#0a0a0a;color:#fff}
#fugu-sent-root .handle{font-size:12px;font-weight:700;white-space:nowrap;overflow:hidden;text-overflow:ellipsis}
#fugu-sent-root .chip{font-size:9px;letter-spacing:.08em;text-transform:uppercase;border:1px solid var(–ink);padding:2px 7px;flex:none}
#fugu-sent-root .chip.sup{background:#0a0a0a;color:#fff}
#fugu-sent-root .chip.ske{background:var(–soft2)}
#fugu-sent-root .chip.cri{background:#fff;border-style:dashed}
#fugu-sent-root .card .body{font-size:12.5px;line-height:1.5}
#fugu-sent-root .card .q{font-style:italic}
#fugu-sent-root .card .foot{display:flex;justify-content:space-between;align-items:center;gap:8px;margin-top:auto;padding-top:4px;border-top:1px dotted var(–line)}
#fugu-sent-root .theme{font-size:10px;color:var(–mut);letter-spacing:.04em}
#fugu-sent-root a.src{font-size:11px;color:var(–ink);text-decoration:none;border-bottom:1px solid var(–ink);white-space:nowrap;font-weight:700}
#fugu-sent-root a.src:hover{background:#0a0a0a;color:#fff;border-bottom-color:#0a0a0a;padding:0 3px}
#fugu-sent-root .affil{font-size:9px;color:var(–mut)}
/* press row */
#fugu-sent-root .press{display:flex;gap:10px;flex-wrap:wrap}
#fugu-sent-root .press a{font-size:11.5px;color:var(–ink);text-decoration:none;border:1px solid var(–ink);padding:7px 11px;display:inline-flex;gap:6px;align-items:center}
#fugu-sent-root .press a:hover{background:#0a0a0a;color:#fff}
#fugu-sent-root .note{font-size:10px;color:var(–mut);line-height:1.6;padding:0 20px 16px}
#fugu-sent-root .ft{padding:12px 20px;border-top:1px solid var(–ink);display:flex;justify-content:space-between;align-items:center;gap:10px;flex-wrap:wrap;font-size:10.5px;color:var(–mut)}
#fugu-sent-root .ft b{color:var(–ink);letter-spacing:.04em}
@media(max-width:640px){
#fugu-sent-root .cards{grid-template-columns:1fr}
#fugu-sent-root .hd h2{font-size:15px}
#fugu-sent-root .seg{font-size:10px}
}

Sakana Fugu — Early Community Sentiment
A manual review of public reaction on X and Hacker News, with links to every source. Captured June 22, 2026.

12 posts reviewed

Sentiment split (n = 12)

Supportive 3
Skeptical 6
Critical 3

Supportive
Skeptical
Critical

Early reaction skews skeptical. The “is this just a router or wrapper?” question dominates. The clearest supportive voices are Sakana‑affiliated.

All
Supportive
Skeptical
Critical

Press & analysis

Hacker News thread · 50 pts &nearr;
VentureBeat report &nearr;
Clanker Cloud analysis &nearr;

Method: sentiment was assigned by hand from a small sample of public posts on June 22, 2026. This is not a statistical survey, and the split can shift as more reactions arrive. Two of the three supportive posts are from Sakana AI or its CEO. Quotes are shortened; follow each link for full context. The Reddit quote is as reported by VentureBeat.

Marktechpost · Sakana Fugu sentiment tracker
Sources: X · Hacker News · VentureBeat

(function(){
var root = document.getElementById(‘fugu-sent-root’);
var DATA = [
// SUPPORTIVE
{s:’sup’, plat:’X’, handle:’@SakanaAILabs’, affil:’Sakana AI (official)’,
body:’Launch announcement. Positions Fugu Ultra to match Fable and Mythos, without export-control risk.’,
theme:’Announcement’, url:’https://x.com/SakanaAILabs/status/2068861630327443966′},
{s:’sup’, plat:’X’, handle:’@hardmaru’, affil:’David Ha, Sakana CEO’,
body:’“Orchestration Models are the next frontier, beyond bigger models.” Frames it as a hedge against single-vendor risk.’,
theme:’Vision’, url:’https://x.com/hardmaru/status/2068884466056225025′},
{s:’sup’, plat:’Blog’, handle:’Clanker Cloud’, affil:’Independent analysis’,
body:’Calls Fugu a productized orchestration layer and a healthy debate — but wants real observability into which agents ran.’,
theme:’Analysis’, url:’https://clankercloud.ai/blog/sakana-fugu-release-model-orchestration-clanker-cloud’},
// SKEPTICAL
{s:’ske’, plat:’HN’, handle:’ed_mercer’, affil:’Hacker News’,
body:’“So basically… openrouter?”’,
theme:’Router framing’, url:’https://news.ycombinator.com/item?id=48625104′},
{s:’ske’, plat:’HN’, handle:’embedding-shape’, affil:’Hacker News’,
body:’Asks how one API is not just swapping one single-vendor dependency for another.’,
theme:’Sovereignty’, url:’https://news.ycombinator.com/item?id=48625312′},
{s:’ske’, plat:’HN’, handle:’bprasanna’, affil:’Hacker News’,
body:’“Isn’t this what perplexity is?”’,
theme:’Router framing’, url:’https://news.ycombinator.com/item?id=48625401′},
{s:’ske’, plat:’HN’, handle:’stygiansonic’, affil:’Hacker News’,
body:’Reads it as a coordinator, not just fusion — an agent-of-agents, with token usage rising accordingly.’,
theme:’Architecture’, url:’https://news.ycombinator.com/item?id=48625273′},
{s:’ske’, plat:’HN’, handle:’alasano’, affil:’Hacker News’,
body:’Sees Fugu Ultra building a dynamic multi-model mini-plan, more than OpenRouter Fusion.’,
theme:’Architecture’, url:’https://news.ycombinator.com/item?id=48625361′},
{s:’ske’, plat:’Reddit’, handle:’GreedyWorking1499′, affil:’Reddit (via VentureBeat)’,
body:’“A highly advanced router/wrapper” — not a fundamental leap like Mythos or Fable, until proven otherwise.’,
theme:’Router framing’, url:’https://venturebeat.com/orchestration/no-claude-fable-5-no-problem-sakana-achieves-frontier-performance-with-new-fugu-multi-model-auto-synthesis-system’},
// CRITICAL
{s:’cri’, plat:’X’, handle:’@eliebakouch’, affil:’Prime Intellect’,
body:’“This is not ‘AI sovereignty’.” Calls regular Fugu a router and flags opaque “Model A/B/C” baselines.’,
theme:’Sovereignty / transparency’, url:’https://x.com/eliebakouch/status/2068939729811468503′},
{s:’cri’, plat:’X’, handle:’@teortaxesTex’, affil:’Independent’,
body:’Withholds excitement pending cost analysis. An orchestrator spending many frontier tokens may not beat best-of-n.’,
theme:’Cost’, url:’https://x.com/teortaxesTex/status/2068986775796687229′},
{s:’cri’, plat:’HN’, handle:’adamnemecek’, affil:’Hacker News’,
body:’“Seems kinda underwhelming considering they raised like $400M.”’,
theme:’Expectations’, url:’https://news.ycombinator.com/item?id=48625429′}
];

var labelS = {sup:’Supportive’, ske:’Skeptical’, cri:’Critical’};

function render(filter){
var box = root.querySelector(‘#cards’);
box.innerHTML = ”;
DATA.filter(function(d){ return filter===’all’ || d.s===filter; }).forEach(function(d){
var platCls = d.plat===’X’ ? ‘plat x’ : ‘plat’;
var card = document.createElement(‘div’);
card.className = ‘card’;
card.innerHTML =
”+
”+d.plat+”+
”+d.handle+’