This Week in AI: Multivendor Strategy

This episode of This Week in AI arrived at a moment when the AI infrastructure most teams take for granted suddenly looked a lot less stable. Andreas Welsch, founder and chief human AI officer at Intelligence Briefing, was joined by Matt Palmer, head of developer experience at Conductor and developer educator on LinkedIn Learning, to work through what the US government’s export restrictions on frontier AI models actually mean for practitioners, why delegating to agents isn’t as effortless as it sounds, and what Sakana AI’s new Fugu system offers as an alternative architecture.

When the API disappears

Andreas and Matt kicked things off by following up on the latest on the Fable 5 and Mythos saga. The US government has now loosened restrictions on Anthropic’s Fable 5 and Mythos Preview, limiting them to 100 handpicked US organizations. OpenAI followed with similar restrictions on GPT-5.6, capping early access at roughly 20 organizations. For most practitioners, those models simply vanished.

Andreas named what a lot of European technology leaders were already thinking: The export restrictions may reflect policy concerns, but they’re really an infrastructure story. If your stack depends on a single frontier model that can become unavailable without warning, you’ve built a hard dependency into your architecture, not a vendor relationship.

Matt made a complementary point from a builder’s perspective. Anyone who spent time with Fable 5 before the restrictions took effect was starting to get a feel for the capability gap between it and the next available option. That gap is a business risk when a competitor has access and you don’t.

The conversation here lands in territory O’Reilly has been tracking for a while: The question that organizations should keep top of mind is how to build with enough flexibility that you can route across models when circumstances change. That means thinking about multivendor strategy as a baseline architectural requirement, the same way teams treat database portability or cloud provider independence. Anthropic has said it hopes access restrictions will evolve quickly. That may be true. . .but it also may not be. Building as if it is seems like the riskier bet.

The delegation trap

As agentic development becomes more widespread, we’ve been hearing more and more about cognitive fatigue. As developers delegate more work to coding agents, they’re reporting higher exhaustion. Last weekend, as Andreas pointed out, another article made the rounds, highlighting even more stories of engineers checking in on their agents around the clock, from their children’s soccer games to their beds. More agents running means more sessions to monitor, more approvals to give, more half-finished work to review in the morning. The promise of “it runs while you sleep” turns into something closer to managing a shift across multiple workstreams at once.

As Matt pointed out:

I think everybody is in some ways a manager of a bunch of agents now, or they’re just orchestrating workflows across these agents. Sometimes what it feels like is being a manager of a mid-sized team. You’re just sending messages all the time, and you’re checking in to make sure things are being done. Writing code, which was once a really relaxing activity—you sit down, you know, cup of coffee, you’re listening to jazz, you’re chilling out, focused on a task—it doesn’t feel like there’s that focus so much anymore.

Andreas connected this to a Harvard Business Review study from earlier this year that tracked a 200-person software company: As AI tools became more capable, people started taking on work that previously belonged to adjacent roles. Product managers were prototyping. Developers were doing design work. The tools expanded what felt possible, and what felt possible became what felt necessary, which meant more work, not less.

Andreas also drew on his own background moving from individual contributor to leadership in the corporate world, where delegation was a formalized skill with a framework behind it: What’s the task? What’s the goal? What data should be used? What does good output look like? How long should it take? Most professionals building with AI today are doing this without training, improvising delegation protocols on the fly.

This is an area where the industry’s investment in tooling has run well ahead of its investment in the organizational skills that make the tooling usable. More capable agents don’t automatically reduce load; they redistribute it in ways that are harder to see and manage. The practitioners who will continue doing this well over the long term are the ones who figure out how to set scope clearly, check output efficiently, and protect the focused work time that deep collaboration still requires.

One API call, many models

The episode’s technical centerpiece was Matt’s walkthrough of Sakana Fugu, a new model/multi-agent system from the Tokyo-based research lab Sakana AI. Fugu is a trained coordinator model that routes your query to a pool of frontier models, assembles a team of specialists, and returns a synthesized result, all through one OpenAI-compatible endpoint. The multi-agent orchestration happens entirely behind that single API call.

Matt walked through the architecture step-by-step. A query hits a lightweight coordinator model that assigns roles. One model thinks through the best approach, another does the implementation work, and a third acts as a verifier. The system can be recursive, with the coordinator assigning a subset of work back through the same process at a smaller scale. Sakana calls this learned orchestration, and the concept is backed by two papers—“TRINITY: An Evolved LLM Coordinator” and “Learning to Orchestrate Agents in Natural Language with the Conductor”—that explore how systems can learn to route and coordinate rather than follow hand-designed workflows. Matt also showed how to quickly set up Fugu as a direct API call via curl (it’s a drop-in replacement for OpenAI-compatible endpoints), through the Codex harness with a one-line installer, and through the open source OpenCode harness via OpenRouter.

Sakana is claiming its novel orchestration method extracts better performance from existing models. Fugu’s Ultra model scores comparably to Fable 5 on agentic benchmarks like Terminal-Bench, and it’s priced identically to GPT-5.5. Whether the performance claims hold up across a wider range of real workloads will be determined by the community, but the portability argument stands regardless of how those benchmarks are eventually validated.

Sakana launched Fugu 10 days after the US export restrictions on Fable 5 and Mythos took effect, with an explicit pitch around AI sovereignty. Because Fugu orchestrates models from multiple providers, a restriction on any single model won’t take the system down, and you can opt specific providers out. For teams in regions facing access uncertainty (Europe is currently locked out pending regulatory compliance, for example), that architecture is a direct response to the problem Andreas opened the episode with.

Qualcomm’s acquisition of Modular, announced the same week for roughly $3.9 billion, fits the same pattern at the hardware layer. Modular’s platform lets AI models run across different chip architectures, including NVIDIA, AMD, and custom ASICs, without requiring developers to rewrite code for each one. Qualcomm gets a hardware-agnostic abstraction layer, and the market gets another data point that portability is becoming a priority investment across the entire stack.

What’s next

Join us for the next episode of This Week in AI on Monday, July 6, from 10:00–10:30am EST, when Christina Stathopoulos breaks down the latest developments in AI.

Register to attend episodes live on the O’Reilly learning platform. If you’re not yet a member, you try it out with a free 10-day trial.

This Week in AI is available on YouTube, Spotify, Apple, or wherever you get your podcasts.