Sakana AI is taking a less common route in the agent race. Instead of presenting one larger model as the answer to every task, the Tokyo-based lab is opening beta access to Sakana Fugu, a system designed to coordinate several models and agents around a problem.
The idea is simple but important: the best answer may not come from choosing one model. It may come from letting a smaller conductor decide which model should reason, code, verify, or summarize at each step.
The bet: coordination over size
In its Fugu announcement, Sakana describes the product as a multi-agent orchestration system offered through an API compatible with OpenAI-style endpoints. The company says Fugu dynamically selects models from a pool, assigns roles, dispatches subtasks, and builds collaboration patterns without relying on fixed human-written workflows.
That is a different pitch from most enterprise AI tooling. Many platforms still ask developers to wire up chains, routers, prompts, and tool calls by hand. Sakana is arguing that the orchestration layer itself can be learned.
The system comes in two planned variants: Fugu Mini for lower latency and Fugu Ultra for more demanding tasks. Sakana says the product is based on its Conductor and Trinity research and is now open for external beta applications.
Why a small conductor matters
VentureBeat reported that Sakana trained a 7B-parameter Qwen2.5 model with reinforcement learning to act as the conductor. Its job is not to be the smartest model in the system. Its job is to decide how the other models should work together.
That distinction matters for cost and flexibility. If a small coordinator can route work across GPT, Claude, Gemini, open-source models, or internal tools, companies may not need to hardcode a separate pipeline for every product feature. They can expose a task and let the conductor assemble the workflow.
Sakana says Fugu can also call itself recursively, using test-time compute as a knob. In practice, that means the conductor may inspect its own earlier coordination plan and revise it without retraining. That is still an early research-to-product claim, but it points toward agent systems that improve by spending more compute on difficult cases instead of using the same static path every time.
The practical appeal
The immediate use cases are familiar: coding, math, scientific reasoning, research, strategy work, and other tasks where multiple models may have complementary strengths. The more interesting part is operational.
Today, many teams have agent demos that work well on narrow examples and degrade when user requests become messy. Static chains are easy to understand, but they are brittle. A learned conductor promises a more adaptive middle layer: not a replacement for model evaluation, security checks, or workflow design, but a way to reduce manual routing work.
There are trade-offs. A dynamic orchestration system can be harder to debug than a fixed chain. Enterprises will want observability into which models were called, why they were selected, what data moved between them, and how costs were controlled. If Fugu hides too much of that decision-making, adoption will be limited to experiments.
What to watch next
Fugu is still in beta, so the important questions are not just benchmark scores. Watch whether developers can inspect the conductor's decisions, constrain model choices, keep sensitive data inside approved boundaries, and measure when orchestration is worth the extra complexity.
The broader signal is clear: the next phase of agent infrastructure may not be about one model winning every category. It may be about systems that know when to use several models, when to stop, and when a simple local model is enough. That is a quieter story than another frontier model launch, but it could be more useful for teams trying to make agents work in production.



