Skip to content
Join readers who trust AllThings.How for practical guides Opens in a new tab

Claude Opus 4.8 dynamic workflows in Claude Code, explained

Shivam Malani
Claude Opus 4.8 dynamic workflows in Claude Code, explained

Dynamic workflows is the headline addition that shipped with Claude Opus 4.8 inside Claude Code. Instead of working turn by turn, Claude writes a JavaScript orchestration script from your plain-language request, hands it to a separate runtime, and that runtime spins up dozens to hundreds of subagents in parallel. Your chat session stays free while the agents work, and only the final, consolidated result comes back to you. It launched as a research preview on May 28, 2026.

Quick answer: Put the word workflow in your prompt, or run /effort ultracode, on Claude Code v2.1.154 or later. Claude plans the job, fans it out across parallel subagents, verifies the results, and returns one report.

What a dynamic workflow is in Claude Code

A dynamic workflow is not a new model, a plugin, or a separate command-line tool. It is a shift in who holds the plan. With a normal subagent, Claude decides each step in its own context window, and every intermediate result eats into that context. With a workflow, the plan moves into code. The script decides what to launch, in what order, and with what loop or branching logic, and it keeps intermediate state in script variables that live outside the conversation.

That separation is the whole point. Because the plan and the partial results sit in the runtime rather than Claude's context, the model's window holds only the final answer. The session is responsive the entire time, and you get a single report at the end instead of a long turn-by-turn transcript. You can read the official write-up on dynamic workflows in Claude Code.


How a workflow runs

When a workflow starts, Claude plans dynamically from your prompt and breaks the task into subtasks. Subagents then run in parallel, each tackling the problem from an independent angle. Other agents try to refute those findings, and the run keeps iterating until the answers converge. Results are checked before anything is folded into the final output.

The runtime enforces hard limits. It allows up to 16 agents running at the same time and caps a single run at 1,000 agents total. The orchestration script itself cannot touch the filesystem or shell. Only the spawned agents read files, write files, and run commands. Subagents inside a workflow always operate in acceptEdits mode, so their file changes are auto-approved regardless of the session's permission mode.

Progress is saved as the run proceeds. If you interrupt a workflow, the agents that already finished return cached results on resume, and only the remaining ones run live, because the runtime tracks every state. Coordination happens outside the conversation, which keeps the plan on track even across a long job.


Three ways to start a workflow

Step 1: Trigger one explicitly by including the word workflow anywhere in your prompt. Claude Code highlights it and writes a script for that single task instead of working turn by turn.


Run a workflow to audit every API endpoint under src/routes/ for missing auth checks

Step 2: Turn on automatic orchestration with /effort ultracode. Ultracode combines xhigh reasoning with auto orchestration, so Claude decides on its own when a task is big enough to justify a workflow. It is available only on models that support xhigh effort, and it lasts for the current session.


/effort ultracode

Step 3: Run the bundled /deep-research workflow to try the feature without describing a complex task. It launches web searches from multiple angles, fetches the sources, cross-checks them, votes on each claim, and returns a cited report.


/deep-research "What changed in the Node.js permission model between v20 and v22?"

Note: when ultracode is on, a single request can become several workflows in sequence, such as one to understand the code, one to apply the change, and one to verify. Drop back to /effort high once the heavy task is done, because ultracode typically burns more tokens than a standard session.


Approval, dashboard, and saving a workflow

Before execution, Claude Code shows you the planned phases and asks for approval. The prompt offers four choices: run it, run it and stop asking for this name in this path, show the raw script (open it with Ctrl+G), or decline. In auto mode the prompt only appears on the first launch for that workflow. With bypass permissions, or through claude -p and the Agent SDK, confirmation is disabled and the workflow starts immediately.

Once a run is going, open the dashboard with /workflows. Each phase appears with its agent count, total tokens consumed, and elapsed time. Inside the dashboard you can navigate with the arrows, pause with p, kill a single agent with x, restart it with r, and save the generated script as a reusable command with s. Saved scripts live in .claude/workflows/ or ~/.claude/workflows/ and can be re-run on any branch or project with a slash command.


Where it runs and which plans get it

RequirementDetail
Minimum versionClaude Code v2.1.154 or later
SurfacesCLI, Desktop, and VS Code extension
PlansMax, Team, and Enterprise
Default stateOn by default on Max and Team; off on Enterprise until an admin enables it
Also available onClaude API, Amazon Bedrock, Vertex AI, Microsoft Foundry
Concurrent agentsUp to 16
Total agents per runUp to 1,000

The Bun rewrite, the clearest real example

The standout proof point is a port of Bun, the JavaScript runtime, from Zig to Rust. Jarred Sumner used dynamic workflows to produce roughly 750,000 lines of Rust in 11 days from first commit to merge, keeping 99.8% of the existing test suite passing. One workflow mapped the correct Rust lifetime for every struct field. The next wrote each .rs file as a behavior-identical port. Hundreds of agents worked in parallel, with two reviewers assigned to each file, and a fix loop drove the build and tests until they were clean. The result is not yet in production.

The same pattern fits other large jobs. Repo-wide bug hunting runs parallel searches and then verifies each finding to weed out false positives. Large-scale migrations cover framework swaps, API deprecations, and language ports across thousands of files. Critical verification uses adversarial agents that critique each other and deliver only the conclusions that survive the cross-check.


When to use a workflow versus subagents or skills

AspectSubagentSkillWorkflow
What it isWorker spawned by ClaudeMarkdown instructionsScript run by the runtime
Who decides the next stepClaude, turn by turnClaude, following the promptThe script
Intermediate resultsClaude's context windowClaude's context windowScript variables
ScaleA few tasks per turnSame as subagentsDozens to hundreds per run
InterruptionThe turn restartsThe turn restartsResumable within the session

The rule is simple. If the plan fits in two or three steps Claude can hold in its head, stick with subagents or skills. Once the plan becomes code, repeatable, and scalable to hundreds of independent operations, reach for a workflow. Skills still work inside workflows, since each spawned agent can use installed skills.


Cost and how to keep it under control

A workflow burns substantially more tokens than a standard Claude Code session, because every agent carries its own context overhead. A run that spawns hundreds of agents can climb into a large bill fast. Each agent uses the model of the current session unless the script routes a phase to a different one, so check /model before a big run. Switching to Opus 4.8 for a 500-agent audit can change the cost by an order of magnitude compared with a smaller model.

💸
Start with a tightly scoped task to calibrate token use before launching a repository-wide audit or a migration across thousands of files. You can also ask Claude, in the same task description, to use a smaller model for phases that do not need the deepest reasoning.

The API change that makes it possible

On the raw Messages API, the enabling piece is mid-conversation system messages. Before Opus 4.8, a system prompt sat at the start of a conversation and stayed fixed. Now you can place a system entry partway through the messages array, updating instructions, permissions, token budgets, or environment context mid-task without breaking the prompt cache or routing the change through a user turn. That standing permission is what lets an orchestrator launch worker agents after a run has already started, based on what it discovers.

You can build the same orchestration pattern outside Claude Code. Run an orchestrator call at xhigh effort that plans the task, use a mid-conversation system message to grant it permission to dispatch workers, fan out worker calls in parallel scoped to one unit each, then collect the results and feed them back to the orchestrator to merge. Set a generous max_tokens so the model has room to plan and coordinate.


import anthropic

client = anthropic.Anthropic()

orchestrator = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=64000,
    output_config={"effort": "xhigh"},
    thinking={"type": "adaptive"},
    messages=[
        {"role": "user", "content": "Plan a refactor of the auth module across all 14 services."},
    ],
)

Orchestrator call on the Messages API

Run each worker as a separate Messages call, often at a lower effort level since its job is narrow. Cap max_tokens per worker so a runaway agent cannot drain your budget, and cache shared context so the repeated system prompt is not billed at full rate on every worker.


How you know it worked, and a few constraints

You can confirm a workflow completed because Claude returns a single consolidated report rather than a running transcript, and the /workflows dashboard shows each phase finishing with its agent count and token total. For the Bun-style migration pattern, the bar is the existing test suite, so a clean build and passing tests are the signal the job converged.

Two constraints are worth remembering. No user input is accepted in the middle of a run; the only thing that pauses a workflow is a permission prompt from an agent for shell, web fetch, or an MCP tool that is not on the allowlist. And ultracode is session-scoped, so it resets on every new session and is only available on models that support xhigh effort. Opus 4.8 keeps the standard Opus pricing of $5 per million input tokens and $25 per million output tokens, with effort levels of low, medium, high (the default), xhigh, and max, so workflows scale your spend with the number of agents you let loose.