AI Agent Orchestration: Routing and Handoffs in Support

TL;DR

AI agent orchestration is the process of routing a customer request through the right steps:

Classify
Retrieve
Call tools
Use a specialist agent
Request approval
Answer
Hand off

The orchestrator should be explicit, observable, and constrained.

For support teams, orchestration is about choosing the right next action, not making the agent look autonomous.

What is AI agent orchestration?

Definition

AI agent orchestration coordinates several specialized AI agents so they operate as one system working toward a single goal. Instead of asking one general-purpose model to handle everything, it gives each agent a narrow job and adds a control layer that decides which agent or tool takes each step.

In a customer support context, the orchestration layer is the part of the system that decides whether this message should be answered from the knowledge base, sent to a billing agent, looked up in an order tool, or escalated to a human. The agents are the specialists who do the work. The orchestrator decides who works.

The distinction matters because agents provide capability, while orchestration provides control. As Snowflake’s engineering team frames it, multi-agent systems differ from traditional AI pipelines: rather than a linear flow from input to output, agents operate iteratively and in parallel, revise plans, and act on partial information. Orchestration is what keeps that from becoming chaos.

We’re going to explain AI agent orchestration through this article. We’ll cover:

Customer support orchestration architecture
OpenAI Agents SDK: how orchestration works in practice
Should you use multiple AI agents?
Top 5 AI agent framework comparison
How to use tools?
Manage approval gates and handoffs.
Industry-specific AI orchestration patterns
Final thoughts
FAQs

Customer support orchestration architecture

In customer support, orchestration is usually not about making many agents debate each other. It is about selecting the next safe workflow:

Answer from knowledge
Call a tool
Collect a missing field
Route to a queue
Hand off.

For customer support, the chosen path must remain attached to the conversation. The AI serves as a classifier and triggers intents, while Kommunicate uses that signal to manage conversations and data. This happens with a router.

Router pattern

The router decides:

Answer from knowledge
Call order tool
Route to billing
Escalate the complaint
Ask for clarification

The router should return a structured decision. It can be shaped as follows:

{
“path”: “tool_lookup”,
“intent”: “order_status”,
“tool”: “get_order_status”,
“requiresApproval”: false,
“fallbackPath”: “handoff”,
“reason”: “Customer asks for current delivery status and provided an order ID.”
}

This keeps orchestration debuggable. When the wrong path is chosen, you can inspect the decision instead of guessing why the model answered the way it did.

A good router also separates intent from action. A refund_request intent may still become clarify if the order ID is missing, handoff if the item is damaged, or answer if the customer only asks about the return window.

To see how this works in practice, we can look at the OpenAI Agents SDK.

OpenAI Agents SDK: how orchestration works in practice

The OpenAI Agents SDK, released in March 2025 as the production-ready successor to OpenAI’s experimental Swarm project, is the most direct way to implement the patterns described above using OpenAI models. It has accumulated over 26,900 GitHub stars and 10.3 million monthly downloads.

Its design philosophy is to provide the minimum set of primitives needed for agent development and let developers compose them without imposing heavy abstraction layers.

The four core primitives

OpenAI Agents SDK core primitives shown as four cards: Agents with a narrow job and clear instructions, Handoffs that transfer the full conversation, Guardrails that validate input and output, and Tracing that logs every step, with a triage agent handing off to a specialist agent below. — OpenAI Agents SDK core primitives

The SDK is built around four concepts:

Agents – An agent is an LLM configured with instructions, tools, and optional runtime behavior such as handoffs, guardrails, and structured outputs. Each agent has a narrow job.
Handoffs: A handoff transfers the entire conversation to a specialist agent. The receiving agent takes over the interaction with access to the complete conversation history. Unlike agent-as-tool (covered below), a handoff means the parent agent steps aside entirely.
Guardrails – Input and output validation that runs on every agent turn. Guardrails can reject a message before it reaches the model, or block a response before it reaches the user.
Tracing – Built-in observability that logs each step in the OpenAI dashboard, showing which agent handled which turn, which tools were called, and where handoffs occurred.

Agents can also be used as tools or for handoffs.

Agent-as-tool vs handoff

These two patterns solve different problems, and the distinction is worth understanding before you design your routing architecture.

	Agent-as-tool	Handoff
Who owns the reply?	Calling (parent) agent	Receiving (specialist) agent
Conversation history	Not transferred	Fully transferred
Best for support	Subtasks: data lookup, summarise	Full routing: billing, refund, technical
Reversibility	The parent incorporates the result	Transfer is final for that turn

Use agent-as-tool when the main agent should stay responsible for the final answer, and call a specialist as a helper. Use a handoff when the specialist should own the next responsibility, and the main agent should step aside.

How to build an agent for support triage?

The canonical support orchestration pattern in the SDK is a triage agent that classifies incoming messages and routes them to the appropriate specialist via handoff.

The handoffs are declared in the triage agent’s configuration upfront, not discovered dynamically at runtime.

from agents import Agent, handoff

billing_agent = Agent(name=”Billing agent”)
refund_agent = Agent(name=”Refund agent”)

triage_agent = Agent(
name=”Triage agent”,
handoffs=[billing_agent, handoff(refund_agent)],
instructions=(
“Route billing questions to the billing agent. “
“Route refund requests to the refund agent.”
)
)

When a handoff occurs, the delegated agent receives the conversation history and takes over the conversation. The triage agent does not answer the user directly. It only decides who should.

The handoff() function also accepts an on_handoff callback, which fires as soon as the handoff is invoked. This is useful for kicking off a data fetch or logging a routing decision the moment the triage agent commits to a path, before the specialist agent even starts.

Some SDK limitations to factor in

The OpenAI Agents SDK is most effective when used with OpenAI models, especially via the Responses API, which is the recommended path for OpenAI-only applications. It can work with non-OpenAI providers through built-in provider integration points and third-party adapters such as LiteLLM. Still, some provider-specific capabilities, including tool calling, structured outputs, usage reporting, and routing behavior, should be validated before production use.

Runtime context is passed per run, so application-specific context still needs to be managed deliberately. However, the SDK includes built-in session memory to maintain conversation history across runs, with options such as:

SQLite
Redis
SQLAlchemy
MongoDB
Dapr
Encrypted sessions
OpenAI Conversations API sessions

For production-grade state, teams still need to choose and operate the right backing store, but they do not have to build all persistence from scratch.

The SDK is not a graph-based orchestration framework. It supports manager-style agents and explicit handoffs, where one agent can transfer control to another. For workflows that require conditional edges, durable execution, complex branching, human-in-the-loop checkpoints, or long-running stateful flows, LangGraph is a better fit.

Whenever you create an AI agent workflow for production, you also need to understand if you should create multiple agents or not.

Should you use multiple AI agents?

When to split AI agents across three levels of increasing complexity: a single agent with tools for most support workflows, a router plus specialists for different domains and different tools, and a multi agent system for separate ownership at enterprise scale. — When to split AI agents

Most teams should start with one orchestrated agent. Use multiple agents only when ownership is genuinely different.

Setup	Best For
Single agent with tools	Most support workflows
Router plus specialists	Different domains with different tools
Multi-agent system	Complex enterprise workflows with separate ownership

Do not split agents because it sounds advanced. Split them when it makes permissions, prompts, tools, and evaluation clearer.

As OpenAI’s own orchestration guide notes: start with one agent whenever you can. Adding specialists only improves things when they materially improve capability isolation, policy isolation, prompt clarity, or trace legibility.

Now that you have a method to operationalize AI agents, let’s talk about which framework you should use.

Top 5 AI agent framework comparison

The right framework depends on what the workflow actually requires. Below is how the five main frameworks compare on the dimensions that matter for support orchestration.

Framework	Orchestration Model	State Persistence	Model Dependency	Best For
OpenAI Agents SDK	Explicit handoffs	Ephemeral (per-run)	OpenAI-native	Teams already on the OpenAI API
LangGraph	Directed graph, conditional edges	Built-in checkpointing	Model-agnostic	Complex branching, auditability
CrewAI	Role-based crews	Task outputs passed sequentially	Model-agnostic	Rapid prototyping, role-mapped teams
AutoGen / AG2	Conversational GroupChat	In-memory by default	Model-agnostic	Human-supervised review workflows
Google ADK	Hierarchical agent tree	Session state, pluggable backends	Optimized for Gemini	Google Cloud workloads

A few data points on relative adoption:

LangGraph leads in enterprise usage at 34.5 million monthly downloads.
CrewAI has grown to over 44,500 GitHub stars and is the fastest path to a working multi-agent prototype, though it runs agents sequentially by default, which limits its use in high-throughput production deployments.
AutoGen achieves higher reasoning accuracy on complex tasks but incurs significantly higher token costs than LangGraph due to its conversational overhead.

For most support orchestration, the choice comes down to two things: whether you need durable state across sessions (if yes, LangGraph), and whether your team is already invested in OpenAI’s API (if yes, start with the Agents SDK).

You do not need to pick one permanently. Many production teams use LangGraph for tool management and retrieval while using the Agents SDK for the agent layer.

One thing no framework gives you out of the box: the channel layer. Kommunicate provides support routing, human handoff, conversation history, and analytics that sit beneath any orchestration framework. Use the Kommunicate docs to connect orchestration to live support channels.

How to use tools?

Tools should be scoped by risk level, and that scoping should be explicit before any tool is deployed.

Tool risk tiers for AI agent orchestration, numbered 1 to 5: search docs rated low, read order rated low, create ticket rated low, update account rated medium, and issue refund rated high, each connecting to a panel showing typed inputs and expected outputs. — Tool risk tiers for AI agent orchestration

Tool	Risk
Search docs	Low
Read order	Low
Create ticket	Low
Update account	Medium
Issue refund	High

Tool calls should have typed inputs and expected outputs.

{
“tool”: “get_order_status”,
“input”: {
“orderId”: “A18291”
},
“expectedOutput”: {
“status”: “string”,
“estimatedDelivery”: “string”,
“requiresHumanReview”: “boolean”
}
}

The model can choose the tool. The backend should validate the tool input. To learn more about tool use, you can see our function-calling tutorial.

Manage approval gates and handoffs.

Approval gates

Approval gates are used for important tasks that significantly affect billing or the customer. In customer support, you should use approvals for:

Refunds
Cancellations
Account changes
Sensitive decisions
Low-confidence actions

Approval gates are not a workaround for a weak agent. They are a design requirement for any action that affects money, identity, or the state of an account. Approval gates and audit logging should be in place before go-live, not after the first incident.

Handoff

Handoff is an orchestration outcome. It should include a summary and a reason. The summary gives the receiving human agent context without requiring them to read the full transcript. The reason explains why the AI could not or should not continue.

With each handoff, log each orchestration step:

Incoming channel
Detected intent
Selected path
Retrieved sources
Tool calls
Approval decisions
Final action
Handoff reason
Outcome

Without this trace, orchestration becomes impossible to debug. A wrong answer might come from the router, the retrieval, the tool, the prompt, or the handoff rule.

A useful orchestration trace is compact but complete:

{
“conversationId”: “conv_123”,
“intent”: “order_status”,
“selectedPath”: “tool_lookup”,
“retrievedSources”: [“shipping_policy”],
“toolsCalled”: [“get_order_status”],
“approvalRequired”: false,
“finalAction”: “answer”,
“fallbackPath”: “handoff”,
“reason”: “Order ID was present, and lookup succeeded.”
}

This trace gives support operations a practical debugging surface without exposing the full prompt or sensitive customer data.

For a practical rollout, follow these steps:

Start with one orchestrated agent and a few tools.
Add specialist agents only after the routing, logging, fallback, and handoff paths are stable.

This becomes more important as enterprise AI agents increase in volume. Gartner projects that 40% of enterprise applications will feature task-specific AI agents by the end of 2026, up from under 5% in 2025. Most of those deployments will start simple, but they need the above safeguards built in to function as intended.

Industry-specific AI orchestration patterns

Customer workflows should determine how much orchestration is actually needed.

Pattern	Case-study Signal	Orchestration Lesson
Department routing	BTVI routed student questions to departments and operators.	Keep queue selection explicit.
Multi-agent campus support	CSUSB used different agents and human groups.	Split paths by domain ownership.
Document-heavy expert support	TaxBuddy combined agent collection with CA review.	Separate collection, validation, and expert handoff.
High-volume gaming support	BlueStacks processed millions of messages.	Optimize for scale, fallback, and analytics.

For BFSI, the orchestrator might route suspected fraud to a security path, loan questions to a policy path, and branch appointments to a scheduling path.
For FinTech, failed payments, KYC, account lockouts, and chargebacks should have a separate tool and review rules.
For healthcare, administrative scheduling can be automated more safely than clinical advice.

The orchestrator should log why it chose a path. That gives the team a way to debug routing issues, improve prompts, and identify missing knowledge.

Conclusion

Make orchestration explicit. Use structured decisions. Log every path.

Salesforce reports that 66% of service organizations are now running AI agents in 2026, up from 39% in 2025.

Most of that growth is happening in support workflows. And the support teams seeing the best results are doing it with AI agents that have the most observable orchestration patterns. Good orchestration makes AI less mysterious and supports more predictable outcomes.

If you want a customer support AI agent preconfigured with handoff and escalation rules, book a demo.

FAQs

What is AI agent orchestration?

It is the process of coordinating multiple specialized AI agents within a unified system, so they work together toward a shared objective. In support, that means routing requests across tools, agents, workflows, and handoff paths based on intent and context.

What is a triage agent?

A triage agent is the entry-point agent in a multi-agent system. It inspects the incoming message, classifies the intent, and routes to the appropriate specialist agent via handoff. In the OpenAI Agents SDK, triage agents declare their routing options as a list of handoffs in their configuration. The triage agent does not answer the user directly. It decides who should.

How does the OpenAI Agents SDK handle orchestration?

Through two primitives: handoffs and agent-as-tool. Handoffs pass the full conversation to a specialist agent, who then owns the interaction. Agent-as-tool allows a parent agent to call a specialist as a helper while retaining control over the final response. For support workflows, handoffs are typically used for routing to departments (billing, refunds, technical support), and agent-as-tool is used for subtasks such as order lookups or data retrieval.

What is the difference between LangGraph and the OpenAI Agents SDK?

LangGraph uses a directed graph with conditional edges and built-in state checkpointing. It is suited for complex branching workflows that require auditability and durable state across sessions. The OpenAI Agents SDK uses explicit handoffs with ephemeral (per-run) state by default. LangGraph is model-agnostic; the SDK is optimized for OpenAI models. For most support orchestration, either works. LangGraph is the better choice if workflow auditability and cross-session state are hard requirements.

Do I need a framework to build an orchestrated agent?

No. A single well-prompted agent with good access to tools handles the majority of use cases that teams reach for frameworks to solve. Frameworks add value when the workflow requires stateful multi-agent coordination, complex branching, built-in tracing, or reusable agent primitives. Start without one. Add a framework when a specific capability gap becomes clear.

Should I use multiple agents?

Only when different responsibilities genuinely require different tools, prompts, or controls. Splitting too early creates more prompts, more traces, and more approval surfaces without necessarily improving the workflow. Split agents when it makes permissions, prompt clarity, or evaluation cleaner, not because it sounds more sophisticated.

What should be in an orchestration governance model?

At minimum: approval gates for high-risk actions (refunds, cancellations, account changes), structured logging of every routing decision, defined escalation paths to human agents, and regular review of fallback rates. Deloitte’s State of AI 2026 found that only 21% of companies have a mature governance model for agents. Approval gates and logging should be in place before go-live.

What should be logged?

Intent, selected path, tools called, approval decisions, handoff reason, and outcome. The trace should be compact enough to scan but complete enough to determine whether an incorrect answer came from the router, retrieval, the tool, the prompt, or the handoff rule.

Is handoff part of orchestration?

Yes, and it is one of the most important actions. A clean handoff includes a conversation summary and a reason the AI is stepping aside. It is not a fallback or a failure state. It is the correct outcome when the request exceeds what automation should handle.

Where does Kommunicate fit?

Kommunicate provides support routing, human handoff, conversation history, and analytics. Use Kommunicate developer docs when orchestration needs to connect with real customer channels and support teams

Adarsh

Adarsh Kumar is the CTO & Co-Founder at Kommunicate. As a seasoned technologist, he brings over 14 years of experience in software development, artificial intelligence, and machine learning to his role. His expertise in building scalable and robust tech solutions has been instrumental in the company’s growth and success.

AI Agent Orchestration: How to Route, Call Tools, and Hand off in Customer Support

What is AI agent orchestration?