How It Works

Before going further, it's worth understanding how Yao Agent actually executes a request. This page explains the execution model, the full project structure, and the three modes you'll choose between.

## The Execution Model

Every agent request flows through this pipeline:

```
User message
│
▼
┌─────────────┐
│ Create Hook │ ← optional TypeScript function, runs before the executor
└──────┬──────┘
│
▼
┌─────────────┐
│ Executor │ ← LLM / CLI Agent (OpenCode, Claude Code, Codex, etc.) / your own code
└──────┬──────┘
│
▼
┌─────────────┐
│ Next Hook │ ← optional TypeScript function, runs after the executor
└──────┬──────┘
│
▼
Response
```

**What goes in the Executor slot determines the agent type.** The Hooks on either side use the same interface regardless of what's in the middle.

## Three Execution Modes

### Mode 1: LLM (OpenAI / Anthropic API, etc.)

The default mode. Configure a `connector` and `prompts.yml` — the LLM handles all reasoning and response generation.

```
Create Hook → [ LLM (OpenAI / Anthropic API, etc.) ] → Next Hook
```

Best for: conversational assistants, Q&A, content generation, anything where you want the LLM to drive the response.

### Mode 2: CLI Agent (OpenCode, Claude Code, Codex, etc.)

Add a `sandbox.yao` file to put a CLI-based coding agent in the executor slot. It runs in an isolated container with full file system access.

```
Create Hook → [ CLI Agent (OpenCode, Claude Code, Codex, etc.) ] → Next Hook
```

Best for: code generation, file operations, multi-step technical tasks. This is the most capable mode and the one most real-world projects use.

### Mode 3: Pure Hook (Your TypeScript Code)

Write a `Create Hook` that handles the request entirely in TypeScript — no LLM involved. Use `ctx.Send()` to stream a response directly, then return to skip the executor.

```
Create Hook → [ Your TypeScript Code ] → (no executor, no Next Hook)
```

Best for: deterministic logic, menu routing, form flows, or any response that doesn't need AI generation.

---

All three modes share the same Hook interface. You can freely mix them — for example, route some requests through the LLM and handle others with pure code, all inside a single `Create Hook`.

## Assistant Directory Structure

Here's what a complete assistant directory can contain. Everything except `package.yao` is optional — start with what you need and add more as your agent grows.

```
assistants/my-agent/
├── package.yao # Required: name, connector, permissions, MCP config
├── prompts.yml # Default system prompt
├── prompts/ # Multiple prompt presets (create.yml, edit.yml, etc.)
│ └── edit.yml
├── src/
│ └── index.ts # Hooks: export function Create(...) and Next(...)
├── mcps/ # MCP tool servers
│ ├── tools.mcp.yao # Tool server definition
│ └── mapping/ # Tool parameter schemas
├── models/ # Agent-scoped database models (.mod.yao files)
├── pages/ # SUI sidebar pages shown during conversation
├── locales/ # i18n strings
│ ├── en-us.yml
│ └── zh-cn.yml
├── tests/ # Test cases
│ └── inputs.jsonl
│
└── sandbox.yao # ← CLI Agent only. This is the only file that
# distinguishes a CLI Agent from a Yao Agent.
```

**Yao Agent and CLI Agent have identical directory structures.** The only difference is `sandbox.yao`.

## Which Mode Should You Use?

| | Pure Hook | LLM | CLI Agent |
|---|---|---|---|
| **Executor** | Your TypeScript code | LLM (OpenAI / Anthropic API, etc.) | OpenCode, Claude Code, Codex, etc. |
| **Good for** | Routing, deterministic logic | Conversation, Q&A | Code, files, SKILL ecosystem |
| **Needs sandbox** | No | No | Yes |
| **Typical usage** | Mixed into other modes | Lightweight scenarios | **Most real projects** |

**Recommended learning path:**

1. Start with **LLM** mode — the mechanism is transparent and the code is minimal
2. Move to **CLI Agent** when you need file operations or the SKILL ecosystem
3. Add **Pure Hook** logic anywhere you want deterministic control

## Why Design It This Way?

Yao Agent is built on a simple premise: in the AI era, everything runs as an agent.

Every feature, every workflow, every business process — triggered by a schedule, an event, or a user, operating autonomously, reporting back when done. The question isn't whether to adopt this model, it's how to do it without losing control.

That's where the three-mode design comes from.

**AI outputs are non-deterministic by nature.** LLMs drift. CLI agents make unexpected decisions. Left unchecked, they produce outputs that are hard to validate, audit, or integrate into existing systems. The Hook layer — `Create` before the executor, `Next` after — gives you a consistent intervention point: inject context, enforce constraints, validate output, trigger downstream actions. The AI does the heavy lifting; you define the boundaries.

**Real applications aren't one-mode.** A single conversation might need an LLM to answer a question, a CLI Agent to write and test code, and a Pure Hook to check permissions or format the response — all in sequence. Because all three modes share the same Hook interface, you can compose them freely: route by intent, delegate to specialists, fall back to deterministic code when certainty matters.

This is the core philosophy: **the cage, not the animal.** You decide what runs, when it runs, and what comes out. The AI is powerful — Yao keeps it useful.

## What's Next

You now understand the full execution model. Time to go deeper:

- **[Yao Agent →](/tutorials/agent/yao-agent)** — Learn prompts, MCP tools, and Hooks, then build a real assistant
- **[CLI Agent →](/tutorials/agent/cli-agent)** — Skip ahead if you're ready to work with sandboxes and the SKILL ecosystem