Building CLI for Agents

This document outlines guidelines and best practices for building CLIs for coding agents.

Why CLI over MCP

While it is perfectly ok to keep both MCP and CLI, CLI has the following advantages:

Token control: with CLI, agents can decide adhocly to use extra tools to shape, search or filter results returned by CLI. This is a second nature for agents, using grep, jq, head, tail, etc. With MCP, you have to manage the response to fight against bloated context window.
Reliability: MCP Servers often crash, disconnect and have latency overheads. CLI is always available, stateless and fast.
Steerability: You get more direct control over CLI with AGENTS.md and Skills.md. MCP has limited steerability.

NDJSON (newline-delimited JSON) is the preferred streaming format for CLI output. Each line is a complete JSON object. Some tools and docs also call this JSON Lines (JSONL), but this guide uses NDJSON consistently.

The Eight Golden Rules of Ugo Enyioha

Based on Writing CLI Tools That AI Agents Actually Want to Use.

1. Structured Output Is Not Optional

The single most important thing you can do is support --json or --output json. Agents are good at parsing text, but “good” isn’t “reliable.” A table that wraps differently depending on terminal width, or a status field that sometimes says "Running" and sometimes "running" — these cause silent failures in agent workflows.

Rules

JSON to stdout, everything else to stderr. Progress messages, warnings, spinners — all stderr. Stdout is your API contract.
Flat over nested. {"pod_name": "web-1", "pod_status": "Running"} is easier for an agent to work with than {"pod": {"metadata": {"name": "web-1"}, "status": {"phase": "Running"}}}.
Consistent types. If age is a number in one command, don’t make it a string like "3 days" in another. Use seconds or ISO 8601 timestamps.
NDJSON for streaming. If the command produces incremental output, emit one JSON object per line. Agents handle NDJSON well.

Examples

❌ Bad: agent has to parse this

$ myctl list pods
NAME       STATUS   AGE
web-1      Running  3d
worker-2   Failed   1h

✅ Good: agent gets clean data

$ myctl list pods --json
[
  {"name": "web-1", "status": "Running", "age_seconds": 259200},
  {"name": "worker-2", "status": "Failed", "age_seconds": 3600}
]

2. Exit Codes Are the Agent’s Control Flow

Agents check $? to decide what to do next. A tool that returns 0 on failure breaks every agent workflow that depends on it.

Rules

Use a stable exit-code contract. For example:

= success
= general failure
= usage error (bad arguments)
= resource not found
= permission denied
= conflict (resource already exists)

Document your exit codes. An agent that gets exit code 5 can decide to skip creation and move to the next step. An agent that gets exit code 1 for everything has to parse stderr to figure out what happened — and it will sometimes get it wrong.
Combine exit codes with structured error output.

Examples

✅ Good: combine exit codes with structured error output

$ myctl create thing --name duplicate-name
# stderr: Error: resource "duplicate-name" already exists
# stdout (with --json): {"error": "conflict", "message": "resource 'duplicate-name' already exists"}
# exit code: 5

3. Make Commands Idempotent

Agents retry. Networks fail. Commands get interrupted. If your create command fails on the second run because the resource already exists, the agent has to write special-case retry logic. The kubectl model is a good reference: kubectl apply is idempotent by design. Declarative commands (ensure, apply, sync) are inherently safer for agents than imperative ones (create, delete).

Rules

Make conflicts detectable. If you can’t make a command idempotent, return a distinct exit code (like 5 for “already exists”) so the agent can handle it programmatically.

Examples

❌ Fragile: fails on retry

$ myctl create namespace prod
Error: namespace "prod" already exists

✅ Robust: idempotent

$ myctl ensure namespace prod
namespace "prod" already exists (no changes)

✅ Or use a flag

$ myctl create namespace prod --if-not-exists

4. Self-Documenting Beats External Docs

When an agent encounters an unfamiliar CLI, the first thing it does is run --help. That help text is your tool description, your parameter spec, and your usage guide all in one.

Rules

Show required vs optional clearly. Agents will not guess which flags are required.
Include realistic examples. Agents learn patterns from examples faster than from flag descriptions.
Document the --json flag. If the agent doesn’t know it exists, it won’t use it.
Use subcommand discovery. myctl --help should list all subcommands. myctl deploy --help should give full detail.

Examples

❌ Bad: minimal help

$ myctl deploy --help
Usage: myctl deploy [flags]

✅ Good: the agent can learn from this

$ myctl deploy --help
Deploy a service to the target environment.

Usage:
  myctl deploy <service-name> --env <environment> [flags]

Arguments:
  service-name          Name of the service to deploy (required)

Flags:
  --env string          Target environment: dev, staging, prod (required)
  --image string        Container image override (default: from config)
  --dry-run             Preview changes without applying
  --wait                Wait for deployment to complete (default: true)
  --timeout duration    Maximum wait time (default: 5m)
  --json                Output result as JSON

Examples:
  myctl deploy web-api --env staging
  myctl deploy web-api --env prod --image myregistry/web:v2.1.0 --json
  myctl deploy web-api --env dev --dry-run

5. Design for Composability

Unix philosophy applies doubly for agents. Agents already think in pipelines — they chain commands naturally.

Rules

--quiet / -q for bare, pipeable values. One value per line, no headers, no decoration. Agents use this for piping into xargs or while read.
Stdin acceptance is explicit. If a command can read from stdin, document it: myctl apply -f - reads from stdin. Don’t make the agent guess.
Batch operations. If an agent needs to delete 50 resources, myctl delete --selector app=old is one call instead of 50.

Examples

⚠️ Acceptable: an agent will naturally compose these

myctl list pods --json | jq '.[] | select(.status == "Failed") | .name'

✅ Better: build filtering in

myctl list pods --status failed --json --field name

✅ Best: support both approaches

myctl list pods --status failed --json   # filtered JSON
myctl list pods --status failed --quiet  # just names, one per line

6. Provide Dry-Run and Confirmation Bypass

Agents need two things that conflict with interactive CLI design: they need to preview destructive actions, and they need to execute without human confirmation prompts.

Rules

--dry-run should produce structured output. Not "would deploy web-api to prod" but a JSON diff of what changes.
--yes / --no-confirm / --force bypasses prompts. An agent cannot type "y" at a confirmation prompt. If your CLI hangs waiting for input, the agent’s workflow is dead.
Detect non-interactive terminals. If stdin is not a TTY, either skip prompts automatically or fail with a clear error telling the user to pass --yes.

Examples

✅ Preview what would happen

$ myctl deploy web-api --env prod --dry-run --json
{
  "action": "deploy",
  "changes": [
    {"type": "update", "resource": "deployment/web-api", "diff": "image: v1.2.0 -> v2.1.0"}
  ],
  "warnings": ["This will restart 3 running pods"]
}

✅ Execute without interactive prompt

$ myctl deploy web-api --env prod --yes

✅ Or use force

$ myctl deploy web-api --env prod --force

7. Errors Should Be Actionable

When a command fails, the agent needs to decide: retry, try something else, or give up. The error message determines which.

Rules

Parseable error codes/types in structured output. A string like "image_not_found" is parseable. "Error occurred" is not.
Include the failing input. If the image name is wrong, echo it back. The agent needs this to construct a fix.
Suggest next steps when possible. "suggestion": "run myctl images list to see available tags" gives the agent a concrete recovery path.
Separate transient from permanent errors. A timeout is worth retrying. A permission denied is not. If your exit codes or error types distinguish these, the agent can build appropriate retry logic.

Examples

❌ Bad: the agent has no idea what to do

$ myctl deploy web-api --env prod
Error: deployment failed

✅ Good: the agent can reason about this

$ myctl deploy web-api --env prod --json
# exit code: 1
# stderr: Error: image "myregistry/web:v2.1.0" not found in registry
# stdout: {"error": "image_not_found", "image": "myregistry/web:v2.1.0", "registry": "myregistry", "suggestion": "check image tag exists"}

8. Use Consistent Noun-Verb Grammar

When designing a CLI with many subcommands, order matters. Human users might memorize random command names, but agents rely on predictable patterns to discover what a tool can do.

Rules

Prefer noun → verb (resource → action). The noun verb pattern (e.g., docker container ls, gh pr create) is exceptionally agent-friendly because it naturally groups related actions in the --help output.
Layer --help like a tree. When an agent runs myctl --help, it sees a list of resources (nouns). When it runs myctl user --help, it sees all possible actions (verbs) for that resource. This hierarchical structure turns exploration into a deterministic tree search, rather than a guessing game.

Examples

❌ Bad: mixed grammar is hard to guess

$ myctl create-user
$ myctl delete_user
$ myctl user-group add

✅ Good: Noun -> Verb hierarchy

$ myctl user create
$ myctl user delete
$ myctl user group add

Context Window Discipline

CLI tools must minimize the amount of data returned to the agent. Large responses increase token usage, reduce reasoning capacity, and slow down workflows.

Rules

Return only what is strictly necessary for the call the agent is making.
Support field selection (projection) so agents can avoid downloading full objects when they only need a few fields.
Avoid full datasets by default; prefer small, bounded responses as the normal path.
Require explicit flags (e.g. --page-all, --all) before returning unbounded or complete result sets.
Support pagination with flags such as --limit and --cursor (or your stack’s equivalent).
Offer NDJSON streaming output when results can be large so the agent can consume rows incrementally.
Avoid deeply nested or verbose structures unless the caller explicitly asks for that shape.
Do not include unused metadata fields by default; add richness only when requested.
Expose cardinality up front — count or total — so agents can estimate size before committing to a heavy fetch.
Document how to shrink output in --help and examples: fields, filters, limits, NDJSON streaming, and “fetch all” flags.

Examples

✅ Field selection

gws drive files list --params '{"fields": "files(id,name,mimeType)"}'

✅ Pagination

hiro tasks list --limit 20
hiro tasks list --cursor abc123

✅ Count / cardinality check

hiro tasks count --status open

{ "count": 523 }

✅ Bounded list with total

hiro tasks list --limit 20

{
  "tasks": [...],
  "total": 523,
  "next_cursor": "abc123"
}

✅ Fetch all (explicit)

hiro tasks list --page-all

✅ Streaming output (NDJSON)

hiro tasks list --page-all
# Default machine format is line-oriented JSON (e.g. hirotm: global --format ndjson).

{"id": "t1", "title": "Fix bug"}
{"id": "t2", "title": "Add feature"}

✅ Filter + projection (preferred pattern)

hiro tasks list --status open --fields id,title --limit 10

General

References

​Why CLI over MCP

​The Eight Golden Rules of Ugo Enyioha

​1. Structured Output Is Not Optional

​Rules

​Examples

​2. Exit Codes Are the Agent’s Control Flow

​Rules

​Examples

​3. Make Commands Idempotent

​Rules

​Examples

​4. Self-Documenting Beats External Docs

​Rules

​Examples

​5. Design for Composability

​Rules

​Examples

​6. Provide Dry-Run and Confirmation Bypass

​Rules

​Examples

​7. Errors Should Be Actionable

​Rules

​Examples

​8. Use Consistent Noun-Verb Grammar

​Rules

​Examples

​Context Window Discipline

​Rules

​Examples

​TBD sections

​AGENTS.md and Skills.md

​Audience and operating modes

​Versioning policy

​CLI documentation

​Authentication and authorization

​Long running operations

​Testing and validation strategy

​Resources

Why CLI over MCP

The Eight Golden Rules of Ugo Enyioha

1. Structured Output Is Not Optional

Rules

Examples

2. Exit Codes Are the Agent’s Control Flow

Rules

Examples

3. Make Commands Idempotent

Rules

Examples

4. Self-Documenting Beats External Docs

Rules

Examples

5. Design for Composability

Rules

Examples

6. Provide Dry-Run and Confirmation Bypass

Rules

Examples

7. Errors Should Be Actionable

Rules

Examples

8. Use Consistent Noun-Verb Grammar

Rules

Examples

Context Window Discipline

Rules

Examples

TBD sections

AGENTS.md and Skills.md

Audience and operating modes

Versioning policy

CLI documentation

Authentication and authorization

Long running operations

Testing and validation strategy

Resources