Skip to main content
This document outlines guidelines and best practices for building CLIs for coding agents.

Why CLI over MCP

While it is perfectly ok to keep both MCP and CLI, CLI has the following advantages:
  • Token control: with CLI, agents can decide adhocly to use extra tools to shape, search or filter results returned by CLI. This is a second nature for agents, using grep, jq, head, tail, etc. With MCP, you have to manage the response to fight against bloated context window.
  • Reliability: MCP Servers often crash, disconnect and have latency overheads. CLI is always available, stateless and fast.
  • Steerability: You get more direct control over CLI with AGENTS.md and Skills.md. MCP has limited steerability.
NDJSON (newline-delimited JSON) is the preferred streaming format for CLI output. Each line is a complete JSON object. Some tools and docs also call this JSON Lines (JSONL), but this guide uses NDJSON consistently.

The Eight Golden Rules of Ugo Enyioha

Based on Writing CLI Tools That AI Agents Actually Want to Use.

1. Structured Output Is Not Optional

The single most important thing you can do is support --json or --output json. Agents are good at parsing text, but “good” isn’t “reliable.” A table that wraps differently depending on terminal width, or a status field that sometimes says "Running" and sometimes "running" — these cause silent failures in agent workflows.

Rules

  • JSON to stdout, everything else to stderr. Progress messages, warnings, spinners — all stderr. Stdout is your API contract.
  • Flat over nested. {"pod_name": "web-1", "pod_status": "Running"} is easier for an agent to work with than {"pod": {"metadata": {"name": "web-1"}, "status": {"phase": "Running"}}}.
  • Consistent types. If age is a number in one command, don’t make it a string like "3 days" in another. Use seconds or ISO 8601 timestamps.
  • NDJSON for streaming. If the command produces incremental output, emit one JSON object per line. Agents handle NDJSON well.

Examples

Bad: agent has to parse this
$ myctl list pods
NAME       STATUS   AGE
web-1      Running  3d
worker-2   Failed   1h
Good: agent gets clean data
$ myctl list pods --json
[
  {"name": "web-1", "status": "Running", "age_seconds": 259200},
  {"name": "worker-2", "status": "Failed", "age_seconds": 3600}
]

2. Exit Codes Are the Agent’s Control Flow

Agents check $? to decide what to do next. A tool that returns 0 on failure breaks every agent workflow that depends on it.

Rules

  • Use a stable exit-code contract. For example:
0 = success
1 = general failure
2 = usage error (bad arguments)
3 = resource not found
4 = permission denied
5 = conflict (resource already exists)
  • Document your exit codes. An agent that gets exit code 5 can decide to skip creation and move to the next step. An agent that gets exit code 1 for everything has to parse stderr to figure out what happened — and it will sometimes get it wrong.
  • Combine exit codes with structured error output.

Examples

Good: combine exit codes with structured error output
$ myctl create thing --name duplicate-name
# stderr: Error: resource "duplicate-name" already exists
# stdout (with --json): {"error": "conflict", "message": "resource 'duplicate-name' already exists"}
# exit code: 5

3. Make Commands Idempotent

Agents retry. Networks fail. Commands get interrupted. If your create command fails on the second run because the resource already exists, the agent has to write special-case retry logic. The kubectl model is a good reference: kubectl apply is idempotent by design. Declarative commands (ensure, apply, sync) are inherently safer for agents than imperative ones (create, delete).

Rules

  • Make conflicts detectable. If you can’t make a command idempotent, return a distinct exit code (like 5 for “already exists”) so the agent can handle it programmatically.

Examples

Fragile: fails on retry
$ myctl create namespace prod
Error: namespace "prod" already exists
Robust: idempotent
$ myctl ensure namespace prod
namespace "prod" already exists (no changes)
Or use a flag
$ myctl create namespace prod --if-not-exists

4. Self-Documenting Beats External Docs

When an agent encounters an unfamiliar CLI, the first thing it does is run --help. That help text is your tool description, your parameter spec, and your usage guide all in one.

Rules

  • Show required vs optional clearly. Agents will not guess which flags are required.
  • Include realistic examples. Agents learn patterns from examples faster than from flag descriptions.
  • Document the --json flag. If the agent doesn’t know it exists, it won’t use it.
  • Use subcommand discovery. myctl --help should list all subcommands. myctl deploy --help should give full detail.

Examples

Bad: minimal help
$ myctl deploy --help
Usage: myctl deploy [flags]
Good: the agent can learn from this
$ myctl deploy --help
Deploy a service to the target environment.

Usage:
  myctl deploy <service-name> --env <environment> [flags]

Arguments:
  service-name          Name of the service to deploy (required)

Flags:
  --env string          Target environment: dev, staging, prod (required)
  --image string        Container image override (default: from config)
  --dry-run             Preview changes without applying
  --wait                Wait for deployment to complete (default: true)
  --timeout duration    Maximum wait time (default: 5m)
  --json                Output result as JSON

Examples:
  myctl deploy web-api --env staging
  myctl deploy web-api --env prod --image myregistry/web:v2.1.0 --json
  myctl deploy web-api --env dev --dry-run

5. Design for Composability

Unix philosophy applies doubly for agents. Agents already think in pipelines — they chain commands naturally.

Rules

  • --quiet / -q for bare, pipeable values. One value per line, no headers, no decoration. Agents use this for piping into xargs or while read.
  • Stdin acceptance is explicit. If a command can read from stdin, document it: myctl apply -f - reads from stdin. Don’t make the agent guess.
  • Batch operations. If an agent needs to delete 50 resources, myctl delete --selector app=old is one call instead of 50.

Examples

⚠️ Acceptable: an agent will naturally compose these
myctl list pods --json | jq '.[] | select(.status == "Failed") | .name'
Better: build filtering in
myctl list pods --status failed --json --field name
Best: support both approaches
myctl list pods --status failed --json   # filtered JSON
myctl list pods --status failed --quiet  # just names, one per line

6. Provide Dry-Run and Confirmation Bypass

Agents need two things that conflict with interactive CLI design: they need to preview destructive actions, and they need to execute without human confirmation prompts.

Rules

  • --dry-run should produce structured output. Not "would deploy web-api to prod" but a JSON diff of what changes.
  • --yes / --no-confirm / --force bypasses prompts. An agent cannot type "y" at a confirmation prompt. If your CLI hangs waiting for input, the agent’s workflow is dead.
  • Detect non-interactive terminals. If stdin is not a TTY, either skip prompts automatically or fail with a clear error telling the user to pass --yes.

Examples

Preview what would happen
$ myctl deploy web-api --env prod --dry-run --json
{
  "action": "deploy",
  "changes": [
    {"type": "update", "resource": "deployment/web-api", "diff": "image: v1.2.0 -> v2.1.0"}
  ],
  "warnings": ["This will restart 3 running pods"]
}
Execute without interactive prompt
$ myctl deploy web-api --env prod --yes
Or use force
$ myctl deploy web-api --env prod --force

7. Errors Should Be Actionable

When a command fails, the agent needs to decide: retry, try something else, or give up. The error message determines which.

Rules

  • Parseable error codes/types in structured output. A string like "image_not_found" is parseable. "Error occurred" is not.
  • Include the failing input. If the image name is wrong, echo it back. The agent needs this to construct a fix.
  • Suggest next steps when possible. "suggestion": "run myctl images list to see available tags" gives the agent a concrete recovery path.
  • Separate transient from permanent errors. A timeout is worth retrying. A permission denied is not. If your exit codes or error types distinguish these, the agent can build appropriate retry logic.

Examples

Bad: the agent has no idea what to do
$ myctl deploy web-api --env prod
Error: deployment failed
Good: the agent can reason about this
$ myctl deploy web-api --env prod --json
# exit code: 1
# stderr: Error: image "myregistry/web:v2.1.0" not found in registry
# stdout: {"error": "image_not_found", "image": "myregistry/web:v2.1.0", "registry": "myregistry", "suggestion": "check image tag exists"}

8. Use Consistent Noun-Verb Grammar

When designing a CLI with many subcommands, order matters. Human users might memorize random command names, but agents rely on predictable patterns to discover what a tool can do.

Rules

  • Prefer noun → verb (resource → action). The noun verb pattern (e.g., docker container ls, gh pr create) is exceptionally agent-friendly because it naturally groups related actions in the --help output.
  • Layer --help like a tree. When an agent runs myctl --help, it sees a list of resources (nouns). When it runs myctl user --help, it sees all possible actions (verbs) for that resource. This hierarchical structure turns exploration into a deterministic tree search, rather than a guessing game.

Examples

Bad: mixed grammar is hard to guess
$ myctl create-user
$ myctl delete_user
$ myctl user-group add
Good: Noun -> Verb hierarchy
$ myctl user create
$ myctl user delete
$ myctl user group add

Context Window Discipline

CLI tools must minimize the amount of data returned to the agent. Large responses increase token usage, reduce reasoning capacity, and slow down workflows.

Rules

  • Return only what is strictly necessary for the call the agent is making.
  • Support field selection (projection) so agents can avoid downloading full objects when they only need a few fields.
  • Avoid full datasets by default; prefer small, bounded responses as the normal path.
  • Require explicit flags (e.g. --page-all, --all) before returning unbounded or complete result sets.
  • Support pagination with flags such as --limit and --cursor (or your stack’s equivalent).
  • Offer NDJSON streaming output when results can be large so the agent can consume rows incrementally.
  • Avoid deeply nested or verbose structures unless the caller explicitly asks for that shape.
  • Do not include unused metadata fields by default; add richness only when requested.
  • Expose cardinality up front — count or total — so agents can estimate size before committing to a heavy fetch.
  • Document how to shrink output in --help and examples: fields, filters, limits, NDJSON streaming, and “fetch all” flags.

Examples

Field selection
gws drive files list --params '{"fields": "files(id,name,mimeType)"}'
Pagination
hiro tasks list --limit 20
hiro tasks list --cursor abc123
Count / cardinality check
hiro tasks count --status open
{ "count": 523 }
Bounded list with total
hiro tasks list --limit 20
{
  "tasks": [...],
  "total": 523,
  "next_cursor": "abc123"
}
Fetch all (explicit)
hiro tasks list --page-all
Streaming output (NDJSON)
hiro tasks list --page-all
# Default machine format is line-oriented JSON (e.g. hirotm: global --format ndjson).
{"id": "t1", "title": "Fix bug"}
{"id": "t2", "title": "Add feature"}
Filter + projection (preferred pattern)
hiro tasks list --status open --fields id,title --limit 10
Key principle — Never return more data than the agent needs.

TBD sections

AGENTS.md and Skills.md

Audience and operating modes

Versioning policy

CLI documentation

Authentication and authorization

Long running operations

Testing and validation strategy

Resources