Why CLI over MCP
While it is perfectly ok to keep both MCP and CLI, CLI has the following advantages:- Token control: with CLI, agents can decide adhocly to use extra tools to shape, search or filter results returned by CLI. This is a second nature for agents, using grep, jq, head, tail, etc. With MCP, you have to manage the response to fight against bloated context window.
- Reliability: MCP Servers often crash, disconnect and have latency overheads. CLI is always available, stateless and fast.
- Steerability: You get more direct control over CLI with AGENTS.md and Skills.md. MCP has limited steerability.
The Eight Golden Rules of Ugo Enyioha
Based on Writing CLI Tools That AI Agents Actually Want to Use.1. Structured Output Is Not Optional
The single most important thing you can do is support--json or --output json.
Agents are good at parsing text, but “good” isn’t “reliable.” A table that wraps differently depending on terminal width, or a status field that sometimes says "Running" and sometimes "running" — these cause silent failures in agent workflows.
Rules
- JSON to stdout, everything else to stderr. Progress messages, warnings, spinners — all stderr. Stdout is your API contract.
- Flat over nested.
{"pod_name": "web-1", "pod_status": "Running"}is easier for an agent to work with than{"pod": {"metadata": {"name": "web-1"}, "status": {"phase": "Running"}}}. - Consistent types. If age is a number in one command, don’t make it a string like
"3 days"in another. Use seconds or ISO 8601 timestamps. - NDJSON for streaming. If the command produces incremental output, emit one JSON object per line. Agents handle NDJSON well.
Examples
❌ Bad: agent has to parse this2. Exit Codes Are the Agent’s Control Flow
Agents check$? to decide what to do next. A tool that returns 0 on failure breaks every agent workflow that depends on it.
Rules
- Use a stable exit-code contract. For example:
-
Document your exit codes. An agent that gets exit code
5can decide to skip creation and move to the next step. An agent that gets exit code1for everything has to parse stderr to figure out what happened — and it will sometimes get it wrong. - Combine exit codes with structured error output.
Examples
✅ Good: combine exit codes with structured error output3. Make Commands Idempotent
Agents retry. Networks fail. Commands get interrupted. If yourcreate command fails on the second run because the resource already exists, the agent has to write special-case retry logic.
The kubectl model is a good reference: kubectl apply is idempotent by design. Declarative commands (ensure, apply, sync) are inherently safer for agents than imperative ones (create, delete).
Rules
- Make conflicts detectable. If you can’t make a command idempotent, return a distinct exit code (like
5for “already exists”) so the agent can handle it programmatically.
Examples
❌ Fragile: fails on retry4. Self-Documenting Beats External Docs
When an agent encounters an unfamiliar CLI, the first thing it does is run--help. That help text is your tool description, your parameter spec, and your usage guide all in one.
Rules
- Show required vs optional clearly. Agents will not guess which flags are required.
- Include realistic examples. Agents learn patterns from examples faster than from flag descriptions.
- Document the
--jsonflag. If the agent doesn’t know it exists, it won’t use it. - Use subcommand discovery.
myctl --helpshould list all subcommands.myctl deploy --helpshould give full detail.
Examples
❌ Bad: minimal help5. Design for Composability
Unix philosophy applies doubly for agents. Agents already think in pipelines — they chain commands naturally.Rules
--quiet/-qfor bare, pipeable values. One value per line, no headers, no decoration. Agents use this for piping intoxargsorwhile read.- Stdin acceptance is explicit. If a command can read from stdin, document it:
myctl apply -f -reads from stdin. Don’t make the agent guess. - Batch operations. If an agent needs to delete 50 resources,
myctl delete --selector app=oldis one call instead of 50.
Examples
⚠️ Acceptable: an agent will naturally compose these6. Provide Dry-Run and Confirmation Bypass
Agents need two things that conflict with interactive CLI design: they need to preview destructive actions, and they need to execute without human confirmation prompts.Rules
--dry-runshould produce structured output. Not"would deploy web-api to prod"but a JSON diff of what changes.--yes/--no-confirm/--forcebypasses prompts. An agent cannot type"y"at a confirmation prompt. If your CLI hangs waiting for input, the agent’s workflow is dead.- Detect non-interactive terminals. If stdin is not a TTY, either skip prompts automatically or fail with a clear error telling the user to pass
--yes.
Examples
✅ Preview what would happen7. Errors Should Be Actionable
When a command fails, the agent needs to decide: retry, try something else, or give up. The error message determines which.Rules
- Parseable error codes/types in structured output. A string like
"image_not_found"is parseable."Error occurred"is not. - Include the failing input. If the image name is wrong, echo it back. The agent needs this to construct a fix.
- Suggest next steps when possible.
"suggestion": "run myctl images list to see available tags"gives the agent a concrete recovery path. - Separate transient from permanent errors. A timeout is worth retrying. A permission denied is not. If your exit codes or error types distinguish these, the agent can build appropriate retry logic.
Examples
❌ Bad: the agent has no idea what to do8. Use Consistent Noun-Verb Grammar
When designing a CLI with many subcommands, order matters. Human users might memorize random command names, but agents rely on predictable patterns to discover what a tool can do.Rules
-
Prefer noun → verb (resource → action). The noun verb pattern (e.g.,
docker container ls,gh pr create) is exceptionally agent-friendly because it naturally groups related actions in the--helpoutput. -
Layer
--helplike a tree. When an agent runsmyctl --help, it sees a list of resources (nouns). When it runsmyctl user --help, it sees all possible actions (verbs) for that resource. This hierarchical structure turns exploration into a deterministic tree search, rather than a guessing game.
Examples
❌ Bad: mixed grammar is hard to guessContext Window Discipline
CLI tools must minimize the amount of data returned to the agent. Large responses increase token usage, reduce reasoning capacity, and slow down workflows.Rules
- Return only what is strictly necessary for the call the agent is making.
- Support field selection (projection) so agents can avoid downloading full objects when they only need a few fields.
- Avoid full datasets by default; prefer small, bounded responses as the normal path.
- Require explicit flags (e.g.
--page-all,--all) before returning unbounded or complete result sets. - Support pagination with flags such as
--limitand--cursor(or your stack’s equivalent). - Offer NDJSON streaming output when results can be large so the agent can consume rows incrementally.
- Avoid deeply nested or verbose structures unless the caller explicitly asks for that shape.
- Do not include unused metadata fields by default; add richness only when requested.
- Expose cardinality up front — count or total — so agents can estimate size before committing to a heavy fetch.
- Document how to shrink output in
--helpand examples: fields, filters, limits, NDJSON streaming, and “fetch all” flags.
Examples
✅ Field selectionTBD sections
AGENTS.md and Skills.md
Audience and operating modes
Versioning policy
CLI documentation
Authentication and authorization
Long running operations
Testing and validation strategy
Resources
- The Eight Golden Rules of Ugo Enyioha — dev.to
- Engineering Agent-Friendly CLIs (Speakeasy)
- Rewriting your CLI for AI Agents
- Command Line Interface Guidelines
