Indent Is All You Need

Posted 2026-04-21 | stdin

There's an interesting debate around whether "Bash is all you need" for AI agents. Claude Code's Thariq Shihipar argues that LLMs may use Bash for anything

But is that the case? complex LLM generated bash may break on nested quotes, parentheses, and escapes. Even GPT-5.4 struggles with deeply nested inline Bash calls, and some engineers have resorted to wrapping binaries into microcommands, Gated Delta Net, so the model only outputs the inner command, achieving near-perfect reliability. The theory behind this is rooted in formal language classes. Bash's quoting and parentheses matching form a Dyck-k language problem, a type of task that requires maintaining a stack of arbitrary depth. Standard Transformers are in the TC0 complexity class, which makes deep nesting and parity tracking inherently challenging.

Python, by contrast, is almost Transformer-friendly by design: each line's indentation implicitly encodes block depth. This "outsources" state tracking to the syntax itself, effectively converting a potentially hard nesting problem into something the model can handle token by token. That may explain why LLMs have excelled at Python generation from early versions, despite struggling with even basic arithmetic.

Practically, this explains the patterns people see: nested Bash commands are error-prone, while Python functions with proper indentation work reliably. YAML, Markdown, and other indentation-heavy formats behave similarly. Many people say that Markdown math formulas and JSON/XML often cause errors because of brace/bracket mismatches and escapes. Bash mistakes, on the other hand, can be catastrophic, especially when used in agent frameworks that make the AI directly invoke commands.

If we accept that LLMs are "state-tracking challenged," our choice of formats must evolve toward "line-local" state:

JSON/XML: High-risk. Every { is a debt that must be paid with a } 50 tokens later.
TOML: Superior for AI because it is flat. A section header [header.subheader] anchors the state for the following lines, requiring zero long-distance nesting memory.
Markdown/LaTeX: This explains why even the best models still hallucinate unrenderable LaTeX. The moment a formula requires deeply nested curly braces, the Dyck-k problem strikes, and the model "forgets" to close a bracket.

To verify this, one could conduct a simple "Indentation Test" experiment: ask a SOTA model to generate C++ code in two scenarios & then compare accuracy:

Standard C++ with mandatory indentation and newlines.
Minified C++ on a single line where indentation is forbidden.

The divergence in error rates as the nesting depth increases would likely prove that for AI, the "indentation" is the logic.

Ultimately, while Bash is a powerful glue, it is a treacherous foundation for autonomous agents. If we want reliable agents, we should favor languages and formats that offload state into the context.

Indent is all you need.

Translated from Zhihu 胡一鸣 & edited by ChatGPT

Comments