A developer created "Caveman," a Claude Code skill that forces the AI
to respond in simplified, caveman-like language to reduce token usage
by approximately 75%. The author clarifies this is mostly a joke and
targets visible output (removing preambles and filler text), not the
hidden "thinking" tokens that improve performance. They acknowledge
the ~75% claim needs proper benchmarking and note that the skill
doesn't affect code quality itself.
The Hacker News community is highly skeptical, with many arguing that
tokens are "units of thinking" for LLMs—reducing them could make the
model dumber by limiting its reasoning capacity. Critics point out
that chain-of-thought reasoning requires verbose output, and forcing
concise responses may degrade performance. However, some find the
concept useful for cutting through verbose AI responses, comparing it
to telegram-style communication or noting that similar concise
prompting can work without harming quality for simple tasks. The
debate centers on whether this actually saves meaningful costs versus
potentially sacrificing accuracy.