caveman
- github
- JuliusBrussee/caveman
- HN
- Caveman: Why use many token when few token do trick | Hacker News
- YouTube
- No way this actually works - YouTube - ThePrimeTime
Why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman.
I tried it on <2026-04-11 Sat> on Mac Mini 2024 and on Mac Book Pro M1.
claude plugin marketplace add JuliusBrussee/caveman && claude plugin install caveman@caveman claude plugin uninstall caveman@caveman && claude plugin marketplace remove caveman
Uninstalled after reading HN comments. Installed again recently.
Caveman in gptel
Add caveman as a gptel directive:
(setq gptel-directives
(cons '(caveman . "CAVEMAN MODE: Drop articles/filler/pleasantries/hedging. Fragments OK. Short synonyms. Pattern: [thing] [action] [reason]. [next step]. Keep full technical accuracy. Code unchanged.")
gptel-directives))
Select with C-c C-d in gptel buffer.
HN comments summary
A developer created "Caveman," a Claude Code skill that forces the AI to respond in simplified, caveman-like language to reduce token usage by approximately 75%. The author clarifies this is mostly a joke and targets visible output (removing preambles and filler text), not the hidden "thinking" tokens that improve performance. They acknowledge the ~75% claim needs proper benchmarking and note that the skill doesn't affect code quality itself.
The Hacker News community is highly skeptical, with many arguing that tokens are "units of thinking" for LLMs—reducing them could make the model dumber by limiting its reasoning capacity. Critics point out that chain-of-thought reasoning requires verbose output, and forcing concise responses may degrade performance. However, some find the concept useful for cutting through verbose AI responses, comparing it to telegram-style communication or noting that similar concise prompting can work without harming quality for simple tasks. The debate centers on whether this actually saves meaningful costs versus potentially sacrificing accuracy.