Free download · MIT license

Five Claude Code skills that automatically cut your LLM bill

Drop them into ~/.claude/skills/, restart Claude Code, and your assistant suddenly knows how to audit your codebase, optimize prompts for caching, generate evals, pick cheaper models, and pull a real spend report from OpenAI / Anthropic. Stdlib Python. MIT. No signup.

install.sh
$curl -fsSL https://tokenwisehq.com/skills/install.sh | bash
Or download as .zip if you’d rather inspect first ↓
stdlib Python · MIT · no signup required
Skill · cost-auditor

LLM cost auditor

Scan your TypeScript / Python codebase, find every hard-coded model call, classify it (classification / extraction / chat / RAG / reasoning), and produce a markdown report mapping each to a cheaper-but-equivalent model with dollar-saving math. Stdlib Python — no API key required.

Skill · prompt-cache

Prompt cache optimizer

Paste a system prompt; the skill splits it into stable prefix + volatile suffix, reorders for max prompt-cache hit rate (90% off on Anthropic, 50% off on OpenAI), and shows you the steady-state savings with before/after diff.

Skill · eval-bootstrap

Eval bootstrap

Hand it 3 representative (prompt, response) pairs; it generates an LLM-as-judge rubric (3-5 criteria), runs the first eval pass, and writes per-pair scores + per-criterion pass rate. Works with OpenAI or Anthropic as judge.

bootstrap_evals.pyDownload .zip →
Skill · model-picker

Model picker

Describe a task in one sentence; get the cheapest model that handles it across all major providers (OpenAI, Anthropic, Google, Groq, DeepSeek, xAI, Mistral) with per-1k-token cost math. Offline by default, can calibrate with an LLM call.

Skill · live-audit

Tokenwise live-audit

The power tool. Paste a read-only admin API key (never persisted, never logged) and pull your last 30 days of OpenAI / Anthropic spend straight from the provider's reporting API. Produces a markdown report with cheaper-model swaps + prompt-cache enablement candidates ranked by monthly $ saved.

audit_live.pyDownload .zip →

What “Claude Code skill” means

A Claude Code skill is a folder under ~/.claude/skills/<name>/ containing a SKILL.md file with frontmatter (name, description, tags) plus instructions, and any helper scripts the skill needs. Claude loads it dynamically and runs it when the conversation matches the trigger phrases in the description.

The five in this bundle are self-contained — each has its own README, a working Python script (stdlib only — nopip install), and the SKILL.md tells Claude when to fire and how to walk you through the output.

Install

Fastest path — paste this into your terminal and the bundle drops into ~/.claude/skills/:

curl -fsSL https://tokenwisehq.com/skills/install.sh | bash

Prefer to inspect first? Grab the zip and unpack by hand:

unzip tokenwise-claude-skills.zip
cp -r llm-cost-auditor       ~/.claude/skills/
cp -r prompt-cache-optimizer ~/.claude/skills/
cp -r eval-bootstrap         ~/.claude/skills/
cp -r model-picker           ~/.claude/skills/
cp -r tokenwise-audit        ~/.claude/skills/

Then just talk to Claude — ask “audit my LLM costs” or “what’s the cheapest model for X?” and the right skill fires automatically.

What gets sent to where

Three of the five skills (cost-auditor, prompt-cache-optimizer, model-picker) run fully offline — no network calls. The pricing data is baked in.

eval-bootstrap needs an OpenAI or Anthropic API key (as judge) — you supply it via environment variable.

tokenwise-auditprompts you for an admin API key, calls each provider’s usage API, and writes a local markdown report. Keys are never written to disk, never logged, never included in the output. No prompt content is ever sent anywhere — only aggregate usage stats from the provider.

Want this continuously?

These five skills run once, in your terminal, when you remember. Tokenwise runs them on every request through a Cloudflare-edge proxy — cheaper-model A/B tests with quality checks, automatic prompt caching, weekly insights email with the dollar amount you saved. One line in your SDK.

Plug Tokenwise in →