Tokenwise Public API
OpenAPI 3.1 spec for read-only access to your workspace data. Use any OpenAPI client (Postman, Insomnia, openapi-typescript) to generate types or test calls.
Get the spec
The canonical JSON spec is at https://tokenwisehq.com/openapi.json. It’s served with CORS open and a 1-hour CDN cache. Drop it into any tool that speaks OpenAPI 3.1.
Quick start with openapi-typescript:
npx openapi-typescript https://tokenwisehq.com/openapi.json -o ./tokenwise.d.ts
Authentication
Every request needs an Authorization: Bearer tw_api_* header. Mint keys in Settings → API Keys → Public REST. We hash your API key on receipt and only store the hash; the secret stays in your possession.
Rate limits
Per-key, hourly, plan-tiered. Every response carries X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429s include Retry-After (seconds).
| Plan | Requests/hour |
|---|---|
| Trial / Indie | 1,000 |
| Pro | 10,000 |
Endpoints
| Method | Path | Summary |
|---|---|---|
GET | /api/v1/public/requests | List logged LLM requests |
GET | /api/v1/public/metrics | Aggregate cost / latency / error metrics |
GET | /api/v1/public/evals | List recent eval-run scores |
Full request/response schemas + parameters live in the JSON spec — we keep this page short on purpose. If you want a richer browser, paste the spec URL into editor.swagger.io.
Proxy rule types
The proxy-rule write endpoints (POST /api/proxy-rules + PATCH /api/proxy-rules/{id}) take a rule type and a config JSON blob whose shape is discriminated by type. The available types:
| Type | config shape | What it does |
|---|---|---|
model_override | { fromModel, toModel } | Swap one model for another at the edge. |
max_tokens_cap | { maxTokens } | Clamp max_tokens on outbound requests. |
context_trim | { maxMessages } | Trim the message history to the last N turns. |
cache_enable | { ttlSeconds, mode?, similarityThreshold? } | Turn on edge cache lookups for matching requests. |
fallback_chain | { candidates[], onStatus?, onTimeoutMs?, maxAttempts? } | Ordered (provider, model) candidates tried on upstream failure. |
compress | { strength: "light" | "medium" | "aggressive", profile: 1, tags?: string[] } | In-house compressor that slims system prompts + tool output before the upstream call. Cache-safe (runs upstream of cache lookup; never reads the user’s last message). Watchdog auto-pauses on a >10% quality drop. |
Creating a compress rule
Minimal payload — defaults to medium strength with the stable in-house profile (profile: 1):
curl https://tokenwisehq.com/api/proxy-rules \
-H "Authorization: Bearer $TW_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "compress",
"name": "Trim system prompt — chat-support",
"tag": "chat-support",
"enabled": true,
"config": {
"strength": "medium",
"profile": 1,
"tags": ["chat-support"]
}
}'Workspace-level defaults (default strength + kill switch) live on the workspace record and are managed in Settings → General → Compression defaults. Per-tag overrides (off / light / medium / aggressive) live on the workspace_tags row and are managed in Tags.
Versioning policy
We won’t remove or rename a field without a 90-day deprecation warning. New fields appear without notice; clients should tolerate unknown keys. Breaking changes ship a new path prefix (/api/v2/public/…); /v1 stays alive indefinitely.
Need something this API doesn’t cover?
Email [email protected]with the use case. We’re a small team and we read every message.