condense.chat compresses up to 70% without losing meaning in under 6 seconds per one hundred thousand tokens.
claude through the proxy — no signup, no key swap.
Run condense on the three payload shapes that bloat modern agent stacks. Same input, same downstream model — the difference is everything between them.
Drop-in proxy for the OpenAI SDK and Claude Code — just point your
existing client at api.condense.chat with a
cxk_ key.
Or call POST /v1/condense
directly for fine-grained control. Your model, your tools, your evals —
ours just makes them cheaper.
x-condense-ratio / x-condense-target-tokensTool outputs, file reads, test runs — the stuff that eats your window. Condense rewrites it on the edge, every turn, so sessions don't collapse into compact-and-lose-everything.
Pack 3× the chunks into the same window without re-ranking or dropping recall. Faithfulness holds at 90 on LongMemEval even with long, citation-heavy payloads.
System prompt + history + tool schemas add up fast when you're serving millions of turns. Condense runs once per request, transparent to your SDK.
Sign up to claim a key, then drop one command into your terminal.