Who we are
Imir Lab is a Lithuanian research group founded in 2026 by a team with backgrounds in frontier AI research and large-scale infrastructure. We build fast-inference models for tasks where latency is the dominant cost. condense.chat is our first release.
Why Claude Code context compaction is slow
Claude Code context compaction is built into the SDK. When the context window fills, Claude Code runs a sequential LLM call that summarises the conversation and replaces it inline. The Claude Agent SDK uses the same approach. It works, but the compaction step blocks your next turn for several seconds, and the rewrite drops detail you usually need.
condense.chat sits between your client and Anthropic. It compresses history in the background, in parallel, before the request goes upstream. Same SDK (Claude Code or Claude Agent SDK), same model, same evals. The result shows up in three places: a smaller upstream bill, around 5x less wall-clock time on the compaction step itself, and a rewrite that scores higher than the two compaction services we benchmark against. Numbers on the last two ride with the next post. The first you can watch happen in real time on your own helm dashboard.
The bill, with and without compaction
The bill is the easiest part to show, so we start there. Same Claude Opus pricing on both axes ($5 per million input tokens). The X axis is the underlying conversation length your agent would have produced on its own. The blue line is what you pay today. Drag the slider to see the same session priced at four condense compaction levels.
What you see after you sign up
Once you are approved, you land in your helm dashboard at helm.condense.chat. The numbers above the fold are the four we care about: dollars saved, compression ratio, cache hit rate, and request count. The dashboard polls every couple of seconds, so the first few requests through the proxy show up within a turn.
Try it
The whole setup is one line in your terminal once your account is approved. The installer mints a key, writes a small ~/.claude config that points the Claude SDK at api.condense.chat, and starts Claude Code on your account.
The numbers we are happy to put in writing
Three for now. The harder ones (full latency curves, accuracy on each benchmark) ride along with the next post.
Up next
New posts arrive here every Saturday from now on. Next Saturday we ship Adeline, our v1 compaction model.
Run Claude Code on the cheaper bill.
Sign up, wait for approval, paste the curl line.