TokenTrim compresses your prompts before they hit OpenAI or Anthropic. Same responses. 30–50% fewer input tokens. No infra to manage.
console.log(response.tokentrim); // → { tokens_in: 658, tokens_in_compressed: 397, // tokens_saved: 261, cost_saved_usd: "0.0013" }
Typical savings: 30–50% on input tokens. Same quality, same responses.
Replace OpenAI with OptimizedOpenAI. Your call stays the same — response.tokentrim shows what you saved.
Real-time dashboard showing compression ratio, tokens saved, and exact $ back in your pocket.
Your app calls OptimizedOpenAI.chat.completions.create() — same API you already use.
TokenTrim gateway compresses the prompt via LLMLingua-2, a Microsoft Research model for lossless prompt compression.
The compressed prompt hits OpenAI. You get the response plus response.tokentrim metadata showing exactly what you saved.
Get your API key within 24 hours. Free during beta. Cancel anytime — there's nothing to cancel.