Token Compression: Achieves significant token reduction (up to 87.6% on JSON tool results, 53% on tool-heavy requests) by compressing data before it reaches the LLM, leading to cost savings and faster responses.
Smart Tool Selection: Intelligently classifies requests and strips irrelevant tool schemas, reducing token overhead for specific tasks.
Semantic Cache: Provides near-instantaneous responses (171ms cache hits) for repeated prompts with zero token cost, utilizing local embeddings for efficiency.
Complexity Tier Routing: Automatically routes requests to appropriate models (local for simple, cloud for complex) based on 15 dimensions of complexity, optimizing cost and performance.
MCP Code Mode Integration: Streamlines the integration with coding tools by replacing numerous schemas with four meta-tools, drastically cutting token overhead.
Broad Provider Support: Offers a single endpoint for over 13 LLM providers, including local options like Ollama and cloud services like Azure OpenAI and AWS Bedrock.
Zero Code Changes: Seamlessly integrates with existing AI coding tools like Claude Code, Cursor, and Codex with a simple environment variable configuration.
Self-Hosted: Apache 2.0 licensed and designed for self-hosting with no usage tracking, ensuring data privacy and control.
Lynkr acts as an intelligent gateway between your AI coding tools and any LLM provider. It optimizes requests by compressing data, selecting relevant tools, and leveraging a semantic cache before forwarding them to the LLM. This process minimizes token usage and latency without requiring any modifications to your existing tools. With support for a wide range of LLM providers and a straightforward installation process via npm, Lynkr empowers developers to cut costs and enhance the efficiency of their AI-powered applications.
Built with