Glacier By Dopove

105

GLACIER: Mamba with infinite memory

Open Source

AI • DevTool

GLACIER: The Open-Source Memory Engine for Mamba and Next-Gen AI Agents

GLACIER is a revolutionary open-source project by Dopove Private Limited, architected by Saran S, that finally solves the fundamental memory limitations of State Space Models (SSMs) like Mamba, and addresses the scaling challenges of traditional Transformer-based RAG systems. It is designed to empower developers to build truly intelligent, persistent, and efficient AI applications capable of engaging in long, coherent conversations and complex agentic workflows.

The Problem GLACIER Solves:

1. Mamba's Amnesia & Context Rot: While Mamba models offer unparalleled O(1) inference speed, they suffer from "context rot"—a lack of persistent memory. Their hidden state continuously overwrites itself, causing them to "forget" early conversation details in long sessions. GLACIER gives Mamba an external hippocampus, ensuring crucial facts are never lost.

2. Transformer Scaling Limitations: Traditional Transformer models, when combined with RAG, must store and process the entire conversation history in their KV-cache. This leads to linear token growth and quadratic (O(N^2)) latency scaling, making long conversations prohibitively expensive and slow.

How GLACIER Works: A Two-Pillar Architecture

GLACIER introduces two core innovations:

* ICE-Lite (Infinite Context Engine): This lightweight virtual memory management layer acts as an "external hippocampus" for Mamba. It transparently intercepts model prompts, performs rapid semantic retrieval from a persistent episodic ledger, and "pages in" only the most relevant context into Mamba's active window. This decouples long-term memory from the model's transient state.

* Temporal-RAG: A novel, time-aware reranking layer built directly into ICE-Lite. Beyond mere semantic similarity, Temporal-RAG classifies documents by validity (VALID, TEMPORAL, EXPIRED) and kind (STATIC, VERSIONED, EVENT). It intelligently applies time-decay scoring, recency weighting, and semantic relevance thresholds to ensure Mamba always receives the freshest, most pertinent information, actively demoting stale or outdated facts.

Key Features & Benefits for Developers:

* Persistent & Infinite Memory: Mamba agents can now remember facts, instructions, and entire conversation histories across hundreds of turns and even application restarts. No more "Agentic Amnesia."

* Massive Token Efficiency: GLACIER drastically reduces the token footprint. Our empirical benchmarks show GLACIER is 16.7x more memory efficient than a Transformer baseline, maintaining a constant ~300 tokens in the active prompt compared to ~4900 tokens for a Transformer at Turn 100.

* Constant O(1) Inference Latency: By keeping Mamba's active context small and relevant, GLACIER preserves Mamba's core speed advantage, delivering predictable, flat inference latency regardless of conversation length.

* Advanced Agentic Capabilities: Native support for tool-calling (with "JSON pinning" to remember tool outputs), multi-step "Turbo-Stitching" for seamless long-form generation, and one-line ingestion of files and codebases.

* Guaranteed Reproducibility with Docker: Includes a Dockerfile for easy setup, ensuring a perfectly configured CUDA environment, eliminating complex Mamba compilation issues on diverse machines.

* Empirical Benchmarks: Backed by real-world data and visualizations proving its performance against both vanilla Mamba and Transformer baselines.

GLACIER empowers you to:

* Build Mamba-powered agents that are truly coherent, consistent, and context-aware over extended interactions.

* Drastically reduce operational costs by minimizing token usage and maintaining predictable latency.

* Innovate with next-generation AI applications that were previously impossible due to memory limitations.

Join the movement! GLACIER is open-source and ready for the community to build the future of memory-augmented SSMs. Explore the code, run the benchmarks, and contribute to pushing the boundaries of AI capabilities.

Built with

Python

C++

CUDA

C (Programming Language)