spranab/contextcache
A persistent key-value cache using content-hash addressing, designed for tool-augmented LLMs. Deduplicates identical tool call results across sessions, reducing redundant API calls and latency. Supports TTL-based expiry and pluggable storage backends.
What's novel
Tool-augmented LLMs repeatedly call the same tools with identical inputs across conversations, wasting tokens and API quota. ContextCache solves this with content-addressed storage — the same input always maps to the same cache key regardless of session. This is a purpose-built caching layer for agentic workflows, not a generic Redis wrapper.
Code Analysis
4 files read · 2 roundsA high-performance agentic router that uses a small local LLM with a NoPE (No Positional Embedding) architecture and disk-based KV caching to route tool calls in under 200ms without positional bias errors.
Strengths
Innovative use of NoPE architecture to solve positional embedding issues in multi-tool routing; highly optimized performance via SHA-256 content-addressed disk caching for instant warm starts; clean separation between routing logic and synthesis LLM.
Weaknesses
Limited test coverage (mostly integration tests, no unit tests for core router logic); error handling relies on generic exceptions without detailed logging or recovery strategies; dependency on external GGUF model files which complicates deployment.
Score Breakdown
Signal breakdown
Innovation
Craft
Traction
Scope
Evidence
Commits
17
Contributors
1
Files
109
Active weeks
3
Repository
Language
Python
Stars
18
Forks
0
License
NOASSERTION