Remnis
Local Work Memory for macOS Developers
Overview
Remnis is a local macOS memory engine for developers. I am building it to solve the context fragmentation problem that shows up across terminals, editors, browsers, docs, and chat tools, where useful work context disappears as soon as you switch tasks.
The product strategy is explicitly local-only and capture-first. Remnis should make it possible to search by intent, not exact keywords, but that only works if the capture pipeline is reliable, low-noise, and lightweight enough to run in the background without getting in the way.
What It Does Right Now
Desktop + Sidecar Split
The Tauri desktop shell starts reliably, talks to the local sidecar, and can fetch `/health` with readiness flags.
Active Window Observer
Observer v1 captures active-window context on macOS as the initial high-signal source for work memory.
Local Ingest + Search APIs
`/health`, `/ingest`, `/observer/stats`, and `/search` are wired for readiness checks, diagnostics, ingest, and keyword-search fallback.
Deduped Local Storage
Events are normalized, hashed, debounced, deduplicated, and persisted locally in JSONL before future indexing.
Architecture
Desktop Shell
Tauri + React + TypeScript handle the UI, hotkey, menu bar direction, and local orchestration between the desktop shell and sidecar.
Local Sidecar
The Python FastAPI sidecar owns the observer, ingest pipeline, health surface, and search endpoints so the capture and retrieval logic can evolve independently from the desktop UI.
Capture Pipeline
The pipeline is observe, normalize, hash, debounce, dedupe, store, then eventually index. The product depends more on high-signal capture than on model sophistication.
Semantic Layer
The intended storage layer is LanceDB with vectors plus metadata persistence, but the local embedding/indexing pipeline and heavier query-time reasoning model are not integrated into runtime yet.
Workflow
1. Capture
The macOS observer tracks the active window and packages context into lightweight local events.
2. Normalize
The ingest layer validates the payload, hashes it, removes noisy duplicates, and stores only the useful context.
3. Store and Index
Accepted events are persisted locally and later indexed with the lightweight embedding model once the semantic pipeline is integrated.
4. Query and Render
Queries will be embedded, matched semantically, optionally reranked or synthesized by a heavier local model, and surfaced in a HUD with app identity, snippet, and time context.
Two-Model End State
Model 1: Background Indexing
Remnis is intended to ship with an always-on local embedding and indexing model for event and query embeddings. The baseline planned model is `all-MiniLM-L6-v2`, used in the background for semantic indexing with low overhead.
Model 2: Query-Time Reasoning
A second, heavier local model is reserved for query-time use only when the user invokes Remnis and wants reranking, synthesis, reminders, related-context explanations, or a better final answer.
Strategy & Challenges
Capture Quality Comes First
The hardest part of the product is reliable, high-signal capture across sources. Better embeddings or reasoning cannot compensate for weak or noisy capture, so the architecture is intentionally capture-first.
Local-Only Constraints
Everything is designed to stay local by default, with explicit OS permission handling, low CPU and memory overhead, and graceful failure when dependencies or permissions are unavailable.
Fast Retrieval Later
Once both local model tiers are integrated, the product will still need to balance background indexing cost, query-time reasoning cost, and the boundary between the desktop shell and local sidecar so search feels immediate.
Next Expansion
Browser Adapter
The next high-value source is browser capture, especially URL, title, and snippet context.
Clipboard + Notifications
Clipboard and notification capture follow after browser integration to make cross-app recall more robust.
HUD + Hotkey
The long-term UX is a Spotlight-style HUD with fast semantic retrieval, relative time, app identity, and optional synthesized answers.
Current Status
Local API routes already running
Observer loop shipping in v1 today
Local model tiers integrated in runtime yet
Today the baseline observer, ingest, JSONL persistence, and keyword-search fallback are in place. The app is not considered complete until both local model tiers, LanceDB-backed retrieval, and the final query-time reasoning flow are running locally.