Remnis

Local Work Memory for macOS Developers

Overview

Remnis is a local macOS memory engine for developers. I am building it to solve the context fragmentation problem that shows up across terminals, editors, browsers, docs, and chat tools, where useful work context disappears as soon as you switch tasks.

The product strategy is explicitly local-only and capture-first. Remnis should make it possible to search by intent, not exact keywords, but that only works if the capture pipeline is reliable, low-noise, and lightweight enough to run in the background without getting in the way.

What It Does Right Now

▸

Desktop + Sidecar Split

The Tauri desktop shell starts reliably, talks to the local sidecar, and can fetch `/health` with readiness flags.

▸

Active Window Observer

Observer v1 captures active-window context on macOS as the initial high-signal source for work memory.

▸

Local Ingest + Search APIs

`/health`, `/ingest`, `/observer/stats`, and `/search` are wired for readiness checks, diagnostics, ingest, and keyword-search fallback.

▸

Deduped Local Storage

Events are normalized, hashed, debounced, deduplicated, and persisted locally in JSONL before future indexing.

Architecture

Desktop Shell

Tauri + React + TypeScript handle the UI, hotkey, menu bar direction, and local orchestration between the desktop shell and sidecar.

TauriReactTypeScriptTailwind CSS

Local Sidecar

The Python FastAPI sidecar owns the observer, ingest pipeline, health surface, and search endpoints so the capture and retrieval logic can evolve independently from the desktop UI.

PythonFastAPIREST APIsSchema Validation

Capture Pipeline

The pipeline is observe, normalize, hash, debounce, dedupe, store, then eventually index. The product depends more on high-signal capture than on model sophistication.

macOSEvent ProcessingHashingDebounce

Semantic Layer

The intended storage layer is LanceDB with vectors plus metadata persistence, but the local embedding/indexing pipeline and heavier query-time reasoning model are not integrated into runtime yet.

JSONLLanceDBEmbeddingsSemantic Search

Workflow

▸

1. Capture

The macOS observer tracks the active window and packages context into lightweight local events.

▸

2. Normalize

The ingest layer validates the payload, hashes it, removes noisy duplicates, and stores only the useful context.

▸

3. Store and Index

Accepted events are persisted locally and later indexed with the lightweight embedding model once the semantic pipeline is integrated.

▸

4. Query and Render

Queries will be embedded, matched semantically, optionally reranked or synthesized by a heavier local model, and surfaced in a HUD with app identity, snippet, and time context.

Two-Model End State

Model 1: Background Indexing

Remnis is intended to ship with an always-on local embedding and indexing model for event and query embeddings. The baseline planned model is `all-MiniLM-L6-v2`, used in the background for semantic indexing with low overhead.

Model 2: Query-Time Reasoning

A second, heavier local model is reserved for query-time use only when the user invokes Remnis and wants reranking, synthesis, reminders, related-context explanations, or a better final answer.

Strategy & Challenges

Capture Quality Comes First

The hardest part of the product is reliable, high-signal capture across sources. Better embeddings or reasoning cannot compensate for weak or noisy capture, so the architecture is intentionally capture-first.

Local-Only Constraints

Everything is designed to stay local by default, with explicit OS permission handling, low CPU and memory overhead, and graceful failure when dependencies or permissions are unavailable.

Fast Retrieval Later

Once both local model tiers are integrated, the product will still need to balance background indexing cost, query-time reasoning cost, and the boundary between the desktop shell and local sidecar so search feels immediate.

Next Expansion

▸

Browser Adapter

The next high-value source is browser capture, especially URL, title, and snippet context.

▸

Clipboard + Notifications

Clipboard and notification capture follow after browser integration to make cross-app recall more robust.

▸

HUD + Hotkey

The long-term UX is a Spotlight-style HUD with fast semantic retrieval, relative time, app identity, and optional synthesized answers.

Current Status

Local API routes already running

Observer loop shipping in v1 today

Local model tiers integrated in runtime yet

Today the baseline observer, ingest, JSONL persistence, and keyword-search fallback are in place. The app is not considered complete until both local model tiers, LanceDB-backed retrieval, and the final query-time reasoning flow are running locally.