Architecture
The inner workings of instant-grep — from regex to result in under a millisecond.
Query Pipeline
The complete query pipeline — six stages from regex to results.
╔══════════════════════════════════════════════════════════════════╗
║ QUERY PIPELINE — SEARCH ARCHITECTURE ║
╚══════════════════════════════════════════════════════════════════╝
User Input
│
▼
┌─────────────────────────────────────────┐
│ Stage 1 · regex-syntax Extractor │ query/extract.rs
│ "async fn" → literal fragments │
└─────────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Stage 2 · Sparse N-gram Generation │ index/ngram.rs
│ fragments → variable-length n-grams │
└─────────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Stage 3 · Covering Set Selection │ index/ngram.rs
│ all n-grams → minimal covering set │ build_covering_ngrams()
└─────────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Stage 4 · Hash Table Lookup │ index/reader.rs
│ n-grams → posting list offsets (mmap) │
└─────────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Stage 5 · Posting List Intersection │ index/reader.rs
│ sorted file ID lists → candidates │
└─────────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Stage 6 · Parallel Regex Verification │ Rayon threadpool
│ candidates → verified matches │
└─────────────────────────────────────────┘
│
▼
✓ Results (p50: 0.9ms) Module Structure
src/index/ngram.rs Core Algorithm
Sparse n-gram generation and covering set selection
src/index/writer.rs Index Build Pipeline
Walks filesystem, drives the SPIMI pipeline, serializes the index
src/index/spimi.rs SPIMI Index Build
Single-Pass In-Memory Indexing with 128MB RAM budget — segments flushed to disk when budget is hit
src/index/merge.rs Segment Merge
Merges SPIMI segments into the final index with VByte-encoded postings
src/index/vbyte.rs VByte Compression
Variable-byte delta encoding for posting lists — ~50–60% smaller postings.bin
src/index/overlay.rs Overlay Incremental Index
For <100 changed files: builds a small overlay index, merges on read. Avoids full rebuild.
src/index/reader.rs Index Query
mmap-backed hash table + posting list intersection. Decodes VByte postings on the fly.
src/query/extract.rs Regex → N-gram Conversion
Parses regex AST to extract searchable literal fragments
src/daemon.rs Background Daemon
Unix socket server with full lifecycle: start/stop/status/install/uninstall + launchd auto-restart
src/read.rs Smart File Reading
Full file reading or signatures-only mode — extracts function/struct definitions to reduce agent token cost
src/smart.rs 2-line File Summaries
Heuristic 2-line summaries for every project file — high-level map before full reads
src/pack.rs Project Context Generator
Generates .ig/context.md: tree, symbols, dependencies, Next.js API routes, .env.example keys
src/ls.rs Compact Directory Listing
Groups files by directory with counts — token-efficient alternative to raw find output
src/rewrite.rs Command Rewrite Engine
Translates grep/find/rg invocations to ig equivalents — used by the Claude Code PreToolUse hook
src/tracking.rs Token Savings Tracking
Appends rewrite events to a JSONL log; read by ig gain to compute cumulative savings
src/gain.rs Savings Dashboard
Reads the JSONL tracking log and renders a token savings + cost summary table
src/main.rs CLI Entry Point
Command dispatch via clap — routes to index, search, status, watch, daemon, query, ls, read, smart, pack, gain
v1.0.0 — SPIMI Build Pipeline & Overlay Index
v1.0.0 uses a Single-Pass In-Memory Indexing (SPIMI) pipeline with a 128MB RAM budget. When the budget is hit, a segment is flushed to disk. After all files are processed, segments are merge-sorted into the final index.
When fewer than 100 files have changed since the last build, ig builds a small overlay index for the changed files only. Queries merge the base index and overlay on read — no full rebuild required.
On-disk Format (.ig/ directory) — INDEX_VERSION 13
Five binary files, all memory-mapped for zero-copy access.
INDEX_VERSION must be bumped when the format changes.
v13: 18-byte lexicon entries keep aggregate bloom + locMask metadata, while postings.bin stores tagged per-document posting entries with bloom, loc and position-zone masks. Dense posting lists add skip-block metadata for selective intersections.
.ig/ Size (1,552 files)
├── metadata.bin — bincode: paths, mtimes, git SHA, IDF flag 112 KB
├── lexicon.bin — Hash table: [key:u64, offset:u32, len:u32, 35 MB
│ bloom:u8, loc_mask:u8] (18 bytes/entry)
├── postings.bin — Tagged VByte PostingEntry stream + skip blocks 7.8 MB
├── bigram_df.bin — bincode: corpus bigram document frequencies 95 KB
├── filedata.bin — bincode: line offsets, symbols, block boundaries 1.8 MB
├── tree.txt — Compact directory tree snapshot (auto-generated)
└── context.md — Structured AI agent snapshot: tree, symbols, deps, routes, env ig read -s and ig context.pack.rs. Sections: directory tree, symbol definitions, Cargo.toml / package.json dependencies, Next.js API routes, .env.example keys.Sparse N-gram Algorithm
Traditional trigram search uses every contiguous 3-character substring — expensive and produces many false positives. The sparse n-gram algorithm uses variable-length n-grams that adapt to the string being indexed, concentrating selectivity where it matters.
How hash_bigram() works:
Each n-gram is hashed by combining its characters using a polynomial rolling hash with a prime base.
The hash fits in a u64 — enabling the open-addressing hash table lookup
to be a single memory access when the index is warm in the OS page cache (or the daemon's mmap).
Covering Algorithm
From all possible n-grams for a query, we need only the minimal covering set — the smallest collection of n-grams that together guarantee we find all matching files. Enough to cover every position, no redundancy.
fn build_covering_ngrams(ngrams: Vec<Ngram>) → Vec<Ngram> {
sort ngrams by (position, length desc)
greedily select ngrams that cover uncovered positions
return minimal subset that covers the full query string
} Daemon Architecture
The daemon keeps the index mmap'd in process memory, eliminating OS page cache warmup on every query.
- • Cold start: open files, read mmap headers
- • OS page faults on first access
- • ~5-15ms for cold index
- • ~0.9ms when OS cache is warm
- • Index already in process memory
- • Unix socket: no TCP overhead
- • Sub-0.5ms consistent latency
- • Perfect for AI agent repeated queries
# Start daemon (background)
ig daemon start . # socket at .ig/daemon.sock
# Check status
ig daemon status .
# Stop daemon
ig daemon stop .
# Install as launchd service (auto-restart on macOS)
ig daemon install .
ig daemon uninstall .
# Query via socket (sub-ms)
ig query "async fn" . Cursor-Inspired 3.5-gram Metadata
The v13 index stores cheap masks per posting entry. Query execution uses sparse n-gram lookup, next-byte bloom filtering, modulo-8 location compatibility, exact small-position compatibility, posting-list intersection, and final regex verification. The masks only narrow candidates; regex remains authoritative.
Variable-length n-grams from the danlark1 covering algorithm. Each n-gram covers a run of literal characters with maximal selectivity — no wasted index entries on fixed 3-char windows.
An 8-bit mask on each posting entry encodes the byte that follows an indexed n-gram occurrence. Query-time masked n-grams can reject files whose following byte cannot match.
An 8-bit mask encodes the position modulo 8 of each n-gram occurrence. The reader checks relative alignment across multiple n-grams before touching file contents.
regex pattern
│
▼
regex-syntax::Extractor → extract literal sequences
│
▼
IDF-weighted covering ngrams (corpus-tuned boundaries)
│
▼
FNV-1a hash → NgramKey (u64) → lookup in mmap'd hash table
│
▼
Next-byte bloom mask → reject incompatible per-file postings
│
▼
locMask alignment → keep compatible modulo-8 positions
│
▼
Small-position mask → reject incompatible exact early-file positions
│
▼
Intersect posting entries (VByte decoded or skip-block advanced)
│
▼
Parallel regex verification (rayon, per-thread regex clone)
│
▼
✓ Results (colored / JSON / compact) IDF-Weighted Ngram Boundaries
Not all bigrams are equal. In a Rust codebase, fn appears in almost every file —
it's a terrible n-gram boundary. The IDF system measures this automatically and steers
the covering algorithm away from common bigrams toward rare, selective ones.
- • During
ig index, bigram document frequencies are tallied across all files - • Stored compactly in
bigram_df.bin(~95 KB for 1,552 files) - • First build uses heuristics; second build uses real IDF from the first
- • Convergence in 2 passes — subsequent rebuilds are self-reinforcing
- • Common bigrams (>50% of files): weight 0.25× → boundary pushed away
- • Typical bigrams: weight 1.0× → unchanged
- • Rare bigrams (<5% of files): weight 2.0× → boundary attracted
- • Result: more selective n-grams, fewer candidates to regex-verify
Example — "async fn handle_request"
FileData Cache — Pre-computed Metadata
Every ig index run pre-computes per-file metadata and serializes it to
filedata.bin. Commands that previously required a full file read now
return instantly from the cache — with graceful fallback when the cache is absent.
ig context does an O(1) binary search
to find the enclosing block — no file read required.
ig read -s returns cached symbols in ~0.1ms
vs 5–10ms for a live regex scan.
ig smart renders the project map without opening a single file.
Posting Entry Decode — Masked Candidate Intersection
The v13 reader decodes PostingEntry records from the mmap'd
postings.bin slice. Each record contains a delta-encoded file ID,
a next-byte bloom mask, a modulo-8 location mask, and an exact small-position mask with overflow. Dense lists include skip blocks so
selective intersections can advance directly toward candidate file IDs before regex verification.
- • Posting list slice is borrowed directly from mmap — no copy
- • Each record decodes one VByte doc-ID delta plus bloom, loc and zone masks
- • Decoder tracks cumulative delta sum for absolute doc IDs
- • One-byte VByte deltas stay compact for adjacent file IDs
- • Next-byte bloom removes files whose following byte cannot match
- • locMask keeps only relative modulo-8 alignments that can coexist
- • Rarest posting list is consumed first — fewest bytes decoded overall
- • Regex verification still decides the final result set
struct PostingEntry {
doc_id: u32, // delta-encoded in postings.bin
next_mask: u8, // hashed byte after the n-gram
loc_mask: u8, // occurrence positions modulo 8
}
fn resolve_and_masked(query_ngrams) {
filter postings by next_mask
intersect doc_id sets
check loc_mask compatibility
verify candidates with regex
}