Performance

From 18ms to 0.9ms — the complete optimization history across five versions.

29/30
Wins vs ripgrep
93.5%
Token savings (v10)
4.1x
More compact than RTK
0.71ms
Daemon mode

Benchmark Table — v1.1.0

Measured across 5 real codebases (915-11,350 files). Hardware: Apple M4 Max. Best of 3 runs. ig v1.1.0 vs ripgrep 15.1.

Query Candidates ig rg Speedup
"function" (11K files) 744 / 1148 33ms 39ms 1.2x
"class\s+\w+" (11K files) 836 / 1148 29ms 34ms 1.2x
"deprecated" (11K files) 5 / 1148 21ms 31ms 1.5x
no-match (11K files) 0 / 1148 20ms 30ms 1.5x
"import" (11K files) 24ms 32ms 1.3x
"function" (3K files) 40ms 43ms 1.1x
"class\s+\w+" (3K files) 33ms 44ms 1.3x
"deprecated" (3K files) 22ms 37ms 1.7x
"function" (2.5K files) 40ms 47ms 1.2x
"todo" -i (2.5K files) 21ms 36ms 1.7x

* v1.1.0 wins on all patterns including complex regex. Lazy line_starts + raised escape hatch (85%) keep the indexed path efficient.

Daemon Mode — Server-side Search Times

Pure search time inside the daemon process (no process startup, no socket overhead). Laravel app, 1,552 files.

Query Candidates Search time
"DistributionController" 2 / 1552 0.08ms
"function" 974 / 1552 4.43ms
"exception" 39 / 1552 0.75ms
"ZZZNOTFOUND" (zero results) 0 / 1552 0.01ms
"middleware" 30 / 1552 0.74ms
"Route::" 4 / 1552 0.20ms
"class " 897 / 1552 3.88ms

The daemon keeps the index mmap'd in process memory — rare/selective patterns like "DistributionController" resolve in 0.08ms. Even zero-result queries short-circuit in 0.01ms.

Index Build — SPIMI Pipeline

v1.0.0 uses a SPIMI (Single-Pass In-Memory Indexing) pipeline with a 128MB RAM budget. VByte delta-encoded postings reduce postings.bin by ~50-60%.

480ms
Fresh build (1,552 files)
2 SPIMI segments
28ms
Incremental (no-op)
overlay index
36MB
Total index size
lexicon 31MB + postings 7.1MB + metadata 111KB
440MB
Peak RSS
streaming pipeline

Ecosystem Comparison

Where ig fits among the indexed-search landscape.

you are here
ig

Sparse n-grams (Blackbird algorithm). Local CLI, open source, standalone. Built for agent and editor-adjacent workflows.

Cursor

Same algorithm (sparse n-grams). Integrated in editor, closed source. ig brings this power to the CLI.

Reflex

Trigrams + Tree-sitter for semantic awareness. Rust, open source. AST-aware queries at the cost of indexing complexity.

Zoekt

Trigrams, server-based (Sourcegraph). Open source. Designed for large-scale monorepo search over HTTP.

ripgrep

Brute-force regex, no index. Fastest unindexed tool. Great for small repos; ig outpaces it as repo size grows.

Trigrep

Trigrams (Moderne). Cloud-based, closed source. Enterprise SaaS — ig is the self-hosted, open alternative.

Optimization Timeline

V1 — Naive trigrams 18ms

Baseline implementation. Full trigram scan on every query.

V2 — Index + verify 8ms

Index-based lookup reduces candidates before regex verification.

V3 — Sparse n-grams 3ms

Sparse n-gram algorithm replaces fixed-width trigrams. Variable-length n-grams dramatically improve selectivity.

V4 — Covering + mmap 1.4ms

Covering set selection picks minimal n-gram set. Memory-mapped index avoids loading into RAM.

V5 — Parallel verify 0.9ms

Rayon-powered parallel regex verification across candidate files.

Daemon mode 0.3ms

Index held permanently in process memory. Unix socket queries bypass all startup.

Why Sparse N-grams Beat Trigrams

Selectivity

Longer n-grams appear in fewer files. A 7-gram like "async f" eliminates 99% of files immediately. Trigrams are too common.

Index Size

Fewer, more selective n-grams per file — smaller posting lists — smaller index. ig's index is typically 2-3MB for a 1000+ file project.

Intersection

Fewer candidates to intersect means less time in the merge step. Covering set selection picks the most selective n-grams automatically.

Selectivity comparison for "async fn":
Trigrams (weak):   "asy"(3000 files) ∩ "syn"(2100 files) ∩ "ync"(1800 files) = 47 candidates
Sparse n-grams:   "async fn"(3 files) = 3 candidates  → 15x fewer candidates to verify

Daemon Latency Breakdown

Six stages from query to result. Total p50 latency: 0.71ms.

Unix socket read (query)
~0.02ms
N-gram extraction from regex
~0.05ms
Hash table lookup (lexicon in memory)
~0.03ms
Posting list intersection
~0.05ms
Parallel regex verification
~0.10ms
Unix socket write (results)
~0.05ms
Total (p50)
0.71ms

Scaling Curve

ig's advantage compounds as project size grows. On small repos it matches rg; on large ones the index dominates.

Project Files ig search rg search Speedup
laravel-app 49 19ms 21ms 1.1x
distribution-app 1,552 70ms 33ms 0.5x
Next.js 24,760 627ms 1,490ms 2.4x
Linux kernel 92,585 1,290ms 5,119ms 4.0x

On the Linux kernel (92K files), a zero-result search: 28ms ig vs 5,279ms rg = 189x speedup.

Daemon Latency Percentiles

1,001 queries across all patterns. Overall: p50 = 0.71ms, p95 = 4.51ms, p99 = 4.69ms, avg = 1.51ms.

Pattern p50 p95 p99
"ZZZNOTFOUND" 0.01ms
"DistributionController" 0.08ms
"Route::" 0.20ms
"middleware" 0.74ms
"exception" 0.75ms
"class " 3.88ms
"function" 4.43ms

Throughput

End-to-end QPS includes process startup; server-side QPS measures pure daemon throughput.

312 QPS
Effective (with process startup)
CLI round-trip throughput
2,695 QPS
Server-side theoretical
Pure daemon query throughput

v1.1.0 — What Changed

v1.1.0 focuses on search performance and correctness. ig now beats ripgrep on 29/30 benchmark tests.

Lazy line_starts

Regex match check runs before building the line index. False-positive candidates (no matches) bail immediately with zero byte scanning. This is the single biggest win for complex regex patterns like class\s+\w+.

Early exit modes

-l (files_only) returns after the first match — no iteration. -c (count_only) counts without building line_starts. Both are now significantly faster on files with many matches.

Single-file search

ig "pattern" file.rs now scopes to that exact file instead of searching the entire project. Works with special characters like $.tsx.

Parallel fallback + madvise

Brute-force fallback now uses rayon (was sequential). Binary check reads 8KB instead of full file. madvise(MADV_RANDOM) on postings prevents useless readahead. Escape hatch raised to 85%.

Overlay vs Full Rebuild

Incremental indexing via overlay layers keeps re-index cost near-constant for typical edit sessions.

Changed files Time Mode
0 28ms no-op
1 28ms no-op
10 76ms overlay
50 88ms overlay
100 91ms overlay
1,552 568ms full rebuild

Token Cost

ig's index pre-filters candidates before returning results, dramatically reducing the tokens an AI agent must process.

Pattern ig tokens rg tokens Ratio
"function" 231K 691K 3.0x
"middleware" 3.4K 12.7K 3.8x
"class " 30.8K 189K 6.1x

v1.3 — Token Optimization Benchmarks

v1.3 adds RTK integration and an enriched context.md. Measured on a real Next.js project: time-to-first-result and total API cost for a typical agent session.

Setup Session time API cost Notes
ig v1.3 + RTK 18s $0.24 Rewrites + context.md + ig search
Baseline (grep/find) 67s $0.32 No rewriting, naive find/grep
RTK only 94s $0.37 Command rewriting without ig index
context.md compression

The v1.3 enriched context.md (dependencies, API routes, env keys) gives agents more signal with fewer tokens than a naive directory listing.

1,517 lines -> 263 lines (83% smaller)
Binary size (v1.3)

strip=true + panic=abort in the release profile halved the binary. Signed releases ship for 4 targets via GitHub Actions.

~8 MB -> 4.1 MB (linux/macOS x x64/arm64)

v1.4 — Token Savings

v1.4 introduces the git proxy, session discovery, and deep hook coverage. Real numbers from production Claude Code sessions — measured end-to-end.

76%
Overall savings ratio
1.4M
Tokens saved
-94%
git status compression
63/65
Tests passing

Git Proxy Compression

ig git rewrites common git commands through RTK — filtering noise, squashing verbose output, and tracking savings. Benchmarked on a real monorepo.

Command Native ig git Savings
git status 732B 127B -83%
git log -10 8,861B 997B -89%
git show HEAD 11,920B 5,812B -51%
git diff 26,288B 6,906B -74%

Real Projects Benchmark

Index build time, search latency, and git status savings across four real-world codebases. Symbols count reflects full API surface extracted by ig symbols.

Project Files Index Search git status Symbols
Laravel app 1,609 226ms 23ms -95% 4,834
Monorepo 3,084 483ms 50ms -51% 7,702
Rust CLI 87 95ms 9ms -84% 541
TypeScript CLI 35 30ms 6ms -83% 150

Explorer Optimization

Comparing exploration strategies on a real codebase. The ig-optimized approach reads 121 files in 10 requests at 170ms — the agent+ig v3 combo adds cross-file reasoning at still-minimal cost.

Approach Files Requests Time
Manual ig 6 4 ~5s
Agent sequential ~35 69 ~120s
ig optimized 121 10 170ms
Agent + ig v3 121 14 ~60s

Opus 4.6 Session Impact

Measured across a full Claude Opus 4.6 development session — code search, file reads, git operations. ig cuts total token consumption by 70%.

~240,500
tokens / session
Without ig
~72,200
tokens / session
With ig (-70%)

The savings compound: ig search returns only matching lines, the git proxy strips verbose output, and the SubagentStart hook pre-injects project context — so the agent never reads files it doesn't need.

Concurrent Queries

The daemon handles concurrent clients without contention — the index is read-only and mmap-shared.

10
Concurrent clients
parallel connections
200
Total queries
evenly distributed
0
Errors
zero dropped queries
266 QPS
Throughput
effective under load

Latency under concurrency: p50 = 0.97ms, p95 = 5.87ms.

v1.6.23 — 100-Command Benchmark

Full-spectrum benchmark across 100 real commands — search, file listing, smart reading, git, and directory listing. Every command category contributes to the aggregate savings total.

-93.5%
Total savings (100 cmds)
3.7 MB
Raw output
241K
ig output
x300
Best: ig files --compact
Category Raw ig Savings
Search --compact (19 patterns) 2.3 MB 108K -95%
Files --compact (14 listings) 597K 2.2K -99.6%
Read -s (10 files) 259K 28K -89%
Read -a (10 files) 259K 39K -85%
Read -b500 (10 files) 259K 32K -88%
Git (13 commands) 60K 32K -47%
ls (5 listings) 4.3K 758B -83%
Total (100 commands) 3.7 MB 241K -93.5%
Files --compact

Full directory listings compressed to just file paths — 269K of raw output distilled into 896 bytes.

269K -> 896B (x300)
Search --compact

19 search patterns across a real codebase — the indexed path eliminates candidates before they even reach the output buffer.

2.3 MB -> 108K (avg 95% savings)
Budget mode (-b 500)

Read with a byte budget — the agent gets the most relevant lines first, never exceeding the token window.

259K -> 32K (avg 88% compression)

v10 — 100-Command Benchmark

Full-spectrum benchmark across 100 real commands — search, file listing, smart reading, git, and misc. Compared side-by-side with RTK (raw terminal output).

93.5%
Total savings (100 cmds)
4.1x
More compact than RTK
2.0x
Overall compression vs RTK
320
Tests passing
Category Raw ig RTK ig savings
Search (25) 2.3 MB 122K 254K -95%
Files (15) 597K 2.2K 2.5K -99.6%
Read -s (10) 259K 28K 248K -89%
Read -a (10) 259K 39K 11K -85%
Read -b100 (10) 259K 5K 11K -98%
Git (15) 60K 32K 32K -47%
Misc (15) varies 80K 94K varies
TOTAL (100 commands) 5.0 MB 329K 658K -93.5%

Real Session Simulation

End-to-end simulation of a realistic AI agent session — 30 tool calls across a real Laravel codebase. Measures total bytes sent to the LLM context window.

Task
Add a Stripe webhook for failed payments
Project
distribution-app (1,552 files, Laravel PHP)
Tool calls
30
Metric Raw ig RTK
Bytes 612K 73K 298K
Tokens 157,000 19,000 76,000
Savings 88.1% 51.4%

Category Breakdown

Search
ig 30K vs RTK 41K
1.4x
ig wins
Read signatures
ig 25K vs RTK 221K
8.8x
ig wins
Read budget
ig 2K vs RTK 6K
3x
ig wins
Git + ls
ig 2K vs RTK 2K
tie

Linux Kernel Stress Test

92,585 files indexed. 127M n-grams. 3.4GB index on disk. The ultimate ig stress test.

92,585
Files indexed
127M
N-grams
6,820 MB
Peak RSS
Query ig rg Speedup
"printk" 332ms 5,426ms 16.3x
"static inline" 876ms 5,106ms 5.8x
"mutex_lock" 317ms 5,467ms 17.2x
"EXPORT_SYMBOL" 398ms 5,342ms 13.4x
"ZZNOTFOUND" (zero results) 28ms 5,279ms 188.5x