QMD - Query Markup Documents

Website: https://github.com/tobi/qmd CLI Tool: qmd Package: @tobilu/qmd

Description

QMD is an on-device search engine for everything you need to remember. Index your markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Combines BM25 full-text search, vector semantic search, and LLM re-ranking—all running locally via node-llama-cpp with GGUF models. Ideal for agentic flows.

Installation

# Install globally
npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

# Or run directly
npx @tobilu/qmd ...
bunx @tobilu/qmd ...

Requires Node.js >= 22 or Bun >= 1.0.0.

Commands

Collection Management

Add Collection

qmd collection add . --name myproject
qmd collection add ~/Documents/notes --name notes
qmd collection add ~/work/docs --name docs --mask "**/*.md"

Create a collection from a directory with optional glob mask.

List Collections

qmd collection list

List all collections.

Remove Collection

qmd collection remove myproject

Remove a collection.

Rename Collection

qmd collection rename myproject my-project

Rename a collection.

List Files

qmd ls notes
qmd ls notes/subfolder

List files in a collection.

Context Management

Add Context

qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"
qmd context add qmd://docs/api "API documentation"
qmd context add / "Knowledge base for my projects"

Add descriptive metadata to collections and paths. Context is returned with search results.

List Contexts

qmd context list

List all contexts.

Remove Context

qmd context rm qmd://notes/old

Remove context.

Embedding

Generate Embeddings

qmd embed
qmd embed -f
qmd embed --chunk-strategy auto

Generate vector embeddings for semantic search. Use -f to force re-embed, --chunk-strategy auto for AST-aware code chunking.

Search Commands

Full-Text Search (BM25)

qmd search "authentication flow"
qmd search "API" -c notes
qmd search "error handling" -n 10
qmd search "API" --all --files --min-score 0.3

Fast keyword-based BM25 full-text search.

Vector Search

qmd vsearch "how to login"
qmd vsearch "how to deploy"

Semantic similarity search using embeddings.

Hybrid Query (Best Quality)

qmd query "user authentication"
qmd query "quarterly planning process"
qmd query -n 10 --min-score 0.3 "API design patterns"
qmd query --json --explain "quarterly reports"

Hybrid search: FTS + Vector + Query Expansion + LLM Re-ranking.

Document Retrieval

Get Document

qmd get "meetings/2024-01-15.md"
qmd get "#abc123"
qmd get notes/meeting.md:50 -l 100
qmd get <file> --full
qmd get <file> --line-numbers

Get a document by path or docid. Supports line ranges.

Multi-Get Documents

qmd multi-get "journals/2025-05*.md"
qmd multi-get "doc1.md, doc2.md, #abc123"
qmd multi-get "docs/*.md" --max-bytes 20480
qmd multi-get "docs/*.md" --json

Batch retrieve by glob pattern or comma-separated list.

Index Maintenance

Update Index

qmd update
qmd update --pull

Re-index all collections. Use --pull to git pull first.

Status

qmd status

Show index status and collections with contexts.

Cleanup

qmd cleanup

Clean up cache and orphaned data.

MCP Server

Start MCP Server

qmd mcp
qmd mcp --http
qmd mcp --http --port 8080
qmd mcp --http --daemon
qmd mcp stop

Start MCP server for AI agent integration. Supports stdio (default) and HTTP transport.

Output Formats

--files            # Output: docid,score,filepath,context
--json             # JSON output with snippets
--csv              # CSV output
--md               # Markdown output
--xml              # XML output

Examples

Quick Start

# Create collections for your notes, docs, and meeting transcripts
qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd collection add ~/work/docs --name docs

# Add context to help with search results
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"

# Generate embeddings for semantic search
qmd embed

# Search across everything
qmd search "project timeline"           # Fast keyword search
qmd vsearch "how to deploy"             # Semantic search
qmd query "quarterly planning process"  # Hybrid + reranking (best quality)

Agent Integration

# Get structured results for an LLM
qmd search "authentication" --json -n 10

# List all relevant files above a threshold
qmd query "error handling" --all --files --min-score 0.4

# Retrieve full document content
qmd get "docs/api-reference.md" --full

# Output as markdown for LLM context
qmd search --md --full "error handling"

MCP Configuration

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

Claude Code plugin:

claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd

HTTP MCP Server

# Start HTTP server (models stay loaded in VRAM)
qmd mcp --http                    # localhost:8181
qmd mcp --http --daemon           # background daemon
qmd mcp stop                      # stop daemon

# Endpoints:
# POST /mcp — MCP Streamable HTTP
# GET /health — liveness check

Custom Embedding Model

# Use Qwen3-Embedding for better multilingual (CJK) support
export QMD_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf"

# Re-embed after changing model
qmd embed -f

Editor Link Configuration

# VS Code (default)
export QMD_EDITOR_URI="vscode://file/{path}:{line}:{col}"

# Cursor
export QMD_EDITOR_URI="cursor://file/{path}:{line}:{col}"

# Zed
export QMD_EDITOR_URI="zed://file/{path}:{line}:{col}"

Data Storage

Index stored in: ~/.cache/qmd/index.sqlite

Models cached in: ~/.cache/qmd/models/

Notes

Local-first: All processing runs locally via node-llama-cpp
Hybrid Search: Combines BM25 (keyword) + Vector (semantic) + LLM re-ranking
Query Expansion: Fine-tuned model generates query variations
RRF Fusion: Reciprocal Rank Fusion with position-aware blending
Smart Chunking: ~900 tokens with 15% overlap, respects markdown structure
AST-Aware: Tree-sitter chunking for code files (TS, JS, Python, Go, Rust)
Context System: Add descriptions to collections/paths for better results
Docid: 6-char hash for quick document retrieval (#abc123)
MCP Server: Exposes query, get, multi_get, status tools
HTTP Transport: Long-lived server keeps models loaded in VRAM
Models: embeddinggemma-300M (embed), qwen3-reranker-0.6b (rerank), qmd-query-expansion-1.7B (expand)
Multilingual: Use Qwen3-Embedding for CJK and 119 languages
Output Formats: JSON, CSV, Markdown, XML, files list
TTY Links: Clickable paths open in configured editor
License: MIT

qmd