RAG Document Search

Add documents and search using natural language - everything stays in your browser

Drop files here or click to browse

Supported formats: PDF, Word, Excel, PowerPoint, EPUB, ODT, ZIP, and 200+ text formats

Frequently Asked Questions

Is my data sent to a server?

No, absolutely not. Everything happens entirely in your browser. Your files are processed locally using WebAssembly and JavaScript. No data is transmitted to any server - your documents never leave your device.

This means you can safely use this tool with confidential or sensitive documents. Even if you disconnect from the internet after loading the page, the search will still work.

What file types are supported?

This tool supports a wide range of file formats:

  • Office Documents: PDF, Word (.docx), Excel (.xlsx), PowerPoint (.pptx), OpenDocument (.odt)
  • E-Books: EPUB
  • Archives: ZIP (all supported files inside are automatically extracted)
  • Text Documents: Plain text (.txt), Markdown (.md), HTML, XML
  • Code: JavaScript, TypeScript, Python, Java, C/C++, Rust, Go, Ruby, PHP, and 200+ more
  • Data: JSON, YAML, CSV, TOML, INI configuration files

Binary formats like images, audio, or video files are not supported. All document processing happens entirely in your browser - no server required.

What is the maximum file size?

There is no strict file size limit, but practical limits depend on your device:

  • Browser memory: Very large files may cause your browser to slow down or crash
  • Recommended: Individual files under 10 MB work best
  • Total size: The combined size of all documents should stay reasonable (under 100 MB) for smooth performance

If you experience slowdowns, try closing other browser tabs or using fewer/smaller files.

What is RAG and how does it work?

RAG stands for Retrieval-Augmented Generation. In simple terms, it's a technique that helps you find relevant information in your documents using natural language questions.

Here's how it works:

  1. Processing: When you add documents, they are split into smaller chunks and converted into mathematical representations called "embeddings"
  2. Searching: When you type a question, it's also converted into an embedding
  3. Matching: The system finds document chunks whose embeddings are most similar to your question

This allows you to search by meaning rather than exact keywords. For example, searching for "how to handle errors" will find content about "exception handling" or "error management" even if those exact words aren't in your query.

Why does adding documents require a download?

When you first add documents, the page downloads an AI model (approximately 130 MB) that runs entirely in your browser. The model used is paraphrase-multilingual-MiniLM-L12-v2, a sentence transformer that supports over 50 languages including English and German.

This model converts text into mathematical representations (embeddings) that capture semantic meaning, enabling you to search by concept rather than just keywords.

Good news: The model is cached in your browser, so future visits won't require another download.

How does the relevance scoring work?

The search uses a hybrid approach that combines two techniques:

  • Semantic Search (60%): Understands the meaning of your query. Finds content that is conceptually similar, even if the exact words don't match.
  • Keyword Search / BM25 (40%): Looks for exact word matches. Ensures that documents containing your specific search terms are boosted.

The relevance percentage shown for each result is this combined score. A document with a specific keyword match might appear high in results even if the semantic similarity alone would be low.

What do 'Precise' and 'Context' mean?

Documents are automatically split into chunks at two granularity levels:

  • Precise: Smaller chunks (~400 characters) that are better for finding specific terms or short phrases. Ideal for keyword-focused searches.
  • Context: Larger chunks (~1200 characters) that capture more surrounding context. Better for conceptual or thematic searches.

The search automatically considers both granularities and adjusts scoring based on your query length: short queries favor precise chunks, longer questions favor context chunks.

Why does my search return no results?

Results only appear if they exceed a minimum relevance threshold. Here are some tips:

  • Use specific keywords: Include terms that are likely to appear in your documents.
  • Try different phrasings: The semantic search understands synonyms, but exact terms get an additional boost.
  • Keep queries focused: Very long or vague queries may dilute the relevance score.
  • Check the Debug Panel: Click "Show debug information" to see all chunks with their scores, including those below the threshold.
Is this a production-ready solution?

This tool is a proof of concept demonstrating that sophisticated semantic search can run entirely in the browser without any server infrastructure. It showcases the potential of client-side AI for privacy-preserving document search.

For enterprise-grade implementations, multimodal textualization would be a typical enhancement - using OCR and vision models to extract and describe content from images, diagrams, charts, and scanned documents, making visual information searchable alongside text.

That said, this implementation already demonstrates the core principles and can handle real-world document collections effectively within browser constraints.