Local Doc Scanner

AI-Powered Document Q&A System

A completely offline, privacy-preserving document search and question-answering system using vector embeddings, local LLMs, and Microsoft Agent Framework.

Features

Completely Offline - All processing runs locally, no internet required after initial setup
Microsoft Agent Framework - Built on Microsoft.Agents.AI.Foundry for AI orchestration
Fast Search - Vector indexing for quick similarity search
AI-Powered - Local LLM integration for accurate, context-aware answers
Multi-Format - Support for .txt, .json, and .pdf files
Context-Aware Search - Uses document context for better answers
Audit Logs - Track all queries with detailed metrics
Modern UI - Clean, responsive interface built with Bootstrap 5

Screenshots



Landing Page	Upload Documents	Query Documents	Search Results - Success

Search Results - Failed	Audit Logs	Rescan Documents

Click on any thumbnail to view full-size image

Quick Start

# 1. Install PostgreSQL with pgvector extension
# 2. Install Ollama and pull models
ollama pull nomic-embed-text:latest
ollama pull llama3.2:3b

# 3. Run backend
cd LocalDocScannerBE
dotnet run --configuration Release

# 4. In another terminal, run frontend
cd LocalDocScannerUI
dotnet run --configuration Release

# 5. Open https://localhost:7106

Additional Documentation

Document	Description
ARCHITECTURE.md	System design & architecture
IMPLEMENTATION_SUMMARY.md	Technical overview
VECTOR_DEBUGGING.md	Vector features & troubleshooting
DEPLOYMENT_CHECKLIST.md	Production deployment guide

Architecture

graph TD
    UI[User Interface] --> API[ASP.NET Core API]
    API --> Upload[Document Upload]
    API --> Query[Query Processing]
    Upload --> PG[(PostgreSQL)]
    Query --> Search[Vector Search]
    Search --> PG
    PG --> Search
    Search --> Query
    Query --> Ollama[Ollama LLM]
    Ollama --> Query
    Query --> PG

System Requirements

PostgreSQL with pgvector extension
Ollama for local LLM inference
.NET 10 SDK - For running the application
4GB+ RAM - For running Ollama models (8GB+ recommended)
10GB+ Disk - For model storage and documents
Ports: 5432 (DB), 11434 (Ollama), 7106 (Frontend), 7126 (Backend)

Performance

Operation	Time
Upload 3 documents	15-30s
First query	20-30s
Subsequent query	5-10s
Vector search	<50ms

Times vary based on content length and hardware

Configuration

Models

Embedding: nomic-embed-text:latest (384-dim, fast)
Generation: llama3.2:3b (3B params, lightweight)

Database

PostgreSQL with pgvector extension
IVFFlat index for fast vector search

Troubleshooting

No embeddings generated? → VECTOR_DEBUGGING.md
Slow queries? → VECTOR_DEBUGGING.md

Scaling

Current (Single Machine)

Up to 10,000 documents
50 concurrent users
100 queries/min

Future (Horizontal Scaling)

Multiple backend instances
Managed database service (AWS RDS, Azure)
Load balancer
Caching layer

See DEPLOYMENT_CHECKLIST.md for details

Security

Local inference (no external API calls)
Runs on localhost by default
No internet required for operation
Full data privacy

For production deployment, see DEPLOYMENT_CHECKLIST.md

Technology Stack

Component	Technology
Backend	ASP.NET Core 10, C# 14
Frontend	ASP.NET Core MVC, Bootstrap 5
AI Framework	Microsoft Agent Framework (Microsoft.Agents.AI.Foundry)
LLM	Ollama (local inference)
Database	PostgreSQL + pgvector
Indexing	IVFFlat

How It Works

Upload Documents - Upload .txt, .json, or .pdf files
Generate Embeddings - Automatically converts documents to semantic vectors
Semantic Search - Finds relevant documents using vector similarity
Answer Questions - Uses context from documents to answer queries

Implementation Status

Complete & Tested

Next Steps

Install - Follow quick-start instructions above
Deploy - Follow DEPLOYMENT_CHECKLIST.md
Monitor - Check daily health metrics
Scale - Refer to scaling guide as needed

Support

Vector Problems → VECTOR_DEBUGGING.md
Architecture → ARCHITECTURE.md
Implementation → IMPLEMENTATION_SUMMARY.md

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

Ollama - Local LLM inference
pgvector - Vector similarity for PostgreSQL
ASP.NET Core - Web framework
Bootstrap - UI framework

Status: Production Ready
Version: 1.0
Last Updated: April 2026

Get started now: Follow the Quick Start instructions above

Made by Nhilesh Baua

Disclaimer

This documentation is automatically generated and may not reflect the recent state of the project. Changes in the codebase may not be immediately reflected in this documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DB Queries		DB Queries
LocalDocScannerBE		LocalDocScannerBE
LocalDocScannerUI		LocalDocScannerUI
Project Docs		Project Docs
Screenshots		Screenshots
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
LocalDocScanner.slnx		LocalDocScanner.slnx
README.md		README.md
quick-start.sh		quick-start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Doc Scanner

Features

Screenshots

Quick Start

Additional Documentation

Architecture

System Requirements

Performance

Configuration

Models

Database

Troubleshooting

Scaling

Current (Single Machine)

Future (Horizontal Scaling)

Security

Technology Stack

How It Works

Implementation Status

Next Steps

Support

License

Contributing

Acknowledgments

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local Doc Scanner

Features

Screenshots

Quick Start

Additional Documentation

Architecture

System Requirements

Performance

Configuration

Models

Database

Troubleshooting

Scaling

Current (Single Machine)

Future (Horizontal Scaling)

Security

Technology Stack

How It Works

Implementation Status

Next Steps

Support

License

Contributing

Acknowledgments

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages