AI-Powered Document Q&A System
A completely offline, privacy-preserving document search and question-answering system using vector embeddings, local LLMs, and Microsoft Agent Framework.
- Completely Offline - All processing runs locally, no internet required after initial setup
- Microsoft Agent Framework - Built on Microsoft.Agents.AI.Foundry for AI orchestration
- Fast Search - Vector indexing for quick similarity search
- AI-Powered - Local LLM integration for accurate, context-aware answers
- Multi-Format - Support for .txt, .json, and .pdf files
- Context-Aware Search - Uses document context for better answers
- Audit Logs - Track all queries with detailed metrics
- Modern UI - Clean, responsive interface built with Bootstrap 5
![]() |
![]() |
![]() |
![]() |
| Landing Page | Upload Documents | Query Documents | Search Results - Success |
![]() |
![]() |
![]() |
|
| Search Results - Failed | Audit Logs | Rescan Documents |
Click on any thumbnail to view full-size image
# 1. Install PostgreSQL with pgvector extension
# 2. Install Ollama and pull models
ollama pull nomic-embed-text:latest
ollama pull llama3.2:3b
# 3. Run backend
cd LocalDocScannerBE
dotnet run --configuration Release
# 4. In another terminal, run frontend
cd LocalDocScannerUI
dotnet run --configuration Release
# 5. Open https://localhost:7106| Document | Description |
|---|---|
| ARCHITECTURE.md | System design & architecture |
| IMPLEMENTATION_SUMMARY.md | Technical overview |
| VECTOR_DEBUGGING.md | Vector features & troubleshooting |
| DEPLOYMENT_CHECKLIST.md | Production deployment guide |
graph TD
UI[User Interface] --> API[ASP.NET Core API]
API --> Upload[Document Upload]
API --> Query[Query Processing]
Upload --> PG[(PostgreSQL)]
Query --> Search[Vector Search]
Search --> PG
PG --> Search
Search --> Query
Query --> Ollama[Ollama LLM]
Ollama --> Query
Query --> PG
- PostgreSQL with pgvector extension
- Ollama for local LLM inference
- .NET 10 SDK - For running the application
- 4GB+ RAM - For running Ollama models (8GB+ recommended)
- 10GB+ Disk - For model storage and documents
- Ports: 5432 (DB), 11434 (Ollama), 7106 (Frontend), 7126 (Backend)
| Operation | Time |
|---|---|
| Upload 3 documents | 15-30s |
| First query | 20-30s |
| Subsequent query | 5-10s |
| Vector search | <50ms |
Times vary based on content length and hardware
- Embedding:
nomic-embed-text:latest(384-dim, fast) - Generation:
llama3.2:3b(3B params, lightweight)
- PostgreSQL with pgvector extension
- IVFFlat index for fast vector search
- No embeddings generated? → VECTOR_DEBUGGING.md
- Slow queries? → VECTOR_DEBUGGING.md
- Up to 10,000 documents
- 50 concurrent users
- 100 queries/min
- Multiple backend instances
- Managed database service (AWS RDS, Azure)
- Load balancer
- Caching layer
See DEPLOYMENT_CHECKLIST.md for details
- Local inference (no external API calls)
- Runs on localhost by default
- No internet required for operation
- Full data privacy
For production deployment, see DEPLOYMENT_CHECKLIST.md
| Component | Technology |
|---|---|
| Backend | ASP.NET Core 10, C# 14 |
| Frontend | ASP.NET Core MVC, Bootstrap 5 |
| AI Framework | Microsoft Agent Framework (Microsoft.Agents.AI.Foundry) |
| LLM | Ollama (local inference) |
| Database | PostgreSQL + pgvector |
| Indexing | IVFFlat |
- Upload Documents - Upload .txt, .json, or .pdf files
- Generate Embeddings - Automatically converts documents to semantic vectors
- Semantic Search - Finds relevant documents using vector similarity
- Answer Questions - Uses context from documents to answer queries
Complete & Tested
- Microsoft Agent Framework integration
- Ollama integration
- Vector embedding generation
- pgvector semantic search
- Context-aware query processing
- Document processing
- API endpoints
- UI components
- Complete documentation
- Install - Follow quick-start instructions above
- Deploy - Follow DEPLOYMENT_CHECKLIST.md
- Monitor - Check daily health metrics
- Scale - Refer to scaling guide as needed
- Vector Problems → VECTOR_DEBUGGING.md
- Architecture → ARCHITECTURE.md
- Implementation → IMPLEMENTATION_SUMMARY.md
MIT License - see LICENSE file for details
Contributions are welcome! Please feel free to submit a Pull Request.
- Ollama - Local LLM inference
- pgvector - Vector similarity for PostgreSQL
- ASP.NET Core - Web framework
- Bootstrap - UI framework
Status: Production Ready
Version: 1.0
Last Updated: April 2026
Get started now: Follow the Quick Start instructions above
Made by Nhilesh Baua
This documentation is automatically generated and may not reflect the recent state of the project. Changes in the codebase may not be immediately reflected in this documentation.






