GitHub - aiKunalBisht/Transcript-ai: In modern international business, conversations often shift between languages. Transcript AI bridges this gap by utilizing Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to ensure no context is lost in translation. It doesn't just transcribe; it understands business intent.

title	TranscriptAI
emoji	🧠
colorFrom	pink
colorTo	red
sdk	docker
app_port	7860
pinned	false

Multilingual Meeting Intelligence · Japanese · Hindi · English · Mixed

Turns any meeting transcript into structured business intelligence in ~3 seconds.

The Problem No Generic AI Tool Solves

Generic meeting summarisers extract what was said. They miss what was meant.

What was said	Generic AI	TranscriptAI
検討いたします	"Action: We will consider it"	⚠ Soft rejection — 72% confidence. Follow up explicitly.
難しいかもしれません	"It may be difficult" — neutral	🚨 HIGH rejection signal — 90% confidence. Deal at risk.
前向きに検討	"Positive consideration"	⚠ Uncertain — outcome not guaranteed (55%)
承知いたしました	"Acknowledged"	🏯 High keigo register — senior speaker or formal context
善処します	"Action: We will handle it"	🚨 Classic nemawashi dodge — no real commitment made
パートナーシップは継続しないことを決定しました	"Decision made"	⛔ CRITICAL — Explicit contract termination. Irrevocable.

Japanese enterprise also mandates APPI compliance — raw meeting data cannot be sent to foreign cloud LLMs. Most tools fail this requirement by design. TranscriptAI masks PII locally before any LLM call.

What It Does

Input: Any transcript (JP · HI · EN · Mixed) or Audio file (MP3/MP4/WAV)
Output: Structured business intelligence in ~3 seconds

Trilingual analysis — Japanese / Hindi (Hinglish) / English / Mixed in a single meeting
Keigo detection — MeCab morphological analysis extracts formality register (High / Medium / Low)
20 soft rejection patterns — nemawashi, 難しいですね, ぜひ検討, and more with confidence scores
CRITICAL risk tier — explicit contract termination detection, separate from soft refusals
8 Hindi indirect patterns — देखते हैं, थोड़ा मुश्किल है, कोशिश करेंगे and more
APPI-compliant PII masking — 500+ Japanese surnames, local NER, before any LLM call
Hallucination guard — 100% rule-based token overlap; LLM never validates its own output
Meeting Health Score — 0–100 across sentiment, action clarity, communication risk, AI confidence
議事録 export — structured Japanese business minutes in standard enterprise format
Cultural insights export — 根回し risk, 稟議 approval status, nemawashi pattern breakdown
Groq Whisper — MP4 / MP3 / WAV audio transcription
Vector cache — ChromaDB semantic similarity for instant return on similar transcripts
2-key Groq rotation — auto-failover on 429 rate limits

What was said	Generic AI	TranscriptAI
検討いたします	"Action: We will consider it"	⚠ Soft rejection — 72% confidence
難しいかもしれません	"It may be difficult" — neutral	🚨 HIGH rejection signal — deal at risk
パートナーシップは継続しないことを決定しました	"Decision made"	⛔ CRITICAL — Explicit contract termination

Accuracy History

Version	Key change	Accuracy
v1	Hard exact matching, English-only	22–30%
v2	Fuzzy speaker names, TF-IDF similarity	~45%
v3	MeCab keigo override, bilingual ground truth	~60%
v4	Hallucination guard, nemawashi patterns, APPI masking	75–85%
v5 (live)	2-key rotation, vector cache, MLflow, bypass_cache eval fix	93.8%

Every rebuild was driven by evaluation metric failures traced through the pipeline — not intuition. When action F1 was 0.4 at v2, tracing showed the owner field extracted role titles ("Director") instead of first names. One prompt instruction fixed it.

Evaluation Metrics · v5 Live

Test case	Overall	ROUGE-1	Action F1	Sentiment
Sales call · JA/EN mixed	94.5%	0.694	1.0	1.0
Internal meeting · Japanese	93.8%	0.703	1.0	1.0
Client conflict · EN/JA	93.8%	0.703	1.0	1.0

Architecture — 11-Stage Pipeline

Input transcript
    │
    ▼
 1  Vector cache check       utils/vector_cache.py          ChromaDB cosine similarity
 2  MD5 exact cache          utils/cache.py                 Instant return if identical
 3  PII masking              transcription/pii_masker.py    APPI — BEFORE LLM
 4  LLM analysis             analysis/analyzer.py           Groq key1 → key2 → mock
 5  PII restoration          transcription/pii_masker.py    BEFORE speaker normalization
 6  Speaker normalization    transcription/speaker_normalizer.py
 7  MeCab keigo override     analysis/japanese_tokenizer.py
 8  Code-switch count        utils/evaluator.py             Rule-based Unicode ranges
 9  Hallucination guard      analysis/hallucination_guard.py
10  Soft rejection detection analysis/soft_rejection_detector.py
11  Cache + log              utils/vector_cache.py + utils/logger.py

Critical ordering constraint: PII must be masked before the LLM sees the text. PII must be restored before speaker normalization — otherwise the normalizer receives [NAME_1] and cannot resolve 田中 ↔ Tanaka ↔ Director as the same person.

Technology Decisions

Decision	Chosen	Over	Reason
Vector DB	ChromaDB	Pinecone, Weaviate	Free, local, HF Spaces compatible, APPI compliant
LLM inference	Groq	OpenAI, Anthropic	10–20× faster (LPU vs GPU). Free tier. JSON mode.
Japanese NLP	MeCab + IPADIC	spaCy ja, Fugashi	Morpheme-level auxiliary verb detection — keigo is invisible to word-level tokenizers
Web framework	FastAPI	Flask, Django	Native async with `asyncio.to_thread()`. Auto Swagger.
Eval tracking	MLflow	W&B, Neptune	Free, local SQLite, APPI compliant

Quick Start

git clone https://github.com/aiKunalBisht/Transcript-ai.git
cd Transcript-ai
pip install -r requirements.txt
export GROQ_API_KEY=your_key_here

# FastAPI (docs at localhost:7860/docs)
uvicorn main:app --reload --port 7860

Local inference (zero cost, full APPI data residency):

ollama pull qwen3:8b
# App auto-detects Ollama — no config needed

Accuracy

main.py                           FastAPI server + route handlers
analysis/
  analyzer.py                     LLM orchestration + 2-key Groq rotation
  soft_rejection_detector.py      20 JP nemawashi + 8 HI indirect + CRITICAL termination
  japanese_tokenizer.py           MeCab keigo extraction
  hallucination_guard.py          Rule-based token overlap validation
  english_analyzer.py             English NLP patterns
  hindi_analyzer.py               Hindi/Hinglish indirect signals
agents/
  gijiroku_formatter.py           議事録 Japanese business minutes generator
  cultural_insights_formatter.py  Cultural context layer for JP meetings
  slide_architect.py              PPTX slide plan generation
exporters/
  pptx_builder.py                 PowerPoint export builder
transcription/
  pii_masker.py                   APPI compliance — local PII masking
  audio_processor.py              Groq Whisper audio transcription
utils/
  html_renderer.py                Results renderer — health score, tabs, insights
  vector_cache.py                 ChromaDB semantic cache
  logger.py                       JSONL trends + drift detection
templates/
  base.html                       Sidebar, navigation, session persistence
  index.html                      Main analysis page
  export.html                     Export page — PPTX, 議事録, MD, JSON, TXT
static/
  style.css                       Full responsive CSS — mobile, tablet, desktop
tests/
  test_data.py                    3 bilingual ground truth test cases

REST API

# Analyze a transcript
POST /analyze-text
  transcript: str
  language:   str | null   # auto-detect if null
  mask_pii:   bool         # default true

# Exports
POST /export/pptx
POST /export/gijiroku
POST /export/cultural-insights
POST /export/markdown
POST /export/json
POST /export/txt

# Health
GET /health

Performance

Metric	Score
Lighthouse Performance	94
Lighthouse Accessibility	100
CLS (Cumulative Layout Shift)	0
Speed Index	1.2s

Built by Kunal Bisht AI Engineer · LLM Systems & RAG Pipelines · Multilingual NLP Uttarakhand, India · Open to Remote / Relocation

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github/workflows		.github/workflows
agents		agents
analysis		analysis
api		api
exporters		exporters
rags		rags
static		static
templates		templates
tests		tests
transcription		transcription
utils		utils
.DOCKERIGNORE		.DOCKERIGNORE
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
README.md		README.md
eval_25_recordings.json		eval_25_recordings.json
eval_30_recordings.json		eval_30_recordings.json
eval_60_recordings.json		eval_60_recordings.json
health_check.py		health_check.py
main.py		main.py
patch_main.py		patch_main.py
requirements.txt		requirements.txt
robots.txt		robots.txt
setup_migration.py		setup_migration.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Problem No Generic AI Tool Solves

What It Does

Accuracy History

Evaluation Metrics · v5 Live

Architecture — 11-Stage Pipeline

Technology Decisions

Quick Start

Accuracy

REST API

Performance

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Problem No Generic AI Tool Solves

What It Does

Accuracy History

Evaluation Metrics · v5 Live

Architecture — 11-Stage Pipeline

Technology Decisions

Quick Start

Accuracy

REST API

Performance

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages