Skip to content

feat(models): in-app model library with Discover browser and Staff Picks#237

Merged
quiet-node merged 89 commits into
mainfrom
feat/local-inference-phase-3
Jun 21, 2026
Merged

feat(models): in-app model library with Discover browser and Staff Picks#237
quiet-node merged 89 commits into
mainfrom
feat/local-inference-phase-3

Conversation

@quiet-node

@quiet-node quiet-node commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Overview

Thuki ships and supervises its own llama.cpp engine, so model management belongs inside the app, not in a terminal. This PR builds out the Settings → Models experience end to end: find, download, and switch models without leaving Thuki, alongside the engine-residency and reasoning controls that make local inference feel responsive.

What's new

Models settings, redesigned. A left-sidebar layout with three segmented sections:

  • Library lists installed models with capability pills, RAM-fit, context window, and a themed picker to set the active model.
  • Discover finds and downloads new models (see below).
  • Providers shows the active provider as a hero card and hosts the shared Generation settings (context size, keep-warm residency).

Discover.

  • Staff Picks: a curated catalog grouped by use case (Everyday chat, Compact & fast, Deep reasoning), three vetted models per category. Each entry is pinned to an exact Hugging Face revision with a recorded sha256, size, capability flags, and context window, and downloads through a verified catalog path.
  • Browse all: live Hugging Face GGUF search with pagination, per-quant RAM-fit hints, capability pills, and a caution plus per-download confirm before pulling an arbitrary repo.
  • Parallel downloads: several models can download at once. Each row tracks its own progress; interrupted transfers surface as Paused with Resume and Discard, and resume in place via HTTP Range.

Context window everywhere. Each model's trained context length is shown on Staff Picks, Browse-all repo rows, and Library, healed from the curated registry or sanitized GGUF metadata.

Engine residency (keep-warm). The built-in engine warm-loads on summon and on the first keystroke and stays resident according to a unified keep-warm inactivity setting; switching away from Ollama evicts its model from VRAM. At most one model is ever resident.

Reasoning control. Thinking is opt-in per request via /think. Models that reason unconditionally are marked with an "Always thinks" pill, derived from a GGUF chat-template classifier; whatever reasoning a model emits is shown in the thinking block, never hidden.

Download experience. Progress renders as an inline hairline with a single continuous bar across a model's weights and vision companion, clear per-kind failure states, and a path back to the picker. Engine load failures (including unsupported model architectures) are surfaced from llama-server rather than failing silently.

How it works

  • Downloads stream from Hugging Face into a content-addressed blob store (tmp/<sha>.partial then atomic rename to blobs/<sha>), resume via HTTP Range, and verify sha256 on completion as an integrity check; provenance comes from the pinned repo revisions.
  • The Settings window holds a per-window download registry at its root, so downloads stay alive across Library / Discover / Providers tab switches that unmount the panes.
  • Hugging Face search targets a fixed Hub host with a percent-encoded query and a bounded page size, and the IPC payload is validated before it is trusted; the page size is capped so "Load more" settles instead of refetching the ceiling.
  • Library and Discover manage the bundled engine's models, so both are gated when a non-built-in provider is active; Providers stays available.
  • Every default and bound lives in config/defaults.rs; new user-facing tunables and baked-in constants are documented in docs/configurations.md.

Testing

  • Frontend: Vitest with React Testing Library; full coverage across the new panes, hooks, the download reducer, and the Settings download registry.
  • Backend: cargo test with llvm-cov at 100% line coverage, covering GGUF parse bounds, download resume and verify, HF search parsing and pagination, manifest CRUD, and engine residency transitions.
  • validate-build (ESLint, cargo clippy -D warnings, Prettier, tsc --noEmit, release build) passes clean.

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…odel footer

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…nd Discover

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…notations

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…RAM-fit

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…d, and Load more

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…eader, footer

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…AM-fit label map

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…eal-in-Finder

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
… premium switch dialog

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…and search caching

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…, drop Active pill

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
… box

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…policy

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…dels

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…Settings webview

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…atus

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…-all

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…nd active-model preserved on install

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
… picker stay in sync

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…models

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
@quiet-node quiet-node force-pushed the feat/local-inference-phase-3 branch from 1543b70 to 9030528 Compare June 21, 2026 20:31
… dropped

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
…rever

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
@quiet-node quiet-node changed the title feat: local inference phase 3 — model library, Discover, and Staff Picks feat(models): in-app model library with Discover browser and Staff Picks Jun 21, 2026
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
… length

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
@quiet-node quiet-node marked this pull request as ready for review June 21, 2026 21:41
@quiet-node quiet-node merged commit 23b32ef into main Jun 21, 2026
3 checks passed
@quiet-node quiet-node deleted the feat/local-inference-phase-3 branch June 21, 2026 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant