feat(models): in-app model library with Discover browser and Staff Picks#237

Merged

quiet-node merged 89 commits into

mainfrom

feat/local-inference-phase-3

Jun 21, 2026

quiet-node commented Jun 21, 2026 •

edited

Loading

Owner

Overview

Thuki ships and supervises its own llama.cpp engine, so model management belongs inside the app, not in a terminal. This PR builds out the Settings → Models experience end to end: find, download, and switch models without leaving Thuki, alongside the engine-residency and reasoning controls that make local inference feel responsive.

What's new

Models settings, redesigned. A left-sidebar layout with three segmented sections:

Library lists installed models with capability pills, RAM-fit, context window, and a themed picker to set the active model.
Discover finds and downloads new models (see below).
Providers shows the active provider as a hero card and hosts the shared Generation settings (context size, keep-warm residency).

Discover.

Staff Picks: a curated catalog grouped by use case (Everyday chat, Compact & fast, Deep reasoning), three vetted models per category. Each entry is pinned to an exact Hugging Face revision with a recorded sha256, size, capability flags, and context window, and downloads through a verified catalog path.
Browse all: live Hugging Face GGUF search with pagination, per-quant RAM-fit hints, capability pills, and a caution plus per-download confirm before pulling an arbitrary repo.
Parallel downloads: several models can download at once. Each row tracks its own progress; interrupted transfers surface as Paused with Resume and Discard, and resume in place via HTTP Range.

Context window everywhere. Each model's trained context length is shown on Staff Picks, Browse-all repo rows, and Library, healed from the curated registry or sanitized GGUF metadata.

Engine residency (keep-warm). The built-in engine warm-loads on summon and on the first keystroke and stays resident according to a unified keep-warm inactivity setting; switching away from Ollama evicts its model from VRAM. At most one model is ever resident.

Reasoning control. Thinking is opt-in per request via /think. Models that reason unconditionally are marked with an "Always thinks" pill, derived from a GGUF chat-template classifier; whatever reasoning a model emits is shown in the thinking block, never hidden.

Download experience. Progress renders as an inline hairline with a single continuous bar across a model's weights and vision companion, clear per-kind failure states, and a path back to the picker. Engine load failures (including unsupported model architectures) are surfaced from llama-server rather than failing silently.

How it works

Downloads stream from Hugging Face into a content-addressed blob store (tmp/<sha>.partial then atomic rename to blobs/<sha>), resume via HTTP Range, and verify sha256 on completion as an integrity check; provenance comes from the pinned repo revisions.
The Settings window holds a per-window download registry at its root, so downloads stay alive across Library / Discover / Providers tab switches that unmount the panes.
Hugging Face search targets a fixed Hub host with a percent-encoded query and a bounded page size, and the IPC payload is validated before it is trusted; the page size is capped so "Load more" settles instead of refetching the ceiling.
Library and Discover manage the bundled engine's models, so both are gated when a non-built-in provider is active; Providers stays available.
Every default and bound lives in config/defaults.rs; new user-facing tunables and baked-in constants are documented in docs/configurations.md.

Testing

Frontend: Vitest with React Testing Library; full coverage across the new panes, hooks, the download reducer, and the Settings download registry.
Backend: cargo test with llvm-cov at 100% line coverage, covering GGUF parse bounds, download resume and verify, HF search parsing and pagination, manifest CRUD, and engine residency transitions.
validate-build (ESLint, cargo clippy -D warnings, Prettier, tsc --noEmit, release build) passes clean.

quiet-node added 30 commits

June 21, 2026 15:30


          feat: Hugging Face model search and thinking-capability detection

d92ae6b

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: restructure Settings to a premium left sidebar with a running-m…

da15b51

…odel footer

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: add the Models segmented Library/Discover/Providers control

699d50e

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: wire the Models segmented control into the Model tab

afad609

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: build the Models surface with Active-Hero providers, Library, a…

0b1b1c0

…nd Discover

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          style: reskin the standard settings tabs to the premium tokens

55db7c7

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: HF search text-gen default, paginated Load-more, and RAM-fit an…

f58f6f1

…notations

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          style: restyle the Models segmented control to icon-above-label tabs

9a85500

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: redesign the Library pane with quiet rows, a popover menu, and …

cce84ee

…RAM-fit

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: redesign the Discover pane with RAM-fit, HF links, icon downloa…

60ad330

…d, and Load more

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: Providers pane row padding, switch confirmation, single prompt h…

a1dbcd3

…eader, footer

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          style: apply prettier and rustfmt, and align the search hook setter name

96151f9

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          refactor: drop unused est_runtime_gb from search rows and share the R…

f4aa41f

…AM-fit label map

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: Discover search returns downloadable GGUF chat repos and add rev…

f82b5ed

…eal-in-Finder

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: refresh the provider model dropdown on switch, consistent names,…

7ae917a

… premium switch dialog

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: RAM-fit tooltips, capability pills, clickable Discover titles, …

ab09401

…and search caching

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: lift config on Ollama model switch so Running card updates

d29e16a

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          refactor: drop unreliable row-level RAM-fit, keep accurate per-quant fit

ac5adcc

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          polish: shorten RAM-fit hover tooltips to one clean line each

b179d9c

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          polish: calm capability pills, add Text, rename Reasoning to Thinking…

868b6d9

…, drop Active pill

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          polish: restyle Models tabs to match Settings nav, drop the container…

1e08d34

… box

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: make built-in reasoning opt-in via /think with honest thinking UX

3d831a6

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: drop reasoning output when thinking is off for a model-agnostic …

bf14fdc

…policy

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: extend reasoning-off to all kwarg-controllable model families

d4363f1

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: show model reasoning instead of hiding it when thinking is off

891bf9b

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: badge models whose reasoning cannot be turned off

58f897c

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: dynamically classify reasoning capability of downloaded GGUF mo…

c872929

…dels

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: add family grouping field to the curated starter registry

2a0ff75

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          refactor: relocate the raw Hugging Face browser to BrowseAllPane

ca582bc

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: add curated Staff picks family accordion for Discover

15d2db1

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node added 18 commits

June 21, 2026 15:30


          fix: keep Discover model downloads alive across tab switches and the …

9bac5f0

…Settings webview

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: flip the Library model menu above its trigger when space is tight

f5c94c8

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: name the built-in engine's actual resident model in keep-warm st…

dcea71c

…atus

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: download multiple models in parallel from Settings Discover

7f47e2f

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat(settings): gate Library and Discover for non-built-in providers

8163e59

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(models): hide the download control for installed quants in Browse…

af052c2

…-all

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: unify Models info, link names to Hugging Face, fix menu clipping

6dab3c7

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: truncate long resident-model name in keep-warm status

150ea81

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat: capability pills in Browse-all, shared capability derivation, a…

43f64f9

…nd active-model preserved on install

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          refactor(discover): remove Browse-all result count label

a79e359

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat(discover): show per-family download status pills on Browse-all rows

837cab3

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: broadcast active-model changes so the Settings panel and overlay…

dd1e571

… picker stay in sync

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: dedup built-in warm-up primes and surface a warming status

839fd04

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix: allow discarding a paused download while another download runs

0a4af79

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat(settings): themed model picker popover for Providers

c30d69a

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(engine): surface llama-server load failures and flag unsupported …

3bb13b0

…models

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          feat(settings): caution notice and per-download confirm in Browse all

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          chore(onboarding): note more models live in Settings

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node force-pushed the feat/local-inference-phase-3 branch from 1543b70 to 9030528 Compare

June 21, 2026 20:31

quiet-node added 2 commits

June 21, 2026 15:59


          docs(openai): correct reasoning note; off-mode thinking is shown, not…

2f180d4

… dropped

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(discover): stop HF search Load more refetching the capped page fo…

750cf8b

…rever

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node changed the title ~~feat: local inference phase 3 — model library, Discover, and Staff Picks~~ feat(models): in-app model library with Discover browser and Staff Picks

quiet-node added 4 commits

June 21, 2026 16:29


          fix(engine): reset built-in warm-up dedup when the engine unloads

f9b1d88

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(engine): prefer the actionable stderr line in engine-start errors

cda5025

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(discover): ignore re-entrant download starts for an in-flight model

9187d9a

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>


          fix(discover): cap the Hugging Face search input at the backend query…

14d81ae

… length

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node marked this pull request as ready for review

June 21, 2026 21:41

quiet-node merged commit 23b32ef into main

3 checks passed

quiet-node deleted the feat/local-inference-phase-3 branch

June 21, 2026 21:45

github-actions Bot mentioned this pull request

chore(main): release 0.15.0 #221

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet