Skip to content

docs: cover 0.4.44 — quantize, federation, models redesign, secrets#1

Open
webdevtodayjason wants to merge 1 commit into
mainfrom
codex/docs-0.4.44
Open

docs: cover 0.4.44 — quantize, federation, models redesign, secrets#1
webdevtodayjason wants to merge 1 commit into
mainfrom
codex/docs-0.4.44

Conversation

@webdevtodayjason

Copy link
Copy Markdown
Contributor

Brings the docs site up to the current product surface (code is 0.4.44; the site was last aligned around 0.4.4). Closes the largest gaps found in a README/CHANGELOG/docs gap analysis.

What changed

  • NEW guides/quantize.mdx — in-browser AWQ (W4A16) / NVFP4 quantization: scheme choice, calibration samples, the idle-node guardrail (409 while a model is loaded; force:true override), push-to-HF, and Qwen3.5 multimodal/Gated-DeltaNet handling. AWQ is presented as proven; NVFP4-on-Qwen3.5 is explicitly marked experimental/unverified.
  • NEW guides/secrets.mdx — the local Secrets store and the read vs write HF token distinction (write required to push; read-only tokens rejected up front), plus NGC / W&B / OpenAI slots.
  • NEW guides/federation.mdx — federated master router (routes /v1/* by model name), load/unload any model on any node from the UI, model stacking + restart persistence, and the per-node memory-utilization knob.
  • REWRITE guides/models.mdx — the two-list Installed / Browse redesign and Unload (the old text described the single "Downloads" tab + "Launch Model"); adds serve-from-on-disk-weights.
  • UPDATE introduction.mdx — new capability cards (Quantization, Federated serving, Model stacking) and federation + quantization added to the architecture box.
  • UPDATE docs.json — the three new pages added to the Guides nav.

Deliberately omitted

  • NVFP4 on Qwen3.5 — flagged experimental, not documented as supported.
  • DS4 (not built) and TRITON_ATTN (currently a no-op hedge) — not documented.

Validation

  • docs.json parses as valid JSON; every nav-referenced guide page exists on disk.

Do not merge yet — opening for review.

🤖 Generated with Claude Code

https://claude.ai/code/session_01JfM4xyZR4DdC3W74ea99Mi

Bring the docs site up to the 0.4.44 product surface.

- add guides/quantize.mdx — in-browser AWQ / NVFP4 quantization, idle-node
  guardrail, HF push, Qwen3.5 multimodal/hybrid handling (AWQ verified;
  NVFP4-on-Qwen3.5 marked experimental)
- add guides/secrets.mdx — local Secrets store; HF read vs write tokens
  (write required for pushes, read-only rejected up front); NGC/W&B/OpenAI
- add guides/federation.mdx — master router (route /v1/* by model name),
  load/unload any node from the UI, model stacking + persistence, mem-util knob
- rewrite guides/models.mdx — two-list Installed/Browse redesign, Unload
  (was DELETE), serve-from-on-disk-weights
- introduction.mdx — add Quantization / Federated serving / Model stacking
  cards; add federation + quantization to the architecture box
- docs.json — add the three new pages to the Guides nav

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01JfM4xyZR4DdC3W74ea99Mi
@mintlify

mintlify Bot commented Jun 27, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
justme-8834e675 🟢 Ready View Preview Jun 27, 2026, 4:55 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant