Skip to content

docs(schema): document HNSW int8 quantization option#508

Closed
kriszyp wants to merge 0 commit into
mainfrom
kris/hnsw-int8-quantization-docs
Closed

docs(schema): document HNSW int8 quantization option#508
kriszyp wants to merge 0 commit into
mainfrom
kris/hnsw-int8-quantization-docs

Conversation

@kriszyp

@kriszyp kriszyp commented Jun 1, 2026

Copy link
Copy Markdown
Member

Summary

Documents the new quantization: "int8" option for HNSW vector indexes — adds a row to the HNSW parameter table and a short example.

Context

Pairs with HarperFast/harper#894 (optional int8 vector quantization for the HNSW index): ~5× smaller index, substantially faster search, ~1% recall cost; opt-in, the record's full-precision vector is unchanged. Making it the default is tracked separately (HarperFast/harper#932).

Generated with the assistance of an LLM (Claude Opus 4.8).

@kriszyp kriszyp requested a review from a team as a code owner June 1, 2026 12:28

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the database schema documentation to include details and an example for the new quantization parameter (specifically "int8") for HNSW indexes. As there are no review comments, I have no feedback to provide.

@github-actions github-actions Bot temporarily deployed to pr-508 June 1, 2026 12:32 Inactive
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

🚀 Preview Deployment

Your preview deployment is ready!

🔗 Preview URL: https://preview.harper-documentation.harperfabric.com/pr-508

This preview will update automatically when you push new commits.

Comment thread reference/database/schema.md Outdated
| `optimizeRouting` | `0.5` | Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive) |
| `mL` | computed from `M` | Normalization factor for level generation |
| `efSearchConstruction` | `50` | Max nodes explored during search |
| `quantization` | _(full precision)_ | `"int8"` stores each indexed vector as 8-bit scalar-quantized values plus a per-vector scale instead of float32 — roughly a 5× smaller index and substantially faster search, at a small recall cost (~1%). Omit for full-precision float32. Only the index is quantized; the full-precision vector on the record is unchanged. |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think default should be "float32" for this option to make more sense. _(full precision)_ is not a value. I believe right now this option only supports two values, but will there be more in the future? If not, and this is truly a binary configuration, should it not reflect that better? Like maybe making it a toggleable int8Quantization: boolean ?

@kriszyp kriszyp closed this Jun 13, 2026
@kriszyp kriszyp force-pushed the kris/hnsw-int8-quantization-docs branch from 7da20a0 to dcd5dc2 Compare June 13, 2026 00:29
kriszyp added a commit that referenced this pull request Jun 13, 2026
* docs(v5.1): release notes, deployment tracking ops, deploy_component updates

- Add 5.1.md release notes covering: models/AI, @embed directive, MCP server,
  deployment tracking, HNSW int8 quantization, and replication improvements
- Update deploy_component docs: urlPath, install_allow_scripts params, deployment_id response
- Document new deployment operations: list_deployments, get_deployment,
  get_deployment_payload, delete_deployment_payload
- Document hdb_deployment record schema (fields, phases, peer_results)

Note: models/AI detail, MCP reference, and HNSW quantization have separate PRs
(#523, #507/#516, #508) — this PR adds the release notes overview and the
deployment tracking operations which had no coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: run prettier on changed files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove cross-plugin MCP link that breaks Docusaurus build

The release-notes and reference doc plugins are separate; relative .md
links between them resolve incorrectly. Removing until PR #507 (MCP
reference section) merges and can be linked with an absolute path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(5.1): expand release notes — middleware/routing, caching, LOCAL_ONLY, HARPER_CONFIG, RocksDB, migrateOnStart, upgrade improvements

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

🚀 Preview Deployment

Your preview deployment is ready!

🔗 Preview URL: https://preview.harper-documentation.harperfabric.com/pr-508

This preview will update automatically when you push new commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants