Skip to content

acari-git/MLXServerManager

Repository files navigation

MLX Server Manager

MLX Server Manager is a lightweight macOS SwiftUI GUI for operating local OpenAI-compatible MLX endpoints in Direct Mode.

It is primarily a control surface for app-managed mlx_lm.server, with support for detecting and adopting an already-running external OpenAI-compatible server as connection context.

The app keeps Direct Mode:

OpenAI-compatible client -> mlx_lm.server or adopted external server -> MLX model

MLX Server Manager controls and observes app-managed local server processes, but it does not enter the inference request path. OpenAI-compatible clients connect directly to the server endpoint.

Screenshots

Main Dashboard

MLX Server Manager main dashboard

The main dashboard shows the core MLX Server Manager workflow in one place: model profiles, app settings, diagnostics, server status, selected model details, OpenAI-compatible connection settings, and logs.

MLX Server Manager remains a Direct Mode control surface for mlx_lm.server. It does not proxy inference requests.

Connection Settings / Current Target

Connection Settings Current Target summary

Connection Settings shows the current OpenAI-compatible target clearly.

It displays:

  • target type
  • Base URL
  • Model ID
  • API key placeholder
  • readiness endpoint
  • ownership note
  • copy actions for client setup

The copy actions help configure OpenAI-compatible clients, including Hermes Agent, without routing inference through MLX Server Manager.

Adopted External Server

Adopted External Server state

Adopted External Server mode lets users use a detected external OpenAI-compatible server as a connection context.

Adopt does not mean process ownership. MLX Server Manager does not stop, restart, kill, monitor memory for, or collect logs from adopted external servers.

The Direct Mode path remains:

OpenAI-compatible client -> mlx_lm.server or adopted external server -> MLX model

First-run Onboarding Guidance

First-run Onboarding Guidance panel

The onboarding guidance panel gives short, state-aware setup hints for first-time users.

It helps users understand what to configure next, such as the mlx_lm.server executable path, selected model profile, server state, and connection settings.

The guidance is informational only. It does not install dependencies, download models, start external processes, proxy inference, or change process ownership.

Why This Project Exists

mlx_lm.server is fast and simple, but day-to-day local use benefits from a small GUI around process management, diagnostics, model profiles, logs, memory display, external endpoint visibility, and connection settings.

MLX Server Manager exists to provide that management layer without becoming the inference layer. The goal is to make pure mlx_lm.server easier to operate for local OpenAI-compatible clients, especially agent tools that need a stable local endpoint.

Project Principles

MLX Server Manager follows three product principles:

  1. Preserve mlx-lm runtime performance as the top priority.
  2. Make mlx-lm usable for users who are not comfortable with CLI workflows.
  3. Adopt useful features from other local LLM tools when they do not conflict with mlx-lm performance, safety, or Direct Mode boundaries.

See docs/product_direction.md for the full project direction, including current non-goals and future candidate features.

What This Is

  • A local macOS app for starting, stopping, and restarting an app-managed mlx_lm.server.
  • A status and diagnostics surface for readiness checks via GET /v1/models.
  • A model profile editor for local OpenAI-compatible endpoint settings.
  • A managed-process log and memory display.
  • A Direct Mode connection settings copier for OpenAI-compatible clients.
  • A conservative external server detector for selected host/port endpoints.
  • An Adopt External Server flow for connection context only, not process ownership.

What This Is Not

  • Not a chat UI.
  • Not an inference proxy.
  • Does not include full Hugging Face model-card browsing, HF token storage, model deletion, or cache cleanup.
  • Not a model deletion tool.
  • Not a multi-backend wrapper.
  • Not a replacement for mlx-lm or model setup.

Install

Download the latest app-code release asset from GitHub Releases:

MLXServerManager-v20.2.0-unsigned.zip

On the GitHub Release page, use the file listed under Assets with this exact name. Do not use Source code (zip) or Source code (tar.gz) when you want the app.

v20.2.0 is the latest app-code release. v20.1.0 was a docs-only current-state alignment release.

Verify the checksum before opening the app:

SHA-256: 8f36dde1514fb52e702b00e1926e6443ad4a2ee00c8dd24fd78d253906435afc

From the folder containing the downloaded zip, you can check it with:

shasum -a 256 MLXServerManager-v20.2.0-unsigned.zip

Extract the zip and confirm it contains MLXServerManager.app. Move the app to your preferred local location, such as /Applications or a user-owned apps folder.

This is an unsigned, non-notarized local-use build. macOS may show a Gatekeeper warning. If you trust the Release asset, verify the zip contents and checksum before removing quarantine:

xattr -dr com.apple.quarantine /path/to/MLXServerManager.app
open -n /path/to/MLXServerManager.app

Do not download the source archive if you want the app binary. Use the named MLXServerManager-v20.2.0-unsigned.zip release asset.

Quick Start

  1. Open MLXServerManager.app.
  2. Review the Dashboard first-launch checklist and fix missing items from the displayed next actions.
  3. Use the executable picker or Settings to set the mlx_lm.server executable path.
  4. Add a model using one of the Dashboard paths:
    • search Hugging Face, choose a result, review the preview, then download it;
    • paste a Hugging Face model ID / URL and download it;
    • register an existing local model folder with the folder picker;
    • or use the advanced profile editor.
  5. Confirm the model appears in the model list, has the expected source badge, and is selected. Use the source filter when you have many profiles.
  6. Use runtime diagnostics and Start preflight before launching mlx_lm.server.
  7. After the server is ready, run the explicit Speed Test to measure /v1/models latency.
  8. Review benchmark history, best/average latency, and failure guidance in the Dashboard runtime panel.
  9. If a download fails, use the queue restore, URL copy, or Retry actions.
  10. Confirm the Current Target summary in Connection Settings or the Dashboard connection card.
  11. Copy the Hermes Agent, generic OpenAI-compatible, curl preset, or benchmark summary.
  12. Paste connection values into an OpenAI-compatible client.

Current GUI Direction

v20.2.0 focuses on integrated workspace operations polish. The model list now uses status-driven Load / Unload controls, sortable columns, model size display, reasoning toggles, functional auto-unload, Activity Monitor-style memory and CPU graphs, and a SYSTEM hardware summary while preserving Direct Mode.

You must provide your own mlx-lm environment and mlx_lm.server executable. You can register existing model files, search Hugging Face and choose a result, or use the Dashboard Hugging Face card to fetch a model by ID / URL and auto-add it to the model list. The app keeps Direct Mode: the client connects directly to mlx_lm.server; MLX Server Manager does not proxy inference traffic or run chat completions.

See docs/distribution.md for release asset and Gatekeeper details, docs/known_limitations.md for the full stable-scope boundary, and docs/v20_2_integrated_workspace_operations_polish.md for the v20.2.0 integrated workspace operations note.

See docs/benchmark_findings.md for benchmark-informed notes on Direct Mode, long-context workloads, streaming TTFT, and future optional Advanced Launch Options.

Advanced Launch Options are optional, per-profile user-tunable settings. They are empty by default and omitted from launch arguments unless explicitly set. See docs/advanced_launch_options.md for design notes and safety boundaries.

External server detection is documented in docs/external_server_detection.md. It detects existing OpenAI-compatible servers on the selected host/port without taking ownership of external processes.

Adopt External Server behavior is documented in docs/adopt_external_server.md. v1.7.0 adds the initial implementation for explicitly adopting a detected external server as connection context only, without taking process ownership.

Connection Settings polish is documented in docs/connection_settings_polish.md. v1.9.0 implements the initial Current Target summary and expanded copy actions for Managed, External Detected, Adopted, and Not Running connection states. Direct Mode remains unchanged.

Dashboard UI Refresh v1 is documented in docs/dashboard_ui_refresh.md. v4.2.0 through v4.9.0 built the dashboard in small display-oriented steps. v5.0.0 finalizes Dashboard v1 as the current stable overview for Next Steps, Current Target, Server State, Client Setup, Diagnostics & Logs, and Profiles / Import Export while preserving lifecycle behavior, Direct Mode, external server ownership boundaries, and Import / Export behavior. v5.1.0 keeps that surface stable and documents larger layout ideas as future work rather than part of Dashboard v1.

Future Full App Layout Refresh planning is documented in docs/full_app_layout_refresh.md. v5.2.0 is a docs-only planning release for possible future v6.x sidebar, profiles, server, logs, client setup, and inspector surfaces. It does not implement a new layout or change Dashboard v1 behavior.

App Shell / Sidebar Foundation is documented in docs/app_shell_sidebar_foundation.md. v5.3.0 introduced the docs-only design, and v6.0.0 implements the initial native sidebar shell with Dashboard as the default and only active top-level section. It does not change runtime behavior, Direct Mode, lifecycle controls, Import / Export behavior, network behavior, or persistence.

Profiles / Model List Surface design is documented in docs/profiles_model_list_surface.md. v5.4.0 is a docs-only detailed design release for a possible future v6.1.0 Profiles section; it does not implement a model list table, model download, model deletion, or profile behavior changes.

Detail Inspector Foundation design is documented in docs/detail_inspector_foundation.md. v5.5.0 is a docs-only detailed design release for a possible future v6.2.0 inspector area; it does not implement inspector UI, endpoint testing, model file management, or behavior changes.

Logs Panel Refresh design is documented in docs/logs_panel_refresh.md. v6.3.0 implements the first Logs Panel Refresh as a top-level sidebar destination for app-managed lifecycle and log context. v6.3.1 polishes the shared log view with entry count display and stable identifiers. It does not implement external log capture, telemetry, background monitoring, or behavior changes.

Client Setup Surface design is documented in docs/client_setup_surface.md. v6.4.0 implements the first Client Setup surface as a top-level sidebar destination for copy-safe OpenAI-compatible setup values. v6.4.1 polishes that surface with a copy scope card. It does not implement generated client configs, API key storage, endpoint testing, proxying, or behavior changes.

Metrics / System Context design is documented in docs/metrics_system_context.md. v6.5.0 implements the first Metrics / System Context surface as a read-only top-level sidebar destination. v6.5.1 polishes that surface with a context scope card. It does not implement active system monitoring, telemetry, request inspection, or behavior changes.

v6 Implementation Readiness Review is documented in docs/v6_implementation_readiness.md. v5.9.0 is a docs-only readiness release that consolidates v5.2.0 through v5.8.0 planning; v6.0.0 starts app-code work with the narrow App Shell / Sidebar Foundation only.

v6 App Layout Stabilization Review is documented in docs/v6_app_layout_stabilization_review.md. v6.6.0 is a docs-only stabilization review after the v6.0.0 through v6.5.1 App Shell and top-level surface work. v6.6.1 polishes that review with next-phase entry criteria and a manual verification checklist. Both produce no new app binary and keep the current downloadable binary at v6.5.1.

Distribution / Packaging Readiness Review is documented in docs/distribution_packaging_readiness.md. v6.7.0 is a docs-only readiness release for future signed, notarized, DMG, and automated release workflow decisions. v6.7.1 polishes that review with install documentation planning and manual packaging verification notes. v6.8.0 refreshes the README Install and Quick Start sections around the current unsigned app asset. v6.8.1 polishes the install section with clearer GitHub Assets guidance and a checksum command. v6.16.0 adds post-closeout packaging checklist polish for the case where Developer ID signing is not yet ready. v6.16.1 adds binary release go/no-go guidance, asset naming failure cases, and release body verification notes. v6.27.0 adds current-download handoff wording so docs-only releases do not point users to GitHub source archives as the app download. These releases produce no new app binary and keep the current downloadable binary at v6.5.1.

Signed Distribution Design is documented in docs/signed_distribution_design.md. v6.9.0 is a docs-only design release for a future signed zip distribution path. v6.9.1 polishes that design with signing implementation entry criteria, manual verification notes, and asset coexistence decision criteria. It does not sign the app, run notarization, create a DMG, create an installer, add release automation, or produce a new app binary.

Notarization Workflow Design is documented in docs/notarization_workflow_design.md. v6.10.0 is a docs-only design release for future notarized distribution. v6.10.1 polishes that design with implementation entry criteria, conservative status wording rules, and fallback decision criteria. It defines credential boundaries, conceptual notarization flow, result handling, asset naming, release notes requirements, verification notes, and fallback policy without running signing, notarization, stapling, DMG, installer, release automation, or producing a new app binary.

Signed Zip Implementation Readiness is documented in docs/signed_zip_implementation_readiness.md. v6.11.0 is a docs-only readiness review for a future signed zip implementation. v6.11.1 polishes that readiness review with Go / No-Go criteria, a release artifact matrix, and a manual verification log template. It defines readiness gates, local signing preconditions, candidate manual flow, required checks, forbidden entries, release notes requirements, README install requirements, and fallback policy without signing the app, producing a new app binary, or changing runtime behavior.

Local Signing Command Draft is documented in docs/local_signing_command_draft.md. v6.12.0 is a docs-only command draft for future local signed zip creation. v6.12.1 polishes that draft with release preflight notes, dry-run expectations, and safer signing command caveats. It documents placeholders, draft build/sign/verify/zip/checksum flow, safety boundaries, verification log template, and fallback rules without executing signing or changing the current release artifact.

Signed Zip Dry-Run Checklist is documented in docs/signed_zip_dry_run_checklist.md. v6.13.0 is a docs-only checklist release for future signed zip dry-run work. v6.13.1 polishes that checklist with pass/fail criteria, evidence expectations, and stop conditions. It defines dry-run scope, repository state checks, release scope checks, build path planning, signing placeholder planning, zip naming, forbidden entries, release notes structure, fallback decisions, and a local dry-run log template without uploading a binary asset or changing runtime behavior.

Signed Zip Local Dry-Run Execution Notes are documented in docs/signed_zip_local_dry_run_execution_notes.md. v6.14.0 is a docs-only execution-notes release for future local-only signed zip dry runs. v6.14.1 polishes those notes with scrub checklist guidance, conservative public summary wording, and handoff criteria before real signed zip implementation. It defines execution boundaries, suggested local dry-run records, scope checks, public asset checks, placeholder status wording, stop conditions, evidence handling, public documentation criteria, and fallback policy without signing, notarization, binary upload, or runtime behavior changes.

Signed Zip Implementation Readiness Closeout is documented in docs/signed_zip_implementation_readiness_closeout.md. v6.15.0 is a docs-only closeout release for the signed zip readiness phase. It maps the readiness documents, defines implementation entry and no-go criteria, records conservative public wording and asset naming policy, and sets the next-step branch between real signed zip distribution and non-signing packaging polish.

Model Download Design is documented in docs/model_download_design.md. v6.17.0 is a docs-only design release for possible future model download support. v6.17.1 polishes that design with failure states, disk-space preflight, token redaction, and partial download policy. It defines download boundaries, destination path policy, credential handling boundaries, download task boundaries, compatibility wording, profile integration boundaries, safety checklist, and verification expectations without downloading models, deleting models, scanning model directories, cleaning caches, persisting tokens, or changing runtime behavior.

Model Availability Documentation is documented in docs/model_availability_documentation.md. v6.18.0 is a docs-only documentation release for future model availability surfaces. v6.18.1 polishes that documentation with staleness wording, explicit check scope, external identifier caveats, and UI copy rules. It defines availability terms, local profile availability, external server availability, compatibility boundary, path checking policy, privacy and safety wording, and verification expectations without checking paths automatically, scanning folders, downloading models, deleting models, or changing runtime behavior.

Model Availability Surface Design is documented in docs/model_availability_surface_design.md. v6.25.0 is a docs-only design release for a future model availability surface. It defines conservative placement options, user questions, state display, explicit check behavior, copy-safe path display, profile and diagnostics boundaries, external server boundaries, accessibility identifiers, empty/error states, manual verification, and implementation entry criteria without adding Swift source, tests, path checks, directory scans, downloads, model deletion, diagnostics execution, endpoint calls, inference requests, background monitoring, telemetry, or runtime behavior changes.

Deeper Diagnostics Design is documented in docs/deeper_diagnostics_design.md. v6.19.0 is a docs-only design release for future deeper diagnostics surfaces. v6.19.1 polishes that design with run trigger guardrails, severity policy, timeout/cancellation notes, and export redaction checklist. It defines diagnostic categories, result terms, explicit check scope, endpoint policy, local path policy, external server policy, privacy and redaction rules, copy summary boundaries, repair action boundaries, and verification expectations without running diagnostics, adding endpoint testing, sending inference requests, background monitoring, traffic inspection, telemetry, or runtime behavior changes.

Diagnostics Result Model Design is documented in docs/diagnostics_result_model_design.md. v6.20.0 is a docs-only design release for a future diagnostics result model. v6.20.1 polishes that design with stable IDs, copy/export eligibility, aggregation precedence, and fixture expectations. It defines candidate result shape, status and severity values, scope and category policy, redaction model, message fields, timestamp policy, cancellation and timeout results, aggregation rules, copy summary format, and UI surface boundaries without running diagnostics, adding endpoint testing, persisting diagnostics history, or changing runtime behavior.

Diagnostics Result Fixture Design is documented in docs/diagnostics_result_fixture_design.md. v6.21.0 is a docs-only design release for future diagnostics result fixtures. v6.21.1 polishes that design with fixture matrix, negative fixtures, deterministic ordering rules, and snapshot safety. It defines fixture naming policy, required status, severity, scope, category, redaction, copy summary, timeout/cancellation, external server ownership, selected profile mutation, aggregation, and fixture data safety expectations without adding tests, running diagnostics, or changing runtime behavior.

Diagnostics Result Fixture File Layout Design is documented in docs/diagnostics_result_fixture_file_layout_design.md. v6.22.0 is a docs-only design release for future diagnostics fixture file layout. v6.22.1 polishes that design with schema validation boundary, file inclusion rules, fixture review checklist, and implementation entry criteria. v6.23.0 adds initial diagnostics fixture data files under MLXServerManagerTests/Fixtures/Diagnostics/ without adding Swift source, tests, diagnostics execution, or runtime behavior changes. v6.23.1 polishes those fixture files with a fixture index README, blocking-severity fixture, home path compaction fixture, raw-command-output negative fixture, and copied-summary coverage update. v6.24.0 adds diagnostics fixture loading tests for result, redaction, negative, aggregation, and copied-summary fixtures without adding production diagnostics execution or runtime behavior changes. It compares JSON, Swift static fixtures, and Markdown examples, defines candidate test fixture directories, naming rules, schema expectations, snapshot safety, negative fixture layout, redaction fixture layout, test target boundaries, and review checklist.

v6.0.1 is a small App Shell / Sidebar polish release. It keeps Dashboard as the only active top-level section and adds macOS sidebar list styling plus sidebar accessibility polish. v6.0.2 follows up with App Shell release hygiene before v6.1.0 work begins. v6.0.3 adds stable App Shell accessibility identifiers for the sidebar, detail area, and section rows without changing runtime behavior. v6.0.4 adds focused AppSection metadata tests to lock the Dashboard-only v6.0.x shell boundary before v6.1.0 work begins. v6.0.5 closes the v6.0.x App Shell foundation series with a docs-only v6.1 implementation handoff. v6.1.0 adds the first staged Profiles / Model List Surface as a top-level sidebar destination while keeping Dashboard as the default Direct Mode control surface. v6.1.1 polishes the Profiles surface with summary cards and clearer list identifiers while preserving runtime behavior. v6.2.0 adds the first read-only Detail Inspector Foundation as a top-level sidebar destination for selected profile and connection target details. v6.2.1 polishes the Inspector with summary cards and clearer target status identifiers while preserving runtime behavior. v6.3.0 adds the first Logs Panel Refresh as a top-level sidebar destination for app-managed lifecycle and log context while preserving runtime behavior. v6.3.1 polishes the shared LogView with entry count display and stable identifiers while preserving runtime behavior. v6.4.0 adds the first Client Setup surface as a top-level sidebar destination for copy-safe OpenAI-compatible setup values while preserving Direct Mode and runtime behavior. v6.4.1 polishes Client Setup with a copy scope card while preserving existing copy actions and runtime behavior. v6.5.0 adds the first read-only Metrics / System Context surface while preserving Direct Mode and runtime behavior. v6.5.1 polishes Metrics with a context scope card while preserving read-only behavior. v6.6.0 records a docs-only App Layout Stabilization Review and does not produce a new app binary. v6.6.1 polishes the stabilization review with next-phase entry criteria and manual verification notes. v6.7.0 adds a docs-only Distribution / Packaging Readiness Review and does not produce a new app binary. v6.7.1 polishes that review with install documentation planning and packaging verification notes. v6.8.0 refreshes the README Install and Quick Start sections while preserving the current v6.5.1 app binary. v6.8.1 polishes the install guidance with GitHub Assets and checksum command details. v6.9.0 adds a docs-only Signed Distribution Design while preserving the current v6.5.1 unsigned app binary. v6.9.1 polishes the signed distribution design with entry criteria and verification notes. v6.10.0 adds a docs-only Notarization Workflow Design while preserving the current v6.5.1 unsigned app binary. v6.10.1 polishes the notarization workflow design with entry criteria, status wording, and fallback decision notes. v6.11.0 adds a docs-only Signed Zip Implementation Readiness review while preserving the current v6.5.1 unsigned app binary. v6.11.1 polishes that readiness review with Go / No-Go criteria, artifact states, and verification log guidance. v6.12.0 adds a docs-only Local Signing Command Draft. v6.12.1 polishes that draft with preflight, dry-run, and signing caveat notes. v6.13.0 adds a docs-only Signed Zip Dry-Run Checklist. v6.13.1 polishes that checklist with pass/fail criteria, evidence expectations, and stop conditions. v6.14.0 adds docs-only Signed Zip Local Dry-Run Execution Notes. v6.14.1 polishes those notes with scrub checklist, public summary wording, and handoff criteria. v6.15.0 closes the signed zip readiness phase and defines the next implementation decision point. v6.16.0 adds docs-only packaging checklist polish instead of starting signed zip distribution without Developer ID readiness. v6.16.1 polishes packaging checks with binary release go/no-go guidance, asset naming failure cases, and release body verification notes. v6.17.0 adds docs-only Model Download Design with strict user-control, credential, and Direct Mode boundaries. v6.17.1 polishes that design with failure states, disk-space preflight, token redaction, and partial download policy. v6.18.0 adds docs-only Model Availability Documentation for conservative configured/present/missing/external/unknown wording. v6.18.1 polishes that documentation with stale-state wording, explicit check scope, external identifier caveats, and UI copy rules. v6.19.0 adds docs-only Deeper Diagnostics Design while keeping diagnostics explicit, local, and non-inferential. v6.19.1 polishes that design with run trigger guardrails, severity policy, timeout/cancellation notes, and export redaction checklist. v6.20.0 adds docs-only Diagnostics Result Model Design for future copy-safe and scoped diagnostics output. v6.20.1 polishes that design with stable IDs, copy/export eligibility, aggregation precedence, and fixture expectations. v6.21.0 adds docs-only Diagnostics Result Fixture Design for future deterministic and redaction-safe diagnostics tests. v6.21.1 polishes that design with fixture matrix, negative fixtures, deterministic ordering rules, and snapshot safety. v6.22.0 adds docs-only Diagnostics Result Fixture File Layout Design for future test fixture storage and format decisions. v6.22.1 polishes that design with schema validation boundary, file inclusion rules, fixture review checklist, and implementation entry criteria. v6.23.0 adds initial diagnostics fixture data files without adding tests or runtime behavior. v6.23.1 polishes those fixtures with index, blocking, redaction, negative, and copied-summary updates. v6.24.0 adds diagnostics fixture loading tests for result, redaction, negative, aggregation, and copied-summary fixtures while preserving runtime behavior. v6.25.0 adds docs-only Model Availability Surface Design for future selected-profile availability UI placement, explicit check behavior, copy-safe path display, and stale-state handling while preserving runtime behavior. v6.26.0 adds docs-only README screenshot refresh readiness for future v6 layout screenshots while preserving existing screenshot links and runtime behavior. v6.27.0 adds docs-only packaging polish for current-download handoff wording while preserving runtime behavior. v6.28.0 adds docs-only LAN Web UI Design for a possible future local-network status surface while preserving Direct Mode and runtime behavior. v6.29.0 adds docs-only Automatic Unload Policy Design for future app-managed server idle policy work while preserving runtime behavior. v6.30.0 adds docs-only v7 implementation readiness review and recommends Model Availability Surface as the first v7 app-code release. v6.30.1 polishes v7 readiness with implementation file boundaries, test order, and first v7 asset handoff checks. v7.0.0 adds the first Model Availability Surface to Detail Inspector with selected-profile explicit checks, conservative availability states, copy-safe path display, and focused tests while preserving Direct Mode. v7.1.0 adds Unified Dashboard GUI Foundation, making Dashboard the primary one-screen surface for model list, lifecycle controls, logs, selected model settings, model availability, runtime status, and Hermes connection values. v7.2.0 adds explicit Hugging Face download by model ID or huggingface.co URL, saving to a local folder and adding the downloaded local path as a model profile.

LAN Web UI Design is documented in docs/lan_web_ui_design.md. v6.28.0 defines disabled-by-default behavior, binding policy, access-control candidates, read-only initial scope, adapter boundary, privacy and redaction requirements, and verification expectations without adding a web server, network listener, routes, browser UI, authentication implementation, inference proxy, or runtime behavior change.

Automatic Unload Policy Design is documented in docs/automatic_unload_policy_design.md. v6.29.0 defines default-off behavior, app-managed-only scope, app-observed idle wording, external server boundaries, persistence boundaries, log safety, and verification expectations without adding timers, background monitors, request observers, network hooks, lifecycle changes, or runtime behavior changes.

v7 Implementation Readiness Review is documented in docs/v7_implementation_readiness.md. v6.30.0 closes the v6 design and boundary-setting phase with a docs-only handoff for the first v7 app-code implementation. It recommends v7.0.0 as a Model Availability Surface release with selected-profile, read-only, explicit-check scope while preserving Direct Mode. v6.30.1 polishes that handoff with minimal file boundaries, test order, and the expected first v7 unsigned app asset name. v7.0.0 implements that first surface in Detail Inspector without model downloads, deletion, scanning, endpoint probing, or inference proxying. v7.1.0 shifts the product back toward the original beginner-friendly GUI goal by consolidating the main workflow into a unified Dashboard without adding inference proxying or background automation. v7.2.0 adds an explicit Hugging Face download panel for users who want to download by model ID or model URL before starting mlx_lm.server.

Screenshot refresh planning is documented in docs/screenshot_refresh.md. Future screenshots should cover the v1.9+ Current Target summary and Adopted External Server states without exposing private paths or secrets. v6.26.0 adds a docs-only README screenshot refresh readiness pass for future v6 layout screenshots without adding image files, changing README screenshot links, or changing runtime behavior.

First-run guidance is documented in docs/onboarding_first_run.md. v2.4.0 adds a small in-app guidance panel that points first-time users toward executable path setup, model profile selection, diagnostics, Start, and Connection Settings while preserving Direct Mode.

Model Profile export and import are documented in docs/model_profile_import_export.md. v4.0.0 treats Import / Export as a stable metadata-only feature set: Export Profiles, Import Preview, Import Selected Profiles, Rename for profile-name conflicts, explicit Replace for one unambiguous existing profile target, and deterministic regression tests. Import does not include model weights, caches, API keys, tokens, executable paths, or automatic server start.

Current Binary Asset

The current downloadable app binary asset is the latest app-code release:

  • MLXServerManager-v20.2.0-unsigned.zip

v20.2.0 is the latest app-code release and includes a new unsigned app zip.

v4.0.0 and v4.1.0 are docs-only preparation releases. v4.2.0 through v5.0.0 are app-code dashboard polish releases with unsigned app zip assets. v5.1.0 through v5.9.0 are documentation releases. v6.0.0 is an app-code shell foundation release. v6.0.1 is an app-code sidebar polish release. v6.0.2 is an app-code release hygiene follow-up. v6.0.3 is an app-code App Shell identifier follow-up. v6.0.4 is an app-code AppSection metadata test follow-up with an unsigned app zip asset. v6.0.5 is docs-only and includes no new app zip. v6.1.0 is an app-code Profiles / Model List Surface release. v6.1.1 is an app-code Profiles Surface polish release. v6.2.0 is an app-code Detail Inspector Foundation release. v6.2.1 is an app-code Detail Inspector polish release. v6.3.0 is an app-code Logs Panel Refresh release. v6.3.1 is an app-code Logs Surface polish release. v6.4.0 is an app-code Client Setup Surface release. v6.4.1 is an app-code Client Setup Surface polish release. v6.5.0 is an app-code Metrics / System Context release. v6.5.1 is an app-code Metrics Surface polish release with an unsigned app zip asset. v6.6.0 is docs-only and includes no new app zip. v6.6.1 is docs-only and includes no new app zip. v6.7.0 is docs-only and includes no new app zip. v6.7.1 is docs-only and includes no new app zip. v6.8.0 is docs-only and includes no new app zip. v6.8.1 is docs-only and includes no new app zip. v6.9.0 is docs-only and includes no new app zip. v6.9.1 is docs-only and includes no new app zip. v6.10.0 is docs-only and includes no new app zip. v6.10.1 is docs-only and includes no new app zip. v6.11.0 is docs-only and includes no new app zip. v6.11.1 is docs-only and includes no new app zip. v6.12.0 is docs-only and includes no new app zip. v6.12.1 is docs-only and includes no new app zip. v6.13.0 is docs-only and includes no new app zip. v6.13.1 is docs-only and includes no new app zip. v6.14.0 is docs-only and includes no new app zip. v6.14.1 is docs-only and includes no new app zip. v6.15.0 is docs-only and includes no new app zip. v6.16.0 is docs-only and includes no new app zip. v6.16.1 is docs-only and includes no new app zip. v6.17.0 is docs-only and includes no new app zip. v6.17.1 is docs-only and includes no new app zip. v6.18.0 is docs-only and includes no new app zip. v6.18.1 is docs-only and includes no new app zip. v6.19.0 is docs-only and includes no new app zip. v6.19.1 is docs-only and includes no new app zip. v6.20.0 is docs-only and includes no new app zip. v6.20.1 is docs-only and includes no new app zip. v6.21.0 is docs-only and includes no new app zip. v6.21.1 is docs-only and includes no new app zip. v6.22.0 is docs-only and includes no new app zip. v6.22.1 is docs-only and includes no new app zip. v6.23.0 includes fixture data files only and no new app zip. v6.23.1 includes fixture data polish only and no new app zip. v6.24.0 includes test-only diagnostics fixture loading coverage and no new app zip. v6.25.0 is docs-only and includes no new app zip. v6.26.0 is docs-only and includes no new app zip. v6.27.0 is docs-only and includes no new app zip. v6.28.0 is docs-only and includes no new app zip. v6.29.0 is docs-only and includes no new app zip. v6.30.0 is docs-only and includes no new app zip. v6.30.1 is docs-only and includes no new app zip. v7.0.0 is an app-code Model Availability Surface release with an unsigned app zip asset. v7.1.0 is an app-code Unified Dashboard GUI Foundation release with an unsigned app zip asset. v7.2.0 is an app-code Hugging Face Download by ID / URL release with an unsigned app zip asset.

Target Users

  • macOS users running local MLX / mlx-lm.
  • Users who want a GUI for mlx_lm.server Start, Stop, Restart, diagnostics, logs, model profiles, and connection settings.
  • Users of OpenAI-compatible clients such as Hermes Agent, Open WebUI, LibreChat, AnythingLLM, or custom scripts.

Supported Client Context

MLX Server Manager presents connection information for OpenAI-compatible clients. Typical clients use:

  • Base URL: http://127.0.0.1:8080/v1
  • Model ID: the selected Model Profile's modelID
  • API key placeholder: not-required-local

The client sends inference requests directly to the selected server endpoint. MLX Server Manager starts, stops, monitors, diagnoses, and copies connection settings for app-managed mlx_lm.server; for adopted external servers it provides connection context only.

For Hermes Agent and similar clients, see docs/hermes_agent_connection.md. Hermes Agent is treated as an OpenAI-compatible client; MLX Server Manager still stays outside the inference request path.

Current Feature Set

As of v20.2.0, MLX Server Manager includes:

  • Start, Stop, and Restart for the mlx_lm.server process started by this app.
  • Managed-process-only Stop and Restart behavior.
  • Port availability check.
  • Ready check via GET /v1/models.
  • Settings save and restore.
  • Model profile add, edit, delete, and selection.
  • Export Profiles for model profile metadata backup.
  • Import selected valid model profiles from JSON metadata.
  • Import Preview validation for schema v1 profile export documents.
  • Rename for imported profile-name conflicts.
  • Explicit Replace for one unambiguous existing profile target.
  • Import / Export fixtures and XCTest regression coverage.
  • Model switching with Restart required state.
  • Advanced Launch Options per model profile.
  • External Server Detection for selected host/port endpoints.
  • Adopt External Server and Forget External Server for connection context only.
  • Current Target summary in Connection Settings:
    • Managed Server
    • External Server Detected
    • Adopted External Server
    • Not Running / Not Connected
  • Dashboard foundation cards for Current Target and Server State.
  • Polished Current Target wording for no target, managed server, external server, adopted external server, unavailable endpoint, and readiness states.
  • Polished Server State wording for managed process ownership, external context, readiness, lifecycle, stopped, unavailable, and failed states.
  • Display-only Dashboard guidance for logs, diagnostics, readiness failures, port busy states, unavailable targets, and external server log boundaries.
  • Display-only Dashboard guidance for selected profile metadata, profile endpoint, current target relationship, Export Profiles, Import Preview, Rename, Replace, and metadata-only Import / Export safety.
  • Display-only Dashboard Next Steps guidance for first-run setup, managed Start, external adoption, readiness expectations, Direct Mode, and manual troubleshooting boundaries.
  • Dashboard grouping headings and scan order that clarify Next Steps, Current Target, Server State, Diagnostics & Logs, and Profiles / Import Export responsibilities.
  • Display-only Dashboard Client Setup guidance for active endpoint, selected profile model ID, profile endpoint relationship, readiness before client use, and Direct Mode copy context.
  • Dashboard UI Refresh v1 as the stable display-oriented overview for operational state and connection guidance.
  • Lightweight Onboarding Guidance panel for first-run setup and connection state hints.
  • Menu bar quick actions.
  • Logs readability improvements.
  • Copy Logs.
  • Setup Diagnostics summary.
  • Copy Diagnostics Summary.
  • OpenAI-compatible connection setting copy actions:
    • Copy Base URL
    • Copy Model ID
    • Copy API key placeholder
    • Copy JSON config
    • Copy Hermes Agent config
    • Copy all connection settings
    • Copy curl /v1/models readiness check
    • Copy OpenAI-compatible chat example text
  • Hugging Face search as a lightweight model discovery surface.
  • Explicit Hugging Face download by model ID or URL through local tooling.
  • Download queue status, progress parsing, retry, cancel, restore form, and URL copy actions.
  • Optional auto-add of downloaded models to the model profile list.
  • Integrated Dashboard / workspace for model list, lifecycle actions, logs, details, recovery, settings, and connection copy actions.
  • Start guardrails and integrated Recovery actions for common runtime and download failures.
  • Unsigned .app zip distribution documentation.

The copied curl /v1/chat/completions text is only a client-side convenience example. The app itself uses /v1/models for readiness and diagnostics and does not send inference requests.

Non-Goals

  • Chat UI.
  • Proxy mode.
  • LAN Web UI.
  • App Intents.
  • Auto unload.
  • Full Hugging Face model-card browsing.
  • HF token storage or credential management.
  • Silent or background model downloads.
  • Model deletion.
  • Hugging Face cache deletion.
  • Multiple concurrent server management.
  • Multiple model simultaneous launch.
  • RAG.
  • Embedding manager.
  • Tool-call translation.
  • Telemetry, analytics, crash reporting, external log sending, or cloud logging.
  • Persistent file logging.
  • Notarization, Developer ID signing, DMG, App Store distribution, Homebrew cask, auto updater, or CI/CD release automation.

Model download is available only as an explicit, user-initiated workflow. It must not silently download files, start servers automatically, store credentials, delete models, or clean caches.

First-Run Workflow

  1. Prepare a working local mlx-lm environment yourself.
  2. Launch MLX Server Manager.
  3. Open Settings and set the mlx_lm.server executable path.
  4. Add or configure a Model Profile:
    • search Hugging Face and choose a result;
    • download by Hugging Face model ID / URL;
    • register an existing local model folder;
    • or open the advanced profile editor for Display name, Model ID, Host, Port, thinking option, and Notes.
  5. Run Setup Diagnostics.
  6. Start the managed server.
  7. Confirm Ready status via /v1/models.
  8. Copy Base URL, Model ID, JSON config, or curl examples from Connection Settings.
  9. Paste those values into your OpenAI-compatible client.
  10. Use Stop or Restart when needed.

For local use, 127.0.0.1 is recommended:

  • Host: 127.0.0.1
  • Port: 8080
  • Base URL: http://127.0.0.1:8080/v1
  • API key placeholder: not-required-local

Do not expose mlx_lm.server directly to the internet.

OpenAI-Compatible Client Example

JSON config:

{
  "api_key": "not-required-local",
  "base_url": "http://127.0.0.1:8080/v1",
  "model": "unsloth/Qwen3.6-35B-A3B-UD-MLX-4bit"
}

List models:

curl http://127.0.0.1:8080/v1/models

Minimal chat-completions request for an external client:

curl http://127.0.0.1:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer not-required-local" \
  -d '{
    "model": "unsloth/Qwen3.6-35B-A3B-UD-MLX-4bit",
    "messages": [
      {"role": "user", "content": "こんにちは"}
    ],
    "max_tokens": 128,
    "chat_template_kwargs": {
      "enable_thinking": false
    }
  }'

Qwen thinking behavior is controlled by the client request and model template behavior. MLX Server Manager only copies helper text; it does not run this request.

Known Limitations

  • The documented release asset is an unsigned local-use .app zip.
  • The app is not notarized and is not signed with Developer ID.
  • macOS Gatekeeper may warn when opening the app.
  • Browser-downloaded unsigned builds may show "MLXServerManager is damaged and can't be opened"; this can be Gatekeeper quarantine, not necessarily a broken zip or app. Verify the Release asset and checksum before removing quarantine.
  • The app does not bundle mlx-lm.
  • The app does not bundle models.
  • You must provide model files, download them explicitly, or use an existing Hugging Face cache.
  • Hugging Face downloads are explicit user actions and depend on local tooling such as the Hugging Face CLI.
  • The app does not optimize inference.
  • The app does not alter the MLX performance path.
  • Ready Check uses /v1/models only.
  • The app does not test chat completions.
  • Stop and Restart affect only the process started and held by this app.
  • External mlx_lm.server processes are not stopped.
  • There is no automatic updater, DMG, installer, or CI/CD release pipeline.

See docs/known_limitations.md for the full list.

If macOS blocks the unsigned app after download, see docs/distribution.md before running it.

Configuration and Repository Hygiene

The app stores runtime configuration under the user's Application Support directory:

  • settings.json
  • models.json

These files are local runtime state and should not be committed. Model directories, model artifacts, logs, virtual environments, .env, HF_TOKEN, .app, .zip, .dSYM, and build artifacts must also stay out of Git.

Do not hardcode user-specific absolute paths in source code or committed documentation.

AI-Assisted Maintenance

This project is maintained with human-reviewed AI assistance for planning, documentation, implementation, and release preparation. AI-generated changes should remain small, reviewable, and consistent with the Direct Mode product boundary.

All changes should be reviewed for:

  • No secrets.
  • No local personal paths.
  • No model files or runtime settings.
  • No app bundles or build artifacts.
  • No expansion into Chat UI, inference proxy behavior, or multi-backend wrapper behavior.

Documentation

About

macOS GUI for managing pure mlx_lm.server on Apple Silicon in Direct Mode.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages