browser-cli is the agent-facing command-line interface and Codex Skill starter
for operating Lexmount remote browser sessions from local shells, Codex, and
other agents.
It owns command parsing, JSON output, installation docs, and agent-facing
ergonomics. Browser lifecycle and page action behavior stay in
lex-browser-runtime, so this project does not maintain a second copy of the
runtime implementation.
If you need a usable version now, start with
docs/quickstart.md. It covers default-branch install,
local credential setup, doctor verification, the first browser task,
persistent context reuse, and the agent command discovery flow.
Use docs/usable-version.md for the current mainline
trial status, readiness checks, and browser.lexmount.cn integration boundaries.
Use browser-cli when Codex or another agent needs a Lexmount remote browser
instead of a local tab, with machine-readable JSON, deterministic commands,
structured repair hints, explicit cleanup, and safe secret handling.
Use browser-cli when |
Use something else when |
|---|---|
| The task needs an isolated Lexmount remote browser session. | The task is about a local desktop app or an already-open local browser tab. |
| The agent must log in, reuse cookies/storage, fill forms, inspect page state, capture evidence, or diagnose page failures. | The task only needs static documentation lookup or HTTP fetch/search without browser state. |
| The work should be repeatable through JSON/YAML case files or auditable JSON output. | The user only needs advice and no live browser operation. |
| Credentials, API connectivity, session cleanup, or secret handling need structured checks. | Lexmount credentials are unavailable and the user does not want to configure them. |
Start here when the user asks an agent to:
- Create, list, inspect, keep alive, and close remote sessions.
- Reuse persistent login state through contexts and metadata filters.
- Open pages, wait for readiness, navigate history, and capture screenshots or page snapshots.
- Inspect page structure before acting: text, links, forms, tables, lists, dialogs, frames, accessibility trees, interactive elements, console logs, and network activity.
- Click, type, fill, select, check, hover, press keys, scrolling, and targeting by selector, role, label, text, or index.
- Read and mutate browser state through storage and cookies.
- Run repeatable JSON/YAML browser case files instead of ad hoc scripts.
- Diagnose setup, credential, API, selector, page, console, and network failures with structured repair hints.
Do not use this Skill for a local desktop app, an already-open local browser tab, or a task that only needs static documentation. If Lexmount credentials are missing or unclear, stay in setup/auth/doctor commands before creating sessions.
Use the setup and auth commands first when credentials are missing or unclear:
browser-cli reference get --id usable_status, browser-cli auth status,
browser-cli auth login, browser-cli auth export-env, and
browser-cli doctor --json. Use browser-cli action guide --task <task> before
custom JavaScript. Write custom Playwright only when the action guide and
catalog cannot express the browser task.
| Task | Start With | Use When |
|---|---|---|
| Install and readiness | browser-cli doctor --json |
Confirm the CLI, Python runtime, environment variables, packaged references, examples, and API connectivity are usable. |
| First browser task | browser-cli commands --workflow first_browser_task |
Verify readiness, open a page, inspect targets, act once, collect evidence, then close the temporary session. |
| Agent primitives | browser-cli commands --workflow agent_browser_primitives |
Cover the observe, act, extract, and verify loop: use action observe before targets, action act for deterministic click/fill/select/check/press/hover/scroll plans, and action extract for bounded page content. |
| Persistent login | browser-cli commands --workflow persistent_login_state |
Reuse cookies/storage across runs, avoid mutating busy contexts, and understand available/locked/unavailable. |
| Navigation and readiness | browser-cli commands --workflow navigation_flow |
Open URLs, reload, move through history, and wait for URL, title, load state, or network idle before acting. |
| Forms | browser-cli action guide --task form_interaction |
Fill labeled fields, select options, check boxes, submit, and verify values without selector guessing. |
| Interactive targets | browser-cli action guide --task interactive_targeting |
Choose between role, label, text, selector, and index targeting for buttons, links, menus, and repeated controls. |
| Content extraction | browser-cli action extract --session-id <session_id> --surface text --surface links --selector main |
Extract bounded text, links, tables, lists, outline, and accessibility surfaces before custom JS. |
| Visual evidence | browser-cli action guide --task visual_capture |
Set a viewport, capture a full page, selector, or role screenshot, and gather bounded text as supporting evidence. |
| Dialogs, frames, uploads | browser-cli action guide --task dialog_frame_handling |
Detect modals, prompts, embedded frames, cookie banners, and file upload controls before custom workarounds. |
| State and credentials | browser-cli action guide --task browser_state_management |
Inspect or update local/session storage and cookies for setup, cleanup, or assertions. |
| Diagnostics | browser-cli commands --workflow page_diagnostics |
Capture console and network evidence around page failures, fetch/XHR issues, and runtime errors. |
| Repeatable cases | browser-cli commands --workflow case_file_task |
Validate, scaffold, and run browser tasks as JSON/YAML artifacts with cleanup and generated evidence. |
For the current comparison against another cloud-browser agent interface, see docs/skill-positioning.md.
Copy this prompt into Codex when you want Codex to install and configure the CLI for you:
请帮我安装并配置 Lexmount browser-cli,用于在 Codex 中操作 Lexmount 远程浏览器。
约束:
1. 不要让我把 API Key 或 Project ID 粘贴到聊天里。
2. 只指导我在本机 shell 中设置环境变量,或写入本机 shell 配置文件。
3. 不要在聊天回复、日志、README、测试、提交记录或 PR 描述里输出 API Key。
4. 如果命令输出中出现 masked/revealed/contains_secrets/usable 字段,请按这些字段判断是否可以复制到聊天里;revealed secret 永远不要复制到聊天。
5. 不要复述任何 secret 的真实值。
步骤:
1. 检查本机是否已经安装 uv:
uv --version
2. 如果没有 uv,提示我先安装 uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
3. 安装或升级 browser-cli:
uv tool install --force git+https://github.com/lexmount/browser-cli.git
4. 验证 CLI 版本输出是 JSON:
browser-cli --version
browser-cli version
5. 先读取当前安装版本提供给 agent 的 workflow 契约,后续按 JSON 中的 workflow.steps 执行,不要解析 --help 文本:
browser-cli skill status
如果 status 不是 current,先检查 stale_files/missing_files;确认要刷新本机 Codex Skill 后运行:
browser-cli skill install --force
browser-cli commands --workflows-only
browser-cli commands --workflow setup_and_verify
browser-cli commands --workflow connect_from_codex_site_requirements
browser-cli commands --workflow connect_from_codex_auth
browser-cli commands --workflow device_code_auth
browser-cli commands --workflow scoped_token_lifecycle
browser-cli commands --workflow agent_browser_primitives
6. 读取 action guide 和 packaged agent reference 目录;后续选择浏览器 action 前,优先读取机器可读 guide 和 action_playbook,不要先写自定义 Playwright/JS:
browser-cli action guide --names-only
browser-cli action observe --session-id <session_id> --surface interactive --surface text
browser-cli action act --session-id <session_id> --kind click --role button --name "<name>"
browser-cli action act --session-id <session_id> --kind fill --label "<label>" --value "<value>"
browser-cli action extract --session-id <session_id> --surface text --surface links --selector main
browser-cli action guide --task interactive_targeting
browser-cli action guide --task content_extraction
browser-cli action guide --task browser_state_management
browser-cli action guide --task file_upload
browser-cli action guide --task dialog_frame_handling
browser-cli action guide --task navigation_flow
browser-cli action guide --task link_navigation
browser-cli action guide --task visual_capture
browser-cli action guide --task semantic_waits
browser-cli action guide --task menu_keyboard_flow
browser-cli action guide --task mouse_interaction
browser-cli action guide --task state_waits
browser-cli reference list
browser-cli reference get --id quickstart --metadata-only
browser-cli reference get --id quickstart
browser-cli reference get --id connect_from_codex --metadata-only
browser-cli reference get --id connect_from_codex
browser-cli reference get --id skill_positioning --metadata-only
browser-cli reference get --id skill_positioning
browser-cli reference get --id usable_status --metadata-only
browser-cli reference get --id usable_status
browser-cli reference get --id action_playbook --metadata-only
browser-cli reference get --id action_playbook
7. 读取 packaged examples;如果要做可重复任务或 case file,优先参考这些示例:
browser-cli example list
browser-cli example get --id agent_playbook --metadata-only
browser-cli example get --id setup_verification_playbook --metadata-only
browser-cli example get --id auth_lifecycle_playbook --metadata-only
browser-cli example get --id persistent_context_playbook --metadata-only
browser-cli example get --id page_inspection_case
browser-cli example get --id agent_primitives_case
browser-cli example get --id form_fill_case
browser-cli example get --id content_extraction_case
browser-cli example get --id browser_state_case
browser-cli example get --id navigation_flow_case
browser-cli example get --id file_upload_case
browser-cli example get --id checkout_flow_case
browser-cli example get --id interactive_targeting_case
browser-cli example get --id page_diagnostics_case
browser-cli case schema
browser-cli case schema --action observe
browser-cli case schema --action act
browser-cli case schema --action extract
browser-cli case scaffold --template page-inspection --url https://example.com --output case.yaml
browser-cli case scaffold --template agent-primitives --output agent-primitives-case.yaml
browser-cli case scaffold --template form-fill --output form-case.yaml
browser-cli case scaffold --template content-extraction --output content-extraction-case.yaml
browser-cli case scaffold --template browser-state --output browser-state-case.yaml
browser-cli case scaffold --template navigation-flow --output navigation-case.yaml
browser-cli case scaffold --template file-upload --output upload-case.yaml
browser-cli case scaffold --template checkout-flow --output checkout-case.yaml
browser-cli case scaffold --template interactive-targeting --output interactive-case.yaml
browser-cli case scaffold --template page-diagnostics --output diagnostics-case.yaml
8. 运行下面命令查看本机是否已经配置凭证:
browser-cli auth status
9. 如果需要确认 browser.lexmount.cn Connect from Codex 页面/API 还缺什么,先读取权限目录和站点契约:
browser-cli reference get --id connect_from_codex
browser-cli auth scopes --include-site-contract
browser-cli auth connect-requirements
browser-cli auth connect-requirements --checklist
读取 browser_site_contract.required_runtime_auth、required_runtime_auth、required_token_lifecycle 和 setup_blocks;如果 scoped token/device-code 相关 runtime auth 仍缺 SDK/API/gateway 支持,不要把它当作可用登录方式。
10. 如果未配置,引导我运行:
browser-cli auth login
11. 如果我希望直接打开本机浏览器,可以让我运行:
browser-cli auth login --open
如果我明确要求 device-code/OAuth 授权,先运行:
browser-cli auth login --device-code
读取 available、reason、device_code、polling、credentials、connect_from_codex.required_runtime_auth 和 fallback_handoff;available=false 时使用 manual env fallback。只有在端点已配置、runtime auth 阻塞项已处理、且授权说明已经显示后,才让我运行:
browser-cli auth login --device-code --wait
12. 从 auth login 的 JSON 中读取 connect_from_codex.url 或 handoff.login_url,优先引导我打开 https://browser.lexmount.cn/connect/codex,并登录账号。
13. 引导我在 browser.lexmount.cn 控制台中选择正确项目,确认当前 Project ID,并创建或复制面向 agent 的 scoped API Key。
14. 引导我运行下面命令生成本机 shell export 模板,并只在本机终端里填入真实值:
browser-cli auth export-env
export LEXMOUNT_API_KEY="<从 browser.lexmount.cn 获取的 API Key>"
export LEXMOUNT_PROJECT_ID="<从 browser.lexmount.cn 获取的 Project ID>"
15. 只有在本机可信 shell 中需要可直接执行的 export 行时,才让我自己运行:
browser-cli auth export-env --from-current --reveal-secrets
browser-cli auth export-env --reveal-secrets
并提醒我不要把该输出粘贴到聊天里。
16. 告诉我中国区默认会使用 https://api.lexmount.cn,通常不需要设置 LEXMOUNT_BASE_URL。
17. 如果我希望长期保存配置,引导我把这些 export 写入当前 shell 配置文件,例如 ~/.zshrc 或 ~/.bashrc。
18. 运行下面命令验证:
browser-cli --help
browser-cli doctor --json
browser-cli doctor --smoke-session
browser-cli session list
其中 doctor 成功判据是 ok=true、failed=0、ready_for_browser_actions=true;如果运行了 smoke-session,browser_smoke_session.status 应该是 pass,且 created=true、closed=true。
19. 浏览器任务开始前,根据任务类型读取更具体的 workflow 契约;选择具体 action 时先查 action guide、action_playbook、packaged examples 和 commands catalog,只有 CLI 无法表达时才写自定义 Playwright/JS:
browser-cli commands --workflow session_recovery
browser-cli commands --workflow first_browser_task
browser-cli commands --workflow agent_browser_primitives
browser-cli commands --workflow one_off_page_task
browser-cli commands --workflow case_file_task
browser-cli commands --workflow persistent_login_state
browser-cli commands --workflow form_interaction
browser-cli commands --workflow interactive_targeting
browser-cli commands --workflow content_extraction
browser-cli commands --workflow browser_state_management
browser-cli commands --workflow file_upload
browser-cli commands --workflow dialog_frame_handling
browser-cli commands --workflow navigation_flow
browser-cli commands --workflow link_navigation
browser-cli commands --workflow visual_capture
browser-cli commands --workflow semantic_waits
browser-cli commands --workflow menu_keyboard_flow
browser-cli commands --workflow mouse_interaction
browser-cli commands --workflow state_waits
browser-cli commands --workflow page_diagnostics
browser-cli action observe --session-id <session_id> --surface interactive --surface text
browser-cli action act --session-id <session_id> --kind click --role button --name "<name>"
browser-cli action extract --session-id <session_id> --surface text --surface links --selector main
browser-cli action guide --task form_interaction
browser-cli action guide --task interactive_targeting
browser-cli action guide --task content_extraction
browser-cli action guide --task browser_state_management
browser-cli action guide --task file_upload
browser-cli action guide --task dialog_frame_handling
browser-cli action guide --task navigation_flow
browser-cli action guide --task link_navigation
browser-cli action guide --task visual_capture
browser-cli action guide --task semantic_waits
browser-cli action guide --task menu_keyboard_flow
browser-cli action guide --task state_waits
browser-cli action guide --task page_diagnostics
20. 如果验证失败,请按顺序排查:
- uv 是否可用
- browser-cli 是否在 PATH 中
- browser-cli auth status 是否显示 configured 为 true
- browser-cli doctor 的 checks 中哪一项 fail 或 warn
- browser-cli doctor --smoke-session 的 browser_smoke_session 是否创建或关闭失败;如果 created=true 且 closed=false,按 fix.commands 手动关闭临时 session
- LEXMOUNT_API_KEY 是否已设置
- LEXMOUNT_PROJECT_ID 是否已设置
- 如果设置了 LEXMOUNT_BASE_URL,它是否为正确的 API endpoint
- browser.lexmount.cn 中选择的 Project 是否和 LEXMOUNT_PROJECT_ID 一致
- API Key 是否已过期、被 revoke,或缺少 browser session/context/action 权限
完成后告诉我:
- browser-cli 的安装路径
- 验证命令是否通过
- 我还需要手动做什么
- 不要复述任何 secret 的真实值
uv tool install git+https://github.com/lexmount/browser-cli.git
browser-cli --help
browser-cli --version
browser-cli commands --names-only
browser-cli commands --workflows-only
browser-cli reference list
browser-cli example listFor local development:
uv sync --all-groups
uv run browser-cli --help
uv run pytest tests -q
uv run ruff format --check .
uv run ruff check .browser-cli reads the same environment variables as lex-browser-runtime:
export LEXMOUNT_API_KEY="<api-key>"
export LEXMOUNT_PROJECT_ID="<project-id>"Optional:
export LEXMOUNT_BASE_URL="https://api.lexmount.cn"
export LEXMOUNT_REGION="<region>"Treat API keys and direct browser URLs as secrets. The CLI masks api_key in
generated direct browser URLs, and doctor hides full direct URLs by default
so internal hosts are not exposed unless you pass an explicit reveal flag.
Use these local auth helpers:
browser-cli auth status
browser-cli auth status --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth scopes
browser-cli auth scopes --scope browser:actions --include-site-contract
browser-cli auth token-info --required-scope browser:actions
browser-cli auth refresh --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth logout --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth clear-credentials --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth login
browser-cli auth login --open
browser-cli auth login --device-code
browser-cli auth login --project-id <project-id> --scope browser:actions --expires-in 24h
browser-cli auth export-env
browser-cli auth export-env --from-currentauth export-env prints placeholder shell commands by default. With
--from-current, it reuses current environment values but still masks
LEXMOUNT_API_KEY unless --reveal-secrets is explicitly passed in a trusted
local terminal. Read top-level usable and unusable_exports before treating
the returned commands as directly runnable. Also read
safe_to_paste_in_chat, local_shell_only, contains_secret_values,
contains_secret_placeholders, safety, setup_block, and verification so
agents know whether the output belongs only in a local shell and which
auth status/doctor commands prove the setup worked.
auth status and auth token-info report local device-token metadata from
~/.config/lexmount/browser-cli/credentials.json,
LEXMOUNT_BROWSER_CREDENTIALS_FILE, or --credentials-file without printing
access or refresh token values. Until bearer-token runtime support lands,
runtime_auth_usable is true only when env API-key credentials are configured.
Read runtime_auth.usable, runtime_auth.source,
runtime_auth.fallback_missing_env, and
runtime_auth.bearer_runtime.required_support before deciding whether browser
actions can use the current credential source. Device tokens remain local
metadata until the SDK, API, and browser gateway all accept bearer tokens.
When env credentials are incomplete, auth status also reports missing_env
and a fix object that first points agents at
browser-cli reference get --id usable_status, then safe
browser-cli auth login / Connect from Codex setup commands.
Persistent context metadata created by this CLI is also cached locally at
~/.config/lexmount/browser-cli/context-registry.json; set
LEXMOUNT_BROWSER_CONTEXT_REGISTRY_FILE to override that path for tests or
isolated workspaces. Use metadata for labels such as purpose; do not put API
keys, passwords, or session secrets in context metadata.
browser-cli doctor --json includes a context_registry check with the path,
path source, creatability/writability, context counts, scope-matched counts, and
redacted metadata diagnostics so agents can repair local persistent-login reuse
before creating sessions.
Use auth token-info --required-scope <scope> to check scoped-token coverage.
Use auth refresh --credentials-file <path> to inspect whether local
device-token metadata needs refresh. Without a token lifecycle endpoint it
reports refresh_available: false and refreshed: false; with
--token-base-url <url>, LEXMOUNT_BROWSER_TOKEN_BASE_URL, or
LEXMOUNT_BROWSER_DEVICE_CODE_BASE_URL, it calls
POST /api/auth/token/refresh, saves refreshed local metadata on success, and
never prints access or refresh token values. The refresh request includes
grant_type=refresh_token, credential_kind, project_id, token_id, and
requested_scopes; the response may return a token payload at the top level or
under token, device_token, credential, or credentials. camelCase fields
such as accessToken, refreshToken, expiresIn, projectId, and tokenId
are normalized before saving. remote_refresh reports only safe response
metadata such as response_payload_source and response_summary.
Use auth clear-credentials --credentials-file <path> to remove the local
browser-cli credentials file, whether it contains API-key credentials or
device-token metadata. It cannot mutate the parent shell, so it reports
env_unchanged: true plus unset_env_commands when LEXMOUNT_API_KEY,
LEXMOUNT_PROJECT_ID, or LEXMOUNT_BASE_URL are still present. Use
auth logout --credentials-file <path> when you specifically want the
device-token lifecycle flow; auth logout --revoke calls
POST /api/auth/token/revoke when a token lifecycle base URL is configured.
The revoke response may omit revoked, but explicit revoked:false is treated
as not confirmed. Without a token lifecycle base URL it reports
revoke_available: false and reminds you to revoke from browser.lexmount.cn.
After credentials are configured, run the self-check:
browser-cli doctor
browser-cli doctor --json
browser-cli doctor --smoke-sessionbrowser-cli output is always JSON; --json is accepted as an agent
compatibility no-op at the top level and after subcommands. Use
browser-cli --version or browser-cli version to read the installed
browser-cli version, lex-browser-runtime version, Python version, and executable
path as JSON. Use browser-cli doctor --skip-api only when the live API should
not be called.
Use browser-cli doctor --smoke-session when you need stronger proof that the
credentials can create and close a temporary browser session, not just reach the
API.
If LEXMOUNT_BASE_URL points at an internal Kubernetes host such as *.svc or
*.cluster.local, doctor treats it as invalid, redacts the value, and asks you
to unset it or rerun browser-cli auth login --open.
For machine-readable command discovery, run:
browser-cli commands
browser-cli commands --names-only
browser-cli commands --group action
browser-cli action guide --names-only
browser-cli action guide --task interactive_targeting
browser-cli action guide --task content_extraction
browser-cli action guide --task browser_state_management
browser-cli action guide --task file_upload
browser-cli action guide --task dialog_frame_handling
browser-cli action guide --task navigation_flow
browser-cli action guide --task link_navigation
browser-cli action guide --task visual_capture
browser-cli action guide --task semantic_waits
browser-cli action guide --task menu_keyboard_flow
browser-cli action guide --task mouse_interaction
browser-cli action guide --task state_waits
browser-cli commands --workflows-only
browser-cli commands --workflow setup_and_verify
browser-cli commands --workflow connect_from_codex_site_requirements
browser-cli commands --workflow connect_from_codex_auth
browser-cli commands --workflow device_code_auth
browser-cli commands --workflow scoped_token_lifecycle
browser-cli commands --workflow session_recovery
browser-cli commands --workflow first_browser_task
browser-cli commands --workflow one_off_page_task
browser-cli commands --workflow case_file_task
browser-cli commands --workflow persistent_login_state
browser-cli commands --workflow form_interaction
browser-cli commands --workflow interactive_targeting
browser-cli commands --workflow navigation_flow
browser-cli commands --workflow link_navigation
browser-cli commands --workflow visual_capture
browser-cli commands --workflow semantic_waits
browser-cli commands --workflow menu_keyboard_flow
browser-cli commands --workflow mouse_interaction
browser-cli commands --workflow content_extraction
browser-cli commands --workflow browser_state_management
browser-cli commands --workflow file_upload
browser-cli commands --workflow dialog_frame_handling
browser-cli commands --workflow navigation_flow
browser-cli commands --workflow link_navigation
browser-cli commands --workflow visual_capture
browser-cli commands --workflow semantic_waits
browser-cli commands --workflow menu_keyboard_flow
browser-cli commands --workflow mouse_interaction
browser-cli commands --workflow state_waits
browser-cli commands --workflow page_diagnostics
browser-cli reference list
browser-cli reference get --id action_playbook --metadata-only
browser-cli example list
browser-cli example get --id auth_lifecycle_playbook --metadata-only
browser-cli example get --id persistent_context_playbook --metadata-only
browser-cli example get --id page_inspection_case --metadata-only
browser-cli example get --id agent_primitives_case --metadata-only
browser-cli example get --id content_extraction_case --metadata-only
browser-cli example get --id browser_state_case --metadata-only
browser-cli example get --id navigation_flow_case --metadata-only
browser-cli example get --id file_upload_case --metadata-only
browser-cli example get --id checkout_flow_case --metadata-only
browser-cli example get --id interactive_targeting_case --metadata-only
browser-cli example get --id page_diagnostics_case --metadata-only
browser-cli case schema
browser-cli case scaffold --template page-inspection --url https://example.com --output case.yaml
browser-cli case scaffold --template agent-primitives --output agent-primitives-case.yaml
browser-cli case scaffold --template content-extraction --output content-extraction-case.yaml
browser-cli case scaffold --template browser-state --output browser-state-case.yaml
browser-cli case scaffold --template navigation-flow --output navigation-case.yaml
browser-cli case scaffold --template file-upload --output upload-case.yaml
browser-cli case scaffold --template checkout-flow --output checkout-case.yaml
browser-cli case scaffold --template interactive-targeting --output interactive-case.yaml
browser-cli case scaffold --template page-diagnostics --output diagnostics-case.yamlcommands returns the current parser-backed command catalog, option metadata,
browser target requirements, JSON/secret policies, and agent entrypoint recipes.
Agents should use --workflows-only for compact setup/task flow discovery,
--workflow <id> for one concrete task path, and the command catalog when
deciding whether a first-class action exists before writing custom JavaScript.
Use browser-cli action guide --task <task> for a compact task-specific action
route with inspect_commands, preferred_commands, verify_commands, and the
custom_js_boundary.
Unknown groups return JSON with error=unknown_group, available_groups, and a
fix object so agents can repair typos instead of treating an empty command
list as capability absence. Unknown workflows similarly return
error=unknown_workflow, available_workflows, and a workflow-discovery fix.
Authentication:
browser-cli auth status
browser-cli auth status --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth scopes
browser-cli auth scopes --scope browser:actions --include-site-contract
browser-cli auth token-info --required-scope browser:sessions --required-scope browser:actions
browser-cli auth refresh --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth logout --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth clear-credentials --credentials-file ~/.config/lexmount/browser-cli/credentials.json
browser-cli auth connect-requirements
browser-cli auth login
browser-cli auth login --open
browser-cli auth login --device-code
browser-cli auth login --project-id <project-id> --scope browser:sessions --scope browser:actions --expires-in 24h
browser-cli auth export-env
browser-cli auth export-env --from-current --include-base-urlauth scopes returns the stable Connect from Codex scope catalog without
secrets. It reports known_scopes, default_scopes, scopes,
permission_count, risk, destructive, unknown_scopes, and the repeatable
scope query parameter. With --include-site-contract, it also returns
browser_site_contract.url, device_code_url, scope_ui_fields,
required_query_parameters, site_capability_status, and token lifecycle
requirements so browser.lexmount.cn can render the permission picker without
scraping auth login.
auth refresh reports local refresh state, including refresh_needed,
has_refresh_token, refresh_available, refreshed, reason,
refresh_endpoint, and remote_refresh. Without a configured token lifecycle
base URL it remains local/pending; with --token-base-url <url>,
LEXMOUNT_BROWSER_TOKEN_BASE_URL, or LEXMOUNT_BROWSER_DEVICE_CODE_BASE_URL,
it calls POST /api/auth/token/refresh, saves refreshed metadata on success,
and keeps token values out of JSON output. Agents should still use
runtime_auth_usable and next_steps before relying on bearer-token runtime
auth for browser actions.
auth status and doctor also include runtime_auth, whose
bearer_runtime.required_support lists the SDK/API/browser-gateway changes
needed before device tokens can replace env API-key credentials for browser
actions.
auth connect-requirements returns the browser.lexmount.cn /connect/codex
implementation contract without requiring credentials or opening a browser. It
includes connect_from_codex.url, connect_from_codex.device_code_url,
site_capabilities/site_capability_status, setup_blocks,
required_device_code_endpoints, required_api_contract,
required_token_lifecycle, required_runtime_auth,
browser_site_acceptance_tests, and verification commands for browser-cli auth status, browser-cli auth login, device-code fallback, and
browser-cli doctor --json.
Use browser-cli auth connect-requirements --checklist when coordinating
browser.lexmount.cn implementation work; it returns an
implementation_checklist grouped by Project ID display, scoped credential
creation, safe local env copy, doctor verification, credential lifecycle,
device-code/OAuth, and runtime bearer-token launch gates.
auth login --open is the preferred Connect from Codex flow. It starts a local
127.0.0.1:{random_port} callback server, opens
https://browser.lexmount.cn/connect/codex with redirect_uri, state,
code_challenge, and code_challenge_method=S256, validates the callback
state, and exchanges the one-time code over HTTPS at
POST /api/connect/codex/exchange. The callback URL never contains an API key.
On success, browser-cli saves local API-key credentials metadata to the
credentials file and auth status/doctor can use it without asking the user to
paste secrets into chat.
auth login without --open returns top-level flow, selected_flow,
available, manual_env_available, and device_code_available, plus a
machine-readable handoff and connect_from_codex contract for agents that need
non-blocking guidance. connect_from_codex uses a single space-delimited
scope query parameter, requested expires_in, structured setup_blocks,
requested_scope_details, site_capabilities/site_capability_status,
browser_site_acceptance_tests, and the browser site requirements for the
loopback + PKCE authorization flow.
The Connect from Codex site workflow also reads
connect_from_codex.browser_site_acceptance_tests from manual and device-code
handoff responses so agents can use the same browser.lexmount.cn acceptance
checklist on either auth path.
setup_blocks groups install, Connect, local env, and verification commands
with secret placeholder and chat-safety metadata so browser.lexmount.cn can
render copy buttons without guessing which commands are local-shell-only.
requested_scope_details gives browser.lexmount.cn labels, descriptions,
permission names, risk levels, and destructive markers for known scopes, while
unknown future scopes are marked with known: false. Capability ids currently include
project_id_display, scoped_api_key, copy_install_and_env,
doctor_verification, scoped_key_lifecycle, and device_code_oauth.
Use --connect-base-url <origin> or LEXMOUNT_BROWSER_CONNECT_BASE_URL when
testing a non-production browser.lexmount.cn host. JSON output includes
open_result, loopback_callback, exchange, safe credentials metadata, and
reason so agents can continue or fall back when browser open, callback, or
exchange fails.
auth login --device-code returns the structured device-code contract. By
default, when no endpoint is configured, it reports available: false,
reason: "browser_site_endpoint_missing", required browser/API endpoints, and a
fallback_handoff for the manual env flow. With
--device-code-base-url <url> or LEXMOUNT_BROWSER_DEVICE_CODE_BASE_URL, it
POSTs /api/auth/device/code; add --wait to poll /api/auth/device/token and
write approved scoped-token metadata to the local credentials file without
printing access, refresh, or raw device-code values.
Use browser-cli commands --workflow connect_from_codex_site_requirements when
an agent or browser.lexmount.cn implementer needs the machine-readable site
requirements, and browser-cli commands --workflow device_code_auth when an
agent needs the machine-readable device-code contract and fallback sequence.
Diagnostics:
browser-cli --version
browser-cli version
browser-cli commands
browser-cli commands --names-only
browser-cli commands --group action
browser-cli commands --workflows-only
browser-cli commands --workflow setup_and_verify
browser-cli commands --workflow connect_from_codex_site_requirements
browser-cli commands --workflow connect_from_codex_auth
browser-cli commands --workflow device_code_auth
browser-cli commands --workflow scoped_token_lifecycle
browser-cli commands --workflow session_recovery
browser-cli commands --workflow first_browser_task
browser-cli commands --workflow one_off_page_task
browser-cli commands --workflow case_file_task
browser-cli commands --workflow persistent_login_state
browser-cli commands --workflow form_interaction
browser-cli commands --workflow interactive_targeting
browser-cli commands --workflow content_extraction
browser-cli commands --workflow browser_state_management
browser-cli commands --workflow file_upload
browser-cli commands --workflow dialog_frame_handling
browser-cli commands --workflow navigation_flow
browser-cli commands --workflow link_navigation
browser-cli commands --workflow visual_capture
browser-cli commands --workflow semantic_waits
browser-cli commands --workflow menu_keyboard_flow
browser-cli commands --workflow mouse_interaction
browser-cli commands --workflow state_waits
browser-cli commands --workflow page_diagnostics
browser-cli doctor
browser-cli doctor --json
browser-cli doctor --smoke-session
browser-cli doctor --skip-api
browser-cli doctor --credentials-file ~/.config/lexmount/browser-cli/credentials.jsonSession management:
browser-cli session create
browser-cli session create --create-context
browser-cli session create --context-id <context_id> --context-mode read_write
browser-cli session create --context-metadata-json '{"purpose":"codex-login"}' --context-selection newest --create-context-if-missing
browser-cli session list --status active
browser-cli session get --session-id <session_id>
browser-cli session close --session-id <session_id>
browser-cli session keepalive --session-id <session_id> --duration 60Context management:
browser-cli context create
browser-cli context create --metadata-json '{"purpose":"codex"}'
browser-cli context list --metadata-json '{"purpose":"codex-login"}' --selection newest --include-reuse-state
browser-cli context get --context-id <context_id>
browser-cli context status --context-id <context_id>
browser-cli context pick --metadata-json '{"purpose":"codex-login"}'
browser-cli context pick --metadata-json '{"purpose":"codex-login"}' --selection newest --create-if-missing --dry-run
browser-cli context pick --metadata-json '{"purpose":"codex-login"}' --selection newest --create-if-missing
browser-cli context delete --context-id <context_id>Browser actions:
browser-cli action guide --names-only
browser-cli action guide --task form_interaction
browser-cli action guide --task interactive_targeting
browser-cli action guide --task content_extraction
browser-cli action guide --task browser_state_management
browser-cli action guide --task file_upload
browser-cli action guide --task dialog_frame_handling
browser-cli action guide --task menu_keyboard_flow
browser-cli action guide --task mouse_interaction
browser-cli action guide --task state_waits
browser-cli action guide --task page_diagnostics
browser-cli action open-url --session-id <session_id> --url https://example.com
browser-cli action wait-selector --session-id <session_id> --selector "main"
browser-cli action click --session-id <session_id> --selector "button[type=submit]"
browser-cli action type --session-id <session_id> --selector "input[name=q]" --text "hello"
browser-cli action screenshot --session-id <session_id> --output /tmp/page.png
browser-cli action screenshot-selector --session-id <session_id> --selector "main" --output /tmp/main.png
browser-cli action screenshot-role --session-id <session_id> --role button --name "Submit" --output /tmp/submit.png
browser-cli action eval --session-id <session_id> --script "() => document.title"
browser-cli action snapshot --session-id <session_id> --max-chars 8000
browser-cli action page-info --session-id <session_id>
browser-cli action set-viewport --session-id <session_id> --width 1280 --height 720
browser-cli action reload --session-id <session_id>
browser-cli action go-back --session-id <session_id>
browser-cli action go-forward --session-id <session_id>
browser-cli action wait-url --session-id <session_id> --url /dashboard
browser-cli action wait-title --session-id <session_id> --title Dashboard --match contains
browser-cli action wait-load-state --session-id <session_id> --state complete
browser-cli action wait-network-idle --session-id <session_id> --idle-ms 500
browser-cli action get-text --session-id <session_id> --selector "main"
browser-cli action get-text-role --session-id <session_id> --role heading --name "Welcome"
browser-cli action exists --session-id <session_id> --selector "button[type=submit]"
browser-cli action exists-role --session-id <session_id> --role button --name "Submit"
browser-cli action count --session-id <session_id> --selector ".item"
browser-cli action wait-count --session-id <session_id> --selector ".item" --count 3 --comparison gte
browser-cli action wait-state --session-id <session_id> --selector "button[type=submit]" --state enabled
browser-cli action wait-state-role --session-id <session_id> --role button --name "Submit" --state enabled
browser-cli action query --session-id <session_id> --selector ".item" --max-nodes 20
browser-cli action get-attribute --session-id <session_id> --selector "a" --name href
browser-cli action get-attribute-role --session-id <session_id> --role button --name "Menu" --attribute aria-expanded
browser-cli action wait-attribute --session-id <session_id> --selector "button" --name aria-busy --state absent
browser-cli action wait-attribute-role --session-id <session_id> --role button --name "Menu" --attribute aria-expanded --value true --match exact
browser-cli action wait-text --session-id <session_id> --text "Ready" --selector "main"
browser-cli action wait-text --session-id <session_id> --text "Loading" --state absent
browser-cli action wait-role --session-id <session_id> --role button --name "Submit"
browser-cli action focus --session-id <session_id> --selector "input[name=q]"
browser-cli action focus-role --session-id <session_id> --role textbox --name "Search"
browser-cli action get-value --session-id <session_id> --selector "input[name=q]"
browser-cli action get-value-role --session-id <session_id> --role textbox --name "Search"
browser-cli action wait-value --session-id <session_id> --selector "input[name=q]" --value "hello"
browser-cli action wait-value-role --session-id <session_id> --role textbox --name "Search" --value "hello"
browser-cli action blur --session-id <session_id> --selector "input[name=q]"
browser-cli action blur-role --session-id <session_id> --role textbox --name "Search"
browser-cli action storage-get --session-id <session_id> --area local --key featureFlag
browser-cli action storage-set --session-id <session_id> --area local --key seenIntro --value true
browser-cli action storage-remove --session-id <session_id> --area session --key draft
browser-cli action storage-clear --session-id <session_id> --area session --prefix temp:
browser-cli action wait-storage --session-id <session_id> --area local --key authToken
browser-cli action cookie-get --session-id <session_id> --name consent
browser-cli action cookie-set --session-id <session_id> --name consent --value yes --path /
browser-cli action cookie-delete --session-id <session_id> --name consent --path /
browser-cli action cookie-clear --session-id <session_id> --prefix tmp: --path /
browser-cli action wait-cookie --session-id <session_id> --name consent --value yes
browser-cli action clear --session-id <session_id> --selector "input[name=q]"
browser-cli action clear-role --session-id <session_id> --role textbox --name "Search"
browser-cli action set-value --session-id <session_id> --selector "input[name=q]" --value "hello"
browser-cli action set-file-input --session-id <session_id> --selector "input[type=file]" --file ./avatar.png
browser-cli action dispatch-event --session-id <session_id> --selector "input[name=q]" --event input --event change
browser-cli action submit --session-id <session_id> --selector "form"
browser-cli action scroll --session-id <session_id> --y 600
browser-cli action scroll --session-id <session_id> --selector ".pane" --y 300
browser-cli action scroll-into-view --session-id <session_id> --selector "button[type=submit]"
browser-cli action scroll-into-view-role --session-id <session_id> --role button --name "Submit"
browser-cli action bounding-box --session-id <session_id> --selector "button[type=submit]"
browser-cli action bounding-box-role --session-id <session_id> --role button --name "Submit"
browser-cli action inspect --session-id <session_id> --selector "button[type=submit]"
browser-cli action select-option --session-id <session_id> --selector "select" --value pro
browser-cli action select-label --session-id <session_id> --label "Plan" --option-label "Pro"
browser-cli action select-role --session-id <session_id> --role combobox --name "Plan" --option-label "Pro"
browser-cli action check --session-id <session_id> --selector "input[type=checkbox]"
browser-cli action uncheck --session-id <session_id> --selector "input[type=checkbox]"
browser-cli action check-label --session-id <session_id> --label "Remember me"
browser-cli action uncheck-label --session-id <session_id> --label "Remember me"
browser-cli action check-role --session-id <session_id> --role checkbox --name "Remember me"
browser-cli action uncheck-role --session-id <session_id> --role checkbox --name "Remember me"
browser-cli action hover --session-id <session_id> --selector ".menu"
browser-cli action hover-role --session-id <session_id> --role button --name "Menu"
browser-cli action press --session-id <session_id> --selector "input[name=q]" --key Enter
browser-cli action press-role --session-id <session_id> --role textbox --name "Search" --key Enter
browser-cli action press-key --session-id <session_id> --key Escape
browser-cli action click-label --session-id <session_id> --label "Remember me"
browser-cli action click-text --session-id <session_id> --text "Submit"
browser-cli action click-role --session-id <session_id> --role button --name "Submit"
browser-cli action click-index --session-id <session_id> --selector ".item button" --index 2
browser-cli action drag-role-to-role --session-id <session_id> --source-role listitem --source-name "Todo" --target-role list --target-name "Done"
browser-cli action fill --session-id <session_id> --selector "input[name=email]" --text "me@example.com"
browser-cli action fill-label --session-id <session_id> --label "Email" --text "me@example.com"
browser-cli action fill-role --session-id <session_id> --role textbox --name "Email" --text "me@example.com"
browser-cli action link-snapshot --session-id <session_id> --selector "main" --max-nodes 50
browser-cli action table-snapshot --session-id <session_id> --selector ".report" --max-rows 50 --max-cells 20
browser-cli action list-snapshot --session-id <session_id> --selector ".results" --max-items 50
browser-cli action text-snapshot --session-id <session_id> --selector "main" --max-nodes 50 --max-chars 500
browser-cli action dialog-snapshot --session-id <session_id> --max-nodes 20 --max-controls 30
browser-cli action wait-dialog --session-id <session_id> --text "Confirm" --modal-only
browser-cli action frame-snapshot --session-id <session_id> --selector "main" --max-nodes 20 --max-chars 500
browser-cli action wait-frame --session-id <session_id> --url "/checkout" --readable-only
browser-cli action performance-snapshot --session-id <session_id> --max-resources 50 --min-duration-ms 0
browser-cli action network-snapshot --session-id <session_id> --max-entries 50
browser-cli action wait-network --session-id <session_id> --url /api/save --method POST --status 201
browser-cli action console-snapshot --session-id <session_id> --max-entries 50
browser-cli action wait-console --session-id <session_id> --source pageerror --level error --timeout-ms 5000
browser-cli action outline-snapshot --session-id <session_id> --selector "main" --max-nodes 50
browser-cli action form-snapshot --session-id <session_id> --selector "form" --max-nodes 50
browser-cli action accessibility-snapshot --session-id <session_id> --max-nodes 100
browser-cli action interactive-snapshot --session-id <session_id>
browser-cli action interactive-only-snapshot --session-id <session_id>action guide returns machine-readable task routes for form_interaction,
interactive_targeting, content_extraction, browser_state_management,
file_upload, dialog_frame_handling, navigation_flow, link_navigation,
visual_capture, semantic_waits, menu_keyboard_flow, mouse_interaction, page_diagnostics, and state_waits, including
selection order, inspect/preferred/fallback/verify commands, read fields, and
the boundary for custom JavaScript.
page-info, set-viewport, screenshot-selector, screenshot-role, reload, go-back, go-forward, wait-url, wait-title,
wait-load-state, wait-network-idle, get-text, get-text-role, exists, exists-role, count, query,
get-attribute, get-attribute-role, wait-count, wait-state, wait-state-role, wait-attribute, wait-attribute-role, wait-text, wait-role, focus, focus-role,
get-value, get-value-role, wait-value, wait-value-role, blur, blur-role, storage-get, storage-set, storage-remove,
storage-clear, wait-storage, cookie-get, cookie-set, cookie-delete,
cookie-clear, wait-cookie, clear, clear-role, set-value, set-file-input,
dispatch-event, submit, scroll, scroll-into-view, scroll-into-view-role, bounding-box, bounding-box-role, inspect,
select-option, select-label, select-role, check, uncheck, check-label,
check-role, uncheck-label, uncheck-role, hover, hover-role, press, press-role, press-key, click-label, click-text, click-role,
double-click, double-click-role, drag-role-to-role, drag-to, right-click, right-click-role,
click-index, fill, fill-label, fill-role,
link-snapshot, table-snapshot, list-snapshot, text-snapshot, dialog-snapshot, wait-dialog, frame-snapshot, wait-frame, performance-snapshot, network-snapshot, wait-network, console-snapshot, wait-console, outline-snapshot, form-snapshot, accessibility-snapshot,
interactive-snapshot, and its interactive-only-snapshot alias are implemented as eval-backed DOM actions while the
runtime action surface catches up. They are intended to reduce agent-written
JavaScript for common page work. For missing matches, parse structured fields
such as found, exists, checkable, checked, selectable, selected, clicked, filled,
focused, value, readable, blurred, set, removed, clearable, cleared,
deleted, items, cleared_count, requested_count, state,
matched, role_found, state_values, attribute_found, requested_value, network_idle,
quiet_ms, submitted, dispatched, dispatched_events, fields,
value_masked, file_input, file_count, requested_files, bounding_box,
in_viewport, index, attributes, html_truncated, candidate_count,
candidates, writable, requested_option_label,
option_found, option_label, requested_checked, previous_checked,
changed, links, link_count, href, href_masked, absolute_url,
absolute_url_masked, same_origin, external, download,
tables, table_count, headers, rows, cells, row_count, cell_count,
lists, list_count, items, item_count, selected, checked, expanded,
texts, text_count, text_length, text_truncated, aria_live,
dialogs, dialog_count, total_dialog_count, requested_text, modal_only,
controls, control_count, controls_truncated, modal,
frames, frame_count, total_frame_count, src, src_masked,
frame_url, frame_url_masked, readable, readable_only,
same_origin_only, text_match, read_error,
navigation, resources, resource_count, initiator_type,
initiator_types, duration, transfer_size, response_status,
entries, entry_count, matched_count, buffered_count, source, level,
method, requested_method, status, ok, failed, failed_only,
request_has_body, duration_ms, text_masked, filename_masked,
url_masked, timed_out, requested_url, url_match, requested_source,
requested_status, requested_level, after_index,
headings, landmarks, outline_count, heading_count, landmark_count,
node_type, level,
total_candidate_count, ready_state, visibility_state,
viewport, scroll, body_text_length, html_length, language,
referrer, requested_title, case_sensitive, code, target,
target_info, modifiers, events, double_clicked, right_clicked,
context_menu, default_prevented, client_x, client_y,
keydown_accepted, or
navigation_requested from result.
For DOM/form actions, values from fields that look like password, token,
credential, secret, authorization, or API-key controls are masked by default.
When value, previous_value, requested_value, or text is ***, inspect
value_masked, previous_value_masked, requested_value_masked,
text_masked, and related *_length fields before deciding whether the page
state is correct.
For link-snapshot, URL query parameters that look like API keys, access
tokens, authorization codes, passwords, or secrets are masked by default. Use
href_masked and absolute_url_masked before copying or reporting URLs.
table-snapshot, list-snapshot, dialog-snapshot, wait-dialog, frame-snapshot, wait-frame, and
performance-snapshot use the same URL masking for links, frame URLs, and
performance resource URLs found inside table cells, list items, dialog controls,
frame metadata, or timing entries.
network-snapshot and wait-network mask fetch/XHR URLs and do not capture
request or response bodies; use request_has_body only as a boolean hint.
console-snapshot and wait-console mask token-like key/value text in captured
console/page error entries and the reported page URL.
Each action must receive exactly one browser target:
--session-id <session_id>
--connect-url <cdp_websocket_url>
--direct-urlBy default, action output masks api_key inside resolved direct connect URLs.
Use --reveal-connect-url only for local debugging.
Diagnostics, case files, and compatibility aliases:
browser-cli auth status
browser-cli auth export-env
browser-cli commands
browser-cli commands --workflow case_file_task
browser-cli case schema
browser-cli case scaffold --template page-inspection --url https://example.com --output case.yaml
browser-cli case scaffold --template agent-primitives --output agent-primitives-case.yaml
browser-cli case scaffold --template navigation-flow --output navigation-case.yaml
browser-cli case scaffold --template file-upload --output upload-case.yaml
browser-cli case scaffold --template checkout-flow --output checkout-case.yaml
browser-cli case scaffold --template interactive-targeting --output interactive-case.yaml
browser-cli case validate --file case.yaml
browser-cli case run --file case.yaml
browser-cli doctor
browser-cli doctor --smoke-session
browser-cli direct-url
browser-cli prepare
browser-cli list-contexts
browser-cli close-session --session-id <session_id>All command output is JSON. --json is accepted as a no-op compatibility flag
for agents that habitually request machine-readable output; it can appear before
the command group or after subcommands, for example browser-cli --json auth status, browser-cli auth status --json, or browser-cli action snapshot --session-id <session_id> --json.
Successful commands include:
{
"ok": true,
"command": "session.create"
}Failed commands include:
{
"ok": false,
"command": "session.create",
"error": "configuration_error",
"message": "..."
}Agents should parse ok, command, and error first, then use
command-specific fields. Failure messages and payload fields are sanitized before
printing: api_key, token-like query parameters, and the current
LEXMOUNT_API_KEY value are masked unless a success command explicitly uses a
local reveal flag.
browser-cli commands returns a parser-backed command catalog with
schema_version, groups, command_count, commands, json_output,
secret_policy, agent_references, agent_examples, agent_entrypoints, and
agent_workflows.
Use --names-only for compact command discovery and --group action when
choosing a browser action. Use browser-cli action guide --task <task> for
compact task-specific action selection before reading larger references. Use
agent_references to load detailed Skill references such as
references/connect-from-codex.md when coordinating browser.lexmount.cn site
work, references/skill-positioning.md when deciding when to use this Skill or
comparing cloud-browser agent gaps, references/usable-status.md when checking
the current usable baseline, and references/action-playbook.md when action
selection, structured result parsing, masking, or browser-target details are needed.
agent_references.connect_from_codex.content_command points to
browser-cli reference get --id connect_from_codex, which returns the packaged
browser.lexmount.cn implementation guide.
agent_references.skill_positioning.content_command points to
browser-cli reference get --id skill_positioning, which returns the packaged
Skill positioning, supported operations, Browserbase Skills comparison, and product
gap notes.
agent_references.usable_status.content_command points to
browser-cli reference get --id usable_status, which returns the installed
setup/readiness boundary reference.
browser-cli skill status compares the local Codex Skill directory with the
packaged Skill resources; use browser-cli skill install --force only after
reviewing stale_files or missing_files.
agent_references.action_playbook.content_command points to
browser-cli reference get --id action_playbook, which returns the packaged
markdown content from an installed CLI. agent_examples points to packaged
playbook and case-file examples, readable with browser-cli example list and
browser-cli example get --id setup_verification_playbook,
browser-cli example get --id auth_lifecycle_playbook,
browser-cli example get --id persistent_context_playbook,
browser-cli example get --id page_inspection_case,
browser-cli example get --id form_fill_case,
browser-cli example get --id content_extraction_case,
browser-cli example get --id browser_state_case,
browser-cli example get --id navigation_flow_case,
browser-cli example get --id file_upload_case,
browser-cli example get --id checkout_flow_case,
browser-cli example get --id interactive_targeting_case, or
browser-cli example get --id page_diagnostics_case. Use --workflows-only when
you only need the structured setup,
Connect from Codex auth, device-code auth, scoped token lifecycle, one-off page
task, persistent login state, session recovery, case file task, form interaction,
interactive targeting, content extraction, browser state management, state waits, and page diagnostics workflows, or
--workflow <id> to fetch a single workflow. agent_workflows gives ordered
steps with fields to read, success conditions, failure hints, and cleanup
commands. The read arrays include auth flow availability, export usability,
and context reuse availability fields when those values decide the next step.
Action catalog entries include browser_target.exactly_one_of so
agents can supply exactly one of --session-id, --connect-url, or
--direct-url.
Argument parsing errors also return JSON on stdout with exit code 2:
{
"ok": false,
"command": "action.open-url",
"error": "argument_error",
"message": "the following arguments are required: --url",
"usage": "usage: browser-cli action open-url ..."
}Agents should use the usage field to repair malformed commands instead of
parsing stderr.
browser-cli doctor returns top-level ok, failed, warnings, checked,
ready_for_browser_actions, check-name arrays, and a repair_plan that
aggregates fix commands/env/guidance. Its checks array uses pass, warn,
fail, or skipped statuses for Python/runtime, install path, version,
command catalog, case schema, packaged references/examples, environment,
direct URL, API connectivity, and optional browser smoke-session checks.
doctor reports whether the direct URL can be built, but hides the full URL by
default; use --reveal-connect-url only in a trusted local shell. The browser_cli check
reports version_source so agents can distinguish installed package metadata
from the package fallback version.
The agent_references check verifies packaged Skill reference docs such as
skill_positioning, usable_status, and action_playbook are readable from
the installed CLI and reports missing_required_references,
invalid_references, and checked_references
with content_command/package_resource metadata. The agent_examples check
verifies packaged playbooks and case examples are readable, validates YAML case
examples, and reports missing_required_examples, invalid_examples,
checked_examples, case_valid, and case_errors. The command_catalog check
verifies the installed CLI has the commands and agent_workflows expected by
the Codex Skill and reports
missing_required_commands, missing_required_workflows, or
missing_required_workflow_steps with upgrade guidance when the action or
workflow surface is too old or missing critical steps such as cleanup. That
required surface includes selector actions, role-based text/existence/geometry
checks, press/hover/scroll, select/check/uncheck, role/text/label actions,
accessibility snapshot, interactive-only snapshot, and diagnostic commands. It
also reports invalid_workflow_command_references when a workflow step points
at a command missing from the parser-backed catalog, and
invalid_agent_entrypoint_command_references when a quick-start entrypoint does
the same. The
action_guides check verifies task-specific action guides such as
interactive_targeting, form_interaction, content_extraction,
browser_state_management, and page_diagnostics; it reports
missing_required_action_guides, required_guide_fields, and
invalid_action_guides, plus invalid_guide_command_references when a guide
points at a command not present in the parser-backed catalog. This tells agents
when an installed CLI is too old to guide them away from custom
Playwright/JavaScript.
The case_schema check verifies that repeatable case files can use the Skill's
expected semantic, state, content, storage/cookie, and diagnostic actions; it
reports required_case_actions,
required_case_scaffold_templates, missing_required_case_actions,
missing_supported_actions, missing_action_schemas,
missing_case_scaffold_templates, checked_case_scaffold_templates,
invalid_case_scaffold_templates, and invalid_action_schemas with upgrade
guidance when the installed CLI is too old or its packaged starter cases no
longer validate for case-based smoke tests.
The auth_login_contract check verifies that auth login still exposes the
handoff fields agents need for safe setup, including setup_blocks,
copyable_commands, local_env, verification, secret_policy,
connect_from_codex_url, and runtime-auth blockers.
The device_code_contract check verifies that auth login --device-code
still exposes the pending approval contract, required device-code endpoints,
required browser-site support, fallback setup blocks, and runtime-auth blockers.
The connect_from_codex_contract check verifies that browser.lexmount.cn
handoff fields such as capabilities, browser_site_acceptance_tests, token
lifecycle, runtime auth, and device-code API contracts are still present.
It masks api_key in direct URLs and diagnostic error messages by default.
doctor --smoke-session creates and closes a temporary session after API
connectivity passes, then reports the browser_smoke_session check with
created, closed, session_id, and actionable close guidance if cleanup
fails. Failed, warning, or skipped checks may include a fix object with a
stable code, recommended commands, relevant env names, and concise
guidance; agents should prefer repair_plan when telling the user how to
repair setup. It also verifies packaged agent prompt metadata with an
agent_prompt check so install-time Codex guidance stays aligned with doctor,
workflow discovery, examples, and secret-handling rules. Credential fixes also include
repair_plan.connect_from_codex.required_runtime_auth,
required_token_lifecycle, and site_capability_status, so agents can explain
browser.lexmount.cn, SDK, API, and gateway blockers from doctor output.
doctor --json is a no-op compatibility form because JSON is already the only
output format.
browser-cli auth status reports local credential presence without revealing
the API key. browser-cli auth export-env returns commands and script
fields; generated commands are placeholders or masked unless
--from-current --reveal-secrets is explicitly used locally.
For a new browser task, agents should prefer this sequence:
browser-cli commands --workflow session_recovery
browser-cli commands --workflow first_browser_task
browser-cli commands --workflow case_file_task
browser-cli case schema
browser-cli case schema --action observe
browser-cli case schema --action act
browser-cli case schema --action extract
browser-cli case schema --action fill-label
browser-cli example get --id auth_lifecycle_playbook --metadata-only
browser-cli example get --id persistent_context_playbook --metadata-only
browser-cli example get --id agent_primitives_case --metadata-only
browser-cli example get --id form_fill_case --metadata-only
browser-cli example get --id content_extraction_case --metadata-only
browser-cli example get --id browser_state_case --metadata-only
browser-cli example get --id navigation_flow_case --metadata-only
browser-cli example get --id file_upload_case --metadata-only
browser-cli example get --id checkout_flow_case --metadata-only
browser-cli example get --id interactive_targeting_case --metadata-only
browser-cli example get --id page_diagnostics_case --metadata-only
browser-cli case scaffold --template agent-primitives --output agent-primitives-case.yaml
browser-cli case scaffold --template form-fill --output form-case.yaml
browser-cli case scaffold --template content-extraction --output content-extraction-case.yaml
browser-cli case scaffold --template browser-state --output browser-state-case.yaml
browser-cli case scaffold --template navigation-flow --output navigation-case.yaml
browser-cli case scaffold --template file-upload --output upload-case.yaml
browser-cli case scaffold --template checkout-flow --output checkout-case.yaml
browser-cli case scaffold --template interactive-targeting --output interactive-case.yaml
browser-cli case scaffold --template page-diagnostics --output diagnostics-case.yaml
browser-cli session create
browser-cli action open-url --session-id <session_id> --url <url>
browser-cli action wait-url --session-id <session_id> --url <url-or-fragment>
browser-cli action wait-title --session-id <session_id> --title <title-or-fragment>
browser-cli action wait-load-state --session-id <session_id> --state complete
browser-cli action page-info --session-id <session_id>
browser-cli action snapshot --session-id <session_id>
browser-cli action exists --session-id <session_id> --selector <selector>
browser-cli action click --session-id <session_id> --selector <selector>
browser-cli action wait-network-idle --session-id <session_id> --idle-ms 500
browser-cli action wait-text --session-id <session_id> --text <text>
browser-cli action wait-count --session-id <session_id> --selector <selector> --count <n> --comparison gte
browser-cli action wait-attribute --session-id <session_id> --selector <selector> --name <name>
browser-cli action type --session-id <session_id> --selector <selector> --text <text>
browser-cli action get-text --session-id <session_id> --selector <selector>
browser-cli action get-value --session-id <session_id> --selector <selector>
browser-cli action storage-get --session-id <session_id> --area local --key <key>
browser-cli action wait-storage --session-id <session_id> --area local --key <key>
browser-cli action cookie-get --session-id <session_id> --name <name>
browser-cli action wait-cookie --session-id <session_id> --name <name>
browser-cli action query --session-id <session_id> --selector <selector>
browser-cli action screenshot --session-id <session_id> --output /tmp/final.png
browser-cli session close --session-id <session_id>case schema supports repeatable agent primitives and semantic form/targeting steps such as
observe, act, extract, fill, fill-label, fill-role, click-label, click-role, click-text, wait-text,
get-value-role, get-text-role, exists-role, select-label,
select-role, check-role, uncheck-role, hover-role, press-role,
press-key, scroll-into-view-role, click-index, form-snapshot,
interactive-snapshot, and accessibility-snapshot; selector state/value
checks such as query,
inspect, count, wait-count, wait-state, wait-attribute,
get-attribute, get-value, wait-value, bounding-box, clear,
set-value, set-file-input, dispatch-event, and submit; plus
navigation/status checks such as page-info,
wait-url, wait-title, wait-load-state, wait-network-idle,
wait-network, and wait-console; and
browser state checks such as storage-get, storage-set, storage-remove,
storage-clear, wait-storage, cookie-get, cookie-set, cookie-delete,
cookie-clear, and wait-cookie; plus content extraction snapshots such as
text-snapshot, link-snapshot, table-snapshot, and list-snapshot, plus
diagnostic surfaces such as dialog-snapshot, wait-dialog,
frame-snapshot, wait-frame, wait-role, performance-snapshot,
network-snapshot, and console-snapshot, so
agents can encode common smoke tests without dropping into custom browser
scripts.
Add expect to any case step when a structured result must make the run fail
instead of merely being reported. For example:
- action: wait-text
text: Saved
expect:
found: true
- action: wait-storage
key: seenIntro
value: "true"
match: exact
expect:
found: trueCommon agent recipes:
- Form submit:
interactive-snapshotorform-snapshot->fill-label,fill-role, orfill,set-value,set-file-input,clear-role, orclear->wait-value-role,get-value-role,wait-value, orget-value->blur-roleorblurif validation is focus-driven ->select-label,select-role, orselect-option,check-label,check-role,uncheck-role, orcheck->wait-state-role --state enabled,wait-state --state enabled, orwait-rolefor async submit buttons ->dispatch-eventif explicitinput/changeis needed ->submit --selector <form-or-field>,click-label --label <label>,click-role --role button --name <text>, orclick-text->wait-urlorwait-text. - Visible button/link: run
browser-cli commands --workflow interactive_targeting, useinteractive-snapshotoraccessibility-snapshotto choose the target, thenwait-rolewhen the control appears asynchronously, then useexists-role,get-text-role, orbounding-box-roleto confirm semantic existence, text, or geometry beforeclick-label,click-role, orclick-text; runlink-snapshotwhen the task is to choose, inspect, or report navigation URLs, then usescroll-into-viewand selectorclickafterexists,inspect, orbounding-boxconfirms a stable selector. - Link navigation: run
browser-cli commands --workflow link_navigationandbrowser-cli action guide --task link_navigation; uselink-snapshotto inspect visible text, href, same-origin, external, download, and masked URL fields, activate withclick-role,click-text, oropen-url, then verify withwait-url,wait-title,wait-load-state, andpage-info. - Repeated list item:
list-snapshotfor menus, search results, listboxes, and task lists -> readitems,links,checked,selected, andexpanded; use--selectorand--max-itemsto keep output bounded. Fall back toquery-> choose a zero-based candidate ->click-indexwhen the list is not semantic. - Page content/data: run
browser-cli commands --workflow content_extraction, then start withaction extract --surface text --surface links --selector main; use--surface allwhen one bounded result should include outline, tables, lists, and accessibility nodes. Choose narroweroutline-snapshot,text-snapshot,link-snapshot,table-snapshot,list-snapshot, oraccessibility-snapshotwhen the bundled extract result is too broad or truncated before falling back tosnapshotor custom JavaScript. - Manage browser state: run
browser-cli commands --workflow browser_state_management, then choosestorage-get,storage-set,cookie-get,cookie-set,wait-storage, orwait-cookiebefore using custom JavaScript. Use these for local/session storage and document.cookie-visible cookies only. - Upload files: run
browser-cli commands --workflow file_upload, inspect upload controls withform-snapshotorquery, then useset-file-inputand verifyfile_count,requested_files, andfilesbefore submitting. - Dialogs and frames: run
browser-cli commands --workflow dialog_frame_handlingandbrowser-cli action guide --task dialog_frame_handling, then usewait-dialog,dialog-snapshot,wait-frame, orframe-snapshotbefore custom JavaScript. - Menus and keyboard: run
browser-cli commands --workflow menu_keyboard_flowandbrowser-cli action guide --task menu_keyboard_flow, then usehover-role,focus-role,press-role,wait-attribute-role,list-snapshot, orpress-keybefore custom JavaScript. - Mouse gestures: run
browser-cli commands --workflow mouse_interactionandbrowser-cli action guide --task mouse_interaction, then preferdouble-click-role,right-click-role, ordrag-role-to-role, then fall back to selectordrag-to,double-click, orright-click; verify withpage-info,wait-text,interactive-snapshot, orwait-url. - Navigation: run
browser-cli commands --workflow navigation_flowandbrowser-cli action guide --task navigation_flow, then useopen-url,reload,go-back,go-forward,wait-url,wait-title, andwait-load-statebefore custom JavaScript. - Visual evidence: run
browser-cli commands --workflow visual_captureandbrowser-cli action guide --task visual_capture, set viewport when needed, then usescreenshot-role,screenshot-selector, full-pagescreenshot, or boundedtext-snapshotbefore custom JavaScript. - Semantic waits: run
browser-cli commands --workflow semantic_waitsandbrowser-cli action guide --task semantic_waits, then usewait-role,wait-text,wait-state-role,wait-attribute-role, orwait-countbefore sleeps or polling JavaScript. - Deterministic wait: run
browser-cli commands --workflow state_waits, then choose the narrowestwait-*command such aswait-load-state,wait-url,wait-state-role,wait-attribute-role,wait-network,wait-storage, orwait-cookiebefore using sleeps or custom JavaScript. - Table or report data:
table-snapshot-> readheaders,rows, andcells; use--selector,--max-rows, and--max-cellsto keep output bounded. - Text, alerts, or status messages:
text-snapshot-> readtexts,kind,aria_live,text_length, andtext_truncated; use--selector,--max-nodes, and--max-charsbefore falling back to fullsnapshot. - Modal or blocking prompt: run
browser-cli commands --workflow dialog_frame_handlingfirst; usewait-dialog --text <text> --modal-onlywhen the prompt appears asynchronously, otherwisedialog-snapshot; readdialogs,title,description,text,controls,control_count, and link masks; then useclick-label,click-role,click-text, orclick-indexfor the chosen control. - Embedded frame: run
browser-cli action guide --task dialog_frame_handling, thenframe-snapshot-> readframes,src,readable,frame_url,body_text,read_error, andbounding_box; same-origin frames can expose bounded text, while cross-origin frames usually require using the frame's URL or reporting that direct DOM inspection is unavailable. - Page loading or slow resource diagnosis:
performance-snapshot-> readnavigation,resources,initiator_types,duration,transfer_size, andresponse_status; use--initiator-typeand--min-duration-msto keep output focused. - Fetch/XHR diagnosis: run
network-snapshot --install-onlybefore the action, trigger the page behavior, then runnetwork-snapshotto read masked request URLs,method,status,ok,failed,duration_ms, andrequest_has_body; use--source,--method, or--failed-onlyto narrow entries and--clearafter collecting. To wait for a future request, runwait-network --url <path> --method <method>; add--status,--failed-only, or--after-indexwhen stale buffered entries should be ignored. - Console or page error diagnosis: run
console-snapshot --install-onlybefore the suspicious action, trigger the page behavior, then runconsole-snapshotto readentries,source,level,method, and maskedtext. Use--clearafter collecting entries. To wait for a future error, runwait-console --source pageerror --level error; pass--after-indexfrom a priorconsole-snapshotentry when stale buffered entries should be ignored. - Page structure:
outline-snapshot-> readheadingsandlandmarksbefore deciding where to inspect, click, or scroll. - Stuck selector:
inspectto checkstate.disabled,state.readonly,visible,in_viewport,attributes, maskedvalue, and optional sanitized HTML before trying another action. - Navigation or async refresh: run
browser-cli commands --workflow navigation_flow; useopen-url,reload,go-back, orgo-forward, then confirm withpage-info,wait-url,wait-title,wait-load-state,wait-network-idle,performance-snapshot,wait-text, orsnapshot. - Visual capture: run
browser-cli commands --workflow visual_capture; usepage-info,set-viewport,screenshot-role,screenshot-selector, full-pagescreenshot, and boundedtext-snapshotbefore custom JavaScript. - Semantic target readiness: run
browser-cli commands --workflow semantic_waits; usewait-role,wait-text,wait-state-role,wait-attribute-role,exists-role,get-text-role, andbounding-box-rolebefore custom polling JavaScript. - Runtime errors: install
console-snapshot --install-only, trigger the suspected action, readconsole-snapshotor wait withwait-console, then usetext-snapshot,wait-dialog,dialog-snapshot,wait-frame, orinspectto correlate visible state with JS errors. - Menu or keyboard flow: run
browser-cli commands --workflow menu_keyboard_flow; then usefocus-role,hover-role,press-role,scroll-into-view-role, selector-scopedfocus,hover, orpress, active/globalpress-key,wait-attribute-roleforaria-expandedoraria-selected, ordispatch-event, then inspect again withinteractive-snapshot. - Dialog flow: use
wait-dialogwhen the dialog appears asynchronously, otherwisedialog-snapshotfor modal dialogs, alert dialogs, cookie banners, and confirmation prompts; choose a control fromcontrols, then click semantically and confirm withwait-text,wait-role, ortext-snapshot. - Frame flow: use
wait-framewhen the iframe or embedded app appears asynchronously, otherwiseframe-snapshotbefore writing frame-related JavaScript; usereadable,same_origin,frame_url, andread_errorto decide whether the agent can inspect the embedded page or needs a different browser workflow. - Read results:
page-infofor URL/title/readyState/viewport checks,set-viewportbefore responsive screenshots or layout-sensitive checks,wait-titlefor async title changes,wait-countfor dynamic lists,list-snapshotfor menu/listbox/search-result/task-list content,text-snapshotfor visible paragraphs, alerts, status messages, and bounded readable page text,get-attribute-roleandwait-attribute-rolefor semantic DOM attributes,wait-attributefor selector DOM attributes,wait-state-rolefor semantic enabled/visible/checked/focused states,wait-statefor selector states,get-text-rolefor semantic text checks,get-textfor known selectors, orsnapshotwhen the selector is unknown. Usewait-text --state absentwhen loading, toast, or error text should disappear. - Browser state: use
storage-getto inspect local/session storage,storage-setto adjust feature flags or onboarding state, andstorage-removeorstorage-clear --prefix <prefix>for targeted cleanup. Usewait-storageafter actions that should create/remove keys. Usecookie-get,cookie-set,cookie-delete, orcookie-clearfor document.cookie-visible cookies, andwait-cookieafter consent/login flows; HttpOnly cookies are not visible through this action surface. - Debug candidate selectors: use
countfor cardinality,queryfor node metadata,inspectfor state,get-attributefor href/value/aria checks, thenwait-count,wait-state, orwait-attributefor async DOM changes. - Final evidence:
set-viewportwhen evidence needs a stable browser size,screenshot-rolefor a semantic target,screenshot-selectorfor a known panel/control, thenscreenshotfor full viewport/page evidence before closing the session unless it should stay open.
Use session create --context-metadata-json '{"purpose":"codex-login"}' --context-selection newest --create-context-if-missing --context-mode read_write
when login state or cookies should survive between sessions. The command picks
a reusable matching context, creates one if requested, then returns context_reuse with
candidate contexts, created, selected, normalized_status, availability,
top-level reusable, locked, reuse_reason, selection_strategy,
selection_summary, and locked/reusable details. Treat
availability: "available" as reusable, availability: "locked" as busy, and
availability: "unavailable" as a state that needs a different context. Use
context list --metadata-json '{"purpose":"codex-login"}' --selection newest --include-reuse-state
to inspect reusable, locked, and metadata-mismatched candidates without mutation; read
reuse_candidates, recommended_context_id, and selection_summary. Use
context status --context-id <context_id> before reusing a known context id. Use
context pick --metadata-json '{"purpose":"codex-login"}' --selection newest --dry-run
when you need to inspect or report candidates before creating a session; read
selection_strategy, selection_summary.recommended_next_action, decision_reason,
locked_matches, metadata_mismatches, reusable_matches, and would_create
before deciding whether to reuse, create, wait, or adjust filters. Candidate
metadata_diagnostics reports matched, missing, and different metadata keys
with values redacted. metadata_diagnostics.metadata_source can be
local_registry when browser-cli is using metadata it recorded locally after
creating a context.
This repository includes a starter SKILL.md so the project can
evolve into a Codex skill. The skill stays a thin wrapper around this CLI:
SKILL.mdshould teach agents when to use browser sessions, contexts, and actions.- The skill should install or verify
browser-cli. - The skill should never store API keys in the skill directory.
- The skill should keep using JSON command output instead of importing Python internals directly.
The smoothest onboarding path would be a dedicated "Connect from Codex" flow:
- Add
/connect/codexand accept optionalproject_id, repeatedscope, andexpires_inquery parameters generated bybrowser-cli auth loginorbrowser-cli auth scopes --include-site-contract. - Add a scoped API key wizard for agent use, with clear permissions, optional expiration, and one-click revoke.
- Provide a copyable install block:
uv tool install git+https://github.com/lexmount/browser-cli.git. - Add a "Verify CLI" section that tells users to run
browser-cli doctor --jsonandbrowser-cli doctor --smoke-sessionafter setting env vars, then explainsready_for_browser_actionsandbrowser_smoke_session. - Show the selected
Project ID, scoped credential status, copyablebrowser-cli auth export-env/export ...commands, and revoke/expiration details for the issued credential. - Show
browser-cli auth login,auth status, andauth export-envas the local setup path until device-code is available. - Longer term, support device-code or OAuth-style authorization so Codex can ask the user to approve access in the browser and then receive a local, short-lived token without the user manually copying API keys.