Skip to content

fix: secure login flow with copy-paste API key exchange#337

Open
KAJdev wants to merge 7 commits into
mainfrom
zeke/ae-3128-fix-login-method-to-be-more-secure
Open

fix: secure login flow with copy-paste API key exchange#337
KAJdev wants to merge 7 commits into
mainfrom
zeke/ae-3128-fix-login-method-to-be-more-secure

Conversation

@KAJdev
Copy link
Copy Markdown
Contributor

@KAJdev KAJdev commented May 21, 2026

the old login flow had the CLI poll for the API key after browser approval. any process that knew the request ID could intercept the key via the same unauthenticated query.

replaces polling with a copy-paste flow: the browser displays the generated API key after approval, and the user pastes it into the CLI prompt. the CLI never fetches the key over the network.

also removes the get_flash_auth_request_status polling method from the GraphQL client since it is no longer needed.

AE-3128

@KAJdev KAJdev marked this pull request as ready for review May 21, 2026 17:42
@promptless
Copy link
Copy Markdown

promptless Bot commented May 21, 2026

Promptless prepared a documentation update related to this change.

Triggered by runpod/flash#337

Updated the flash login command documentation to describe the new copy-paste authentication flow where users copy the API key from the browser and paste it into the CLI. Removed the --timeout flag documentation since it's no longer used. Also updated Python version requirements from "Python 3.10, 3.11, or 3.12" to "Python 3.10 or later" across 6 tutorial and reference pages.

Review: Update flash login docs for copy-paste auth flow

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the flash CLI login flow to avoid fetching newly-issued API keys over an unauthenticated polling endpoint by switching to a user copy/paste exchange, and removes the no-longer-needed GraphQL polling method.

Changes:

  • Replace the CLI’s polling-based login with a copy/paste API key prompt; remove --timeout and related polling/deadline logic.
  • Remove get_flash_auth_request_status from the GraphQL client and update unit tests accordingly.
  • Introduce new SSE log parsing/streaming helpers in request_logs.py and broaden the supported Python version range in packaging metadata.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/test_login.py Updates unit tests to match the copy/paste login flow and removes polling-related coverage.
tests/unit/test_login_extended.py Updates extended login/GraphQL tests to remove polling status checks and align with paste-based login.
src/runpod_flash/core/resources/request_logs.py Adds SSE event/log parsing and a pod log streaming generator.
src/runpod_flash/core/api/runpod.py Removes the polling query method get_flash_auth_request_status.
src/runpod_flash/cli/commands/login.py Implements copy/paste API key login flow and removes timeout/polling logic.
pyproject.toml Relaxes requires-python from >=3.10,<3.13 to >=3.10.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/runpod_flash/cli/commands/login.py
Comment thread src/runpod_flash/core/resources/request_logs.py
Comment thread src/runpod_flash/core/resources/request_logs.py
Comment thread src/runpod_flash/core/resources/request_logs.py
Comment thread src/runpod_flash/core/resources/request_logs.py
Comment thread src/runpod_flash/core/resources/request_logs.py
Copy link
Copy Markdown
Member

@deanq deanq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AE-3128 review — login flow security. A few suggestions on the CLI paste UX.

Comment thread src/runpod_flash/cli/commands/login.py
Comment thread src/runpod_flash/cli/commands/login.py
Comment thread src/runpod_flash/cli/commands/login.py
@KAJdev KAJdev requested a review from deanq June 2, 2026 18:09
Copy link
Copy Markdown
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Henrik's AI-Powered Bug Finder — PR #337 Review

Verdict: NEEDS WORK — login change is solid, but the PR also ships an unrelated, untested, and runtime-broken pod-log SSE streaming feature in request_logs.py. Pull that out, then this is good.

CI is green across all 4 Python versions — but the SSE additions have no tests, so green CI doesn't tell us much about whether they actually work.


1. Login flow — secure copy/paste exchange

Correct shape, addresses the stated vulnerability — with a coordinated server-side caveat

User scenario (before): a malicious process on the user's machine knows the request_id (e.g. via /proc tracing or shell history) and can poll flashAuthRequestStatus over an unauthenticated endpoint to exfiltrate the API key right when the user approves in the browser.

User scenario (after): browser shows the key after approval; user copies it from the browser and pastes it into the CLI. CLI never fetches the key over the network.

The CLI side does the right thing: get_flash_auth_request_status is removed entirely from the GraphQL client, polling loop deleted, no timeout knob needed.

Question: is the server-side endpoint also changing?

Removing the client-side call closes the legitimate polling path, but it doesn't address the root cause if the server-side flashAuthRequestStatus query still returns apiKey to any unauthenticated caller with a valid request_id. A malicious process can still call the query directly.

For the vulnerability to be fully closed, the server should either:

  • Stop returning apiKey from the unauthenticated status query, or
  • Remove the query endpoint entirely.

Is there a coordinated server-side change in flight? If yes, mention it in the PR description so reviewers know the threat model is fully addressed across client + server. If no, this PR is a defense-in-depth measure but not a complete fix — worth saying so explicitly.

Question: browser-side change shipped?

The PR description says "the browser displays the generated API key after approval." That requires a change in the auth-approval web page (the page at the auth_url the CLI prints). If that browser-side change hasn't shipped yet — or hasn't shipped to the same environment the CLI talks to — a user upgrading to this CLI will be prompted to paste a key they can't see in the browser.

Worth confirming the browser flow is live in prod (and dev) before this CLI merges. A short note in the PR description ("browser change deployed in ") would close this.


2. Unrelated SSE log streaming snuck into request_logs.py — pull it out

The PR adds 62 lines to src/runpod_flash/core/resources/request_logs.py: SSEEvent, LogEvent, parse_sse_event, parse_log_event, stream_pod_logs. None of this is mentioned in the PR title, description, or commit messages. None of it has tests.

This is a separate feature (pod-log streaming via SSE). It should be its own PR, with its own tests, its own ticket. Merging it under a login-security title means it won't get the review it needs, and it'll show up in the v1.x.x changelog as part of "secure login flow" — confusing for users tracking what changed.

While the additions are in the diff, I have to review them. Two bugs in the SSE code that would not survive a focused review:

Bug: `stream_pod_logs` uses `client.get(url)` as an async context manager — it isn't

```python
async with get_authenticated_httpx_client() as client:
async with client.get(url) as response: # <-- httpx.AsyncClient.get is a coroutine, NOT an async ctx manager
async for line in response.aiter_lines():
```

User scenario: any caller of `stream_pod_logs(pod_id, tail=...)` gets `TypeError: 'coroutine' object does not support the asynchronous context manager protocol` (or similar) the first time it runs against a real endpoint.

Compare to the existing pattern in the same file at `request_logs.py:295` which uses `client.get(url)` and `await`s the response — not as a context manager. For streaming, the correct httpx idiom is `client.stream("GET", url)`, which IS an async context manager. As written, this function does not work.

Bug: `parse_sse_event` cannot succeed on a single line of SSE input

`stream_pod_logs` does `async for line in response.aiter_lines(): event = parse_sse_event(line)` — feeding one line at a time.

But `parse_sse_event` expects a multi-line event:

```python
event_id_line, data_line = filter(bool, data.split("\n")) # <-- unpack exactly 2
event_id = event_id_line.split(":", 1)[1].strip()
data_json = data_line.split(":", 1)[1].strip()
```

A single line passed to this function fails to unpack ("not enough values to unpack"). Even multi-line input has fragility: real SSE events can include `event:`, `retry:`, multiple `data:` lines, comments, and use `\r\n` line endings per the SSE spec.

The function also catches all exceptions and logs at `log.error`, so failures are silent except for log spam. End result: even if `client.get` were fixed to `client.stream`, every parse call would fail and `stream_pod_logs` would yield zero log events forever.

Test coverage: none

There's no test exercising `parse_sse_event`, `parse_log_event`, or `stream_pod_logs`. CI green tells us the module imports cleanly. It does not tell us anything about correctness.

Recommendation: revert the 62-line addition to `request_logs.py` from this PR. Land it as a separate PR with tests and an `httpx.stream` usage, ideally with a parser based on a real SSE state machine (e.g. read until blank line, accumulate `data:` lines, parse JSON once).


3. UX: pasted API key is echoed in the terminal

`console.input("Paste the API key shown after authorization: ")` echoes the input. For a key that the user is meant to paste (not type) into the prompt, the value ends up in:

  • Terminal scrollback buffer
  • tmux / screen scrollback if applicable
  • Terminal recording tools (`asciinema`, IDE terminal sessions)
  • Shell history hook scripts that snapshot output

`getpass.getpass()` or Rich's `Prompt.ask(..., password=True)` would prevent the echo. Rich already imported; one-line fix:

```python
from rich.prompt import Prompt
api_key = Prompt.ask("Paste the API key shown after authorization", password=True).strip()
```

This is a real exposure for users in environments where terminal output is logged or shared.


4. Issue: KeyboardInterrupt during paste prompt surfaces as unhandled traceback

The user is now blocked at a prompt waiting for paste. If they Ctrl+C (because they realised they pasted into the wrong window, or want to abort), `console.input()` raises `KeyboardInterrupt`, which is not caught by `login_command`'s `except RuntimeError` handler. User sees a Python traceback instead of clean exit.

```python
except (RuntimeError, KeyboardInterrupt) as exc:
```

Or catch `KeyboardInterrupt` separately and print "login cancelled" with `typer.Exit(code=130)` (conventional SIGINT exit code).


5. Nit: empty-key error message could be friendlier

Current: `RuntimeError("no api key provided")` → user sees `Error: no api key provided`.

A user who just pressed Enter (or pasted whitespace) on the prompt may not realise they need to paste the API key from the browser. Better message:

```
no api key provided. copy the key shown in the browser after approval and paste it here.
```

Minor. The current text isn't wrong, just terse.


Nits

  • Test cleanup is good — `_make_mock_client(**status_return)` simplified to `_make_mock_client()`, all the polling-status mocks removed. Tests now reflect the new flow accurately.
  • Removed `--timeout` flag — this is a CLI option breaking change. Anyone scripting `flash login --timeout 30` will now fail with an unknown-flag error. Probably zero real-world impact (login is interactive), but worth a CHANGELOG callout.
  • `test_login_extended.py` imports — `import datetime as dt` is gone, good. The docstring at top still mentions "Covers gaps in test_login.py" — should be updated to reflect that those gap categories (CONSUMED, expiresAt, timeout) no longer exist.

🤖 Reviewed by Henrik's AI-Powered Bug Finder

Copy link
Copy Markdown
Member

@deanq deanq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review from /pr-review-toolkit:review-pr (login security + request_logs.py SSE additions). Critical items: stream_pod_logs is runtime-broken and parse_sse_event drops every event; the SSE block also appears unrelated to AE-3128. The login flow still echoes the pasted key and persists it unvalidated.


if status in {"DENIED", "EXPIRED", "CONSUMED"}:
raise RuntimeError(f"login failed: {status.lower()}")
api_key = console.input("Paste the API key shown after authorization: ").strip()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.input(...) echoes the pasted key to the terminal, so the credential lands in scrollback, screen recordings, and shoulder-surfing range. This is the exact exposure this PR set out to close, so it shouldn't ship echoed. Use hidden input:

api_key = console.input("Paste the API key shown after authorization: ", password=True).strip()

(or getpass.getpass). This was raised earlier in the review and the thread was resolved, but the code is unchanged at HEAD.

Comment on lines +36 to +41
if not api_key:
raise RuntimeError("no api key provided")

await asyncio.sleep(POLL_INTERVAL_SECONDS)
check_and_migrate_legacy_credentials()
path = save_api_key(api_key)
console.print(f"[green]Logged in.[/green] Credentials saved to [dim]{path}[/dim]")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any non-empty string is written to the credentials file and the user sees "Logged in." — a typo or stray paste becomes a deferred silent failure that only surfaces as a 401 on the next command, far from the cause. At minimum check the format (e.g. rpa_ prefix) before persisting; ideally fire one cheap authenticated call (e.g. myself/whoami) and only print "Logged in." once the key is confirmed usable. Fail loudly with an actionable message otherwise.

url = f"{RUNPOD_HAPI_URL}/v1/pod/{pod_id}/logs?stream=true&tail={tail}"

async with get_authenticated_httpx_client() as client:
async with client.get(url) as response:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_authenticated_httpx_client() returns an httpx.AsyncClient, whose .get(url) returns a coroutine resolving to a buffered Response — not an async context manager. async with client.get(url) as response: raises TypeError/AttributeError on first call, and aiter_lines() requires a stream-opened response anyway. Use the streaming API and surface HTTP errors:

async with client.stream("GET", url) as response:
    response.raise_for_status()
    async for line in response.aiter_lines():
        ...

Without raise_for_status(), a 401/404/500 (e.g. expired key — directly relevant to this PR) silently yields an empty stream. Every other call site in this file correctly uses await client.get(...).

Comment on lines +51 to +66
def parse_sse_event(data: str) -> Optional[SSEEvent]:
"""
Parses an SSE line into a dictionary
"""
if not data:
return None

try:
event_id_line, data_line = filter(bool, data.split("\n"))
event_id = event_id_line.split(":", 1)[1].strip()
data_json = data_line.split(":", 1)[1].strip()
data = json.loads(data_json)
return SSEEvent(id=event_id, data=data)
except Exception as e:
log.error("Failed to parse SSE event: %s", e)
return None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two problems in this function:

  1. Contract mismatch — drops 100% of events. event_id_line, data_line = filter(bool, data.split("\n")) assumes a multi-line id:\ndata: frame, but the only caller (stream_pod_logs) feeds it one line at a time via response.aiter_lines(). A single line can't unpack into two targets, so this raises ValueError on essentially every line; the broad except swallows it and returns None, so the stream yields nothing. SSE frames must be accumulated across the blank-line delimiter before parsing.

  2. Docstring is wrong. "Parses an SSE line into a dictionary" — it returns an SSEEvent dataclass (not a dict) from a multi-line block (not a line). Please correct it and document the None-on-failure behavior per the project docstring standard (args/returns/raises).

return None


async def stream_pod_logs(pod_id: str, tail: int = 0):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSEEvent, LogEvent, parse_sse_event, parse_log_event, and stream_pod_logs are unrelated to the login security fix this PR describes. Verified across the whole branch: nothing outside this file references any of these symbols, and stream_pod_logs itself has no caller — so this is dead code, with no tests, and (per the other comments) non-functional as written. Recommend pulling the entire block out into its own PR behind tests and a real caller. Bundling it here also enlarges the security review surface for no benefit.

Comment on lines +64 to +66
except Exception as e:
log.error("Failed to parse SSE event: %s", e)
return None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except Exception here (and in parse_log_event at line 75) swallows structurally different failures — the always-present ValueError from the unpack bug is indistinguishable in the logs from a genuine json.JSONDecodeError or a one-off malformed line. Per the project convention (no broad except, never swallow silently, errors must be actionable), narrow to the expected types and log the offending payload:

except (ValueError, IndexError, json.JSONDecodeError) as e:
    log.error("Failed to parse SSE event %r: %s", data, e)
    return None

Comment on lines +84 to +85
if tail < 0:
raise ValueError("tail must be greater than 0")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard rejects tail < 0 but the message says "tail must be greater than 0" — yet tail=0 is the default and is allowed. The message should read "tail must be greater than or equal to 0" (or "must not be negative") to match the predicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants