Skip to content

Do not send FTP commands with embedded CRs or LFs#2444

Closed
somecookie wants to merge 45 commits into
squid-cache:masterfrom
measurement-factory:osd-7-sanitize-ftp-command
Closed

Do not send FTP commands with embedded CRs or LFs#2444
somecookie wants to merge 45 commits into
squid-cache:masterfrom
measurement-factory:osd-7-sanitize-ftp-command

Conversation

@somecookie

Copy link
Copy Markdown
Contributor

FTP command syntax prohibits bare CRs and LFs except for command
termination. FTP servers treat embedded CRs inconsistently. FTP does not
have a generally accepted escape mechanism for safely encoding those
message-framing characters. Thus, Squid must not send FTP commands with
parameters containing embedded CRs or LFs.

Squid assembles FTP commands from a variety of information sources that
depend, in part, on whether the master transaction originated on an
http(s)_port or ftp_port. One can check for (and reject) embedded
CRs and LFs in many code locations. Our attempts to reject invalid
inputs earlier, before the assembly, resulted in significant
functionality changes going beyond this fix scope but still missed an
important injection point.

The approach used here covers all known code paths, has a decent chance
of remaining effective during significant code refactoring, and does not
alter Squid functionality beyond killing transactions that resulted in
problematic FTP command parameters (an in-scope change). This approach
also allows request adaptation services to "fix" problematic requests in
some cases.

Rejected transactions usually result in HTTP 502 ERR_FTP_FAILURE
responses logged as TCP_MISS_ABORTED. More work is needed to provide
better %err_detail for these cases.

This is a Measurement Factory project.

rousskov added 15 commits June 15, 2026 10:04
The new variable is used beyond the `if` statement below its definition.
The `if` statement order does not matter match, but this order is faster
if both components are problematic and this order helps clarify intended
the urlpathDecoded usage scope.
Words like "info", "data", and "state" are popular in legacy Squid code,
but they usually do not add anything to the meaning of the named entity
and, hence, should be avoided.

Highlighting that we are copying something is actually useful.
AnyP::Uri::parse() has a try-catch wrapper around its code, making it
feasible to throw on parsing errors inside [the affected part of] that
legacy code. In general, throwing on parsing errors simplifies and makes
safer parsing code, so we usually want to throw in that context rather
than rely on boolean return values.

Also use "///" for Doxygen comments because they often consume fewer
lines and simplify consistent formatting.
This might be useful in triaging lower-level code and for detecting
CharactersSet reuse/duplication.
Also disclosed that we are going "too far" with these checks.
    % git grep 'if[(]' '*.cc' | wc -l
        21
    % git grep 'if [(]' '*.cc' | wc -l
        14016
... to better tolerate Ftp::Relay::sendCommand() and similar exceptions.

    ERROR: Squid BUG: assurance failed: ...
        exception location: FtpClient.cc(835) writeCommand
    AsyncJob.cc(145) callException: assurance failed ...
    AsyncJob.cc(100) mustStop: Ftp::Relay will stop, reason: exception
    ...
    FATAL: check failed: !request->pinnedConnection()
        exception location: FwdState.cc(1119) connectStart

When branch-added Ftp::Client::writeCommand() assertions throw, the
worker dies instead of just killing the affected ftp_port transaction.

`JobDialer::dial()` catches the BUG exception, as expected, but the
worker still dies because `Client::swanSong()` cleanup code calls
`FwdState::handleUnregisteredServerEnd()` which attempts to retry the
failed FTP transaction despite true `request->pinnedConnection()`, which
triggers another, this time uncaught, exception in `connectStart()`.

Also documented suspicions that `FwdState::pinnedCanRetry()` should
_always_ return false and, hence, should be removed (instead of adding
more and more code/conditions when that method returns false).

----

I hope this out-of-scope change can be reverted (and merged separately)
after we add CR protections to command parser and/or replace
writeCommand() assertions with (moved) embedded new line handling code.
FTP command syntax treats CR and LF characters as command terminators.
All RFC 959 sightings of CR and LF either refer to the CRLF terminator
or treat both characters the same way. For example, RFC 959 defines an
FTP command parameter as a sequence of chars, where:

    <char> ::= any of the 128 ASCII characters except <CR> and <LF>

Squid should (and now does) restrict CR use to the command terminator
sequence, especially since some FTP servers treat CRs as command
delimiters -- if we continue to allow embedded CRs in FTP command
parameters than our FtpClient::writeCommand() will assert when trying to
forward those commands to the FTP server. Moreover, we already use that
CR treatment when _parsing_ FTP responses (see
Ftp::Client::parseControlReply()).

When it comes to command termination, CRs are still optional.

A new ban on CRs in FTP command parameter values means that Squid starts
treating some FTP commands as syntactically invalid, using
EarlyErrorKind::MalformedCommand for the first time since its inception
in 2014 commit eacfca8. For example, `PWD\rQUIT` input is now invalid.

N.B. This change removes "RFC 959 Section 3.1.1.5.2" reference. That
reference was wrong: RFC Section 3.1 is about "data representations"
used for file transfers rather than for command syntax. I probably
missed this problem when the addition of that reference was requested at
http://www.squid-cache.org/mail-archive/squid-dev/201408/0023.html
This commit effectively reverts branch commit 5b6e34d changes. In that
commit, we chose to reject problematic `ftp://` URLs in Uri::parse()
rather than wait until they trigger a creation of a problematic FTP
command in Ftp::Gateway because detecting (and rejecting) a problematic
request ASAP is often the safest option. For example, an allowed request
could cause problems in adaptation services or other parts of Squid
ecosystem that carelessly converts URLs to FTP commands.

However, the above choice comes with significant drawbacks:

* It broke previously working cases where problematic `ftp://` URLs did
  not reach Ftp::Gateway (e.g., where they were rewritten by a REQMOD
  service or sent to a cache_peer).

* Squid started rejecting `ftp://` URL paths with broken
  percent-encoding. For example, an HTTP GET request for
  `ftp://example.com/double%%` resulted in an HTTP 400 (Bad Request)
  ERR_INVALID_URL response instead of leading to Squid sending FTP
  commands with a different filename (i.e. `RETR double%`). That is a
  significant change that may break some "working" use cases as well.

* Finally, URI-based controls missed another CRLF injection point (now
  addressed in the previous branch commit 21581bd), proving that they
  are not sufficient. Squid did not handle the resulting exceptions well
  enough, necessitating more out of scope changes (see branch commit
  becd5b4 that this change also reverts).

Some of these problems were known in advance, and we hesitated moving
the checks into `Uri.cc`. Now, given the number, the weight, and the
scope of these drawbacks, we believe that we made the wrong decision.
This reverts branch commit 21581bd
because official code can now survive the corresponding CR rejection in
Ftp::Client::writeCommand() because that rejection code is no longer
throwing.

The reverted changes are valuable and should be merged separately.

@rousskov rousskov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description: Squid must not send FTP commands with parameters containing embedded CRs or LFs. ... Squid assembles FTP commands from a variety of information sources ... One can check for (and reject) embedded CRs and LFs in many code locations. Our attempts to reject invalid inputs earlier, before the assembly, resulted in significant functionality changes going beyond this fix scope

While we reverted those earlier changes, some of them are quite valuable, and we plan to polish and submit them as dedicated pull requests after this surgical (and long-awaited) PR is merged.

Comment thread src/clients/FtpClient.cc
if (crlfCharPosition != bufLen-2) {
const auto invalidCharName = buf[crlfCharPosition] == '\r' ? "CR" : "LF";
debugs(9, 2, "ERROR: Caller assembled a malformed FTP command. Found " << invalidCharName << " at position " << crlfCharPosition);
failed(ERR_FTP_FAILURE, 0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should throw here. We tried, but other/out-of-scope problems in current code forced us to table that idea (for now). FWIW, I am uncomfortable with the proposed failed() call in the middle of transaction processing stack, but it matches how we handle similar problems in FTP code, and it passes our tests. In the absence of specific evidence to the contrary, it is the right step forward here despite my concerns, until other improvements allow us to replace such calls with exceptions.

@rousskov rousskov added the S-could-use-an-approval An approval may speed this PR merger (but is not required) label Jun 18, 2026
Comment thread src/clients/FtpClient.cc
void
Ftp::Client::writeCommand(const char *buf)
{
// The caller must supply a non-empty command followed by CRLF.

@yadij yadij Jun 19, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the check being added is not validating "non-empty", it is validating length. The string may still be whitespace characters, which make non-zero length empty.

(I am not certain this is more than a theoretical case, thus the non-blocker review)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the check being added is not validating "non-empty", it is validating length. The string may still be whitespace characters, which make non-zero length empty.

// The caller must supply a non-empty command followed by CRLF.
Assure(bufLen > 2);

This source code comment uses the words "non-empty" to mean "has at least one character". IMO, that is a reasonable usage. "Has at least one non-space character" could also be a reasonable definition in this context, but that is not the definition being used here. The actual assertion code (quoted above) clearly distinguishes the two cases, so I do not think there is an ambiguity problem here.

A command consisting entirely of space characters would be invalid and should not be generated, but our new assertions do not validate that invariant (because it is unrelated to this PR scope and correctly detecting excessive spaces in the assembled command may not be trivial and could be controversial). I do not recommend adding such an assertion in this PR.

Removing "non-empty" from this C++ comment would make this code worse because many readers would suspect an off-by one error without the comment -- the assertions further below only require a bufLen >= 2 condition. Using that weaker condition instead is not an improvement either -- Squid should not generate a \r\n command, of course.

If you can suggest a better wording for this comment, please do so.

squid-anubis pushed a commit that referenced this pull request Jun 20, 2026
FTP command syntax prohibits bare CRs and LFs except for command
termination. FTP servers treat embedded CRs inconsistently. FTP does not
have a generally accepted escape mechanism for safely encoding those
message-framing characters. Thus, Squid must not send FTP commands with
parameters containing embedded CRs or LFs.

Squid assembles FTP commands from a variety of information sources that
depend, in part, on whether the master transaction originated on an
`http(s)_port` or `ftp_port`. One can check for (and reject) embedded
CRs and LFs in many code locations. Our attempts to reject invalid
inputs earlier, before the assembly, resulted in significant
functionality changes going beyond this fix scope but still missed an
important injection point.

The approach used here covers all known code paths, has a decent chance
of remaining effective during significant code refactoring, and does not
alter Squid functionality beyond killing transactions that resulted in
problematic FTP command parameters (an in-scope change). This approach
also allows request adaptation services to "fix" problematic requests in
some cases.

Rejected transactions usually result in HTTP 502 ERR_FTP_FAILURE
responses logged as TCP_MISS_ABORTED. More work is needed to provide
better %err_detail for these cases.

This is a Measurement Factory project.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Jun 20, 2026
@rousskov rousskov added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-could-use-an-approval An approval may speed this PR merger (but is not required) labels Jun 20, 2026
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels labels Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

M-merged https://github.com/measurement-factory/anubis#pull-request-labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants