Do not send FTP commands with embedded CRs or LFs#2444
Conversation
The branch-added "XXX" comment is not really an XXX: We are not doing anything wrong here. We are avoiding an arguably expensive assertion. Also be more specific regarding why foundHost cannot have CRs or LFs.
The new variable is used beyond the `if` statement below its definition.
The `if` statement order does not matter match, but this order is faster if both components are problematic and this order helps clarify intended the urlpathDecoded usage scope.
Words like "info", "data", and "state" are popular in legacy Squid code, but they usually do not add anything to the meaning of the named entity and, hence, should be avoided. Highlighting that we are copying something is actually useful.
AnyP::Uri::parse() has a try-catch wrapper around its code, making it feasible to throw on parsing errors inside [the affected part of] that legacy code. In general, throwing on parsing errors simplifies and makes safer parsing code, so we usually want to throw in that context rather than rely on boolean return values. Also use "///" for Doxygen comments because they often consume fewer lines and simplify consistent formatting.
This might be useful in triaging lower-level code and for detecting CharactersSet reuse/duplication.
Also disclosed that we are going "too far" with these checks.
% git grep 'if[(]' '*.cc' | wc -l
21
% git grep 'if [(]' '*.cc' | wc -l
14016
... to better tolerate Ftp::Relay::sendCommand() and similar exceptions.
ERROR: Squid BUG: assurance failed: ...
exception location: FtpClient.cc(835) writeCommand
AsyncJob.cc(145) callException: assurance failed ...
AsyncJob.cc(100) mustStop: Ftp::Relay will stop, reason: exception
...
FATAL: check failed: !request->pinnedConnection()
exception location: FwdState.cc(1119) connectStart
When branch-added Ftp::Client::writeCommand() assertions throw, the
worker dies instead of just killing the affected ftp_port transaction.
`JobDialer::dial()` catches the BUG exception, as expected, but the
worker still dies because `Client::swanSong()` cleanup code calls
`FwdState::handleUnregisteredServerEnd()` which attempts to retry the
failed FTP transaction despite true `request->pinnedConnection()`, which
triggers another, this time uncaught, exception in `connectStart()`.
Also documented suspicions that `FwdState::pinnedCanRetry()` should
_always_ return false and, hence, should be removed (instead of adding
more and more code/conditions when that method returns false).
----
I hope this out-of-scope change can be reverted (and merged separately)
after we add CR protections to command parser and/or replace
writeCommand() assertions with (moved) embedded new line handling code.
FTP command syntax treats CR and LF characters as command terminators.
All RFC 959 sightings of CR and LF either refer to the CRLF terminator
or treat both characters the same way. For example, RFC 959 defines an
FTP command parameter as a sequence of chars, where:
<char> ::= any of the 128 ASCII characters except <CR> and <LF>
Squid should (and now does) restrict CR use to the command terminator
sequence, especially since some FTP servers treat CRs as command
delimiters -- if we continue to allow embedded CRs in FTP command
parameters than our FtpClient::writeCommand() will assert when trying to
forward those commands to the FTP server. Moreover, we already use that
CR treatment when _parsing_ FTP responses (see
Ftp::Client::parseControlReply()).
When it comes to command termination, CRs are still optional.
A new ban on CRs in FTP command parameter values means that Squid starts
treating some FTP commands as syntactically invalid, using
EarlyErrorKind::MalformedCommand for the first time since its inception
in 2014 commit eacfca8. For example, `PWD\rQUIT` input is now invalid.
N.B. This change removes "RFC 959 Section 3.1.1.5.2" reference. That
reference was wrong: RFC Section 3.1 is about "data representations"
used for file transfers rather than for command syntax. I probably
missed this problem when the addition of that reference was requested at
http://www.squid-cache.org/mail-archive/squid-dev/201408/0023.html
This commit effectively reverts branch commit 5b6e34d changes. In that commit, we chose to reject problematic `ftp://` URLs in Uri::parse() rather than wait until they trigger a creation of a problematic FTP command in Ftp::Gateway because detecting (and rejecting) a problematic request ASAP is often the safest option. For example, an allowed request could cause problems in adaptation services or other parts of Squid ecosystem that carelessly converts URLs to FTP commands. However, the above choice comes with significant drawbacks: * It broke previously working cases where problematic `ftp://` URLs did not reach Ftp::Gateway (e.g., where they were rewritten by a REQMOD service or sent to a cache_peer). * Squid started rejecting `ftp://` URL paths with broken percent-encoding. For example, an HTTP GET request for `ftp://example.com/double%%` resulted in an HTTP 400 (Bad Request) ERR_INVALID_URL response instead of leading to Squid sending FTP commands with a different filename (i.e. `RETR double%`). That is a significant change that may break some "working" use cases as well. * Finally, URI-based controls missed another CRLF injection point (now addressed in the previous branch commit 21581bd), proving that they are not sufficient. Squid did not handle the resulting exceptions well enough, necessitating more out of scope changes (see branch commit becd5b4 that this change also reverts). Some of these problems were known in advance, and we hesitated moving the checks into `Uri.cc`. Now, given the number, the weight, and the scope of these drawbacks, we believe that we made the wrong decision.
This reverts branch commit 21581bd because official code can now survive the corresponding CR rejection in Ftp::Client::writeCommand() because that rejection code is no longer throwing. The reverted changes are valuable and should be merged separately.
rousskov
left a comment
There was a problem hiding this comment.
PR description: Squid must not send FTP commands with parameters containing embedded CRs or LFs. ... Squid assembles FTP commands from a variety of information sources ... One can check for (and reject) embedded CRs and LFs in many code locations. Our attempts to reject invalid inputs earlier, before the assembly, resulted in significant functionality changes going beyond this fix scope
While we reverted those earlier changes, some of them are quite valuable, and we plan to polish and submit them as dedicated pull requests after this surgical (and long-awaited) PR is merged.
| if (crlfCharPosition != bufLen-2) { | ||
| const auto invalidCharName = buf[crlfCharPosition] == '\r' ? "CR" : "LF"; | ||
| debugs(9, 2, "ERROR: Caller assembled a malformed FTP command. Found " << invalidCharName << " at position " << crlfCharPosition); | ||
| failed(ERR_FTP_FAILURE, 0); |
There was a problem hiding this comment.
Ideally, we should throw here. We tried, but other/out-of-scope problems in current code forced us to table that idea (for now). FWIW, I am uncomfortable with the proposed failed() call in the middle of transaction processing stack, but it matches how we handle similar problems in FTP code, and it passes our tests. In the absence of specific evidence to the contrary, it is the right step forward here despite my concerns, until other improvements allow us to replace such calls with exceptions.
| void | ||
| Ftp::Client::writeCommand(const char *buf) | ||
| { | ||
| // The caller must supply a non-empty command followed by CRLF. |
There was a problem hiding this comment.
FWIW, the check being added is not validating "non-empty", it is validating length. The string may still be whitespace characters, which make non-zero length empty.
(I am not certain this is more than a theoretical case, thus the non-blocker review)
There was a problem hiding this comment.
FWIW, the check being added is not validating "non-empty", it is validating length. The string may still be whitespace characters, which make non-zero length empty.
// The caller must supply a non-empty command followed by CRLF.
Assure(bufLen > 2);This source code comment uses the words "non-empty" to mean "has at least one character". IMO, that is a reasonable usage. "Has at least one non-space character" could also be a reasonable definition in this context, but that is not the definition being used here. The actual assertion code (quoted above) clearly distinguishes the two cases, so I do not think there is an ambiguity problem here.
A command consisting entirely of space characters would be invalid and should not be generated, but our new assertions do not validate that invariant (because it is unrelated to this PR scope and correctly detecting excessive spaces in the assembled command may not be trivial and could be controversial). I do not recommend adding such an assertion in this PR.
Removing "non-empty" from this C++ comment would make this code worse because many readers would suspect an off-by one error without the comment -- the assertions further below only require a bufLen >= 2 condition. Using that weaker condition instead is not an improvement either -- Squid should not generate a \r\n command, of course.
If you can suggest a better wording for this comment, please do so.
FTP command syntax prohibits bare CRs and LFs except for command termination. FTP servers treat embedded CRs inconsistently. FTP does not have a generally accepted escape mechanism for safely encoding those message-framing characters. Thus, Squid must not send FTP commands with parameters containing embedded CRs or LFs. Squid assembles FTP commands from a variety of information sources that depend, in part, on whether the master transaction originated on an `http(s)_port` or `ftp_port`. One can check for (and reject) embedded CRs and LFs in many code locations. Our attempts to reject invalid inputs earlier, before the assembly, resulted in significant functionality changes going beyond this fix scope but still missed an important injection point. The approach used here covers all known code paths, has a decent chance of remaining effective during significant code refactoring, and does not alter Squid functionality beyond killing transactions that resulted in problematic FTP command parameters (an in-scope change). This approach also allows request adaptation services to "fix" problematic requests in some cases. Rejected transactions usually result in HTTP 502 ERR_FTP_FAILURE responses logged as TCP_MISS_ABORTED. More work is needed to provide better %err_detail for these cases. This is a Measurement Factory project.
FTP command syntax prohibits bare CRs and LFs except for command
termination. FTP servers treat embedded CRs inconsistently. FTP does not
have a generally accepted escape mechanism for safely encoding those
message-framing characters. Thus, Squid must not send FTP commands with
parameters containing embedded CRs or LFs.
Squid assembles FTP commands from a variety of information sources that
depend, in part, on whether the master transaction originated on an
http(s)_portorftp_port. One can check for (and reject) embeddedCRs and LFs in many code locations. Our attempts to reject invalid
inputs earlier, before the assembly, resulted in significant
functionality changes going beyond this fix scope but still missed an
important injection point.
The approach used here covers all known code paths, has a decent chance
of remaining effective during significant code refactoring, and does not
alter Squid functionality beyond killing transactions that resulted in
problematic FTP command parameters (an in-scope change). This approach
also allows request adaptation services to "fix" problematic requests in
some cases.
Rejected transactions usually result in HTTP 502 ERR_FTP_FAILURE
responses logged as TCP_MISS_ABORTED. More work is needed to provide
better %err_detail for these cases.
This is a Measurement Factory project.