RFC Compliance Is Not an Implementation Strategy

Using Caddy’s malformed Host header discussion as a case study, this piece argues that RFC compliance alone does not settle implementation behaviour, and that real engineering decisions still require judgement about boundaries, defaults, compatibility and failure handling.

I revised this piece to strengthen the opening thesis, improve readability and better account for protocol-version and translation-boundary details particularly HTTP/2 and HTTP/3 authority handling. The argument remains the same but it should now be easier to skim and better grounded in how these decisions become more complex once :authority, intermediary behaviour and defaults enter the picture.

Standards can tell you that input is invalid. They cannot, by themselves, tell you what a server should do next. That is the core point of this piece, and Caddy’s malformed Host discussion is a good case study for it. At first glance, the issue looks simple: some Host values are malformed, the RFCs appear to say they should be rejected and the implementation decision seems obvious. But once you move past the syntax, the real question is no longer whether the input is invalid. It is whether stricter handling should be default behaviour, opt-in strict mode or part of a broader malformed-request model, and how those failures should surface in practice.

That is why “RFC compliance” is not enough on its own. It is necessary, but it is not a strategy. A server still has to decide where validation belongs, which defaults it wants to expose, how much compatibility risk it is willing to accept and what happens when requests cross protocol and intermediary boundaries.

The Caddy issue chain makes that clear. In issue #7459, malformed Host values such as [], [12345] and [123g::1] were accepted and, in the demonstrated case, the request still returned 200 OK instead of 400 Bad Request. PR #7551 then proposed stricter validation aligned with RFC 3986 so malformed values would be rejected early. Later I raised issue #7628 to reframe the discussion properly: not just whether the input is invalid, but whether stricter Host handling should be default behaviour, opt-in strict mode or part of a broader malformed-request model in Caddy.

That shift in framing is the point. Spotting malformed syntax is the easy part. Deciding what a server should do with it is harder.

The reason it is harder is that the RFCs involved are doing different jobs. RFC 3986 defines generic URI syntax. RFC 9110 defines HTTP semantics, including the role of Host and :authority in request handling. RFC 9112 defines HTTP/1.1 messaging rules, including when a server must reject a request with 400 Bad Request. RFC 9111 explains why target URI and authority handling matter to caches and intermediaries. RFC 9113 changes the authority story again for HTTP/2, and RFC 9114 continues that pattern for HTTP/3. These documents fit together, but they are not interchangeable. If they are flattened into one line of reasoning, the result may sound rigorous while skipping the actual engineering work.

The Concrete Bug Is Easy, the Decision Is Not

The initial bug report is not controversial. Caddy accepted malformed Host values that appear invalid under the URI and HTTP grammar and served the request normally. That is enough to establish a real behavioural question. This is not a speculative standards debate. There is an actual server doing something surprising at the edge.

From there, the obvious response is the one most engineers would reach for first. Validate the field earlier, reject malformed values, return 400 and stop the bad input before it reaches later layers. That is what makes PR #7551 attractive. It is narrow, testable and easy to defend in terms of direct correctness. As a patch, it makes immediate sense.

But narrow correctness is not the same thing as a complete implementation decision. The moment you ask whether this should be default behaviour in Caddy, whether it belongs in core, whether similar malformed-input rules should be handled consistently elsewhere and how failures should surface to users, you have left the comfort of a single patch and entered system design. That is why issue #7628 is the most important part of the chain. It turns a bug and a patch into a question about responsibility and boundary.

What RFC 3986 Actually Tells You

RFC 3986 is where the syntax argument usually begins. It defines the host production as IP-literal / IPv4address / reg-name and makes clear that square brackets are used for IP-literals. It also says this is the only place where square brackets are allowed in URI syntax. So if someone sends Host: [], or another malformed bracketed value that cannot be an IP-literal, it is entirely fair to say the value is invalid at the URI grammar layer.

That part is not the problem. The problem is what people do next. They move from “this value is invalid under the grammar” to “therefore the implementation choice is obvious”. That second step is not contained in RFC 3986. The RFC gives you a syntax boundary. It does not tell you where a production server should validate, whether strictness should be configurable, what compatibility cost is acceptable or how the failure should be surfaced to users and operators.

This is the first place where RFCs are easy to misread. Not because people read them too literally, but because they read them too narrowly. They pull a syntax rule out of its layer and quietly promote it into a complete behavioural policy. RFC 3986 can tell you that [] is malformed. It cannot, by itself, tell you what Caddy should do by default, where that validation should live or how that decision should fit into the product as a whole.

What RFC 9110 Adds

RFC 9110 makes the issue more serious because it explains why the Host field matters. It defines Host = uri-host [ ":" port ] and treats host and port information as critical to handling a request. It also notes that authority information is often used as an application-level routing mechanism and warns that it is a common target for cache poisoning or misdirection if handled carelessly.

That shifts the discussion from mere grammar to actual request handling. A malformed or misleading Host is not just an ugly string. It can affect routing, origin interpretation and intermediary behaviour.

Still, RFC 9110 does not finish the job either. It tells you why the field is important. It does not tell you exactly how a server like Caddy should organise its validation model. It does not tell you whether this behaviour belongs in default core handling, in an opt-in strict mode or as part of a broader configurable policy. It does not tell you how the error should surface. It does not tell you how similar malformed-input cases should be handled elsewhere in the stack.

RFC 9110 also strengthens the defaults argument more than it may seem at first glance. It explicitly says implementers need to consider the privacy and security implications of their design and coding decisions, especially the default configuration they provide to operators. That matters here because even if one accepts that malformed authority should often be rejected, the standards still do not settle whether stricter handling should be default Caddy behaviour, opt-in strict mode or part of a broader configurable validation model.

Importance does not settle the underlying architecture questions.

RFC 9112 Is Stronger Than the Easy Summary, and More Nuanced Too

If someone wants the strongest standards argument for rejecting malformed Host values, RFC 9112 is the best place to go. It says a server must respond with 400 Bad Request to an HTTP/1.1 request that lacks a Host header field, contains more than one Host header field or contains a Host header field with an invalid field value. That is not vague and any serious implementation discussion has to reckon with it.

But this is also where the “just follow the RFC” line becomes too shallow. In the same document, the rules for reconstructing the target URI say that if the request-target is not in authority-form then the authority component comes from the Host field. If the Host field is missing, empty or invalid then the authority component is empty. The document then says that if the URI scheme requires a non-empty authority, as http and https do, the server can reject the request or determine whether a configured default applies that is consistent with the incoming connection’s context. It immediately warns that such defaults are unsafe unless authority can be uniquely identified from context.

That does not undo the earlier 400 requirement. What it does show is that even the HTTP/1.1 messaging document is not treating malformed authority as a one-dimensional parser event. It understands that request reconstruction, context and server behaviour are related questions.

HTTP/2 sharpens the point further. RFC 9113 says the :authority pseudo-header carries the authority component of the target URI, and that a recipient must not use the Host header field to determine the target URI if :authority is present. Direct HTTP/2 clients must use :authority when they need to convey authority, must not send a Host value that differs from it and a server should treat such a request as malformed. RFC 9113 also notes that these values may need to be normalised before comparison, which is another reminder that even the apparently simple question of whether two authority values match is not always a trivial string comparison.

HTTP/3 continues the same pattern. RFC 9114 says direct HTTP/3 clients should use :authority instead of Host, and that if both are present they must contain the same value. It also treats requests with invalid pseudo-header values as malformed. That does not change the core argument, but it does make it harder to pretend this is only an HTTP/1.1 Host parsing question.

That matters because it exposes the mistake in both simplistic readings. A loose reading says malformed authority is not a big deal, so the server can just carry on. A rigid reading says RFC 9112 contains a MUST, therefore the entire product decision is finished. Both are shallow. A MUST is necessary, but not sufficient. It constrains the decision. It does not remove the need for one.

Translation Boundaries Matter Too

This becomes even more important once intermediaries enter the picture. RFC 9113 says that an intermediary converting an HTTP/2 request to HTTP/1.1 must create Host from :authority, replacing any existing Host value in the message. RFC 9114 imposes similar requirements for HTTP/3 to HTTP/1.1 conversion. That matters because once requests cross protocol boundaries, the implementation problem is not merely whether a single header is syntactically invalid. It is also how authority is reconstructed, normalised and safely translated.

That point is especially relevant to a server like Caddy because Caddy is not only parsing requests. It often sits at a routing boundary, a reverse proxy boundary and sometimes a protocol translation boundary. That makes the implementation question larger than a single RFC sentence about a single field.

Why RFC 9111 Still Belongs in the Conversation

RFC 9111 is not the centre of the argument, but it helps explain why this is not just an internal parser concern. The caching specification says that a cache key is composed, at a minimum, from the request method and target URI. In practice, many caches use the URI as the key. Once that is read alongside RFC 9110’s warning that host and port are often used as application-level routing inputs and can be abused for cache poisoning or misdirection, the operational significance becomes clear enough. Authority handling can shape how requests are interpreted, routed and potentially cached downstream.

This does not mean every malformed Host value is a dramatic security event. That would be sloppy in the other direction. What it does mean is that malformed authority handling is not merely a cosmetic syntax issue. Once it passes the edge, its effects can travel further than the parser itself.

Where the RFCs Are Easy to Misread

The easiest mistake here is to treat all of these RFCs as though they are saying the same thing in different words. They are not.

RFC 3986 tells you what a valid generic URI host looks like. RFC 9110 tells you why host and authority matter in HTTP semantics and request handling. RFC 9112 tells you how HTTP/1.1 messaging treats missing or invalid Host, including the strong 400 requirement and the reconstruction context around it. RFC 9113 and RFC 9114 tell you that in HTTP/2 and HTTP/3, authority handling is centred on :authority, not just Host, and that mismatches and malformed pseudo-headers have their own consequences. RFC 9111 tells you why target URI and authority matter to caches and intermediaries.

If you collapse those into a single sentence such as “RFC 3986 says the value is invalid, therefore Caddy should reject it by default”, you are skipping the distinction between syntax, semantics, messaging and operational consequences. You may still land on the right answer in the end, but you will have got there with weaker reasoning than the problem deserves.

That is what I mean by misreading the RFCs. Not that the standards are wrong and not that normative language can be ignored. The mistake is to read one layer as though it silently decides all the others.

The Real Caddy Question

This is why Caddy’s own issue chain is more interesting than the bug alone.

Issue #7459 proves the behaviour is real. PR #7551 proves a narrow fix is feasible.

The older strict-mode discussion in PR #4841 shows that Caddy has already had to think about whether malformed or suspicious request handling belongs in default core behaviour at all or whether some of it should remain optional because stricter edge validation can break legitimate traffic or push the product in a direction it does not want as a default. That is not a minor detail. It is precedent.

Issue #7628 then asks the right questions for the underlying issue. Should stricter Host validation be default behaviour? Should it be opt-in strict mode? Should it be treated as a one-off parser fix or as one instance of a broader malformed-request strategy? How should lower-level request validation failures be surfaced? What user benefit exists beyond saying the input is invalid under the relevant RFCs?

That is a much better level of reasoning than “the RFC says so”. It recognises that the actual decision is not whether the syntax is malformed. That part is comparatively easy. The actual decision is what sort of server Caddy wants to be at this boundary.

My View

I think rejecting obviously malformed Host values is directionally reasonable. A server accepting broken authority information and returning 200 OK is hard to defend. There is real value in rejecting malformed authority earlier rather than letting ambiguity leak into later layers. RFC 9110 makes clear that host and port are critical to request handling and RFC 9112 is not shy about invalid Host in HTTP/1.1.

What I do not think is that “RFC compliance requires it, therefore the implementation decision is obvious” is the strongest argument. That claim is too thin. It treats standards compliance as though it were a substitute for architecture.

The stronger argument is that malformed Host handling should be judged as part of a coherent request-validation model. That means deciding where validation belongs, whether strictness should be default or configurable, how failures should be surfaced and how similar malformed-input cases should be treated elsewhere in the server. A narrow patch may turn out to be correct, but it should be evaluated against that broader model rather than treated as self-justifying because the syntax rule exists.

The Broader Lesson

The lesson here is not that RFCs are optional. Quite the opposite. If you are going to rely on standards, you need to read them at the right level and in the right order.

URI grammar is not the same thing as HTTP semantics. HTTP semantics are not the same thing as HTTP/1.1 messaging. HTTP/1.1 messaging is not the same thing as HTTP/2 or HTTP/3 authority handling. Protocol handling is not the same thing as product policy. Product policy is not the same thing as a parser patch.

Good engineering starts where shallow standards arguments stop. It asks what the text requires, what the adjacent layers imply, what the system is actually responsible for and what trade-offs the product is willing to accept. Standards constrain that work. They do not remove the need for it.

In that sense, RFC compliance is necessary. It is just not an implementation strategy.