Automatic decompression should sanitize Content-Encoding
and Content-Length
headers from the response
#1729
Labels
needs implementer interest
Moving the issue forward requires implementers to express interest
needs tests
Moving the issue forward requires someone to write tests
topic: http
What is the issue with the Fetch Standard?
The
fetch()
spec allows browsers to perform decompression of HTTP responses infetch()
if an appropriatecontent-encoding
header is set on the response. In this case, theResponse.prototype.body
stream no longer reflects the raw bytes (modulo protocol framing) received on the wire, but instead a processed version of the bytes after being passed through a decompression routine.This decompression is meant to be transparent to users: they do not have to explicitly opt in or enable it. Further, they can not even disable this (ref #1524).
Unfortunately, the decompression is currently not very transparent: given an arbitrary response object, it is ambiguous whether the
Response
's body has been decompressed or is still compressed.This causes real world problems:
Content-Length
andContent-Encoding
headers wintercg/fetch#23)Proposal
I propose we strip out
Content-Length
(because it represents the content length prior to decompression), andContent-Encoding
(because it represents the encoding prior to decompression) fromResponse
headers when we perform automatic response body decompression infetch()
. I am not suggestion this affects responses created withnew Response()
or responses returned fromfetch()
that do not have automatic response body decompression performed.Compatibility
I don't think this change will break any existing code. It may skew some folks' monitoring tools. I make this assumption based on the following thoughts:
Content-Length
before decompression is meaningless if you only have the decompressed body. You can not infer how long the real response is based on theContent-Length
in both gzip and br.Content-Encoding
is not useful in combination with a compressed body. The only use I can think of is monitoring usecases where you want to determine what percentage of your assets were served with compression (and with which compression).Prior art
In the JavaScript space:
In other programming languages:
http
std lib module has auto decompression enabled by default. It strips outContent-Length
andContent-Encoding
when it performs decompression. It has a flag on the response to determine if auto-decompression has taken place. See https://pkg.go.dev/net/http#Response.Uncompressedreqwest
crate supports auto decompression and enables it by default for clients if thegzip
orbrotli
compile time flags are set. It strips outContent-Length
andContent-Encoding
when it performs decompression. It has no flag to check if decompression has been performed or not. See https://docs.rs/reqwest/latest/reqwest/struct.ClientBuilder.html#method.gziprequests
: does auto decompression by default, and setsContent-Length
to the post decompression content length. It does not remove theContent-Encoding
headerNet::HTTP
: does auto compression by default, removingContent-Encoding
, and rewritingContent-Length
to the length after decompression.The text was updated successfully, but these errors were encountered: