I’ll admit I get bitten by the Vary
header once every few months. It’s something a lot of CMS’s randomly add, and it has a serious impact on how Varnish handles and treats requests.
For instance, here’s a request I was troubleshooting that had these varnishlog hash() data:
- VCL_call HASH - Hash "/images/path/to/file.jpg%00" - Hash "http%00" - Hash "www.yoursite.tld%00" - Hash "/images/path/to/file.jpg.jpg%00" - Hash "www.yoursite.tld%00" - VCL_return lookup - VCL_call MISS
A new request, giving the exact same hashing data, would return a different page from the cache/backend. So why does a request with the same hash return different data?
Let me introduce the Vary
header.
In this case, the page I was requesting added the following header:
Vary: Accept-Encoding,User-Agent
This instructs Varnish to keep a separate version each page for every value of Accept-Encoding
and User-Agent
it finds.
The Accept-Encoding
would make sense, but Varnish already handles that internally. A gziped/plain version will return different data, that makes sense. There’s no real point in adding that header for Varnish, but other proxies in between might still benefit from it.
The User-Agent
is plain nonsense, why would you serve a different version of a page per browser? If you consider a typical User-Agent string to contain text like Mozilla/5.0 (Macintosh; Intel Mac OS X...) AppleWebKit/537.xx (KHTML, like Gecko) Chrome/65.x.y.z Safari/xxx
, that’s practically unique per visitor you have.
So, quick hack in this case, I remove the Vary header altogether.
sub vcl_backend_response { unset beresp.http.Vary; ... }
No more variations of the cache based on what a random CMS does or says.