I hate the idea that this blog might be turning into nothing but a journal of all the things at Grooveshark that have ever broken, but some of the most interesting challenges we face are when things go terribly awry, so I’m not going to avoid talking about it just because it involves something breaking at Grooveshark, again.
What happened was, out of the blue IE8 could no longer run the site. Users were getting a message about making sure they did not have flash block enabled, which means the swf was failing in some way. We determined that the swf was in fact loading, so why was it lying to us? There is one file that swfs need in order to talk to other domains: crossdomain.xml. If that file fails to load, the swf isn’t going to work. I suspected that was happening in this case, so I loaded up http://cowbell.grooveshark.com/crossdomain.xml and IE complained that it wasn’t valid XML. View source showed me that IE was right. It was in fact what looked like binary garbage. Loading the same file in Firefox and Chrome worked perfectly fine, but IE8 on 4 different computers all showed the invalid XML.
Some months ago, we switched from serving pages up directly from Apache, to running Nginx in front of Apache as a reverse proxy with caching. The difference that made on our front end servers in terms of memory usage and CPU load is phenomenal. Although Nginx serves 30% of requests from cache now, the drop in server load was much more than 30%. Nginx is truly a wonderful addition to our http stack…but as you’ve guessed by now, it played a key role in the latest breakage at Grooveshark.
Force clearing the cache in IE8 and in nginx would sometimes fix the file, but not always. I then turned to wget and found the same thing: whenever the file was broken for IE8, it was identical in wget. wget was showing the exact same file size that Firebug was showing, which was the biggest clue: Firefox received the file gzipped because it supports deflate, but wget also received the file gzipped even though it doesn’t support deflate. My theory, which proved correct, was that IE8 was for some reason asking for the non-gzipped version, but receiving the gzipped version and barfing.
Why would that happen? Well, remember that we are using nginx as a reverse proxy cache. It turns out that we just recently added some auto-gzipping for certain file types to apache. What was happening was, nginx would get a request for a file not in its cache, and forward along the request (with all headers intact) to Apache. If this request came from a client that supports deflate, Apache would respond with a gzipped file. Nginx would store that gzipped file in its cache, and the next request that came in asking for that file, with or without deflate support, would get the gzipped version served up.
The fix was relatively simple: add a variable in nginx conf tracking whether or not the current client supports deflate. Append the value of that variable to the proxy key, meaning that gzipped and non-gzipped versions of the files will be cached separately, and served appropriately depending on what the client supports.
What’s not clear to me at this time is why IE8 would refuse to accept gzipped content for that file, and whether that applies to all .xml files in IE8…but at least it helped us catch what would have otherwise been an extremely obscure issue!