RSS
 

New RPC Server

20 Aug

Among the many backend enhancements coming together for Grooveshark 2.0 is a brand new RPC Server that I wrote. The format it follows is the same as the last incarnation, JSON, but it’s considerably more efficient.

First, a history lesson. Our first stab at RPC (codename tambourine) was SOAP. We chose SOAP because it did what we needed (theoretically), it was XML (so super easy for Flash to use), and it’s also super enterprisey, which appeals to certain people here, myself not included. We used NuSOAP because it supports WSDL generation, which is so annoying we did not ever, ever want to have to do that by hand. Using SOAP turned out to be a huge mistake (anyone familiar with SOAP probably could have told us that). Because it’s XML, it was super easy to break the service by returning data with, say, an unsupported character in it, or by having PHP accidentally output a newline character (say, by following the PEAR standard and having an extra newline at the end of every PHP file). Further, because browsers eat most error codes and don’t pass the codes or the accompanying data on to plugins, we had to hack NuSOAP to violate SOAP standards and return a 200 OK when generating SOAP Faults. On top of all that, and most importantly, it turns out that generating all of that XML is very expensive. So expensive that I could not test code locally on my machine before putting it on a server. So expensive that we actually got a huge performance boost by caching the generated XML in memcache.

After the SOAP fiasco, we wanted to try using JSON, because it’s much more terse and far less expensive to generate than XML, while still being human readable, which we value for debugging purposes. We looked at official JSON-RPC stuff, but we were spoiled by SOAP in one regard: support for headers. We got used to using headers to pass around status information that didn’t exactly belong to the method calls, and we didn’t want to give that up so I wrote my own JSON-RPC server (codenamed cowbell) modeled largely after how SOAP worked, but more efficient. My custom JSON server class was only 275 lines long to NuSOAP’s 7,994 lines for just the core file. We gave up the elaborate data type declarations (we just verbally communicated what type everything would be), but beyond that method registration worked the same way as SOAP and every other RPC implementation I’ve seen: you need a service file which registers all available methods with the RPC server, and then invokes the RPC class, passing through the raw POST data. Although I did not save benchmark information, the speed difference between my custom RPC class and NuSOAP was astronomical, so we left the RPC optimizations at that.

I recently came back and did some profiling out of curiosity to see how much overhead our current RPC implementation was creating, and it turned out to be about 15% plus about an extra 1.5MB of memory used. I thought about how things were working with the current implementation and realized that it’s a bit silly to be registering hundreds of methods every time the client makes a request, when you really only need information about whatever method is currently being requested. All the other information is superfluous. So in our new JSON-RPC implementation (codename more because it’s still cowbell in a way), 162 lines long, there is no ‘registration’ of methods. Instead, the service class that would normally do the registration simply has a collection of “meta methods” which return information about the methods that make up the service when called. So the RPC server parses the request, does a little bit of reflection magic to determine which meta method to call, and calls that method to get all the information it needs in order to handle the request. What’s the overhead of this approach? Well, it hardly even registers in the profiling I’ve done. :)

In summary, Grooveshark 2.0 method calls should be nearly 15% faster on average than Grooveshark 1.0 calls. :)

P.S. On the topic of cowbell, there’s an old easter egg that we added and never got around to telling anyone about. Before Grooveshark 2.0 is lost and it’s forgotten forever, you can get to it by typing about:cowbell into the search bar.

 
 
  1. James Hartig

    August 20, 2009 at 10:37 am

    Haha I love the about:cowbell! I almost forgot about it! Ben showed it to me forever ago. Btw, nice job on the RPC server, I never use XML anymore, since almost all APIs can use JSON now-a-days anyway.

     
  2. James Hartig

    August 20, 2009 at 10:41 am

    What did you make the IRC server in? I’m making similar stuff for my site, deVolf, but I’m trying to do everything in PHP ;) So far I have a pretty stable PHP server that forks new connections, I just need to work on loading functions on-demand and not just when the server is started. http://github.com/fastest963/PHP-Simple-Daemon/tree/master

     
  3. Jay

    August 20, 2009 at 2:40 pm

    That’s interesting, I’ll have to look at what you’re doing. We’re using the whole LAMP stack for this, so the RPC server is pure PHP but we don’t have to worry about forking or anything like that…

     
  4. James Hartig

    August 21, 2009 at 8:28 am

    Well, I just do the forking, so I can process many connections at once, normally with PHP (without a lot of hacking) you can only do one connection at once, but with forking, I can do as many as the server can handle.

     
  5. jon

    November 30, 2009 at 9:47 pm

    Ever think about moving from JSON to RTMP? Major difference being you can write the entire back-end in Java.

     
  6. Jay

    March 25, 2010 at 11:47 am

    Sorry for the super delayed reply. I just upgraded wordpress and managing comments is a lot easier now. ;)

    We actually used RTMP a long time ago for streaming (Red5), but we’re able to get significantly better concurrency with our current setup of PHP+lightweight web servers.