RSS
 

Archive for May, 2009

500,000 Users and scaling

29 May

Grooveshark surpassed the 500,000 registered user mark today.
Ignoring the fact that many of our users never bother to register (it’s not necessary in order to use the site), 500k is an absolutely phenomenal number, especially compared to where we were just a year ago: 33k. The scary thing is that under our current growth rate, we will have over a million registered users in roughly 3 months.

Can we double our capacity in just 3 months? Obviously, history implies that it’s possible; we’ve already done much more than that. In fact we’ve done better than that: with little change in infrastructure and much of the same server capacity we’ve managed to make Grooveshark faster and scale at the same time.

On the other hand, much of the low-hanging scalability fruit has been picked now. We use memcached extensively, use a master/slave DB configuration with a data warehouse for logging or writes that don’t need to be processed in real time, and have begun doing some rudimentary sharding for stream-related activities.

What’s left? Well, we aren’t yet at the point where we can scale linearly simply by adding more servers, except probably for streaming servers. For that we need more sharding, primarily. There are still some SQL optimizations that can be made, like bringing session ids down to 16 bytes from 32 (32 on disk and 96 in memory, thanks to utf8) and ultimately getting them out of the database altogether, and using memcached even more heavily, but really all of those things only buy us time. Not that there is anything wrong with buying time, because we also need time to work on new features like last.fm scrobbling, a super-secret redesign, launching on half a dozen mobile platforms, etc., all with a relatively small dev team, but ultimately there are some fundamental architecture changes coming, and if we’re going to keep doubling our number of users every 3 months, it’s going to have to be very soon.

Update: A lot of people have been looking at this post as evidence that we are working on scrobbling support. I should point out that scrobbling support now exists for VIP users: Enable it here.

 
 

Microsoft + SeeqPod?

09 May

There’s a rumor floating around that Microsoft has bought Seeqpod, mainly fueled it seems by the fact that they have a link to Microsoft live search on their home page.

I may regret saying this, but I think that link is a red herring. Microsoft is the last company I would expect to have an interest in SeeqPod, unless their search technology is incredibly impressive and Microsoft intends to apply it to other forms of search. A possibility, but it seems pretty slim. Besides being a bad fit in terms of corporate culture, SeeqPod is probably under an NDA and would most likely be in big trouble for leaking that sort of information early.

If Microsoft is buying SeeqPod for their search technology, don’t expect to see the free streaming service re-launched after the acquisition, at least not until Microsoft has signed deals with the majors, which as we know is a lengthy and expensive progress. Of course Microsoft can afford it, but can they profit from it?

In the meantime, Grooveshark is still running, still growing, and we have an API as well, for all those developers left out in the cold after SeeqPod shut down.

 
 

Grooveshark has outgrown Gainesville

04 May

We have just about completed moving all of our servers from Gainesville to Colo5 in Jacksonville. Why did we have to move? Bandwidth! We simply could not get a fat enough tube to handle all of your music streaming demands into Gainesville. Why Jacksonville? Because it’s cheap, and close! We were considering moving to a similar facility in Atlanta, but Colo5 came in cheaper, while offering the same sort of bandwidth that we could expect to get in Atlanta. How much bandwidth? We should have room to grow up to about 20Gbps before we will have to consider expanding to other facilities. We’re going to need a lot more servers before that can happen.

Although we now have plenty of bandwidth for the near to mid-term future, there are still plenty of other bottlenecks that are starting to pinch us, so don’t be surprised if playback is still laggy sometimes or results to buffering. The next big improvement we need to make is the “bandwidth” we get from our disks. No point having 20GBps of headroom in the tube if we can’t actually get that much off of our servers. We have several strategies we are applying to this end, and I may post more about them later if I have time.

The transition from Gainesville to Jacksonville was, of course, not as smooth as we had hoped. Everything was going along swimmingly until we went to install our crappy “downtime server” in Gainesville to let users know that we were down and why, while allowing us to take the real web servers with us. The server and router simply would not acknowledge each other’s presence. Our laptops could connect directly to either one just fine, but both thought they had no connection when interfacing directly. The solution? A crappy old Linksys hub, which both the router and server were able to see.

Loading up all of our servers containing all of our data into a UHaul was more than a little scary, but we packed everything very carefully using lots of cardboard, furniture blankets and rope. Not a single server was harmed in the moving process!

Ben got a lot of pictures of the whole event, if you’d like to see more. Big thanks to Ed, Skyler, Joe, Colin and Nate for all working so hard to make the transition as quick and painless as possible, and thanks to Paloma for driving us home and letting us sleep on the way back; we were definitely in no shape to drive after all that.