MySQL Atomic Rename + FK Constraints + Oops!

Categories: Uncategorized

Here’s a fun issue I ran into a little while ago, while doing DB maintenance at Grooveshark.

MySQL has a handy atomic rename function, where you can do something like this:
RENAME TABLE CurrentTable TO OldTable, NewTable TO CurrentTable
Both renames are done as a single atomic action, so that the table CurrentTable is always there, and Nothing Has To Break*

A few days ago, I took advantage of this feature. I needed to make some schema changes to one table that was being read from but not written to.

CREATE TABLE NewTable LIKE CurrentTable;
ALTER TABLE NewTable DROP COLUMN UnnecessaryColumn;
...
INSERT INTO NewTable SELECT (...) FROM CurrentTable;
RENAME TABLE CurrentTable TO OldTable, NewTable TO CurrentTable;

That way we have a live, selectable backup of the table in case I really screwed something up, not to mention that moving the tables this way meant we could finally get that huge table into file per table, as that table had been around since before we had turned on the file per table setting in MySQL.

Several other tables have foreign key constraints pointing to CurrentTable, and unfortunately MySQL is smart enough to notice that CurrentTable is being renamed, so it “helpfully” updates all the foreign key constraints in those tables so that they are now pointing at OldTable. Without mentioning it. The problem didn’t become apparent until CurrentTable was being written to again, and then the other tables with the foreign keys were being written to pointing to the new data in CurrentTable, except the updated constraints meant MySQL tried to find the rows in OldTable, causing the inserts to fail.

This is definitely one of those cases where being smart and helpful can sure lead to a lot of misery. I had to hunt down every table that had referenced CurrentTable to change them back by hand. If MySQL had been just a little bit smarter it would have seen that CurrentTable wasn’t *really* going away and it wouldn’t have made those changes for me. If it was a little bit dumber, it would have done nothing and things would have just kept working, or it would have blocked me from doing my rename, forcing me to deal with the constraints before they broke anything.

* except when it does

Sorry about the ads

Categories: Uncategorized

Sorry about the new ads on Grooveshark Lite, folks. They certainly don’t make me very happy either, but in these tough economic times, we gotta pay the bills somehow!

The good news is that these ads should free up some resources so we can spend more time making Grooveshark Lite even more awesome.

If you have thoughts on the new ads, Ben’s post on the community page has more information about the ads and a couple of ways to get in touch with us. Feedback is always appreciated.

Soltero

Categories: music

One of my long-standing favorite bands is Soltero, and this song is a great representation of why.

Grooveshark releases our first widget

Categories: Uncategorized

It’s finally here: the Grooveshark widget! To show it off, here’s a collection of songs by The Bird and the Bee; one of my new favorite bands.

Stocks

Categories: life

It’s almost as if Tuesday never happened…

Leaky Abstractions

Categories: Coding grooveshark software engineering

As with nearly every issue in Software Engineering worth thinking about, Joel Spolsky has written an article about leaky abstractions that is very relevant to some problems I ran into tonight.

Abstractions do not really simplify our lives as much as they were meant to. [...] all abstractions leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning [...] and all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder

In my case, I was not working with programming tools per se, more of an abstraction of an abstraction built into the Grooveshark framework that is meant to make life as a programmer easier. And normally it does. But in this case, that abstraction was wrapped in a couple more layers of abstraction away from where my code needed the information, and somewhere in there the information was, for lack of a better description, being mangled. The particular form of mangling is actually due to a lower level abstraction at the DB layer, but is not handled in the higher level of abstraction because for most uses it doesn’t matter.

From the level of abstraction where my code was sitting, the information needed in order to un-mangle the data was simply not available. I went through the various layers to find a convenient place to put the un-mangling, but by the time I found a place, everything was so abstract that I couldn’t confidently modify that code and see all potential ramifications, so I just did an end-run around many of the layers of abstraction. This is certainly not ideal, but it works. The point is, an abstraction intended to make life as a programmer slightly easier most of the time, can easily make life as a programmer significantly more difficult on edge cases. Unfortunately, once a code base has been established, most cases are edge cases: feature additions, new products built on top of the old infrastructure, etc., all or most unforeseen, and therefore unaccounted for during the original design process.

Quick searching in Google Chrome

Categories: grooveshark

Google released their Chrome Browser today. I’ll refrain from giving it a full review as many others have already manged to do so.

I want to comment on the ease of adding search engines and some minor tweaks you can do to make the search functionality more powerful, specifically in regards to Grooveshark Lite and TinySong.

To add a search engine to Chrome, just visit the site. If they have properly implemented their search extensions by embedding information about them in the page (and we have), then all you have to do is visit the site and Google will add it to its list of search engines. Each search engine has a “keyword” that you type before your search term to indicate which search engine you wish to use. The default keyword is the domain of the search engine, but you can change it by clicking on the wrench icon, and then Options | Basics | Default Search | Manage

I changed mine so that I can search lite by typing in “listen artistname” so if I want to listen to Soltero I type “listen Soltero.”
Likewise I modified the keyword for TinySong to be ts. So if I want to search for Camera Obscura, I type “ts Camera Obscura”

Another thing about Google Chrome that is really cool as it relates to Grooveshark Lite is the built in support for making any website an application. I installed Grooveshark Lite on my start menu, and now it runs as an independent application with the lite Favicon showing up in the taskbar. Very cool.

When human error isn’t

Categories: design

One of today’s headlines is Human error causes most Predator crashes.

Without reading the article, it should be obvious that this statement is simply not true. If there is a common recurring problem attributed to human error, it is most likely actually a design error.

A caption in that article reads

Research shows that most Predator mishaps are the result of inadequate skills, lack of teamwork and lack of situational awareness.

This could just as easily be rewritten:

Research shows that most Predator mishaps are the result of poor design. The controls as designed to not promote situational awareness, require extensive teamwork and higher than normal levels of skill to operate.

Rather than trying to adapt people to your designs, adapt your designs to the people who will be using them. If there’s a lack of teamwork, don’t require it. If situational awareness is a necessity, make sure your design promotes it, if the people using your design do not possess the necessary skills, don’t require those skills.

I’m not saying that’s easy, but those mishaps cost $1 million to $4 million according to the article, so it’s obviously well worth the effort.

Who will monitor the monitors?

Categories: Coding software engineering

This is the second time that an error in a monitoring/testing tool that we use has caused me to waste a good deal of time trying to solve a non-problem.

The first was when we were using a tool to test the scaling performance of a new feature. According to this tool the performance was absolutely abysmal with only 50 virtual clients, and I spent a couple of days writing alternative algorithms and trying to test them with this tool. The scalability was horrible no matter what I did, so I added some logging to see if something was going on that wasn’t supposed to be. It was. The testing tool was sending completely wrong information, creating completely unrealistic scenarios. The testing implementation was done by another developer who was sure that he had set it up right, and I was not familiar with the tool so I assumed that it was a problem with my code. Having another developer write the tests was supposed to save me time.

The latest monitoring snag was caused by a monitoring tool that, for our convenience, tails the php error log and sends the last couple hundred lines to us whenever there is an error. We kept getting the same error periodically and could not track down the source; everything seemed to be working perfectly, but we still got the error. Usage of the site every day is breaking new records so we assumed that it had something to do with the unprecedented load on servers, but could never pinpoint the source of the errors. Today I looked a bit more closely at the error log only to discover that the log was from 3 days ago!

So the question is, when your testing or monitoring tools report an error, how do you ensure that the error is real and not just in the tool itself?

On being a DJ

Categories: Coding grooveshark life music

When I was in college (oh so long ago…) I was a DJ for our radio station, and then I was a music director. I loved being a DJ: having lots of new, interesting and unreleased music on tap, from Smashing Pumpkins to Underwater Boxer; having a channel to share that music with other people; being able to make a small band’s day by playing their stuff and reporting it to CMJ. Well, there was one part I didn’t care for so much: talking on the radio. I’m a bit shy, which is why although I loved being a DJ and music director at Eckerd College, I knew it wasn’t ever going to be a career path for me.

It’s interesting, then, that I work at Grooveshark where much of that dream is being fulfilled by participating in this movement. The one piece that is missing is having a channel to share music with other people and subsequently helping small bands by making them more discoverable. Well, now with the release of Autoplay in Grooveshark Lite, it’s kind of like I get to be everybody’s DJ. Of course a computer scientist would write a DJing program rather than doing the manual labor of DJing.

As Professor Fishman, the best professor who ever lived, was fond of saying in our classes, a computer scientist isn’t satisfied with just using computers to put other people out of a job, they won’t settle until they manage to put themselves out of a job too. To be fair, he usually talked about that in the context of AI and specifically programming languages such as LISP, where the program can rewrite itself, but I think it applies here as well.

Now I get to be everyone’s DJ, but with everyone’s help too. If the system is currently a bad DJ, keep giving it feedback and it will learn. Imagine if you got to call up your local radio station and yell at them every time they played something you didn’t like, and congratulate them every time they played something you liked. If they didn’t block your phone number, you’d end up with the ultimate radio station for you, and that’s what Grooveshark aims to be, although we admit it will take some time to get there.

Check out Autoplay, and let me know what you think.