Wednesday, February 22, 2006

Planning for failure

I love my TiVo. We have two in our house and they are constantly recording stuff. I never thought I'd see the day when my kids preferred "educational" programs like Junkyard Wars and Mythbusters to cartoons but TiVo did it for us.


Recently, they have been adding new features to our box by way of automatic upgrades. Given that the TiVo really is an "embedded system", this is a bit novel. Now we have the ability to transfer movies to our PC, to a video iPod, to a DVD (by way of the PC). We can pull music from our iTunes libraries on our machines. Last week, we got a new set of features that provide access to the Internet. It allows us to look up movie times, buy movie tickets, see traffic conditions, and get the weather forecast. All of these options require connecting to the Internet but our TiVo is hooked to our wireless network at home so that's no problem.


Now, I don't know if these features are worthwhile. Really the bottom line is saving us walking the 35 feet from our couch to one of the computers to look this information up. But, weighing the merits of these features isn't the interesting part. When we found out about these new features, we poked through them to see how they did. Hmm, nice, the weather report seemed accurate. We did look up movie times and descriptions. Then we went to the traffic report and the screen didn't update. I can only assume the TiVo was waiting for something on the network and it wasn't responding. If you or I was looking up this data on a web browser and this happened, we'd mumble something about the site being down, close the browser window, and go on with our lives. Maybe, if it was some evil flash-laden site and closing the browser didn't happen instantaneously, we might have to get out the task manager and kill the browser process. It's painful but you can still be done in 20 seconds and back on to other things.


Unfortunately, these paradigms don't exist in TiVo world. There's no "cancel" button, no "quit" button, no "task manager". In fact, the TiVo doesn't even have a power button. It's always on, humming along, recording things that it thinks you might like. So what happens when an embedded system designed for unfailing reliability (you have to admit, not having a power button is quite a statement of reliability) meets something unreliable like the Internet? We had to pull the power plug on the TiVo. It took about 6 minutes for the box to come up, since it's very atypical for you to power off the box. Fortunately, it wasn't recording anything in the background or that process would have failed also. But probably the most damaging problem was to our confidence in that feature. Maybe traffic reports are a cool thing, but neither my wife no I will likely ever click on that button again. The "benefit" of not walking 35 feet has been dramatically outwieghed by the "risk" of losing 5-10 minutes of that very favorite show the TiVo might have been recording. The feature is now "too dangerous" to attempt and that's a pity.


So what's the moral here? If I knew the actual failure mode, there are probably engineering parables about error checking, robustness of coding, or something like that. But to the user, it's really a different story. This was not a coding error, it was a design error. The design fell over when something inherently failable failed. So, I guess my best advice for those amazing people at TiVo is, "plan for failure". Who says pessimism is bad?


Technocrati tags: TiVO, Embedded Programming, User Interface

FREE hit counter and Internet traffic statistics from freestats.com