Show Me the Code

A while ago, Mark Pilgrim wrote about being prompted with a license agreement that looked like this.

Adobe Reader 8 license agreement showing HTML code.

If, like most people, you have trouble parsing the agreement, that’s because it’s not the text of the license agreement that’s being shown but the “marked up” XHTML code. Of course, users are only supposed to see the processed output of the code and not the code itself. Something went wrong here and Mark was shown everything. The result is useless.

Conceptually, computer science can be boiled down to a process of abstraction. In an introductory undergraduate computer science course, students are first taught syntax or the mechanics of writing code that computers can understand. After that, they are taught abstraction. They’ll continue to be taught abstraction, in one way or another, until they graduate. In this sense, programming is just a process of taking complex tasks and then hiding — abstracting — that complexity behind a simplified set of interfaces. Then, programmers build increasingly complex tools on top of these interfaces and the whole cycle repeats. Through this process of abstracting abstractions, programmers build up systems of almost unfathomable complexity. The work of any individual programmer becomes like a tiny cog in a massive, intricate machine.

Mark’s error is interesting because it shows a ruptured black box — an accute failure of abstraction. Of course, many errors, like the dialog shown below, show us very little about the software we’re using.

Unknown Error dialog

With errors like Mark’s, however, users are quite literally presented with a view of parts of the system that programmer was trying to hide.

Here’s another photo I’ve been showing in a my talks that shows a crashed ATM displaying bits of the source code of the application running on the ATM; a bit of unintentional “open sourcing.”

Unknown Error dialog

These examples are embarrassing for authors of the software that caused them but are reasonably harmless. Sometimes, however, the window we get into a broken black box can be shocking.

In talks, I’ve mentioned a configuration error on Facebook that resulted in the accidental publication of the Facebook source code. Apparently, people looking at the code found little pieces like these (comments, written by Facebook’s authors, are bolded):

$monitor = array( '42107457' => 1, '9359890' => 1);
// Put baddies (hotties?) in here

/* Monitoring these people's profile viewage.
Stored in central db on profile_views.
Helpful for law enforcement to monitor stalkers and stalkees.
*/

The first block describes a list of “baddies” and “hotties” represented by user ID numbers that Facebook’s authors have singled out for monitoring. The second stanza should be self-explanatory.

Facebook has since taken steps to avoid future errors like this. As a result, we’re much less likely to get further views into their code. Of course, we have every reason to believe that this code, or other code like it, still runs on Facebook. Of course, as long as Facebook’s black box works better than it has in the past, we may never again know exactly what’s going on.

Like Facebook’s authors, many technologists don’t want us knowing what our technology is doing. Sometimes, like Facebook, for good reason: the technology we use is doing things that we would be shocked and unhappy to hear about it. Errors like these provide a view into some of what we might be missing and reasons to be discomforted by the fact that technologists work so hard to keep us in the dark.

Google Miscalculator

This post on a search engine blog pointed out a series of very strange and incorrect search results returned by Google’s search engine. A very complicated “black box,” many of the errors described highlight and reveal some aspect of Google’s search technology.

My favorite was this error from Google Calculator:

Error showing 1.16 as a result for eight days a week

The error, which has been fixed, occurred when users searched for the the phrase “eight days a week” — the name of a Beatles’s song, film, and sitcom.

Google Calculator is a feature of Google’s search engine that looks at search strings and, if it thinks you are trying to ask a math question or a units conversion, will give you the answer. You can, for example, search for 5000 times 23 or 10 furlongs per fortnight in kph or 30 miles per gallon in inverse square millimeters — Google Calculator will give you the right answers. While it would be obvious to any human that “eight days a week” was a figure of a speech, Google thought it was a math problem! It happily converted 1 week to 7 days and then divided 8 by 7: roughly 1.14.

Clearly, the error reveals the absence of human judgment — but we knew that about Google’s search engine already. More intriguing is what this, combined with a series of other Google Calculator errors, might reveal about the Google’s black box software.

When Google launched its Calculator feature, it reminded me of GNU Units — a piece of free/open source software written by volunteers and distributed with an expectation that those who modify it will share with the community. After playing with Google Calculator for a little while, I tried a few “bugs” that had always bothered me in Units. In particular, I tried converting between Fahrenheit and Celsius. Units converts between the amount of degrees (for example, a change in temperature). It does not take into account the fact that the units have a different zero point so it often gives people an unexpected (and apparently incorrect) answer. Sure enough, Google Calculator had the same bug.

Now it’s possible that Google implemented their system similarly and ran into similar bugs. But it’s also quite likely that Google just took GNU Units and, without telling anyone, plugged it into their system. Google might look bad for using Units without credit and without assisting the community but how would anyone ever find out? Google’s Calculator software ran on the Google’s private servers!

If Google had released a perfect calculator, nobody would have had any reason to suspect that Google might have borrowed from Units. One expects unit conversion by different pieces of software to be similar — even identical — when its working. Identical bugs and idiosyncratic behaviors, however, are much less likely and much more suspicious.

Given the phrase “eight days a week”, Units says “1.1428571.”