Lost in Machine Translation Posted Mon, 21 Jul 2008

While I've been traveling over the last week or so, loads of people sent me a link to this wonderful image of a sign in China reading "Translate Server Error" which has been written up all over the place. Thanks everyone!

Billboard saying

It's pretty easy to imagine the chain of events to led to this revealing error. The sign is describing a restaurant (the Chinese text, 餐厅, means "dining hall"). In the process of making the sign, the producers tried to translate Chinese text into English with a machine translation system. The translation software did not work and produced the error message, "Translation Server Error." Unfortunately, because the software's user didn't know English, they thought that the error message was the translation and the error text went onto the sign.

This class of error is extremely widespread. When users employ machine translations systems, it's because they want to communicate to people with whom they do not have a language in common. What that means is that the users of these systems are often in no position to understand the output (or input, depending on which way the translation is going) of such systems and have to trust the translation technology and its designers to get things right.

Here's another one of my favorite examples that shows a Chinese menu selling stir-fried Wikipedia.

Billboard saying

It's not entirely clear how this error came about but it seems likely that someone did a search for the Chinese word for a type of edible fungus and its translation into English. The most relevant and accurate page very well might have been an article on the fungus on Wikipedia. Unfamiliar with Wikipedia, the user then confused the name of the article with the name of the website. There have been several distinct citings of "wikipedia" on Chinese menus.

There are a few errors revealed in these examples. Of course, there are errors in the use of language and the broken translation server itself. Machine translations tools are powerful intermediaries that determine (often with very little accountability) the content of one's messages. The authors of the translation software might design their tool to avoid certain terminology and word choices over others or to silently censor certain messages. When the software is generating reasonable sounding translations, the authors and readers of machine translated texts are usually unaware of the ways in which messages are being changed. By revealing the presence of a translation system or process, this power is hinted at.

Of course, one might be able to recognize a machine translation system simply by the roughness and nature of a translation. In this particular case, the server itself came explicitly into view; it was mentioned by name! In that sense, the most serious failure was not that the translation server worked or that Wikipedia was used incorrectly, but rather that each system failed to communicate the basic fact that there was an error in the first place.

Responses to This Post

This is just a special case of the all too widespread error made by computer programmers of providing meaningless error messages. The average user doesn't speak SQL so there's no point in displaying the raw database error message to him. And if you're providing a program to translate Chinese into English (presumably because the user doesn't speak English), which language do you think you ought to display the error message in?
Dan - how does that not ever occur to anyone? Displaying the error message in the original languge shouldn't be that hard to do and it would help so much.
Does it not occur to anyone to ask a native English speaker to proofread your copy before printing?  How hard is it to scan it to someone who speaks English?
A local Chinese restaurant where I live has a humorous typographical and spellcheck error on its menu resulting in a dish called "beef in empirical sauce".  What did it taste like you might ask?  Well, you'll just have to try it and find out.
The answer is not to omit the SQL or whatever tech-speak is relevant. It's valuable information the developers can use to help support the user who runs into problems, even if all the user can do is copy and paste it into email sent to the developers. What's important is providing reasonable natural language errors IN ADDITION to the tech-speak.

The problem with providing error messages in the "original" language is the volume of error messages which must be translated. A boggling variety of failures can occur while any non-trivial computer program is running. Often, there are so many possibilities that errors aren't reported well internally within the program, much less to any human, much less to any human other than the programmer, much less in seventeen languages. Oftentimes, nobody even knows all the errors which might occur, or if it's possible to know what they all are, it's impractical to create circumstances under which they all occur so that messages for them can be designed well in one natural language.

Software is harder than you think and easy to complain about.

One thing people could do better easily is make sure error messages are clearly designated in the "original" language. Then at least the user would know THAT the operation failed even if the nature of the failure is incomprehensible.
I am from China, I am so sorry to see that.

Why did they use a online dictionary? and they even didn't know they'd lost their internet connection? Stupid guys.

Leave Your Own Comment