Journal tags: l10n

1

Ni Hao, Monde: Connecting Communities Across Cultural and Linguistic Boundaries

Simon Batistoni is responsible for Flickr’s internationalisation and he’s going to share his knowledge here at XTech. Flickr is in a lucky position; its core content is pictures. Pictures of cute kittens are relatively universal.

We, especially the people at this conference, are becoming hyperconnected with lots of different ways of communicating. But we tend to forget that there is this brick wall that many of us never run into; we are divided.

In the beginning was the Babelfish. When some people think of translation, this is what they think of. We’ve all played the round-trip translation game, right? Oh my, that’s a tasty salad becomes that’s my OH — this one is insalata of tasty pleasure. It’s funny but you can actually trace the moment where tasty becomes of tasty pleasure (it’s de beun gusto in Spanish). Language is subtle.

It cannot really be encoded into rules. It evolves over time. Even 20 years ago if you came into the office and said I had a good weekend surfing it may have meant something different. Human beings can parse and disambiguate very well but machines can’t.

Apocraphyl story alert. In 1945, the terms for Japanese surrender were drawn up using a word which was intended to convey no comment. But the Japanese news agency interpreted this as we ignore and reported it as such. When this was picked up by the Allies, they interpreted this as a rejection of the terms of surrender and so an atomic bomb was dropped on Hiroshima.

Simon plugs The Language Instinct, that excellent Steven Pinker book. Pinker nails the idea of ungrammatically but it’s essentially a gut instinct. This is why reading machine translations is uncomfortable. Luckily we have access to language processors that are far better than machines …human brains.

Here’s an example from Flickr’s groups feature. The goal was to provide a simple interface for group members to translate their own content: titles and descriptions. A group about abandoned trains and railways was originally Spanish but a week after internationalisation, the group exploded in size.

Here’s another example: 43 Things. The units of content are nice and succinct; visit Paris, fall in love, etc. So when you provide an interface for people to translate these granular bits, the whole thing snowballs.

Dopplr is another example. They have a “tips” feature. That unit of content is nice and small and so it’s relatively easy to internationalise. Because Dopplr is location-based, you could bubble up local knowledge.

So look out for some discrete chunks of content that you can allow the community to translate. But there’s no magic recipe because each site is different.

Google Translate is the great white hope of translation — a mixture of machine analysis on human translations. The interface allows you to see the original text and offers you the opportunity to correct translations. So it’s self-correcting by encouraging human intervention. If it actually works, it will be great.

Wait, they don’t love you like I love you… Maaa-aa-a-aa-aa-a-aa-aaps.

Maps are awesome, says Simon. Flickr places, created by Kellan who is sitting in front of me, is a great example of exposing the size and variation of the world. It’s kind of like the Dopplr Raumzeitgeist map. Both give you an exciting sense of the larger, international community that you are a part of. They open our minds. Twittervision is much the same; just look at this amazing multicultural world we live in.

Maps are one form of international communication. Gestures are similar. We can order beers in a foreign country by pointing. Careful about what assumptions you make about gestures though. The thumbs up gesture means something different in Corsica. There are perhaps six universal facial expressions. The game Phantasy Star Online allowed users to communicate using a limited range of facial expressions. You could also construct very basic sentences by using drop downs of verbs and nouns.

Simon says he just wants to provide a toolbox of things that we can think about.

Road signs are quite universal. The roots of this communication stretches back years. In a way, they have rudimentary verbs: yellow triangles (“be careful of”), red circles (“don’t”).

Star ratings have become quite ubiquitous. Music is universal so why does Apple segment the star rating portion of reviews between different nationality stores? People they come together, people they fall apart, no one can stop us now ‘cause we are all made of stars.

To summarise:

  • We don’t have phasers and transporters and we certainly don’t have universal translators. It’s AI hard.
  • Think about the little bits of textual content that you can break down and translate.

Grab the slides of this talk at hitherto.net/talks.

It’s question time and I ask whether there’s a danger in internationalisation of thinking about language in a binary way. Most people don’t have a single language, they have a hierarchy of languages that they speak to a greater or lesser degree of fluency. Why not allow people to set a preference of language hierarchy? Simon says that Flickr don’t allow that kind of preference setting but they do something simpler; so if you are on a group page and it isn’t available in your language of choice, it will default to the language of that group. Also, Kellan points out, there’s a link at the bottom of each page to take you to different language versions. Crucially, that link will take you to a different version of the current page you’re on, not take you back to the front of the site. Some sites get this wrong and it really pisses Jessica off.

Someone asks about the percentage of users who are from a non-English speaking country but who speak English. I jump in to warn of thinking about speaking English in such a binary way — there are different levels of fluency. Simon also warns about taking a culturally imperialist attitude to developing applications.

There are more questions but I’m too busy getting involved with the discussion to write everything down here. Great talk; great discussion.