Using The Wisdom Of Crowds To Translate Language

With the help of a translator, an aid worker in Haiti enters the medical data of a woman.

With the help of a translator (right), an aid worker in Haiti enters the medical data of a woman injured during January's earthquake. In the days after the quake, translators who spoke the local dialect, Kreyol, were in short supply. And that prompted linguists to use crowd-sourcing techniques. AP/Global Relief Technologies hide caption

itoggle caption AP/Global Relief Technologies

Anyone who's turned to a website to translate a foreign language knows that perfect computer translation remains elusive. That's especially true for hundreds of lesser-known languages.

So linguists are trying to harness the wisdom of crowds to do what machines can't. It's known as crowd-sourcing, and researchers think it could help them get closer to something they've been pursuing for decades: the perfect translation machine.

"There are aspects to the translation problem that are undeniably, unavoidably human," says Philip Resnik, who teaches linguistics at the University of Maryland. Resnik says computer translators like Babelfish and Google Translate work best when they have lots of translation data to learn from. And we only have that data for a handful of languages, like French and Chinese.

"There's an awful lot more than six languages in the world," Resnik says. "And an awful lot of people in the world who have a need for something that provides more reliability than you're going to get from Google Translate."

So Resnik and a handful of his colleagues are looking for ways to make human and computer translators work together. Earlier this month, they gathered at a conference in Maryland to trade ideas and real-world examples.

A Problem Illuminated In Haiti

Stanford graduate student Rob Munro cites the earthquake that struck Haiti in January.

In that country, text messages were still getting through — but most of them were in the local Creole dialect, Kreyol. And the U.S. military, which was in charge of relief, doesn't speak Kreyol. Fortunately, there are thousands of Haitians around the world who do.

"If you lived anywhere in the world and spoke Haitian Kreyol and you wanted to help, then you could come online, translate a message," Munro says.

On average, it took those volunteers less than 10 minutes to translate each text message. Not as fast as a computer, but far more accurate. This is what tech geeks like Munro call crowd-sourcing.

"By crowd-sourcing, we could bring in the knowledge of people who could translate from Kreyol into English," he says, "and then those who could identify all of the locations in those messages."

The Challenges Of A Global Economy

Linguists hope that crowd-sourcing holds the key to translating hundreds of relatively obscure languages, such as Urdu, Pashto and Farsi. But this is not a purely academic question — it also touches on both security and business concerns, says computer science professor Judith Klavans, of the University of Maryland.

Klavans also works with the Office of the Director of National Intelligence.

"In the Cold War era, we had Spanish and Russian. If you could handle Spanish and Russian, you could do about anything that needed to be done," Klavans says.

"But now, we've got all kinds of other languages. We live in a much more global economy. If you can't figure them out quickly, then we don't know what's going on anywhere."

The early experiments with crowd-sourcing and translation are not close to being ready for public consumption — it's still a big challenge to figure out which human translators do a good job and which don't. But conference organizer Philip Resnik is optimistic.

"It's possible that crowd-sourcing will not get us all the way to fully automatic, high-quality translation," Resnik says. "But it can get us a lot closer, by bringing humans and machines closer together in a way that hasn't happened before.

Linguists don't know where the next disaster will be. But they do know that they'll be ready to translate.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.