The Last Frontier of Machine Translation

Don’t ask a bot to translate a book.

Pixelated illustration of a robot reading a book while its head is obscured in a cloud of dust
Illustration by The Atlantic. Source: Getty.
Pixelated illustration of a robot reading a book while its head is obscured in a cloud of dust

Listen to this article

Produced by ElevenLabs and News Over Audio (NOA) using AI narration.

When Google Translate was released, in 2006, I was an eighth grader stumbling through introductory Spanish, and my teacher had little reason to worry about her students using it to cheat. It’s almost hard to remember now, but early machine-translation systems were laughably poor. They could give you the general thrust of, say, a Portuguese website, but they often failed at even basic tasks. In one case from 2010, a Google-translated summons reportedly instructed a defendant to avoid court instead of showing up there.

Machine translation didn’t become the juggernaut we know until 2015, when Baidu released its large-scale neural machine-translation system, built with the same basic architecture that chatbots such as ChatGPT use today. Google started switching from a statistical model to a neural system not long after, as did peers such as Systran and Microsoft Translator. It was a major leap forward: Tourists can order coffee and haggle for knickknacks thanks to the magic of Google Translate; I’ve occasionally used Reverso Context, an AI tool, in my own published translations. But still, one area of translation has proved remarkably impervious: literature, which many researchers call the “last bastion” of human translation.

Most studies find that neural machine-translation models can translate only about 30 percent of novel excerpts—usually simple passages—with acceptable quality, as determined by native speakers. They struggle because, at its core, literary translation is an act of approximation. The best option is sometimes not the correct one, but the least bad one. Translators often have to sacrifice literal meaning for the greater good of the piece. But AI is less adept at making such compromises and at landing on creative solutions that, although technically less correct, preserve aspects of a book that are hard to quantify: voice, spirit, sensibility. “You’re weighing different losses and different gains against one another,” Heather Cleary, a literary translator from Spanish to English, told me. A translator has to ask herself: What am I going to really prioritize?

Daniel Hahn’s recent book, Catching Fire: A Translation Diary, is full of these types of dilemmas. In the book, he walks through his process of translating Jamás el Fuego Nunca, a novel by the Chilean writer Diamela Eltit. One chapter, for example, begins with the following four words: “Frentista, estalinista, asesina loca.” Let’s focus on frentista as a case study. The most literal translation (and the one offered by some AI translators) would be “frontist,” which is basically meaningless in English. Hahn suspects that frentista is meant to be a term for a Chilean leftist, and with a fellow translator’s help, he establishes that it is likely a derogatory term referring to a specific anti-Pinochet guerilla group.

By Diamela Eltit

Hahn must ask himself what’s more important in this case: specificity, or maintaining readability and capturing the writer’s voice. He throws around a few options—“paramilitary,” “commie thugs”—before settling on “extremist.” He also switches the order to foreground “Stalinist” (estalinista), giving the reader a sense of what kind of extremist they’re dealing with. Then there’s the problem that Spanish is a gendered language; it’s clear in the original that the speaker is addressing a woman. As a result, Hahn renders asesina loca as “crazy killer bitch.” The final version reads “Stalinist. Extremist. Crazy killer bitch.” It’s imperfect, but it’s also great.

Google Translate, by contrast, suggests “Frontist, Stalinist, crazy murderer.” The sentence is correct, sure, but clumsy, and all but unintelligible to non-Chilean readers. A specialized model like the kind used in most studies of neural machine translation—perhaps one trained specifically on Chilean literature—would certainly fare better. But it’s still hard to imagine one coming up with something close to Hahn’s solution.

When you compare human translations with edited machine translations, however, things suddenly get a lot more interesting. In the production of commercial texts—an instruction manual for a printer or a kitchen gadget, say, or even a news article—it’s standard for humans to edit a raw machine translation and then send it to press. This process, which is called post-editing (PE), has been around since long before neural networks started being used for translation. Studies vary, but most conclude that it’s faster and cheaper than translating from scratch.

Since the release of neural models such as those used by Baidu and Google Translate, a body of research has investigated whether the PE process can be applied to literature too. When presented to readers, PE performs comparably in some studies to fully human translations. (So far, most of the research to date has compared European languages, which limits the conclusions that can be drawn from it.)

How well PE fares is influenced by several factors, but in studies, the method tends to do less well with challenging literary works and better with plot-driven novels. Ana Guerberof Arenas, an associate professor in translation studies at the University of Groningen, in the Netherlands, told me that machines are more likely to trip over works with more “units of creative potential”—metaphors, imagery, idioms, and the like. Hahn’s frentista dilemma is a prime example—the more creativity required, the wider the gap between a human solution and a machine one.

Of course, the post-editor can touch up a poor rendition of a challenging passage. But some studies suggest that PE versions are different from fully human ones in subtle, vitally important ways. Antonio Toral, an associate professor at the University of Groningen who frequently collaborates with Guerberof Arenas, explained one example to me: “In translation from scratch, the translator decides where the translation goes from the start. If a sentence can be translated in three main ways, the translator is going to decide.” But in post-editing, “the machine is going to make that decision, and then you just fix whichever of the three the [machine-translation] system has picked.” This reduces the translator’s voice and could result in more homogeneous translations across the literary market.

It could also lead to inconsistent voice within a single translation: Toral told me that in research he has collaborated on, post-editors deviated from the raw machine translation less and less often as they progressed through a work. Recent research led by Guerberof Arenas found that compared with entirely human translations, PE translations are consistently less creative, meaning they depart from literal translations less often and perform less well with those units of creative potential. The differences here are subtle, a question of inches rather than miles. But these subtleties—voice, rhythm, style—are precisely what can separate a functional translation from a great one.

Despite these drawbacks, some European publishers are actively releasing PE titles. Nuanxed, an agency that produces PE translations for publishers, has completed more than 250 books, most of them commercial fiction, since launching two years ago. When I spoke with Robert Casten Carlberg, Nuanxed’s CEO and one of its co-founders, in October, it sounded like Nuanxed was doing well. “The publishers we work with, once they have worked with us, they come back and they want to do more,” he told me. Perhaps that’s because Nuanxed has really nailed human-machine translation; Carlberg described his company’s version as “broader” and “more holistic” than the PE norm, though he was unwilling to discuss specifics. But more likely, I think, is that the quality gap between PE and human translation doesn’t bother the average reader of action-driven commercial fiction. If the customers are happy, it’s easy to see why Nuanxed might not be so concerned about the recent academic research suggesting that PE isn’t optimal.

The changes in the industry aren’t going unnoticed. “Colleagues are starting to be offered post-editing jobs from the publishing houses that would normally offer them translation jobs,” Morten Visby, a Danish literary translator and the former president of the European Council of Literary Translators’ Associations, told me. In the United States, the Authors Guild recently published a sample clause for book contracts that would disallow publishers from machine-translating an author’s book unless the author consents. But so long as the translation “substantially comprises human creation” and a translator “has control over, and reviews and approves, each word in the translation,” the publisher would not need to secure consent to use AI “as a tool.” I asked several of the experts I spoke with whether they thought PE fits this definition, and unsurprisingly, there was no consensus. (Mary Rasenberger, the CEO of the Authors Guild, told me that according to her understanding, a publisher would have to obtain the author’s consent for PE translation.)

Although some European publishers fear that releasing PE titles would damage their brand, Visby said, most of the experts I spoke with think that the industry will continue to move in that direction. Likewise, although Nuanxed isn’t currently pursuing more literary work, Carlberg said that they would if they received a request from a publisher and thought they were up to the task.

The timing of all this is somewhat ironic. In English-speaking markets, there has been a real push in recent years to put translators’ names on covers, and for greater translator visibility in general. If PE jobs proliferate, the place of translators will likely become even less central. Translation, already an incredibly precarious profession, may become even less secure: Visby said that in his work on behalf of translators, he’s seen that post-editing gigs, unlike translation contracts, generally don’t grant human translators copyright, and offer fewer benefits.

And yet, many translators share a sense that all of this recent upheaval has only further cemented literary translation’s status as an indispensable art. AI can predict how proteins fold. It can outperform medical students and pass the bar. It can be used to create a plausible version of “Barbie Girl” sung by Johnny Cash. The fact that it remains woefully inadequate at literary translation—at least on its own—is a testament to the difficulty and value of the profession.


​When you buy a book using a link on this page, we receive a commission. Thank you for supporting The Atlantic.

Jeremy Klemin is a writer and translator from Southern California.