Executive Summary
Using machine translation (MT) correctly requires insight, expertise and careful planning. We do not believe that fully automated localisation processes are likely to happen anytime soon.
The quality of machine translation output depends on many factors: the way the source text is written, how creative it is, whether the MT engine has been trained with appropriate data, etc.
In audiovisual (AV) content, the meaning of words is affected by the sound and image. MT engines do not understand context and merely reproduce existing translations. This makes machine translation less suitable for audiovisual translation.
Translators are rarely invited to participate in the design of MT solutions, and it is assumed that one size fits all, although translators have different styles of work.
A better dialogue between developers and AV translators could help to shift the focus from MT to other technologies which might be of greater benefit, such as some CAT (computer-assisted translation) features, voice-input technologies and quality assurance tools.
Although proponents of MT claim that efficiency gains are guaranteed, fixing a poor translation can take longer than translating the same text from scratch. Some translators produce translations of worse quality when using MT. This means that the effort is merely shifted from the translator to the reviser.
Unscrupulous use of MT will increasingly lead to brain drain and talent crunch. This undermines the long-term sustainability of the industry.
To reinforce sustainability, translators’ conditions need to be improved. This includes adequate remuneration both for their work and for training MT engines, guaranteeing their status as authors, adjusting the expected output both for translators and post-editors and attention paid to quality.
Ethical aspects are no less important: as MT is sometimes passed off as human translation, clients are not made aware of this fact. Translators are often not aware that their work is used to train MT engines, nor are they remunerated for this. MT makes translators lose their unique translation style, and the language becomes more bland and homogeneous. The ecological cost of MT should be taken into account, too.
We are in favour of the concept of the augmented translator that puts the human front and centre and uses technology to enhance their capabilities.
Introduction
Nothing has ever changed the localisation industry quite like the quantum leap in machine translation (MT) technology made in the last decade. While some parties are already celebrating the end of human translation and ushering in the era of “post-localisation”, we should keep in mind that artificial intelligence (AI) is merely using large amounts of reference data to often produce correct results, while the human translator knows how words and concepts relate to each other in the world, which is the more robust basis for quality output.
As Aljoscha Burchardt, principal researcher at the German Research Center for Artificial Intelligence, put it, “MT is a clever parrot, but still a parrot” (Augmented Translation, Dr. Arle Lommel, CSA Research, 34:37).
Unfortunately, working conditions for human translators have historically been poor, and the recent hype around machine translation has only made things worse.
Since MT is here to stay and certainly has potential, we would like to give an overview of the current state of affairs and recommend guidelines for AI-powered human translation as well as best practices for the sustainability of the industry in general.
Capability of MT
Current experience
Contrary to the belief that AI will now effortlessly serve as our personal interpreter, using the technology correctly requires insight, expertise and careful planning. We are far from being able to copy-and-paste any piece of text into the graphical interface of an MT engine and get a polished result. Humans are still an integral part of the localisation process, and often human translation is the only way of producing a fit-for-purpose result. This is due to the fact that the quality of machine translation output varies wildly, as it depends on many factors:
Is the source text written in a way that facilitates translation by the machine? Does it use many abbreviations or very specific lingo, as would be the case with a video-game developer talking about the latest game update, or does it adhere to common linguistic standards and is written by a professional writer (e.g. a journalist)?
Does the content require creative translation (e.g. a high-fantasy drama) or is it repetitive and follows recognisable patterns (e.g. a recipe)?
How well is the MT engine trained in the specific field, and what is the quality of the corpora (databases filled with existing phrases in source and target languages) used for training? Or is the MT engine all-purpose and favours quantity over quality regarding its underlying data?
Are there extensive corpora available for the language pair in question, or do you have to use a pivot language (third language, e.g.: source > English > target) for translation?
All these factors and more influence the quality of the MT output, and as a consequence it is difficult to make generalised statements about it. If used correctly and with the individual scenario in mind, the technology can assist the human translator. We envisage that in the future AI will not replace human translators but will rather augment their skills. The machine’s ability to quickly parse large amounts of data and present relevant results to the human in the driver’s seat could prove invaluable to efficiency and quality levels moving forward.
Similarly, it is important to keep in mind that AI merely reproduces existing translations, which means that it cannot come up with creative solutions. It looks for words and phrases in the available training data and constructs translated sentences based on that. The way it reaches those results is unpredictable, which means that sometimes catastrophic errors (completely unusable translations) occur. At the same time, AI does not understand context (or only in a limited way) and, due to the way the algorithms are designed, will not be able to improve on this in the near future. Experts say that a couple of pages in a book will be the limit.
When localising audiovisual (AV) content, however, context is key. The image on the screen, the action, the tone of voice and gestures all affect the meaning of the words, and as long as machines are not capable of taking all these elements into account, MT will be lacking in AV translation and in all (creative) translations that heavily rely on context, for that matter.
Features currently in development
While machines might never read and understand entire books, industry experts can give us a glimpse of incremental updates to the current technology. Both Evgeny Matusov and Yota Georgakopoulou have published articles talking about improvements which are likely to appear very soon in our daily workflows.
In terms of pre- and post-processing of text, metadata could help MT to take factors such as register, text-length restrictions and gender into account, and audiovisual translation (AVT) specialists could also fine-tune this information on a per-segment basis. At the same time, intelligent text segmenters could reduce the amount of editing necessary for suitably formatted subtitles.
While automatic grammar correction could remove grammatical errors, glossary implementation could help improve consistency, and indicators for the estimated quality of the MT can automatically filter out suggestions that do not meet a certain standard.
Newly trained engines for specific language pairs, which eliminate the need for pivot languages, are also expected to appear in the coming years.
While these features will make working with MT more comfortable, we do not think that fully automated localisation processes without human intervention are likely to happen anytime soon.
Implementation of MT
Current experience
Translators are rarely invited to participate in the design of MT solutions, nor is adequate feedback solicited from them, although they could offer valuable suggestions. Consequently, their needs and user experience aren’t taken into account. The assumption is that one size fits all, although translators have different styles when it comes to the use of MT – while some translators want to display the MT suggestion only occasionally and don’t want to allow the MT output to affect their style, others prefer rewriting whatever is provided by the machine. However, MT suggestions are usually prefilled in the translator’s working area (boxes) and aren’t always easy to hide.
A better dialogue between developers and translators could help to shift the focus from MT to other technologies which seem to be overlooked, although they might have a greater practical benefit both for the translators and the overall quality. Such technologies include a number of CAT features, voice-input technologies and other solutions. Although they are used as a matter of course in other domains of translation, they are almost non-existent or unsupported in audiovisual translation.
It is usually agencies or end clients that decide whether or not and how MT should be used on a given project. This decision is often made by a project manager with no training or experience in translation, while the translator, an expert in the area, has no say in the matter. Sometimes, translators are instructed to keep as much of the MT output as possible. In fact, this is an approach called light post-editing. Unlike full post-editing, which is used to “obtain a product comparable to [that] obtained by human translation”, light post-editing merely allows the user to get a sense of the meaning and doesn’t “attempt to [create] a product comparable to [that] obtained by human translation”. While this may be suitable for internal communication or media monitoring, it isn’t appropriate for most audiovisual content.
Although enthusiastic statements by proponents of MT may suggest that efficiency gains are guaranteed, most translators disagree: fixing a poor translation, whether it has been produced by a machine or a human, can take longer than translating the same text from scratch. Indeed, revisers (known as QCers, proofreaders, adaptors, etc. in different AVT fields) complain that some translators produce translations of worse quality when using MT. As a result, some of the effort is merely shifted from the translation stage to the quality-control stage, while the revisers’ remuneration remains the same.
What we want
User-friendly, intuitive interface that makes it easy to concentrate on the task at hand. This can best be achieved if translators are involved in MT design and setup.
MT should be an optional tool which stimulates human creativity instead of inhibiting it. In practical terms, we would like to have an MT input toggle to display/hide and insert MT suggestions easily by means of a customisable shortcut. Ideally, it should be possible to display only MT suggestions exceeding a selected quality threshold. We would like to be able to choose the type of MT output presentation to accommodate different styles of work: overtyping the MT suggestion in the box vs. typing in an empty box while the suggestion is displayed in a separate field (if activated by a toggle).
In addition to MT, we’d like to have access to other, more traditional tools as well. Translators should be put in full control, which also means that all the tools can be turned on and off as needed and can be accessed with customisable shortcuts. In particular, they should include the following functionalities, some known from computer-assisted translation (CAT) applications:
Translation memory search, also known as concordancing, giving easy access not only to existing translations of related content, e.g. other episodes/seasons of the same series, but to other bilingual content as well. When both MT and translation memories (TMs) are used to leverage suggestions, it should be clear at first sight which source each suggestion comes from. MT could be used to enhance TM technology and introduce subsegment matching;
Termbase lookup, giving on-the-fly access to glossaries of key names and phrases, terms, etc. and enhance the data with the help of AI (Automated Content Enrichment);
Predictive typing (autocomplete) that anticipates what words you will need to type when translating a specific sentence;
Voice input (dictation) functionality customised for colloquial spoken language and supporting voice commands for operations such as timing and segmentation;
Up-to-date spell checkers customised for colloquial/spoken language rather than just widely available open-source ones that haven’t been updated for years and, for some languages, have inadequately sized dictionaries with countless errors;
Quality assurance tools that focus not only on technical aspects such as reading speed and gaps between subtitles but on linguistic aspects as well. Examples of items that may be checked include trailing spaces, double spaces, repeated words, grammatical errors, words that shouldn’t appear at the end of lines and glossary conformance. Supporting regular expressions in find-and-replace operations is another important tool.
Applications facilitating collaboration among translators on larger projects either within a language (e.g., several people working on a series) or across languages (e.g., several translators, each working into their target language) and across functions (translator, QCer, template maker, script authors, etc.).
New innovative tools that could be developed in collaboration with translators who use language technologies such as language corpora. One such example is having access to lemma-based translation equivalents (word alignments) or simple access to existing dictionaries.
Fair Payment, Sustainability and Quality
Current experience
In the field of audiovisual translation, big technological turns have become associated by practitioners with pay cuts, as historically many such turns, e.g. the introduction of fuzzy matches in game localisation and the use of automatic speech recognition in captioning, have led to exactly that.
Another example of this comes from the subtitling industry. In an attempt to maximise efficiency, companies have fragmentedwhat used to be a job for one specialist, the subtitler, into smaller parts performed by different people — the template maker, the template proofreader, the translator, the translation proofreader, etc. — each paid considerably less due to “having to do less work”. The pay reduction was disproportionately large, and that’s one of the reasons why very few subtitling professionals nowadays earn a decent wage commensurate with their qualifications, skills and experience.
Rates were already low in the pre-MT era, not least because they’d been stagnant for more than a decade and hadn’t been adjusted for inflation, but now that the use of MT has exploded in popularity among companies, there’s been even more fragmentation in the workflow (via post-editing and metadata management) and even more pay cuts, as some companies have aggressively slashed their rates on the basis of “huge efficiency gains” that MT supposedly brings— a claim that audiovisual translators have contested all along.
The shift to post-editing has not only led to lower pay and decreased work satisfaction but it has also put the translators’ authorship status in jeopardy. With the raw translation now being provided by the machine, unless policymaking goes in the right direction, AVT specialists might lose their authors’ rights, including the right to receive royalties for the use of their work, which would decrease their income by up to 50% in some cases.
All this has exacerbated the so-called brain drain — a phenomenon which was acutely felt even before the advent of MT — with professionals quitting in search of better working conditions and university graduates failing to find their footing in our trade. This, in turn, has contributed to the “talent crunch” — one of the biggest talking points in the industry today.
Translation quality has also suffered due to MT and will likely continue to be eroded further unless proper measures are taken. This process may even accelerate, as increasingly more machine translated content is fed back into MT engines. These days, industry-led discussions surrounding machine translation seem to focus solely on maximising efficiency, while the effect of MT on quality is mentioned only briefly or not at all. Indeed, it doesn’t seem to be a priority — as mentioned above, post-editors are often instructed to keep as much of the machine translation output as possible, and even when that’s not the case, lower rates force them to adopt the “this will do” approach, i.e. accepting barely passable translations in order to work as fast as they can to try and earn a decent wage. An already demanding and stressful job, post-editing in such a hectic manner is bound to affect not only translation quality but also one’s mental health.
With rates, working conditions and quality deteriorating and with veteran translators leaving the industry and being replaced by students, amateurs and part-timers, we fear for the long-term sustainability of our profession and lament the role of the exploitative use of MT in this decline.
What we want
Translation rates, which will be affected by MT, should be adjusted for the inflation of the past 10+ years and reflect the average salary of similarly skilled employees in other fields and the employment benefits they get, such as paid leave, paid vacation, healthcare and pension benefits, maternity leave, etc.
The efficiency gain via MT should primarily be used to scale with the increasing demand, which would justify the investment in MT training and implementation by itself. MT could be used to support less demanding projects and free up highly skilled AVT specialists for more demanding ones.
Remuneration for machine translation post-editing should be commensurate with the high level of expertise required to perform it.
Translators should be made aware of and remunerated for the use of their work for training an MT engine.
Translators’ author status should be retained even with the use of MT.
Mental health should be taken into account when adjusting post-editors’ expected output, since it is not possible to proofread MT as quickly as human translation.
Quality should be brought to the forefront.
Measures that can prevent its erosion include adequate recruitment, regular training, continuous quality assessment, etc.
Quality assessment (evaluation of quality) and revision (improvement of quality) are different parts of the workflow. Revisers shouldn’t be routinely asked to provide quality assessment at the same time: when revisers have to assign an error type and severity to each change they make, it makes their task much more time-consuming and may negatively affect their choices.
Quality assessment shouldn’t be based exclusively on counting errors, as this weakness-focused approach disregards the strengths such as creative solutions and idiomatic expressions.
Quality assessment shouldn’t be confined to assessment of translators and post-editors but should cover revision as well.
It should be kept in mind that reliable revisers are more difficult to recruit than reliable translators.
The Ethical Aspect of MT
Like many other powerful technologies, machine translation doesn’t come without ethical issues, which mostly stem from the unscrupulous practices surrounding its use. An example of one such practice is that of companies trying to pass MT off as human translation or not telling their clients whether MT was involved in the localisation process. Most filmmakers, producers and content creators would likely object to having their work translated by a machine, even if post-edited afterwards, but more often than not they simply aren’t made aware of the fact.
Translators, in turn, aren’t made aware (or asked for consent) when their work is used to train an MT engine, which could be seen as infringement of intellectual property in some countries. More generally, practitioners’ opinions of MT and post-editing remains largely ignored, even though many have openly stated that they don’t want to be correcting what a machine wrote and to see their creative profession, an art form even, turn into a more mechanistic, routine job.
An unwelcome result of translators’ shift to post-editing is the loss of their unique translation style and linguistic signature. While a fair amount of localisable content, e.g. corporate materials, conference presentations and e-learning courses, can potentially be suitable for the use of MT and not benefit much from what’s known as the “translator’s voice”, what happens when the human touch and creativity are crucial, as in the case of films? According to research, when post-editing MT, translators lose about one-third of the stylistic features that make up their individual voice.
As far as potential issues are concerned, there’s the one of language becoming more bland and homogeneous, and less rich and diverse.
Beyond this, as previously stated, authors’ rights in the era of MT have not been sufficiently discussed yet, and the jury is still out in terms of appropriate legislation. As it is currently not clear who owns text created by the collaboration of machine and post-editor, audiovisual translators might end up losing the right to be credited for their work and becoming nameless, voiceless cogs in the localisation machine.
And finally, as is the case with all types of technologies in our digital day and age, MT and AI in general come at an ecological cost, since training MT engines as well as using them requires a tremendous amount of energy.
Looking Ahead
While machines have come a long way since their rule-based days, human parity under real-world circumstances has not yet been achieved and will not be achieved anytime soon. Because of this, despite technological advances, humans will remain the bottleneck and therefore a key element in sustaining our industry. And so, their needs must be met.
In his recent talk about augmented translation, Dr. Arle Lommel from CSA Research provided a valuable insight into a future of MT that centres around the human. In that future, AI is used not to replace translators or make them post-editors but rather to enhance their capabilities and boost the speed, comfort and ease with which they do their work. Surrounded by cutting-edge, next-generation technology, translators are not forced into using any of the available tools. Instead, they’re free to choose whatever tools they think will work best for each particular project, be it automated content enrichment for research-heavy tasks, enhanced translation memory and adaptive neural machine translation for assignments with a fair bit of terminology and repetition, or perhaps even nothing at all for highly creative jobs.
This is a vision we share. By using MT to empower translators and improve their working conditions, we can secure a sustainable future for the field of AVT and continue to bridge linguistic divides across different countries and cultures.
For that, a constant dialogue between all stakeholders is crucial, to ensure that the needs of all parties are being met. We also encourage more research into such aspects of MT as quality, cognitive effort, mental health and work satisfaction, with a special focus on creative translation (so far, research has primarily focused on legal and technical texts) in order to build a healthy, productive and sustainable working environment. Cutting costs, putting pressure on collaborators and phasing out human translators based on a surface-level understanding of MT technology is not acceptable to us.
We hope that this document will serve as a basis for the best practices in the field of augmented translation and inspire fruitful relationships between everyone involved. After all, our industry has always been about helping to foster understanding and bringing the world closer together.