To err is not just human: common types of errors to look out for when post-editing machine translations
With machine translation and post-editing continuing to take on relevance in the translation industry, many professional translators are turning their hand to providing this new service. While experience with editing and proofreading means that it is relatively simple to add post-editing to the list of language services you provide, it does come with its own specific challenges.
Full and light post-editing – what’s the difference?
We always advise our clients on what we think is the best option for their projects. In doing so, we take into account a number of variables, such as the type and purpose of the text, the length of that text, the required deadline and how strict this is, and what the client’s budget is. Depending on the outcome of our assessment of those factors, we may suggest using post-editing.
There are two types of post-editing:
Light post-editing involves proofreading a machine translation and correcting only what is necessary to make the text accurate and true to the source. The aim of light post-editing is to produce a translation that is usable in a very short amount of time. In order to achieve this, only the most serious errors are corrected and the bare minimum is changed when it comes to style, just enough to ensure the text is understandable for the reader. This is why this type of translation is only suggested for certain types of text, for example internal notices or emails, i.e. texts that are not going to be published and do not need to be high quality.
Full post-editing involves making more changes and being more thorough when going through the machine translation. Not only are any errors corrected but the terminology used is checked, the style and readability of the text is improved, and logical connections in the text are considered. The end result should be as close to a traditional translation as possible. This is often a good balance between receiving a high-quality translation and having a short turnaround time, and is therefore sometimes an option for certain external translations like presentations and user manuals.
Common types of errors in machine translations
Both machines and humans are susceptible to making errors when it comes to translation. However, the common types of errors seen in each case can vary, for obvious reasons. While many translators are very familiar with what they need to keep an eye out for in translations written by other humans, it is becoming increasingly important to know where to look for possible errors in machine translations.
A machine ‘thinks’ very differently to a human. A post-editor therefore needs to treat the proofreading of a machine-produced text differently to one written by a human translator. A balanced approach is needed where the post-editor is not afraid to step in to create a text that reads naturally but also knows when a good enough level has been reached.
Each language has its own grammar and syntax, and certain structures are often translated in a particular way by a machine that is not the most natural to the human ear. The post-editor needs to decide whether it is a necessary change or a stylistic one (only to be changed when the assignment is full post-editing). It is also important to not get so used to reading sub-par texts that some unnatural phrasing is left in unintentionally.
A machine is not able to write creatively but rather draws on the input it has received, and this creativity is what needs to be injected into the machine output by the post-editor. For example, adding idioms and expressions may enhance a translation or, if these come from the source text, they may be translated too literally or require a different solution to the one suggested in this context.
As machine translation looks at each segment individually, it is unable to take the wider context of the text or the subject area into account. This may mean that some logical connections are missing that a human translator would have emphasised, like adding in connectors, for example.
A machine is unable to add or remove sections to make the target text more understandable or natural for the reader, so the post-editor must insert these where appropriate.
It cannot judge differing degrees of emphasis (e.g. is ‘wonderful’ too over the top in context?) or decide if a text needs a more formal or informal tone.
Any mistakes or oddities in the source text are replicated in the target text, so it is important to check that everything makes sense in context. These can also be quite subtle: for example, if the source suddenly changes from using ‘we’ to using the name of the company, this may be something that a human translator would change to make more consistent.
- Localisation and transcreation
There are sometimes texts that, while broadly fulfilling the criteria for machine translation, still have sections that require a freer approach. Machine translation cannot add in or adapt ideas that are not there in the original text as it works on a set of rules rather than thinking about what the reader needs. It is therefore important to consider aspects like cultural references and whether they need to be explained further or even changed completely.
A machine will approach creatively written sections in a more literal way than a human translator. Here it is important to look beyond the words and work on reproducing the same imagery and/or feel of the source text.
Machine translation uses data and input from translation memories and termbases. This can help improve consistency as it means that if a term has been used before, it will also be used in the new text. However, this does not necessarily mean that this term is right for the new text or correct in every single instance. For example, a term may be used a lot in a translation memory, but a different term is needed for the current text because there is a specific request from the client, a new target audience for this particular text, etc. Moreover, as there are often terms that have multiple different translations, there may not just be one term used consistently in the translation memory.
A machine is susceptible to using terms that are unsuitable or unnatural as it cannot sense if something sounds a bit off. There may be two terms used in the source text that can be translated as the same term in the target text, but in context there is actually a slight difference that needs to be maintained. Or there may be two different terms used in the target text when it would be better to repeat the same one. Furthermore, as it translates segment by segment, a machine may put forward different translations for the same term in different segments.
While machine translation may be less likely to fall into the trap of mistranslating a false friend, it may run into trouble if there is slang, jargon or made-up words. These are therefore things to pay close attention to when post-editing.
The punctuation used in the source text is often reproduced in the target text. While this can be done in some cases, it does not always work, and another solution might be more natural or appropriate. After all, it’s the little changes that really make the difference between a machine and a human translation.
Machine translation, while able to benefit from the research conducted previously by translators in the form of translation memory matches or a termbase, is not able to conduct research when translating a text and cannot look up new or specific terms and phrases. For example, it cannot look up acronyms, official job titles, names or places. Similarly, it is not an expert in the field and may therefore lack the exact terminology or context needed.
This is by no means an exhaustive list of the types of errors you may come across. However, it should highlight some points you need to be aware of while post-editing as these may be different to what you would normally look out for when you proofread a translation written by a human translator and therefore alter how you approach the task.
Are there any common errors you have come across while post-editing that haven’t made it onto our list? Let us know!