Science or Fiction: Machine Translation Explained

Juraj Močilac 10 months ago Comment

Once when I was a kid, I was passing by a car wash which had the big written sign “Machine washing and polishing” with a friend who asked me, all amazed, “Wow, they have machines to wash the cars?!” And the guy who worked there heard him and replied, mildly disappointed “Do I look like a machine to you?” We did not expect that, but he, indeed, was still a human being. Same goes with machine translation (MT).

Science or Fiction: Machine Translation Explained | Blog | Ciklopea

Many people believe that computers are doing all the work nowadays, and translators are often asked if they are afraid of losing their jobs. And the answer could be the same as the one that car wash worker provided. Even though we’re nearing the end of 2017, machine translation tools are still not advanced enough to replace the humans. We designed them to make our jobs easier and to be more efficient. The tools are here to help us, not vice versa.

When it comes to machine translation there is still a lot of confusion, particularly for the people coming from the other industry fields. Naturally, they have a whole bunch of questions that need clarification, such as:

What does MT actually mean?

How does it work?

What do MT and CAT stand for?

Machine translation (MT) is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, it works on a principle of simply converting the word from one language into the word of another. Therefore, as you can imagine, it cannot provide convenient translation, because the art of translating is much more complex. Since every language has its own rules and ways of usage, it is difficult to make machine translation tools do better than they’re doing now.

That’s why the human intervention is inevitable. We’re still the ones who are doing most of the work and we’re making all the necessary corrections, ensuring that translation output is efficient, correct and ready to use.

When we’re talking about machine translation there are several types worth mentioning:

Rule-Based Machine Translation (RBMT)

This type of machine translation requires more information about the structure of the source and target languages. Morphological and syntactic rules and semantic analysis of both languages help define the frame of rule-based machine translation. The process involves linking structures of input and output sentences using a parser, generator and a transfer lexicon. The problem with this method is that everything needs to be defined explicitly, which can be time consuming. If we want to speed up the whole process, we would hardly want to use this one.

Statistical Machine Translation (SMT)

This is a paradigm of machine translation in which the translation output is generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The idea is to store as many similar documents as possible in the same place so that tools can detect patterns in the documents that have previously been translated by professional human translators and to make guesses based on those findings. Google Translate was probably the most popular machine translation service that was using this method, but they have recently switched to neural MT models.

Example-Based Machine Translation (EBMT)

This method also relies on the corpus of previously translated documents. When we enter a sentence we want to translate, the sentences that contain similar sub-sentential components are selected from the corpus. Those sentences are then used to translate the subsentential components of the original sentence into the target language. You can already see that this simply screams for additional human intervention.

Hybrid Machine Translation (HMT)

Just as its name suggests, some of the previously mentioned techniques have their fingers in this. Hybrid machine translation ties together rule-based and statistical machine translation in a way that translations are performed using a rules based engine, which is then followed by statistical attempt to adjust and/or correct the output from the rules engine.

Neural Machine Translation (NMT)

Obviously, it has something to do with those “neural networks” we’re always hearing about, but we’re not going to torture you with unnecessary details. Not today, at least.

This type of machine translation is based on deep learning (artificial intelligence) and it has made rapid progress in recent years. As we said it above, Google has announced its translation services are now using this technology, abandoning the previously used statistical approach.

Stay tuned for an in-depth overview of Computer-Assisted Translation (CAT) coming very soon.

Like This Article? Subscribe to Receive More Via Email

  • receive a digest with new articles
  • up to 2 emails a month

Comments

Related Articles

Files, Files Everywhere: The Subtle Power of Translation Alignment

2 weeks ago

Here’s the basic scenario: you have the translated versions of your documents, but the translation wasn’t performed in a CAT tool and you have to build a translation memory because these documents need to be updated or changed across the languages, you want to retain the existing elements, style and terminology, and you have integrated CAT technology in your processes in the meantime. The solution is a neat piece of language engineering called translation alignment.

Continue reading

Interview with Marija Omazić: “New generations of students are more digitally literate and more mobile”

3 weeks ago

After several years of successful cooperation, we decided to learn more about Marija Omazić, full professor at the Faculty of Humanities and Social Sciences of the University of Osijek and the chair of the master’s programme in translation at the Department of English. She shared with us the joys of working with students and the details about many projects she was involved in.

Continue reading