Something I have not blogged much about to date is the topic of machine translation and its use within a subtitling context. Having read about a project titled SUMAT I was lucky enough to asks questions on this topic with Yota Georgakopoulou:
Q1: What does SUMAT stand for? (is it an Acronym?)
Yes, it stands for SUbtitling by MAchine Translation.
Q2: How is SUMAT funded and what industries/companies are involved?
SUMAT is funded by the European Commission through Grant Agreement nº 270919 of the funding scheme ICT CIP-PSP – Theme 6, Multilingual Online Services.
There are a total of nine legal entities involved in the project. Four of them are subtitling companies, four are technical centres in charge of building the MT systems we are using in the project, and the ninth is responsible for integrating all systems in an online interface through which the service will be offered.
Q3: Can you give us a little bit of information on your background and what your involvement in SUMAT has been to date?
I have been working in translation and subtitling ever since I was a BA student in the early 90’s. I was working in the UK as a translator/subtitler, teaching and studying for a PhD in subtitling at the time of the DVD ‘revolution’, with all the changes it brought to the subtitling industry. This was when I was asked to join the European Captioning Institute (ECI), to set up the company’s translation department that would handle multi-language subtitling in approximately 40 languages for the DVD releases of major Hollywood studios. That’s how my career in the industry began. It was a very exciting time, as the industry was undergoing major changes, much like what is happening today.
Due to my background in translation, I was always interested in machine translation and was closely following all attempts to bring it to the subtitling world. At the same time, I was looking for a cost-effective way to make use of ECI’s valuable archive of parallel subtitle files in 40+ languages, and the opportunity came up with the SUMAT consortium. ECI has since been acquired by Deluxe, who saw the value of the SUMAT project and brought further resources to it. Our involvement in the project has been that of data providers, evaluators and end users.
Q4: Machine Translation (MT) already has some history of being used to translate traditional text. Why has machine translation not been put to use for translation subtitles?
Actually, it has. There have been at least two other European projects which have attempted to use machine translation as part of a workflow that was meant to automate the subtitling process: MUSA (2002-2004) and eTITLE (2004-2006). Unfortunately, these projects were not commercialized in the end. Part of the reason for this is likely to be that the MT output was not of good enough quality for a commercial setting. As professional quality parallel subtitle data are typically the property of subtitling companies and their clients, this is not surprising. The SUMAT consortium invested a large amount of effort at the beginning of the project harvesting millions of professional parallel subtitles from the archives of partner subtitling companies, then cleaning and otherwise processing them for the training of the Statistical Machine Translation (SMT) systems our Research and Technical Development (RTD) partners have built as part of the project.
Q5: Some readers might be concerned that a machine could never replace the accuracy of a human subtitler translating material. What is your response to that concern?
Well, actually, I also believe that a machine will never replace a subtitler – at least not in my lifetime. MT is not meant to replace humans, it is simply meant to be another tool at their disposal. Even if machines were so smart that they could translate between natural languages perfectly, the source text in the case of film is the video as a whole, not just the dialogue. The machine will only ‘see’ the dialogue as source file input, with no contextual information, and will translate just that. Would a human be able to produce great subtitles simply by translating from script without ever watching the film? Of course not. Subtitling is a lot more complex than that. So why would anyone expect that an MT system could be able to do this? I haven’t heard anyone claiming this, so I am continuously surprised to see this coming up as a topic for discussion. I think some translators are so afraid of technology, because they think it will take their jobs away or make their lives hard because they will have to learn how to use it, that they are missing the point altogether: MT is not there to do their job, it is there to help them do their job faster!
Q6: Is the technology behind SUMAT similar to that used by You Tube for its ‘automated subtitles’?
Yes, in a way. YouTube also uses SMT technology to translate subtitles. However, the data YouTube’s SMT engines have been trained with is different. It is not professional quality subtitle data, but vast amounts of amateur quality subtitle data found on the internet, coupled with even larger amounts of any type of parallel text data found on the web and utilized by Google Translate. Also, one should bear in mind that many ‘issues’ found in YouTube subtitles, such as poor subtitle segmentation, are a result of the input text, which in some cases is an automatic transcription of the source audio. Thus, errors in these transcriptions (including segmentation of text in subtitle format) are propagated in the ‘automatic subtitles’ provided by YouTube.
SUMAT also uses SMT engines built with the Moses toolkit. This is an open source toolkit that has been developed as part of another EU-funded project. In SUMAT, the SMT engines have been trained with professional quality subtitle data in the 14 language pairs we deal with in the project, and supplemented with other freely available data. Various techniques have been used to improve the core SMT systems (e.g. refined data selection, translation model combination, etc.), with the aim of ironing out translation problems and improving the quality of the MT output. Furthermore, the MT output of SUMAT has been evaluated by professional subtitlers. Human evaluation is the most costly and time-consuming part of any MT project, and this is why SUMAT is so special: we are dedicating almost an entire year to such human evaluation. We have already completed the 1st round of this evaluation, where we focused on the quality output of the system, and we have now moved on to the 2nd round which focuses on measuring the productivity gain that the system helps subtitlers achieve.
Q7: Why do you think machine translation is needed in the field of subtitling?
I work in the entertainment market, and there alone the work volumes in recent years have skyrocketed, while at the same time clients require subtitle service providers to deliver continuous improvement on turnaround times and cost reduction. The only way I see to meet current client needs is by introducing automation to speed up the work of subtitlers.
Aside from entertainment material, there is a huge amount of other audiovisual material that needs to be made accessible to speakers of other languages. We have witnessed the rise of crowdsourcing platforms for subtitling purposes in recent years specifically as a result of this. Alternative workflows involving MT could also be used in order to make such material accessible to all. In fact, there are other EU-funded projects, such as transLectures and EU-Bridge, which are trying to achieve this level of automation for material such as academic videolectures, meetings, telephone conversations, etc.
Q8: How do you control quality of the output if it is translated by a machine?
The answer is quite simple. The output is not meant to be published as is. It is meant to be post-edited by an experienced translator/subtitler (a post editor) in order for it to reach publishable quality. So nothing changes here: it is still a human who quality-checks the output.
However, we did go through an extensive evaluation round measuring MT quality in order to finalise the SMT systems to be used in the SUMAT online service, as explained below. The point of this evaluation was to measure MT quality, pinpoint recurrent and time-consuming errors and dedicate time and resources to improving the final system output quality-wise. Retraining cycles of MT systems and other measures to improve system accuracy should also be part of MT system maintenance after system deployment, so that new post-edited data can be used to benefit the system and to ensure that the quality of the system output continues to improve.
Q9: How do you intend to measure the quality/accuracy of SUMAT?
We have designed a lengthy evaluation process specifically to measure the quality and accuracy of SUMAT. The first round of this evaluation was focused on quality: we asked the professional translator/subtitlers who participated to rank MT output on a 1-5 scale (1 being incomprehensible MT output that cannot be used, and 5 being near perfect MT output that requires little to no post-editing effort), as well as annotate recurrent MT errors according to a typology we provided, and give us their opinion on the MT output and the post-editing experience itself. The results of this evaluation showed that over 50% of the MT subtitles were ranked as 4 or 5, meaning little post-editing effort is required for the translations to reach publishable quality.
At the second and final stage of evaluation that is currently under way, we are measuring the benefits of MT in a professional use case scenario, i.e. checking the quality of MT output indirectly, by assessing its usefulness. We will thus measure the productivity gain (or loss) achieved through post-editing MT output as opposed to translating subtitles from a template. We have also planned for a third scenario, whereby the MT output is filtered automatically to remove poor MT output, so that translators’ work is a combination of post-editing and translation from source. One of the recurrent comments translators made during the first round of evaluation was that it was frustrating to have to deal with poor MT output and that there was significant cognitive effort involved in deciding how to treat such output before actually proceeding with post-editing it. We concluded it was important to deal with such translator frustrations as they may have a negative impact on productivity and have designed our second round of experiments accordingly.
Q10: Are there any examples of translation subtitles created by SUMAT?
Yes, the SUMAT demo is live and can be found on the project website (www.sumat-project.eu). Users can upload subtitle files in various subtitle formats and they will be able to download a machine translated version of their file in the language(s) they have selected. We have decided to limit the number of subtitles that can be translated through the demo, so that people do not abuse it and try to use it for commercial purposes.
Q11: Does SUMAT have a role to play in Same Language Subtitles for Access? (Subtitles for the Deaf and HOH)
No. SUMAT is a service that offers automation when one needs to translate existing subtitles from one language to another and presupposes the existence of a source subtitle file as input.
Q12: You recently gave a workshop for SUMAT at the Media For All conference, can you tell us a little bit about the results of the workshop?
The workshop at Media for All was the culmination of our dissemination efforts and the first time the SUMAT demo was shown to professionals (other than staff of the subtitling companies that are partners in this project). These professionals had the chance to upload their own subtitle files and download machine-translated versions thereof. There were approximately 30 participants at the workshop, who were first briefed on the background of the project, the way the MT systems were built and automatically evaluated, as well as on the progress of our current evaluation with professional translators.
In general, participants seemed impressed with the demo and the quality of the MT output. Representatives of European universities teaching subtitling to their students acknowledged that post-editing will have an important role to play in the future of the industry and were very interested in hearing our thoughts on it. We were also invited to give presentations on post-editing to their students, some of which have already been scheduled.
Q13: Where can readers go to find out more about this project?
The best source of information on the project is the project website: http://www.sumat-project.eu. We have recently re-designed it, making it easier to navigate. One can also access our live demo through it and will eventually be able to access the online service itself.
Q14: Is there anything readers can do if they wish to get involved in the project?
Although the project is almost complete, with less than half a year to go, contributions are more than welcome both until project end and beyond.
Once people have started using the live demo (or, later on, the service itself), any type of feedback would be beneficial to us, especially if specific examples of files, translations, etc. are mentioned. We plan to continue improving our systems’ output after the end of the project, as well as add more language pairs, depending on the data and resources we will have available. As we all know, professional human evaluation is time-consuming and costly, so we would love to hear from all translators that end up using the service – both about the good and the bad, but especially about the bad, so we can act on it!
Q15: If you could translate any subtitling of your choice using SUMAT what would it be?
Obviously MT output is most useful to the translator when its accuracy is at its highest. From our evaluation of the SUMAT systems so far, we have noticed trends that indicate that scripted material is translated with higher accuracy than unscripted material. This is something that we are looking at in detail during the second round of evaluations that are now underway, but it is not surprising. MT fares better with shorter textual units that have a fairly straightforward syntax. If there are a great deal of disfluencies, as one typically finds in free speech, the machine may struggle with these, so I’m expecting our experiments to confirm this. I suppose we will need to wait until March 2014 when our SUMAT evaluation will be completed before I can give you a definite answer to this question.
Thanks again to Yota for agreeing to the Q&A and for providing such informative answers.