Recent Updates Page 2 Toggle Comment Threads | Keyboard Shortcuts

  • iheartsubtitles 11:57 am on September 19, 2014 Permalink | Reply
    Tags: , , ,   

    MOOC’s, Learning and Education 

    Please don’t think a lack of blog posts over the summer means a lack of interest in the subject of all things captioning and subtitling, far from it. In fact in an attempt to improve my skills and knowledge, one of things that I’ve been busy with is learning. I took my first steps into the world of MOOC’s. In case you are unfamiliar with the term, it stands for Massive Open Online Courses. They are courses that exist online, and the majority consist of a combination of reading material and video lectures.

    So you can probably guess what I am going to comment on next. As a hard of hearing person, just how accessible was the video content? Well it goes without saying that a key factor in me choosing a MOOC was not just the subject matter but whether the video and audio content was subtitled or captioned in English. The two MOOC’s I took were from FutureLearn and Coursera*

    A screenshot of Coursera's Course At A Glance details hours of study, length of the course, language of the course, and language of subititles that are available.

    Coursera – At A Glance section of the page detailing subtitle availability

    A screenshot of FutureLearn's FAQ webpage noting that subtitles are available

    FutureLearn’s FAQ includes information on the availability of subtitles


    I am happy to say that it was relatively easy for me to find out if content on their courses was subtitled. I particularly like Coursera’s clear layout and course summary from a course’s main page which tells you if subtitles are available. You have to dig a little deeper to find the answer on FutureLearn’s website but it is there in a detailed FAQ – Technology and Accessibility page. All of FutureLearn’s courses are subtitled in English, I am unsure if that is the case for Coursera.

    But…having established that the video content of the course itself is subtitled, why oh why, on both websites, is the introductory video not also subtitled! I have to rely only on the text description of the course to decide if it is the right one for me. This is the only opportunity you have to make me a ‘customer’ and commit to joining your course, so why are you leaving this video out?  It’s clear time and effort has been put into recording and editing them – so for goodness sake make them accessible and add subtitles!

    So what was the quality of the subtitling of the course content like I hear you ask? Well, varied to be honest. Starting with the good – the errors that did occur in the subtitles for both MOOC courses were not frequent enough to stop me from understanding and completing assignments. The most grave example – where a word error actually changed the meaning of the sentence came from Coursera. For example the phrase “Dublin Core” was subtitled as “Double Encore” and it was a horrible distraction when trying to understand a new topic that I had not studied before. When I pointed this out in the course forums, the staff explained it was likely due to an  auto-captioning error and apologised for the mistake. They also fixed the error relatively quickly allowing me to watch the video again two days later with much less confusion. Whilst it would have been better if the error was not there at all the speed of the response to fix it meant I didn’t get left behind in my studies. On the FutureLearn course one video used an incorrect word. I have to admit if it wasn’t for my own lip-reading skills I may not have realised this. When I posted a comment about it, it wasn’t the staff that responded but a very helpful fellow learner who clarified the correct word for me.

    Now for the not so good. Anyone who is a professional subtitler or captioner will know the importance of chunking, character limits per line and reading speeds. Now assuming the same guidelines for subtitling pre-recorded content for captioning/subtitles on broadcast TV also applies to pre-recorded educational MOOC videos (I don’t see why not but please comment if you disagree) these rules were not adhered to. The question is did it stop me learning? Honestly, no it didn’t (I can at least pause,rewind online) but it did make the retention and understanding harder. The user experience was not as good as it could have been. It is not what I am used to. I would prefer that the level of quality I am used to seeing on broadcast TV and DVD is replicated for MOOC videos.

    Another issue, for both courses is that the teacher would sometimes direct you to an external resource such as another website or video not hosted by the MOOC platform itself. And here’s where the access falls down. On both FutureLearn and Coursera the external content contained videos that were not subtitled or captioned. So I was unable to benefit from this. Now it would be nice if the platforms only allowed external links if the content has been made accessible. However the decision to include such content is probably at the discretion of the teacher not the MOOC platform. It’s exactly the same issue we currently see with VOD (Video on Demand) platforms. They might host the video but they are not the providers of content for whom it is generally accepted that the responsibility to provide the captioning or subtitling lies with. Did this prevent me from learning and passing tests and assignments? Thankfully no, because for both courses the external content was an optional extra but it still stands that this current format/situation does not equate to equal access to content. And that is most certainly a bad thing.

    Both MOOC courses that I took allowed students on the course to download a transcript of all videos (Coursera also allow you to download the subtitle file itself). This is a nice tool that all pupils on the course can benefit from. And this brings me to the point of one of the reasons I set up this blog – the belief that subtitles and closed captioning are not just a resource for deaf and hard of hearing communities, they are for everyone. There has been numerous research and studies over the last 20-30 years that suggest subtitles and closed captioning can help improve reading skills, literacy and the retention of information. There are a few websites that highlight this, the most comprehensive are Captions For Literacy and Zane Education.

    A photo of a captioned TV, the front cover of the National Captioning Institute - Guide for Using Captioned Television in the Teaching of Reading

    SOURCE: National Captioning Institute – Guide for Using Captioned Television in the Teaching of Reading (1987)

    Some of this research has been recognised and there are resources for teachers in Australia via Cap That!, and the USA via Read Captions Across America and Reading Rockets.  In fact, the USA as far back as 1987 realised the benefits and the National Captioning Institute published a guide for teachers.

    Does anyone know if there are or have been similar publications or resources for teachers in the UK? I have been unable to find anything and given the level of subtitled coverage on TV we now have, it seems a missed opportunity for teachers not to use it as a learning tool and encourage their use?

    Going back to MOOC’s , the global nature of the internet means its recognised that subtitles are needed given the course can be taken anywhere in the world and a pupil might need to read subtitles in their own language or use same language subtitles to aid their understanding. And everyone stands to benefit from this. I really enjoyed the experience overall and will absolutely consider taking more subtitled MOOC courses in the future.

    I haven’t even mentioned the services of CART (Communication Access in Real Time) or STT/STTR (Speech To Text) as an educational tool yet. These services were not available to me as a student but where they have been made available for at talks, meetings, or events I have absolutely benefited from being better at retaining the information being spoken simply because I can read every word.  I look forward to more research and evidence in the area of real-time live subtitling/captioning access because again I think all learners could benefit from this not just those who struggle to hear what is being said.

    What has been your experience with using captioning or subtitling as an educational tool been?

    *other accessible MOOCs are available.

    • Claude Almansi 12:17 am on September 23, 2014 Permalink | Reply

      Great post, Dawn: thank you.

      About the “double encore” for “Dublin core” error in a Coursera lecture that you mention: I think the instructor was mistaken in saying it was likely due to an auto-captioning error: Coursera used to visit appalling automatically voice-recognition generated original subs (1) on volunteer translators when it was using an team, but at least, volunteers were able to fix them – in the course videos as well – before translating them.

      But with their new crowdtranslating initiative called the Global Translator Community (GTC), they said, in a hangout for GTC volunteers:

      “…When they [Coursera’s university partners] request captioning, that goes to a company that we work with,that does human language-captioning of videos. So then people listen to the videos and actually,humans write out the words that are being spoken on the screens.
      Now, the people who are doing these captions, they are not subject-matter experts,so, for instance in the course on Machine Learning, you know,they’re probably going to get some words wrong, there are going to be grammatical mistakes and, you know, one of the challenges that I realize, that we certainly realize is a challenge,is that English transcripts are not perfect.We think that they’ve improved a lot, we’ve worked with this provider that we use to improve that.I don’t know if any, if actually some of you had been on the platform for a couple of yearsand saw the transcripts back in 2012,and maybe you can tell that they have gone better — I hope so.” (1)

      Actually they haven’t, by a long shot: there might be fewer transcription errors than with the former auto-captions, though that’s arguable, but now, as the GTC uses Transifex, which is NOT a subtitling app, for translating the original subtitles, volunteers have no way to fix them anymore: hence the staple absurd splitting, frequent bad syncing, sometimes long unsubtitled parts, not to mention inane mentions of non verbal audio, like just [music] without describing it. So on June 6, Coursera staff started a Google spreadsheet, , where volunteers are meant to report these original subtitles issues via a form, so staff can respond to them. Problem: staff hasn’t responded to a single entry after June 16.

      About captioning for literacy: not UK but Indian: . Pity the video on the home page is uncaptioned, but the site offers many resources, theoretical and practical.

      As to my use of captioning in education: in a couple of really open online courses for Italian teachers organized by Andreas Formiconi (3), I deviously started captioning some videos then asked if other participants would like to join. Only a few did, but they got really interested, and some posted about it in their blogs.

      (1) See

      (2) From the the transcript generated by the captions in

      (3) See his blog


      • iheartsubtitles 10:22 am on September 23, 2014 Permalink | Reply

        Hi Claude, thanks for commenting. Some very interesting background and links with regards to Coursera’s subtitling and captioning methods.


    • Arlene Mayerson 7:57 pm on September 29, 2014 Permalink | Reply

      I am a lawyer with the Disability Rights Education and Defense Fund who litigated the Netflix case. If any one has trouble accessing MOOC’s because of lack of captions, please contact me at Thanks.


  • iheartsubtitles 12:25 pm on July 24, 2014 Permalink | Reply
    Tags: , , ,   

    Invisible Subtitled Live Theatre – Trial in the UK 

    Giojax, the company using 3D technology to create invisible subtitles for use by cinemas have just announced that the same technology is to be trialled in the theatre.

    Originally set up as a crowd-funded business, the now private company with private investors is running a trial of the invisible subtitles technology to subtitle a musical in October this year.

    The principle is the same as for the cinema. Audience members who wish to see the captions running during the live performance can wear 3D glasses and view the subtitles via a box situated on the theatre stage. The subtitles will be in English and is aimed as a solution to provide subtitles for the deaf and hard of hearing so should not be confused with translation subtitles or surtitles that you may have seen at opera performances.

    If you are interested in trying this technology out, the trial will take place on Saturday October 4th at the matinée performance at the Harlow Theatre for the Barry Manilow musical Copacabana:

    Her name was Lola, she was showgirl… So begins this tale of romance and stardom that has captivated audiences in the West End, Atlantic City and on-screen across the US. With sensational original songs by Barry Manilow, dazzling costumes and fabulous choreography is a show that will leave you breathless. Featuring hits such as Dancin Fool, Who Needs To Dream, Aye Caramba, and of course the Grammy award-winning Copacabana, this is a show sure to have you humminh the tunes all the way home. Harlow Playhouse is proud to present the premiere of Barry Manilow’s revised version of the original show for 2014.

    For more information on the musical and to purchase tickets visit the Harlow Playhouse website.

    For more information on 3D subtitles technology please visit the Giojax web page.

    And if anyone is wondering, the 3D Invisible Subtitles for cinemas project is still under way, testing took place earlier this year in Milton Keynes and the next stage is to finalise the software for the cinemas.

    • Mamtha 11:04 am on December 6, 2014 Permalink | Reply

      We are experienced in Video/Audio Transcription and subtitling, kindly give us opportunity to work as a vendor for your company.


  • iheartsubtitles 12:19 pm on June 27, 2014 Permalink | Reply
    Tags: , , , , , , , ,   

    CSI TV Accessibility Conference 2014 – Live subtitling, VOD key themes 

    Photo of CSI TV Accessibility Conference 2014 brochure

    CSI TV Accessibility Conference 2014

    Earlier this month the CSI TV Accessibility Conference 2014 took place in London. I had hoped to be able to give a more detailed write up with a bit of help from the transcript of the live captioning that covered the event but I’m afraid my own notes are all I have and so I will summarise some of the interesting points made that I think will be of interest to readers here. It will not cover all of the presentations but it does cover the majority.

    i2 Media Research gave some statistics surrounding UK TV viewing and the opportunities that exist in TV accessibility. Firstly, TV viewing is higher in the older and disabled population. And with an ageing UK population the audience requiring accessibility features for TV is only going to increase.

    Andrew Lambourne, Business Director for Screen Subtitling Systems had an interesting title to his presentation: “What if subtitles were part of the programme?” In his years of working in the subtitling industry he questioned why are we still asking the same questions over recent years. The questions surround the measurement of subtitling quality, and if there is incentive to provide great subtitling coverage for children. He pointed out that in his opinion funding issues are still not addressed. Subtitling is still not a part of the production process and not often budgeted for. Broadcasters are required to pay subtitling companies,and subtitling costs are under continued to pressure (presumably to provide more, for less money). It is a sad fact that subtitling is not ascribed the value it deserves. With regards to live subtitling there is a need to educate the public as to why these errors occur. This was a repeated theme in a later presentation from Deluxe Media. It is one of the reasons I wrote the #subtitlefail! TV page on this blog.

    Peter Bourton, head of TV Content Policy at Ofcom gave an update and summary of the subtitling quality report which was recently published at the end of April. This is a continuing process and I’m looking forward to comparing the next report to this first one to see what changes and comparisons can be made. The presentation slides are available online.

    Senior BBC R&D Engineer Mike Armstrong gave a presentation on his results to measuring live subtitling quality. (This is different to the quantitative approach used by Pablo Romero and adopted by Ofcom to publish its reports) What I found most interesting about this research is that the perception of quality by a user of subtitles is quite different depending on whether the audio is switched on whilst watching the subtitled content. Ultimately nearly everyone is watching TV with the audio switched on and this research found that delay has a bigger impact on perception of quality compared to the impact of errors. The BBC R&D white paper is available online.

    Live subtitling continued to be a talking point at the conference with a panel discussion titled: Improving subtitling. On the panel was Gareth Ford-Williams (BBC Future Media), Vanessa Furey (Action On Hearing Loss), Andrew Lambourne (Screen Subtitling Systems), and David Padmore (Red Bee Media). All panelists were encouraged that all parties – regulators, broadcasters, technology researchers are working together to continually address subtitling issues. Developments in speech recognition technology used to produce live subtitles has moved towards language modelling to understand context better. The next generation of speech recognition tools such as Dragon has moved to phrase by phrase rather than word by word (the hope being that this should reduce error rates). There was also positivity that there is now a greater interest in speech technology which should lead to greater advancements over the coming years, compared to the speed of technology improvements in the past.

    With regards to accessibility and Video on Demand (VOD) services it was the turn of the UK’s Authority of Television Video on Demand (ATVOD) regulatory body to present. For those that are unaware, ATVOD regulate all VOD services operating in the UK except for BBC iPlayer which is regulated under Ofcom. In addition because iTunes and Netflix operate from Luxembourg, although their services are available in the UK, they are outside of the jurisdiction of ATVOD. There are no UK regulatory rules that say VOD providers must provide access services, but ATVOD have an access services working party group that encourage providers to do so as well as draft best practice guidelines. I cannot find anywhere on their website the results of a December 2013 survey looking at the statistics of how much VOD content is subtitled, signed, or audio described which was mentioned in the presentation. If anyone else finds it please comment below. However, in the meantime some of the statistics of this report can be found in Pete Johnson’s presentation slides online. What has changed since 2012 is that this survey is now compulsory for providers to complete to ensure the statistics accurately reflect the provision. Another repeated theme, first mentioned in this presentation is the complexity of the VOD distribution chain. It is very different for different companies, and the increasing number of devices which we can choose to access our content also adds to the complexity. One of the key differences for different VOD providers is end-to-end control. Few companies control the entire process from purchasing and/or creating content for consumers to watch right through to watching the content on a device. So therefore who is responsible for a change or adaptation to a workflow to support accessible features and who is going to pay for it?

    I should also mention that the success of a recent campaign from hard of hearing subtitling advocates in getting Amazon to finally commit a response and say that they will start subtitling content was mentioned positively during this presentation. You may have read my previous blog post discussing my disappointment at the lack of response. Since then, with the help of comedian Mark Thomas, who set up a stunt that involved putting posters up on windows of Amazon UK’s headquarters driving the message home, Amazon have committed to adding subtitles to their VOD service later this year. See video below for the stunt. It is not subtitled, but there is no dialogue, just a music track.

    You can read more about this successful advocacy work on Limping Chicken’s blog.

    Susie Buckridge, Director of Product for YouView gave a presentation on the accessibility features of the product which are pretty impressive. Much of the focus was on access features for the visually impaired. She reminded the audience that creating an accessible platform actually creates a better user experience for everyone. You can view the presentation slides online.

    Deluxe Media Europe gave a presentation that I think would be really useful for other audiences outside of those working in the industry. Stuart Campbell, Senior Live Operations Manager, and Margaret Lazenby Head of Media Access Services presented clear examples and explanations of the workflow involved in creating live subtitles via the process of respeaking for live television. Given the lack of understanding or coverage in mainstream media, this kind of information is greatly needed. This very point was also highlighted by the presenters. The presentation is not currently available online but you can find information about live subtitling processes on this blog’s #SubtitleFail TV page.

    A later panel discussed VOD accessibility. The panelists acknowledged that the expectation of consumers is increasing as is the volume and scale of complexity. It is hoped that the agreed common standard format of subtitle file EBU-TT will resolve a lot of these issues. This was a format still being worked on when it was discussed at the 2012 Conference which you can read about on this blog. The UK DPP earlier this year also published updated common standard subtitles guidelines.

    Were any of my readers at the conference? What did you think? And please do comment if you think I have missed anything important to highlight.

    • peterprovins 4:48 pm on July 21, 2014 Permalink | Reply

      Interesting blog. No excuse for TV, Film, website or even theatre not to be captioned…we do it all. Currently captioning university lectures and looking at doctors surgeries which are currently limited to BSL only. Keep up the good work.


  • iheartsubtitles 2:24 pm on April 28, 2014 Permalink | Reply
    Tags: , ,   

    Subtitling and Captioning Campaigns 

    Spring has arrived, and with it seems to be some new campaigns on the horizon relating to subtitling advocacy. It seems more people are getting frustrated at the lack of captioning or subtitling that is available to them or at least have become more vocal about it online in an attempt to create change.

    Just before Christmas last year, @sjmcdermid and @lovesubtitles spearheaded a petition aimed at Amazon providing subtitling information and subtitles to their video on demand services. Formerly called LOVEFiLM the company has since rebranded to Amazon Prime Instant Video presumably to match the brand in other countries. The petition gained a huge number of signatures but what is truly disappointing is the lack of response from Amazon themselves.

    Another campaigner has set up a different petition, this one aimed in the direction of BSKYB requesting subtitles be added to their video on demand offerings.

    UPDATE: And another subtitling campaigner based in the UK is @whatshesay76 who has just launched a website.

    It seems TV viewers in Ireland are not happy with the levels of subtitling offered to them by RTE, TV3, Setanta,TnaG and TG4. @SubtitleIreland have set up a Facebook page publishing responses to their enquiries as to why more content is not subtitled.

    UPDATE: Robyn, got in touch to alert me to captioning campaign in New Zealand aimed at increasing the volume of broadcast output that is currently subtitled. The campaign called Caption It NZ has a blog and Facebook page. You can also follow them on Twitter @captionitNZ

    Not to be outdone by individuals efforts to advocate, there has also been some activity with corporate backed campaigns.

    Firstly the crowd sourced subtitled content platform @Viki has teamed up with @MarleeMatlin to launch the #billionwords campaign advocating for more subtitling globally in more languages.

    And @121Captions is behind the @CaptionEverything campaign which has just recently launched.

    Have I missed any? Are there other’s you know about? Comment below and I can update this blog post to include it.

    Finally, it’s worth noting that wherever you are based in the world the CCAC for several years and has helped run and/or contributed to advocacy campaigns to get captioning in all sorts of scenarios – in schools, at work, online, at church, online and so on. Their most recent campaign surrounds the US Election and captioning election campaigns, but its members and participants consists of both users of captions and providers of captioning services spanning the globe.

    • MM 6:13 pm on April 28, 2014 Permalink | Reply

      We cannot mount an honest captioning campaign whilst the sign user has an legal opt out to them. we have to show we are willing too.


    • messagesfromouterspace 1:11 am on April 30, 2014 Permalink | Reply

      Yes – You’ve missed the one in NZ. #CaptionitNZ We only have 23% of captioning on our television, none on internet, and only a small percentage of DVDs. No wonder people turn to pirating.

      OUr blog is We have a facebook page

      Please join in and help – we need all we can get 🙂


    • Tina 9:46 am on May 3, 2014 Permalink | Reply

      Hi Dawn, Thanks for the mention! Would you like to help support the Caption Everything campaign? I believe an email came in from you but I can’t find it. Please contact me or help yourself to the logo.


  • iheartsubtitles 12:59 pm on January 2, 2014 Permalink | Reply

    2013 in review: i heart subtitles blog 

    The stats helper monkeys prepared a 2013 annual report for this blog.

    Here’s an excerpt:

    The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 19,000 times in 2013. If it were a concert at Sydney Opera House, it would take about 7 sold-out performances for that many people to see it.

    Click here to see the complete report.

  • iheartsubtitles 5:30 pm on December 13, 2013 Permalink | Reply
    Tags: , ,   

    The Killing – Christmas Jumper Day (with subtitles) #xmasjumperday 

    Today is Christmas jumper day in the UK. A charity campaign from Save The Children. Check out this video of a clip of popular subtitled Danish series The Killing also with its own ‘translation’ subtitles:

    Hope it made you smile! Maybe not if you understand the original language, but I love the idea for the campaign.

    • Funny Video 8:00 pm on January 5, 2014 Permalink | Reply

      I tried to stop but this stuff makes me laugh over and over again. Its really very funny


      • Ripley Trout 7:07 pm on March 1, 2014 Permalink | Reply

        Nice one. Particularly liked the line “I look like a loser while you ponce about like a Christmas tree”. All so much better for the seriousness of the original show.


  • iheartsubtitles 7:31 pm on December 1, 2013 Permalink | Reply
    Tags: , ,   

    Information Safety Video – Virgin America #VXsafetydance 

    Thanks to @mxincredible for alerting me to this. Think safety videos for air flights are boring? Take a look at this one from Virgin America with some great visuals used in the subtitles:

  • iheartsubtitles 4:19 pm on November 8, 2013 Permalink | Reply
    Tags: , , ,   

    Machine Translation & Subtitles – Q&A with Yota Georgakopoulou 

    Something I have not blogged much about to date is the topic of machine translation and its use within a subtitling context. Having read about a project titled SUMAT I was lucky enough to asks questions on this topic with Yota Georgakopoulou:

    Q1: What does SUMAT stand for? (is it an Acronym?)

    Yes, it stands for SUbtitling by MAchine Translation.

    Q2: How is SUMAT funded and what industries/companies are involved?

    SUMAT is funded by the European Commission through Grant Agreement nº 270919 of the funding scheme ICT CIP-PSP – Theme 6, Multilingual Online Services.

    There are a total of nine legal entities involved in the project. Four of them are subtitling companies, four are technical centres in charge of building the MT systems we are using in the project, and the ninth is responsible for integrating all systems in an online interface through which the service will be offered.

    Q3: Can you give us a little bit of information on your background and what your involvement in SUMAT has been to date?

    I have been working in translation and subtitling ever since I was a BA student in the early 90’s. I was working in the UK as a translator/subtitler, teaching and studying for a PhD in subtitling at the time of the DVD ‘revolution’, with all the changes it brought to the subtitling industry. This was when I was asked to join the European Captioning Institute (ECI), to set up the company’s translation department that would handle multi-language subtitling in approximately 40 languages for the DVD releases of major Hollywood studios. That’s how my career in the industry began. It was a very exciting time, as the industry was undergoing major changes, much like what is happening today.

    Due to my background in translation, I was always interested in machine translation and was closely following all attempts to bring it to the subtitling world. At the same time, I was looking for a cost-effective way to make use of ECI’s valuable archive of parallel subtitle files in 40+ languages, and the opportunity came up with the SUMAT consortium. ECI has since been acquired by Deluxe, who saw the value of the SUMAT project and brought further resources to it. Our involvement in the project has been that of data providers, evaluators and end users.

    Q4: Machine Translation (MT) already has some history of being used to translate traditional text. Why has machine translation not been put to use for translation subtitles?

    Actually, it has. There have been at least two other European projects which have attempted to use machine translation as part of a workflow that was meant to automate the subtitling process: MUSA (2002-2004) and eTITLE (2004-2006). Unfortunately, these projects were not commercialized in the end. Part of the reason for this is likely to be that the MT output was not of good enough quality for a commercial setting. As professional quality parallel subtitle data are typically the property of subtitling companies and their clients, this is not surprising. The SUMAT consortium invested a large amount of effort at the beginning of the project harvesting millions of professional parallel subtitles from the archives of partner subtitling companies, then cleaning and otherwise processing them for the training of the Statistical Machine Translation (SMT) systems our Research and Technical Development (RTD) partners have built as part of the project.

    Q5: Some readers might be concerned that a machine could never replace the accuracy of a human subtitler translating material. What is your response to that concern?

    Well, actually, I also believe that a machine will never replace a subtitler – at least not in my lifetime. MT is not meant to replace humans, it is simply meant to be another tool at their disposal. Even if machines were so smart that they could translate between natural languages perfectly, the source text in the case of film is the video as a whole, not just the dialogue. The machine will only ‘see’ the dialogue as source file input, with no contextual information, and will translate just that. Would a human be able to produce great subtitles simply by translating from script without ever watching the film? Of course not. Subtitling is a lot more complex than that. So why would anyone expect that an MT system could be able to do this? I haven’t heard anyone claiming this, so I am continuously surprised to see this coming up as a topic for discussion. I think some translators are so afraid of technology, because they think it will take their jobs away or make their lives hard because they will have to learn how to use it, that they are missing the point altogether: MT is not there to do their job, it is there to help them do their job faster!

    Q6: Is the technology behind SUMAT similar to that used by You Tube for its ‘automated subtitles’?

    Yes, in a way. YouTube also uses SMT technology to translate subtitles. However, the data YouTube’s SMT engines have been trained with is different. It is not professional quality subtitle data, but vast amounts of amateur quality subtitle data found on the internet, coupled with even larger amounts of any type of parallel text data found on the web and utilized by Google Translate. Also, one should bear in mind that many ‘issues’ found in YouTube subtitles, such as poor subtitle segmentation, are a result of the input text, which in some cases is an automatic transcription of the source audio. Thus, errors in these transcriptions (including segmentation of text in subtitle format) are propagated in the ‘automatic subtitles’ provided by YouTube.

    SUMAT also uses SMT engines built with the Moses toolkit. This is an open source toolkit that has been developed as part of another EU-funded project. In SUMAT, the SMT engines have been trained with professional quality subtitle data in the 14 language pairs we deal with in the project, and supplemented with other freely available data. Various techniques have been used to improve the core SMT systems (e.g. refined data selection, translation model combination, etc.), with the aim of ironing out translation problems and improving the quality of the MT output. Furthermore, the MT output of SUMAT has been evaluated by professional subtitlers. Human evaluation is the most costly and time-consuming part of any MT project, and this is why SUMAT is so special: we are dedicating almost an entire year to such human evaluation. We have already completed the 1st round of this evaluation, where we focused on the quality output of the system, and we have now moved on to the 2nd round which focuses on measuring the productivity gain that the system helps subtitlers achieve.

    Q7: Why do you think machine translation is needed in the field of subtitling?

    I work in the entertainment market, and there alone the work volumes in recent years have skyrocketed, while at the same time clients require subtitle service providers to deliver continuous improvement on turnaround times and cost reduction. The only way I see to meet current client needs is by introducing automation to speed up the work of subtitlers.

    Aside from entertainment material, there is a huge amount of other audiovisual material that needs to be made accessible to speakers of other languages. We have witnessed the rise of crowdsourcing platforms for subtitling purposes in recent years specifically as a result of this. Alternative workflows involving MT could also be used in order to make such material accessible to all. In fact, there are other EU-funded projects, such as transLectures and EU-Bridge, which are trying to achieve this level of automation for material such as academic videolectures, meetings, telephone conversations, etc.

    Q8: How do you control quality of the output if it is translated by a machine?

    The answer is quite simple. The output is not meant to be published as is. It is meant to be post-edited by an experienced translator/subtitler (a post editor) in order for it to reach publishable quality. So nothing changes here: it is still a human who quality-checks the output.

    However, we did go through an extensive evaluation round measuring MT quality in order to finalise the SMT systems to be used in the SUMAT online service, as explained below. The point of this evaluation was to measure MT quality, pinpoint recurrent and time-consuming errors and dedicate time and resources to improving the final system output quality-wise. Retraining cycles of MT systems and other measures to improve system accuracy should also be part of MT system maintenance after system deployment, so that new post-edited data can be used to benefit the system and to ensure that the quality of the system output continues to improve.

    Q9: How do you intend to measure the quality/accuracy of SUMAT?

    We have designed a lengthy evaluation process specifically to measure the quality and accuracy of SUMAT. The first round of this evaluation was focused on quality: we asked the professional translator/subtitlers who participated to rank MT output on a 1-5 scale (1 being incomprehensible MT output that cannot be used, and 5 being near perfect MT output that requires little to no post-editing effort), as well as annotate recurrent MT errors according to a typology we provided, and give us their opinion on the MT output and the post-editing experience itself. The results of this evaluation showed that over 50% of the MT subtitles were ranked as 4 or 5, meaning little post-editing effort is required for the translations to reach publishable quality.

    At the second and final stage of evaluation that is currently under way, we are measuring the benefits of MT in a professional use case scenario, i.e. checking the quality of MT output indirectly, by assessing its usefulness. We will thus measure the productivity gain (or loss) achieved through post-editing MT output as opposed to translating subtitles from a template. We have also planned for a third scenario, whereby the MT output is filtered automatically to remove poor MT output, so that translators’ work is a combination of post-editing and translation from source. One of the recurrent comments translators made during the first round of evaluation was that it was frustrating to have to deal with poor MT output and that there was significant cognitive effort involved in deciding how to treat such output before actually proceeding with post-editing it. We concluded it was important to deal with such translator frustrations as they may have a negative impact on productivity and have designed our second round of experiments accordingly.

    Q10: Are there any examples of translation subtitles created by SUMAT?

    Yes, the SUMAT demo is live and can be found on the project website ( Users can upload subtitle files in various subtitle formats and they will be able to download a machine translated version of their file in the language(s) they have selected. We have decided to limit the number of subtitles that can be translated through the demo, so that people do not abuse it and try to use it for commercial purposes.

    Q11: Does SUMAT have a role to play in Same Language Subtitles for Access? (Subtitles for the Deaf and HOH)

    No. SUMAT is a service that offers automation when one needs to translate existing subtitles from one language to another and presupposes the existence of a source subtitle file as input.

    Q12: You recently gave a workshop for SUMAT at the Media For All conference, can you tell us a little bit about the results of the workshop?

    The workshop at Media for All was the culmination of our dissemination efforts and the first time the SUMAT demo was shown to professionals (other than staff of the subtitling companies that are partners in this project). These professionals had the chance to upload their own subtitle files and download machine-translated versions thereof. There were approximately 30 participants at the workshop, who were first briefed on the background of the project, the way the MT systems were built and automatically evaluated, as well as on the progress of our current evaluation with professional translators.

    In general, participants seemed impressed with the demo and the quality of the MT output. Representatives of European universities teaching subtitling to their students acknowledged that post-editing will have an important role to play in the future of the industry and were very interested in hearing our thoughts on it. We were also invited to give presentations on post-editing to their students, some of which have already been scheduled.

    Q13: Where can readers go to find out more about this project?

    The best source of information on the project is the project website: We have recently re-designed it, making it easier to navigate. One can also access our live demo through it and will eventually be able to access the online service itself.

    Q14: Is there anything readers can do if they wish to get involved in the project?

    Although the project is almost complete, with less than half a year to go, contributions are more than welcome both until project end and beyond.

    Once people have started using the live demo (or, later on, the service itself), any type of feedback would be beneficial to us, especially if specific examples of files, translations, etc. are mentioned. We plan to continue improving our systems’ output after the end of the project, as well as add more language pairs, depending on the data and resources we will have available. As we all know, professional human evaluation is time-consuming and costly, so we would love to hear from all translators that end up using the service – both about the good and the bad, but especially about the bad, so we can act on it!

    Q15: If you could translate any subtitling of your choice using SUMAT what would it be?

    Obviously MT output is most useful to the translator when its accuracy is at its highest. From our evaluation of the SUMAT systems so far, we have noticed trends that indicate that scripted material is translated with higher accuracy than unscripted material. This is something that we are looking at in detail during the second round of evaluations that are now underway, but it is not surprising. MT fares better with shorter textual units that have a fairly straightforward syntax. If there are a great deal of disfluencies, as one typically finds in free speech, the machine may struggle with these, so I’m expecting our experiments to confirm this. I suppose we will need to wait until March 2014 when our SUMAT evaluation will be completed before I can give you a definite answer to this question.

    Thanks again to Yota for agreeing to the Q&A and for providing such informative answers.

    • Patricia Falls 2:00 pm on May 13, 2014 Permalink | Reply

      I train people on a steno machine to do realtime translation. I would like to discuss our product with you and how we can become involved in training


    • iheartsubtitles 2:10 pm on May 13, 2014 Permalink | Reply

      Hi Patricia, the SUMAT project is about machine translation for post editing translation. The system does not work with live/real-time subtitling so I am not sure the two are compatible? I suggest contacting them via the website listed in the article for further information.


  • iheartsubtitles 10:09 am on September 12, 2013 Permalink | Reply
    Tags: , , , , , , ,   

    SMPTE Internet Captioning Webcast 

    This webcast posted by the Society of Motion Picture and Television Engineers (SMPTE) is a good introduction to current US captioning regulatory requirements and new requirements due to come into play in the USA. All US broadcasters must caption content online that has previously been broadcast on linear TV by the end of this month. This includes pre-recorded content that has been edited for broadcast online. By March 2014, this also applies to live and near live content. Whilst the webcast is US-Centric the technical problems and solutions it discusses around captioning formats for online, and multi-platform broadcast content is relevant to all global broadcasters. The webcast covers both pre-recorded/block style captioning as well as live subtitling. It is captioned and you can view it below:

  • iheartsubtitles 10:18 am on August 15, 2013 Permalink | Reply
    Tags: , , ,   

    Captioned Music – automated vs human skill 

    Here are two fun videos that illustrate two very different results when captioning music.

    The first is lyric video for One Direction lyrics as captioned by You Tube’s auto captioning system. (You can also view the results of Taylor Swift’s lyrics)

    Machine translation does have a role to play in providing access and despite these funny videos continues to improve but that is for another blog post.

    Continuing on, compare the above with the fantastic skill of this stenographer and watch them subtitle Eminem’s Lose Yourself in real-time (music starts at 1:35 in).

    Stenography is also used to caption/subtitle live television – see #subtitlefail! TV

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: