CSI User Experience Conference 2012 Part 5 – Broadcast subtitles and captions formats

CSI User Experience Conference 2012: TV Accessibility

CSI User Experience Conference 2012: TV Accessibility

For background info on this conference read:Part 1.

Frans de Jong, a senior engineer for European Broadcasting Union (EBU) gave a presentation on the history of work and current work being done to ensure standardised subtitle formats as broadcast technology evolves whilst ensuring that legacy formats are still support and compatible. The subtitle format evolved from teletext technology STL has evolved to a format called EBU-TT Part I. Jong explained:

We have published this year (2012) EBU-TT part one. This is the follow up specification for that old format (STL). It takes into account that nowadays we like to define things in XML and not in binary format because its human readable, and because there many people who read XML…and of course nowadays [broadcast] its all file based, network facilities. Because if you look at the way that subtitles are produced, this a very generic sketch, typically it comes from somewhere, external company or internal department, can be based on existing formats, then it goes into some central content management system. Afterwards it archived and of course its broadcast at a certain moment, then provided to several of the platforms on right. This list of platforms growing. Analogue TV, digital TV, now there’s HDTV, iPlayer, we have IPTV streaming platforms all these platforms have their own specific way of doing subtitling. But in the production side we have for a long time being using STL and also proprietary formats based on them or newly developed. There’s several places where this format is useful but we felt we had to update that format to make sure we can fulfill the requirements of today. That is HD TV and the different web platforms mainly. So the new format published was focusing on that, very aware of web format, but focused in our case on production. Our goal is to really optimise the production, to help the broadcasters get their infrastructure up-to-date.

The EBU-TT format is not a stand-alone invention and is based on W3C Timed Text (TTML) but restricts the featureset, makes default values explicit, and adds (legacy STL) metadata. Similar work has been done in the US by SMPTE with the captioning format SMPTE-TT. This captioning standard received an honor from the Federal Communications Commission (FCC) —a Chairman’s Award for Advancement in Accessibility last month:

The FCC declared the SMPTE Timed Text standard a safe harbor interchange and delivery format in February. As a result, captioned video content distributed via the Internet that uses the standard will comply with the 21st Century Communications and Video Accessibility Act, a recently enacted law designed to ensure the accessibility, usability, and affordability of broadband, wireless, and Internet technologies for people with disabilities.

SOURCE: TV Technology

The EBU are currently working on EBU-TT Part II which will include a guide to ensuring ‘upgrading’ STL legacy subtitle files and how they can be converted to EBU-TT file. This is due to be published early this year. Looking further ahead Jong’s said:

There is also a third part coming up, that is now in the requirements phase, that’s on live subtitling. Several countries, and the UK is certainly leading, are working with live subtitling. The infrastructure for this and the standards used are not very mature, which means there is room also to use this format to come to a live subtitle specification. We will provide a user guide with examples…One word maybe again about live subtitling that’s coming up. What we did here is we had a workshop in the summer in Geneva at the EBU. We discussed the requirements with many broadcasters, what would you need this type of format. There are about 30 requirements. Some of the things that came up for example, is that it would be really good if there is a technical situation for routing, if I am subtitling for one channel maybe 10 minutes later I could be subtitling for another channel – to make sure that the system knows the what channel I am working for and that its not the wrong channel. And you need some data in the format that was used. Again the issue of enriching the work you are working on with additional information, description and speaker ID.

To conclude the presentation Jong’s discussed his views on future technology and the next steps for subtitling including automated subtitles and quality control:

There is an idea we could be much more abstract in how we author subtitle in the future. We understand that the thought alone can be quite disrupting for a lot of people in current practice because it’s far from current practice. Just to say we’re thinking about the future after this revision. I think later we’ll see on more advanced methods for subtitling, there is a lot of talk about automation and semi-automation. I think it was a week ago that You Tube released their automated subtitling with speech recognition, at least in the Dutch language. I am from Holland originally, I was pretty impressed by the amount of errors! … It’s a big paradox. You could argue that Google (owners of You Tube) has the biggest corpus of words and information probably of all of us.. if they make so many (automated subtitles/captions) mistakes how can we ever do better in our world? For the minority languages there is no good automated speech recognition software. If you ask TVP for example, the Polish broadcaster, how they do live subtitling, they say we would love to use speech recognition but we can’t find good enough software. In the UK it’s a lot better. It’s a real issue when you are talking about very well orchestrated condition and even there it doesn’t exist. I am really curious how this will develop.