Βι¶ΉΤΌΕΔ

Research & Development

Posted by Michael Armstrong on , last updated

Regular visitors to the website may have noticed that an increasing proportion of the video clips on the site are now being accompanied by subtitles. Βι¶ΉΤΌΕΔ Bitesize and now subtitle all new content before it is uploaded and this is the result of and the wider Βι¶ΉΤΌΕΔ aimed at finding cost-effective ways of increasing the quality and availability of subtitles.

Initially, several hundred older clips were manually subtitled but at the beginning of May an additional 3,187 clips had subtitles added via an automated process which matched the clips to the section of the television programme they were taken from. The subtitles were then retrieved from the archive, retimed to match the clip. In a further process the files were converted to , conformed to the and an extra subtitle was added at the beginning of each clip informing the viewer that the clip contains “Automatically matched subtitles”.

Our work at Βι¶ΉΤΌΕΔ R&D originally focused on using audio fingerprinting to locate subtitles for News clips in the material broadcast on Βι¶ΉΤΌΕΔ News Channel programmes. Because subtitles for news are created live and so have issues with accuracy and timing the system included a user interface where corrections could be made. The work was written up for but no further progress was made at this time.

However, the idea of automatically recovering subtitles resurfaced at a meeting in September 2015, and the focus now moved to clips derived from pre-recorded programmes. These have good quality subtitles and this opened up the possibility of recovering subtitles in an automated process without the need for human oversight. Over the coming months, a series of were written to interface with a number of different internal and external web resources across the Βι¶ΉΤΌΕΔ. These scripts locate the source programmes and use audio fingerprinting to locate the clip and check for edits. A speech to text stage was then added to enable a text search for the original programme subtitles. The speech to text was also needed to retime the subtitles to match the clips and verify the result. The scripts were then combined to create a batch process as a proof of concept demonstrator of the approach. We also teamed up with Βι¶ΉΤΌΕΔ Knowledge and Learning (now Βι¶ΉΤΌΕΔ Learning) who supplied a list of their Βι¶ΉΤΌΕΔ Bitesize clips of which 7,509 were available as video files. This set of clips became our target corpus for the process.

With time running out as my work on subtitles was drawing to a close in April 2016, a total of 3,508 subtitle files were created for the clips, a 46% success rate. The clips that could not be matched included all those that were specially made for the website and any that had been edited together. All the non-English clips from language learning programmes confused the speech to text system so could not be verified. The results were written up as a paper for IBC2016 and the subtitle files were handed over to colleagues with the ambition to make the subtitles available on the Bitesize website.

One of the key issues with the recovered subtitles was that they totalled around 250 hours of clips. This meant it was not possible to manually check every single clip and there was a very small chance of an incorrect subtitle file getting through. Also, the verification thresholds had been set to let through minor issues such as an extra line or subtitle at the beginning or end of the clip as this significantly increased the success of the system. In the following example the audio in clip starts with the words “The Noble Gasses are…”.

As a result, the decision was made to include additional subtitles for 5 seconds at the beginning of each clip informing the viewer that these are automatically matched subtitles and helps set the viewer’s expectations. Also, it transpired that 381 of the files had already been manually subtitled so these files were discarded as duplicates. Software scripts were written to add the warning text and convert the files into fully compliant EBU-TT-D format and enable the batch uploading of the subtitle files to the Βι¶ΉΤΌΕΔ’s content management system. This was completed at the beginning of May 2017.

This work has proved the original hypothesis that it should be possible to create subtitles for video clips by automatically recovering them from the programmes from which they were broadcast and these files are now out on the Βι¶ΉΤΌΕΔ website where the audience can benefit. This is in addition to t which were uploaded at the end of last year.

The R&D subtitle matching software is a proof of concept of the approach to subtitling clips. Whilst it is well suited to recovering subtitles for clips taken from archive content, it relies on the source programme already having subtitles and this is not the case for preview clips published before broadcast or clips made for the web. However, the EBU-TT-D conformance and upload software has been designed to be used operationally and other approaches are being piloted to create subtitles where no match is available.

-

This post is part of the Future Experience Technologies section

Topics