Speech-to-Text
Using our subtitle archive to create more accurate speech-to-text
Improvements in machine learning have allowed us to train our own speech-to-text system. Itβs found a myriad of uses from archive search to improving social media shareability.
Project from - present
What we're doing
Speech-to-text is a process for automatically converting spoken audio to text. It has recently moved from the lab as a useful new tool for broadcasters and journalists. Breakthroughs in automatic analysis and improvements in affordability mean that running it at scale over hundreds of thousands of hours of content is now feasible. Increases in accuracy mean that users will have a realistic chance of finding what they want in minutes rather than hours, especially in genres such as news or factual content.
Why it matters
The ΒιΆΉΤΌΕΔ has one of the largest archives of broadcast material in the world, but only a fraction of it is truly searchable. We know there are hidden gems throughout the hundreds of thousands of hours of TV and Radio weβve digitised, but thereβs currently no easy way to find them. Speech-to-text is the first step in this process, as it allows a semi-accurate transcript of whatβs said to be made searchable.
How it works
Our recent work has focused on using to build speech-to-text systems for both live and offline use. We used our large archive of subtitled programmes to train language and acoustic models specifically for broadcast output, which weβve found to be more accurate than generalised models. Weβre also researching new types of recurrent neural net which offer the promise of better accuracy when very large datasets are deployed.
Outcomes
The engine weβve build has been used in half a dozen different tools across the ΒιΆΉΤΌΕΔ. The biggest user is the whoβve run it across almost a million hours of historic content. One of the more unexpected use cases is which uses speech-to-text to rapidly subtitle short-form video for social media platforms. We presented a technical overview of the system we built and the uses we put it to at .
-
This project is part of the Internet Research and Future Services section
This project is part of the Content Analysis Toolkit work stream
Topics
People & Partners
Project Team
-
Matt Haynes
Principal Web Developer
-
Rob Cooper
Producer
-
Chrissy Pocock-Nugent
Software Engineer
-
Andrew McParland
Principal Engineer
-
Alex Norton
Software Engineer
-
Jonty Usborne