Research & Development

Posted by Βι¶ΉΤΌΕΔ Research and Development on , last updated

The ever-growing amount of media content published each day makes it extremely challenging for human editors to consistently segment and annotate it. However, segmentation and labelling of media content are necessary to make short-form content available. For example, if someone is looking for a particular piece of news from a programme aired one month ago, some pre-segmentation and/or annotation of the news story in the show would be a massive help!

So how can we segment and annotate media content without direct human effort? It probably is no surprise that the answer is artificial intelligence (AI). PhD student, Iacopo Ghinassi, has been working with Βι¶ΉΤΌΕΔ R&D on ways to solve this problem as part of our Data Science Research Partnership.

I have been working on a fascinating project that uses AI to segment and annotate TV and radio programmes automatically. The project is part of my that the Βι¶ΉΤΌΕΔ sponsors. The Βι¶ΉΤΌΕΔ has provided valuable data and continuous support that allowed me and my supervisors (Dr Huy Phan and Prof Matthew Purver) to investigate new ways of automatically understanding the content of media.

'Understanding' is, in fact, crucial to solve the problem of segmenting an otherwise undivided piece of content, such as a news show or a podcast. We aim to segment content by topic, meaning that an automatic system needs to 'understand' when the topic changes. To achieve this, we turn to the branch of AI that is concerned with understanding human language. Sounds and acoustic elements are also explored, but understanding language is crucial if we want to isolate a self-contained section of the programme on one topic and, eventually, label the segment with the topic itself.

Graphic showing elements of topic segmentation: Analyse - Extract - Build

In a sense, this is not too different from what a search engine does when trying to return results relevant to your query. That's why our research takes a different direction from previous research on the topic - by investigating models and techniques from AI that are closely connected to . A general understanding of language like this could be a unique way to segment and label content - recognising different topics and the way they appear within the programme. If our algorithm has a good understanding of the content, we can then potentially adapt it for things like automatic summarisation at little or no cost!

at an important academic workshop about the broadcasting industry’s use of data science, which led to . Another paper is on its way documenting the latest system built with this approach that managed to correctly segment a set of 270 news programmes from the Βι¶ΉΤΌΕΔ News Channel more than 90% of the time. This system has been adopted by R&D in a prototype news segmentation system called Yuzu, which will be used to explore potential applications for automatic segmentation.

Much more has yet to come, though! The potential that AI and data science have in helping shape processes and media consumption is, if not limitless, very far-reaching. I’m glad to have had an opportunity to lay a (small) tile on that path.


Βι¶ΉΤΌΕΔ Media Centre - Βι¶ΉΤΌΕΔ and UK universities launch major partnership to unlock potential of data

Βι¶ΉΤΌΕΔ R&D - Artificial Intelligence & Machine Learning

Βι¶ΉΤΌΕΔ R&D - Natural language processing

Βι¶ΉΤΌΕΔ R&D - Developing automated user generated content filtering tools for news events

Βι¶ΉΤΌΕΔ R&D - Creating automatic video summaries with text queries

Βι¶ΉΤΌΕΔ R&D - Using Algorithms to Understand Content

Βι¶ΉΤΌΕΔ R&D - Content Analysis Toolkit

Βι¶ΉΤΌΕΔ R&D - Snippets
