Βι¶ΉΤΌΕΔ

Research & Development

Posted by Βι¶ΉΤΌΕΔ Research and Development on , last updated

These are weekly notes from the Internet Research & Future Services team in Βι¶ΉΤΌΕΔ R&D where we share what we do. We work in the open, using technology and design to make new things on the internet. You can follow us on Twitter at .

This week we have new joiners, entity extraction from Comma on different datasets, a visit to Lancaster for an event to invite local small businesses to use the technologies being made available from the FI-Content and FI-WARE projects, and World Service archive speaker identification released.

New joiners

  • Jiri Jerabek: the newest addition to UX team.
  • Michael Barroco: on attachment from the EBU for six months.
  • Sophie Powling: joined the speakerthon team as a researcher for 10 days.

welcome!

 

Comma

For James, Yves and MattH, this week has been another week filled with COMMA work - over the weekend/Monday we processed about 35 hours of content from the Cheltenham literary festival with COMMA/Kiwi, James says: "much to everyone's surprise it Just Worked". Yves adds: "It works!!".

James spent the rest of the week running things through COMMA (including a day of Radio 4 ground truth dataset) and writing tools to interact with the COMMA API (like downloading all the result files from a batch run). He's also starting experiments to use the speaker segmentation data to generate segmented audio around speaker identity transitions.

Relatedly, Jana has been working with Rob on male/female voice representation in radio, evaluating accuracy of Yves' LIUM based speaker segmentation algorithm versus the ground truth labels produced by Rob's work experience intern. Ranging between near perfect (95% accuracy) to around 65% (but we have some problems with the labels for that one, so might improve), on average it's looking pretty good.

World Service Archive


From Tom: "this week the new speaker identification stuff we've been working on for the World Service Archive went live. Almost all programmes now have segmentation and speaker identification enabled, and we've introduced 'identity' pages, which tend to give a more accurate view of the programmes a person is in by taking user-generated identifications and corrections into account. Expect a blog post soon!"

Tristan's been organising meetings geared towards getting the Comma and WS work into production at the Βι¶ΉΤΌΕΔ - how best to take data we have generated (and will generate) in our projects and put it into live production systems.

 

MediaScape


I ran a workshop with Theo's help and lots of good comments from ChrisG. We had visitors from and and Sean and Chris Needham also attended. Together we worked on a process for creating scenarios for the project. We'll use this process in Heidelberg this week at the MediaScape face to face meeting  - Theo and I have been preparing for that by analysing the existing scenarios proposed by the partners, while Chris Needham and Sean have been drafting a 'concepts' document to share.

 

Radiodan


AndrewN, Dan and I have been preparing Radiodan for a first tentative release. Dan's released the core library Radiodan v1 to rubygems, AndrewN has improved the example application, and I've added in programme avoiding. We've also been thinking about the next version - Dan's been building an independent Βι¶ΉΤΌΕΔ Radio Services API for Radiodan clients, that gives us access to streaming URLs and live data. AndrewW has done some lovely responsive designs for the v2 webapp. I've been writing up some personas to think about for the next release. Andrew, Dan and I have also submitted a Radiodan proposal to Solidcon.

 

EgBox


Chris Needham has been porting the egBox VLC code to node.js.

FI2

Lots of admin before Christmas and a useful meeting: Chris Needham and ChrisG visited Lancaster University where FI2 hosted an event to invite local small businesses to use the technologies being made available from the and projects as part of the . ChrisG presented an introduction to the Βι¶ΉΤΌΕΔ and the R&D department, and ChrisN presented about the FI-WARE platform and the Βι¶ΉΤΌΕΔ's that has been made available.

 

Speakerthon


Zillah has been doing detailed planning for with Michael and various others. Tim has been considering the workflow for the event. On Wednesday he spent some time observing Sophie and Andy progress a clip from start to finish. The next stage will be turning that workflow into a simple guide for attendees on the day.

New Forms of Content


The Content Team have been working together to look at different approaches in surfacing/capturing the 'interesting bits' of a programme. Barbara says: "We manually and automatically processed (with Denise's algorithms) a series episode (Eastenders - 1 hour special) and Newsnight. We aimed at capturing interestingness with Tweets but through our research we found some limitations. Still a useful learning exercise in discovering what's possible and what's not. Positive results from the overlapping of the manual and automatic annotation as there are clearly similar patterns."

Links


From Tristan:

(what we're hoping to do with COMMA):

- curations of the best things to watch on iPlayer etc

(cf Mythology Engine). They've got proper scale to play with here.

From Jana:

An engineering toy aimed at girls, from a kickstarter company,

From Olivier: