IRFS Weeknotes #235

Posted by Olivier Thereaux on 13 Dec 2016, last updated 7 Feb 2017

Latest from the IRFS team - Tellybox prototypes, NewsBeat Explains, and lots of Machine Learning.

Tellybox Making Sprint

At the end of 10 working days we have 9(!) prototypes in development.

They are part-working prototype versions of the non-functional design fictions we showed at the V&A and Mozfest, including a device that picks you a random programme to watch using a large red button, a giant, interactive map of all the programmes available on iPlayer, an ambient display for your favourite programmes, a talking teddy, and lots more, including iPlayer on a Raspberry Pi 3.

These prototypes are designed to reflect results of the ethnographic research we've undertaken into why people watch TV, and also reflect the breadth and usefulness of tools created in the department and further afield in a way that will help us have useful conversations about where TV should go next. We've got three roadshows next week where we take all the prototypes to all three R&D sites.

Atomised Media

Thomas, Barbara, Sacha and Tristan had an interesting meeting with Product Managers and the Creative Director for Sport regarding their current and upcoming challenges and to discuss our Atomised Media work. Thomas presented his work on S-M-L (small, medium, large) summaries for Sport.

Barbara presented the Newsbeat Explains (NBE) prototype at our All-Staffer. She helped Tim, Lei, and Joanne in setting up the NBE user study which starts next week. She commissioned the company to transcribe the 1-2-1 interviews, liaised with the recruitment agency and wrote the copy for the study. The three of them did a runthrough/test run the study and it is almost ready to go - just some bits to finish off.

Talking With Machines

Andrew did some user testing of our Alexa skill in the usability testing labs in Salford and London. Testing is also being done remotely by the Blue Rooms in Birmingham and Salford.

Machine Learning Classifiers

David and Fionntán discussed ideas about classification and started thinking about use cases with different types of content - Sport, football, Music, News.

Fionntán came up with a framework for using machine learning classifiers within Freebird. After reading up on ML in production and trying Python libraries for AWS lambda, he found the Zappa library to work well. Using this, we can separate out the classifier training, feature extraction, classifier running and creative work processing in to different lambda functions. He started implementing this for the Salience feature processor. If this works well, we can apply this framework and easily scale it up for the many classifiers used in Editorial Algorithm smart streams.

Fionntán also tried two different machine learning experiments, both of which unfortunately didn't work. One was using FastText for tag prediction and the second was dictionary learning over article vectors.

Ground Truth Generation

Ben and Lara have been working with the codebase to create a tool for annotating videos. We hope to use this to generate new ground truth data for the face recognition and scene detection work.

Scene Detection

Craig has been benchmarking multiple shot boundary detection algorithms which will form the basis of his scene recognition work. He’s testing several implementations including Open CV, ffprobe and some software developed in house. He’s been using an annotated Eastenders video for testing, something which we have previously provided to TRECvid.

Editorial Algorithms

On the Editorial Algorithms front, we worked both on our content discovery tools, and the analysis pipeline powering them.

Manish switched off a number of our content analysis processors after a team review to identify the most useful types of metadata needed. This will help towards cost reduction and has improved the rate at which we can analyse content.

We have been busy building a “Stream Builder” tool to help users define and then follow specific topics. Matt has been working on the autocomplete UI for it. Meanwhile, Kate spent a weekend with Sport Live and learned a lot about how they work and where Freebird could help them. She has started putting together a report for all Live research (Sport, EdFest and Autumnwatch) to form one report, which she’ll present either this coming sprint or the next one.

David reviewed our stream features (currently activated via feature flags) and decided on the default set. With Katie he also did some QA on a “bookmark” feature for our content stream tool.

ELMer - External Linking trial

Our trial with newsroom journalists of a tool which would suggest related articles they could add to their story .

Olivier and Katie spent some of the sprint reviewing the first phase of the trial in collaboration with our News Labs colleagues, working on planning the next steps with our friends in News Labs, and talking to people around the �鶹Լ�� to see if they would be interested in trying something similar in their editorial area.

Tagging Systems

Chris Newell developed and deployed a tag suggestion server for news and sport articles based on a scikit-learn classifier. The classifier has been trained using nearly 3 years of articles which have been manually tagged by �鶹Լ�� journalists. He’s now exploring how the classifier model could evolve incrementally in response to new articles and tags.

Training

Tim went on the 3-day new broadcast technologies course.

Chris and Tim, together with trainees Kristian and Craig, attended an internal R&D training course on Digital Signal Processing, starting with a refresher on engineering maths.

Conferences / Events

Chris and Nick went to . You can read event host David Lloyd's write-up .

Tristan went to the .

Olivier attended and presented our work at a big internal workshop on uses and production of Linked Data across the �鶹Լ��. Our proposed approach of using for automated tagging and domain-specific engines trained on human tagging for tag suggestions was well received so he spent part of the sprint following up with Chris Newell and various colleagues.

Katie started a conversation about Hack Days with Experiences and Data. We're planning one with Sport, and another with NewsLabs and Vivo.

Joanne organised a visit by Msc students from UCL, with presentations of the projects they undertook with us in the past few months.

Barbara has been working on the draft for changes requested by Sxsw to panel description.

Other

This sprint we spent a lot of time working on knowledge-transfer tasks with Thomas P as he’s leaving us very very soon! We've carried out a Rumsfeld-inspired Known-Knowns/Known-Unknowns/Unknown-Knowns/Unknown-Unknowns exercise and I think it’s been quite successful and we’ve improved some of our software deployment practices along the way.

Denise has been continuing work on live processing of MPEG Dash streams, transcoding from the AVC3 Codec into something useable by her algorithms. Matt, Chrissy and Ben have continued work on the COMMA platform, and are now planning work for IP Studio integration to start in January.

Chris Needham has fixed some more issues with our , and has been discussing other changes and new features with colleagues Alan and Tom.

�鶹Լ��

Accessibility links

Research & Development