Βι¶ΉΤΌΕΔ

A Voice Classification System for Younger Children with Applications to Content Navigation

White Paper WHP 249

Published: 1 October 2013

Abstract

A speech classification system is proposed which has applications for accessibility of content for younger children. To allow a young child to access online content (where typical interfaces such as search engines or hierarchical navigation would be inappropriate) we propose a voice classification system trained to recognise a range of sounds and vocabulary typical of younger children. As an example we design a system for classifying animal noises. Acoustic features are extracted from a corpus of animal noises made by a class of young children. A Support Vector Machine is trained to classify the sounds into one of 12 corresponding animals. We investigate the precision and recall of the classifier for various classification parameters. We investigate an appropriate choice of features to extract from the audio and compare the performance when using mean Mel-frequency Cepstral Coefficients (MFCC), a single-Gaussian model fitted to the MFCCs as well as a range of temporal features. To investigate the real-world applicability of the system we pay particular attention to the difference between training a generic classifier from a collected corpus of examples and one trained to a particular voice.

This work was presented at the 132nd Audio Engineering Society Convention in Budapest on April 26th 2012.

White Paper copyright

Β© Βι¶ΉΤΌΕΔ. All rights reserved. Except as provided below, no part of a White Paper may be reproduced in any material form (including photocopying or storing it in any medium by electronic means) without the prior written permission of Βι¶ΉΤΌΕΔ Research except in accordance with the provisions of the (UK) Copyright, Designs and Patents Act 1988.

The Βι¶ΉΤΌΕΔ grants permission to individuals and organisations to make copies of any White Paper as a complete document (including the copyright notice) for their own internal use. No copies may be published, distributed or made available to third parties whether by paper, electronic or other means without the Βι¶ΉΤΌΕΔ's prior written permission.

Authors

  • Chris Lowis (PhD)

    Chris Lowis (PhD)

    Senior Research Engineer
  • Chris Pike (MEng PhD)

    Chris Pike (MEng PhD)

    Lead R&D Engineer - Audio
  • Yves Raimond (PhD)

    Yves Raimond (PhD)

    Senior R&D Engineer

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: