Βι¶ΉΤΌΕΔ

Β« Previous | Main | Next Β»

Ambisonics and Periphony [part 1]

Post categories: ,Μύ,Μύ,Μύ,Μύ

Anthony Churnside Anthony Churnside | 16:47 UK time, Thursday, 11 March 2010

I'm currently working for R&D in the North Lab which I've blogged a bit about here. I'm working in Production Magic, Graham Thomas' section, on the future of surround sound, and I thought that it might be interesting to write a little about the project. In the media sector audio is often seen as the underdog to video. This is even true in some parts of the Βι¶ΉΤΌΕΔ, where we produce much more audio than video content (we don't make silent TV!). R&D has strong audio team, led by Andrew Mason, who's talked a bit about Μύin a video here.

In order to see this content you need to have both Javascript enabled and Flash installed. Visit Βι¶ΉΤΌΕΔ Webwise for full instructions. If you're reading via RSS, you'll need to visit the blog to access this content.

Two of my collegues, Chris Pike and Chris Baume, and I decided that we would propose some areas of audio research into which we felt Βι¶ΉΤΌΕΔ R&D should be investing some more resources. One of our proposals was research into Ambisonics and periphony.Μύ

The majority of the audio that the Βι¶ΉΤΌΕΔ creates is stereo. The two exceptions to this are Radio 5 Live, which, for now, is broadcast in mono, and Βι¶ΉΤΌΕΔ HD, which has a mixture of stereo and 5.1 surround.Μύ

5.1 surround is one of the current multichannel surround standards. As a result of extensive testing in the 1990s the recommends a set of speaker positions for a 5-channel surround, with 3 at the front and 2 at the back, at specific angles.

There are a number of disadvantages to this way of recording surround sound. One of the major issues is compatibility with formats with a different number of channels. The sound engineer has check compatibility with mono, stereo and 5.1. In the future the engineer may have to also check with 7.1, 22.2 and whatever other discrete channel surround system that may come next. That would require a lot of time and a room with enough speakers to cover every possible set up.

Another issue faced by an organisation like the Βι¶ΉΤΌΕΔ is how we archive our material. Theoretically, if we archived the stereo, 5.1 and 7.1 mixes of a piece of audio it would take 8 times the amount of space than just the stereo recording. These ITU standards were borne out from a lot of research into which angles gave the best sound, and are essential when setting up a studio or listening room. However, I would be surprised if many of our audience had their own ITU 5.1 set up, and the talks I've had with friends in the computer games industry suggest most of their customers who listen in 5.1 don't follow ITU's recommendation, preferring to use a square, perhaps because that set up fits best around their furniture. While games users may not be representative of the Βι¶ΉΤΌΕΔ's audiences, we shouldn't assume our 5.1 listeners are using an ITU recommended set up.

A possible alternative to these discrete channel formats is a system called Ambisonics. This system was developed in the 1970s and has had a cult following since but has yet to break into the mainstream, being of interest mainly to academics and select audio engineers. The fundamental idea behind Ambisonics is to attempt to represent a sound-field at a single point in space.

Without going into too much detail it is an extension of the , but capturing audio from three perpendicular figure of eight microphones all positioned at the same point in space. When combined with an omnidirectional microphone these four signals are know as B-format. This signal represents the three-dimensional sound-field.Μύ

Since the 1970s development of the system has lead to High Order Ambisonics which provides higher resolution in the localisation of sources within the sound-field, at the cost of needing more channels to represent the same recording.

So how might this technology help solve some of the problems described above? A major potential advantage of Ambisonics is its lack of dependency on speaker position. Unlike 5.1, the audio channels being carried in an Ambisonic signal do not map directly onto speakers. The number of speakers and the way they've been set-up by the listener is not as important and the same signal can be decoded to any speaker array. This flexibility would allow one common set of signals to be sent to everyone, and they would be able to decode it to suit their listening environment, regardless of the way they've chosen to set up their sound system. This also has obvious advantages from an archival point of view, and unlike stereo, 5.1 and 7.1, keeping the Ambisonics recordings could potentially help future-proof the archive. In my next post I'll talk about what we've done so far, and what we might do with Ambisonics in the future.

Comments

  • Comment number 1.

    As one of those "select audio engineers" actively using an Ambisonics-based audio system in the product that I work with, I am very encouraged to see the Βι¶ΉΤΌΕΔ following this path.

    It may have been a long time coming (and I'm pretty sure various departments have worked on Ambisonics within the Βι¶ΉΤΌΕΔ previously), I can only hope this may continue, and lead to something mainstream. Ambisonics is really a very neat solution to the problem of delivering surround sound to various format playback configurations. Please keep posting your progress!

  • Comment number 2.

    That's good for audio, though we already have surround sound but not surround video yet. Maybe the two together would be better. Also, I think it's a bit unfair on the video side because they are going for more and more audio channels eg. 5.1, 7.1, or even higher (eg. 22.2 for Super High Vision), yet for video we usually only have 1, or soon 2 (for stereoscopic "3d") but no better (eg. multiview or 3d).

    Also, for audio, they are going for uncompressed or lossless compressed audio, but they never do that for consumer video. And they are using ever higher audio sample rates because they know that produces better, more accurate sound, but the Βι¶ΉΤΌΕΔ (Βι¶ΉΤΌΕΔ HD) are going for ever lower video sample rates (eg. usually shooting video at 25 hz instead of 50 hz that we've had for years with SD, and feature films are usually made at 24 fps, the same as they were since sound was added to film many years ago), despite the fact that Βι¶ΉΤΌΕΔ research has said that we should be increasing the video frame rate to much higher (otherwise we lose the advantages of HDTV over SDTV for moving things). High quality video needs a high video sample rate (frame rate) for accurate representation of video (and so it doesn't judder or strobe), just like audio needs a high sample rate for high, accurate representation of audio.

  • Comment number 3.

    Good to see that this is work in progress at the Βι¶ΉΤΌΕΔ once again. I see from the video that you appear to be using Nuendo on a Mac and I'd like to know whose plug-ins you're using to do the transcoding. I'd also be interested in knowing where you obtained the recordings of Spitfires at Duxford that are mentioned in the video. Are these Βι¶ΉΤΌΕΔ recordings, or have you obtained them from somewhere else?

    Keep up the good work.

    John

  • Comment number 4.

    > These ITU standards were borne out from a lot of research into which angles gave the best sound,..

    Could you point us to the work which concluded that the ITU-R 5.1 layout gives the best sound? AFAIK, this was a crude attempt to replicate the layouts in cinema. It is very suboptimal for surround sound. I have only seen it in a very small number of studios and research establishments. It is unknown in domestic environments.

    Among domestic listeners who have tried to place speakers properly, a square (or near square) is by far the most common layout with the listener somewhat back from centre.

    Ambisonic decode to a square works very well for these common layouts. Listeners instinctively move to the centre of the square when they encounter good surround material.



    has a number of papers investigating this.

  • Comment number 5.

    Nice to see the Βι¶ΉΤΌΕΔ working on Ambisonics again, and that it hopefully isn't impacted by the disappearance of Kingswood Warren.

    My own research on surround layouts found in the wild supports your observations that most people listen in a square, or at least a rectangle to which a square is a reasonable approximation. I do not believe this is limited to gamers, but is quite common in average home theatre environments. In addition, Ambisonics is very robust and can take a bit of speaker positioning error without destroying the illusion.

    I presume you are looking at working and archiving in B format, in which case you could derive a suitable decode at any time for whatever discrete-channel system and/or speaker layout is the fad at the time, should it not be possible to transmit the B-format content to the end-user to render at the listening end.

    I have a paper on the effectiveness of square decodes and the use of studio-based decoding of Ambisonics here: [Unsuitable/Broken URL removed by Moderator] (PDF) which I hope may be of interest.

  • Comment number 6.

    You can access the paper mentioned above, "Getting Ambisonics Around" from the Articles section on Ambisonic.net:

Μύ

More from this blog...

Βι¶ΉΤΌΕΔ iD

Βι¶ΉΤΌΕΔ navigation

Βι¶ΉΤΌΕΔ Β© 2014 The Βι¶ΉΤΌΕΔ is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.