Why Are We Still Using Narrow Band Codecs for SIP to SIP Calls?

I really like the SIP Voice over IP implementation of the Wifi enabled Nokia Nseries phones such as the N95 and the N82. After all, they save me a lot of money these days as SIP to SIP calls for example even between different operators are free. On first thought, voice quality seems to be excellent, I can't tell the difference between a VoIP call and a traditional circuit switched call over a cellular network. But put into a different perspetive, that's a bit short sighted. For direct SIP to SIP calls that do not cross a circuit switched interface, why do the two SIP clients still use the backwards compatible G.711 narrow band voice codec released in 1972 or the narrowband AMR (Adaptive Multi Rate) codecs? Today, much better codecs such as Wideband AMR (AMR-WB) are available that have a similar audio quality as Skype to Skype calls. So why are we still stuck with a narrow band encoding? It can't be computing power, especially in the case of high end Nseries phones. Maybe license or patent issues? Ideas, anyone?

SIP providers miss a great opportunity to go beyond the limitations of circuit switched networks and offer subscribers a superior experience for direct VoIP to VoIP calls. And I think it would be a good selling argument once suddenly for some connections you have a much better audio quality than to someone still stuck in the circuit switched world (or in a SIP network that is not interconnected and thus has to use a circuit bridge). I can very well imagine that at some point conversations would start with "oh, you are still on an old phone line" 🙂

P.S.: Some details: The N95 SIP client supports AMR, G.711 a- and my-law, iLBC and G.729. All narrowband…

7 thoughts on “Why Are We Still Using Narrow Band Codecs for SIP to SIP Calls?”

  1. Siemens Gigaset S6xx IP-Cordless Phones use G.722 codec for CAT-iq-Connections. It would be great if these phones could do high quality sip-connections with SIP-enabled phones. Though the microphone and speaker have to support the better quality as well.

  2. Hi Christian,

    Good point with the microphone and speaker. I’ve tested the microphone of the N82 for example with a MP3 voice recording program and the quality is excellent. So it would do for WB-AMR or G.722. Concerning the speaker I am sure the external one and over headset are also up to the task as they are optimized for music playback. About the internal one for voice calls I have no information.

    Thanks for the comment.

    Cheers,
    Martin

  3. Don’t think that it is to do with the bandwidth requirements, as because even wideband codecs can be squeezed by means of advanced compression techniques. Guess it has to do with DSP requirements.

  4. i Think it’s because of battery lifetime on handsets and CPU consumption for processing other CODECS !

  5. My two cents:

    G711: ~5 MIPS
    AMR-NB: ~16 MIPS
    AMR-WB: ~40 MIPS

    So in order to use a wideband codec such as AMR-WB which would still use less BW than G711, one needs a lot of horsepower. Even if this can be supported in smart phones, battery power will likely be the limitation. Plus, MIPS are based on direct DSP/firmware level implementation, if you do that at software level for AMR-WB, the difference will likely be much bigger.

    G711 is using 64 kbps (not including overheads) which should provide equivalent quality as AMR-WB at 24 kbps. They both have a Mean Opinion Score (MOS) of 4.5.

    In summary the use of G711 vs AMR-WB can make sense if battery usage vs bandwidth in the air interface trade-off is favorable.

  6. Hi Serdar,

    Thanks for commenting. Concerning MOS scores, when comparing NB-AMR (used in mobile networks today) and WB-AMR, it seems there are different opinions out there. Ramo and Toukomaa point out in the conclusion of an article published in the IEEE library ( http://tinyurl.com/6zgcjn ):

    “As can be seen wideband speech coding gives much
    better quality over narrowband speech coding. The average improvement in the various tests was nearly one MOS score with clean speech. Surprisingly even with quite low bit rates wideband was preferred (e.g. AMRWB
    8.8k vs. AMR-NB 10.2k).”

    Also, when I listen myself to samples of 15 kbit/s AMR-WB recordings, which samples the audible frequency range up to 7 kHz compared to only around 3 kHz for narrowband codecs the difference is striking, like day and night. That’s why I compared it to Skype to Skype speech quality.

    Here’s a link to how G.722 (64 kbit/s) sounds compared to G.711. O.k. that’s not WB-AMR, which is a different codec, but I find the sound quality comparable. Quite a difference.

    The G.722 codec is already supported by cordless phones such as the Siemens S685, which most likely have a much lower processing power than current high end 2G/3G/Wifi smartphones. So I am not overly concerned with processing power limitations or high battery consumption.

    Martin

  7. Along with 20-16000 Hz stereo audio we would want 200 KBPs Mpeg-4 Video on our VOIP calls. This is not difficult with an N95.

Comments are closed.