Everyday we get a bit closer to all IP wireless networks in which operators are hard pressed to present a voice over IP solution. Today two approaches are on the horizon: ‘Naked SIP’, already implemented in some 3G phones such as Nokia N-Series and E-Series S60 phones. And then there is the IP Multimedia Subsystem (IMS), based on SIP but with lots of additional specification put around it. So what does IMS do that SIP doesn’t? I came up with the following list of things which are laking in naked SIP today which are dealt with in IMS:
- General SIP implementations are network agnostic and can not signal their quality of service requirements to a wireless access network. Thus, voice over IP data packets can not be preferred by the system in times of congestion.
- Handling of transmission errors on the air interface can not be optimized for SIP calls. While web browsing and similar applications benefit from automatic retransmissions in case of transmission errors, VoIP connections would prefer erroneous packets to be dropped rather than be repeated at a later time since such packets are likely to come too late.
- SIP VoIP calls can not be handed over to the 2G network in case the user roams out of the coverage area of B3G networks.
- SIP does not work in 2G networks.
- Most SIP implementations today use the 64 kbit/s PCM codec for VoIP calls. Compared to optimized GSM and UMTS codecs, which only require about 12 kbit/s, this significantly decreases the number of VoIP calls that can be delivered via a base station. Furthermore, mobile network optimized voice codecs have built in functionality to deal with missing or erroneous data packets. While this is not required for fixed networks due to the lower error rates it is very beneficial for connections over wireless networks.
- Emergency Calls (112, 911) can not be routed to the correct emergency center since the subscriber could be anywhere in the world.
- No billing flexibility. Since SIP implementations are mostly used for voice sessions, billing is usually built into the SIP proxy and no standardized interfaces exist to collect billing data for online and offline charging.
- Additional applications such video calls, presence, instant messaging, etc. are usually not integrated in SIP clients and networks.
- It is difficult to add new features and applications since no standardized interfaces exist to add these to a SIP implementation. Thus, adding new features to User Agents and the SIP network such as a video mailbox, picture sharing, adding a video session to an ongoing voice session, push to talk functionality, transferring a session to another device with different properties, etc. is proprietary on both the terminal and the network components. This is costly and the use of these functionalities between subscribers of different SIP networks is not assured.
- Insufficient security: Voice data is usually sent unencrypted from end to end which makes it easy to eavesdrop on a connection. Signaling can be intercepted since it is not encrypted. Man in the middle attacks are possible. No standards exist of how to securely and confidentially store user data (e.g. username/password) on a mobile device.
- Scalability: Mobile networks today can easily have 50 million subscribers or more. This is very challenging in terms of scalability since a single SIP proxy in a network can not handle such a high number of subscribers. A SIP network handling such a high number of subscribers must be distributed over many SIP proxies/registrars.
- There is no standardized way to store user profiles in the network today. Also, no standardized means exist to distribute user data over several databases which is required in large networks (see scalability above).
The list is quite long I have to admit. But there is one thing the list does not say: While naked SIP is available today I have yet to see an IMS capable terminal in the wild. I wonder how long it will still take?
As always, comments are welcome.

