IoT Failures With Network Impact

With hundreds of thousands of IoT devices in mobile networks it seems that every now and then, bad behavior in failure scenarios on the application level leads to, well lets say, interesting behaviors on the network layer. While most network operators keep such incidents to themselves, Annex A of GSMA TS.34 (IoT Connectivity Guidelines about which I’ve already written here) contains a number of interesting scenarios that mobile network operators have been willing to share with IoT implementors in the hope that they think a bit before deploying their devices.

Annex A contains, for example, the following scenarios with more details:

Example 1: 375.000 devices in several countries with SIM cards of one network operator started resetting their GSM modules once a minute because the application on top could not talk to it’s server on the Internet anymore because it was down. The result: A massive HLR overload which resulted in all devices handled by that HLR being unable to location update and register. In the end, roaming interfaces had to be shut down to reduce the HLR load to stabilize the situation.

Example 2: (Only!) 59 IoT modules with GSM modules where hacked and misused to make 17.000 fake calls to expensive destination numbers with a few days. Financial loss to the company that deployed the 59 devices was €150k and would have probably been much higher if the network operator did not notice a spike in calls to a number of destinations.

Example 3: An external Radius server suffered a massive overload after an SGSN outage and tens of thousands of IoT devices trying to get a PDP context again once connectivity was restored. The external Radius server could only be stabilized after the network operator barred all IoT devices and only let them back into the network in smaller batches.

TS.34 contains a number of additional scenarios and I’m pretty sure those are just small harbingers of what will come in the future. Because if we are realistic, for each developer that reads and understand these scenarios and puts counter measures in place there will be at least 10 others that don’t care until they are bitten by reality. Unfortunately it’s not only to their disadvantage but has an impact on the service of others as well.