There is a running gag between me and my fellow engineers I work with, that if the website of one of the most popular tech magazines in Germany can't be reached that the whole Internet must be down. This is because whenever one of us wants to check if there is Internet connectivity we type in the URL of this website to see if we can reach the site as it is (almost) always reachable.
So far so good. Recently, I was reminded of this running gag when I was in Taiwan and wanted to reach one of my servers at home and got no response either via the DSL line they are connected to nor via the LTE backup link. A quick interrogation of my GSM power socket via SMS revealed that there was no power outage either. So what could be the reason for this!?
As a next step I performed a traceroute and noticed that up to the edge of my provider's network in Germany, everything was working. After that, however, responses stopped coming in. So indeed for about half an hour the fixed line and wireless network of one of Germany's largest network operators was not reachable from the outside. Few probably noticed as it was 3 am local time. As I was in Taipei, however, it was 9 am for me and I did notice.
I wonder what will happen next time I travel!? I've had a DSL outage before while I was traveling, a city wide power outage interrupted communication last December when I was on vacation, a power outage caused by construction work while I was on vacation and now a backbone router outage on another trip. And whenever I think that I can't imagine anything else, reality shows me another possibility.
Welcome to the connected world!
Have your server fetch connectivity info every 5min from “the outside” e.g. http://www.downforeveryoneorjustme.com/yourserver.de. If that fails (i.e. inbound traffic blocked or DSL down for good), have it send you an SMS with reason (“outbound”/”inbound”) included.
Or pay extra for a “business” line, granting better uptime and SLA.