When things were new and we used to dial in to a BBS on someone’s computer, what we meant by ‘the internet’ is down was pretty simple. Either our phones were down and we couldn’t dial out or someone else’s were down and we couldn’t dial in.
On August 12th, a network outage took my server ‘down.’ Now, trying to explain this on Twitter was complicated, since it’s a more than 140 character explanation. The situation was pretty basic. The internet pipe leading to and from my server wasn’t working right. But what did that mean?
As I love to do, let’s step back and think about all the various ways your ‘internet’ might ‘break.’ It’s a fun thought experiment, and this is in no particular order.
- Your home/work internet isn’t working and no one can get anywhere
- Your device’s internet connecter isn’t working and you can’t get any signal
- You’re in a place with no signal/wifi
- Your firewall is preventing you from accessing a site
- The server that houses site you’re trying to visit is offline (or on fire)
- The site you’re trying to visit has a code error and nothing loads
- The DNS is wrong for the site
- The nameserver is wrong/changing (mea culpa)
- The internet connection from the site to the rest of the world is down
- There’s a problem in between you and the site
That list is incomplete. What happened to me on the 12th was the last one, however, and it was caused by something particularly weird that can be summed up as this: We finally hit 512K BGP routes on the internet today and ran out of room.
https://twitter.com/TheProtestBoard/status/499270694702972928
Of course, what’s BGP is the next question. From Reddit
BGP is a routing protocol that advertises routes externally, each large organization advertises some BGP routes at the edge of their network. Each edge device has a routing table with all the advertised BGP routes from around the internet.
So think of it like a giant phone book, and we ran out of pages. Now before you get scared, not every router needs all the tables. Instead, most routers have the core ones everyone needs, and then they link out to other routers and tables for the rest. These tables act as giant maps for the entire Internet, and those maps are pretty damn big.
A lot of routers, especially Cisco which I think powers most of the Internet, simply started dropping routes when they hit the 512k limit. That means you simply could not get from point A to point B, or in this case, you couldn’t get to your website from your ISP. I could, for example, get to my site on my phone and from my home internet, but not my office. Go figure. The routers had no idea how to find my domain.
This isn’t something ‘new’ by the way. In may, the IPv4 routing table hit 500k routes and the prediction was we’d hit 512k no sooner than August, more likely October. Oops.
As Otto put it:
Everything was affected. See
http://downdetector.com/ for example. All those blue graphs should usually
be quite flat.
![]()
That’s AT&T. It was pretty much the same for everyone, though.
The fix? Well systems engineers spent their August 12th reconfiguring their routers and in many cases upgrading memory, but it’s a practical limitation of the Internet. That isn’t a long term fix, either. Nor is IPv6. Oh, I should explain that too.
Internet Protocol version 6 (IPv6) is the latest version of the Internet Protocol (IP). That’s your internet address, your IP. It’s possible to share them, like all my domains have the same one, and you can change them if you need to, but mathematically speaking there’s a limit to how many there can be, in addition to those routing tables. This gets worse when you realize that every single device on the Internet is assigned an IP address for identification and location definition. Your phone. Your iPad. You get the idea.
There are improvements to the mess with IPv6. We’re using IPv4 for about 95% of the net right now, and the IP ‘blocks’ you get take up a lot of room. But with IPv6 the blocks will be larger and store more, so they’ll paradoxically take up less room. But it’s not a full fix. We’re going to have to come up with a better way to store the data for the tables, because things are only getting bigger.
On the plus side, for the first time in a long time, when someone yelled “The Internet is broken!” they were actually right.
Comments
2 responses to “The Internet is Down”
Once again, a VERY GOOD explanation of something that rather complicated. We had clients shrieking at us yesterday. Since it looks like there may be more breakdowns, it will be handy to have something understandable to share with others.
@Jenny Klein: Counting on the news to report on what’s actually going on … Interestingly I saw more non US people impacted by this.