The Long Haul
In 1999 I interviewed to be a network engineer at WorldCom. At the time WorldCom was the world’s largest network service provider, larger than the next six largest networks–Sprint, Qwest, AT&T, etc–combined. A popular screensaver displayed a WorldCom Borg Cube made of logos of companies they’d acquired–UUNET, MCI, CompuServe and dozens more.
The cab dropped me off at a shiny black behemoth of a building west of Columbus, Ohio that I would learn later was affectionately referred to as The Death Star. After a short wait in a lobby befitting a five-star hotel I was taken up the elevator and then through security doors into an enormous network operations center with three-story ceilings, massive movie theater displays, observation booths, catwalks–it was like walking onto the NORAD set from WarGames.
The hiring manager, Jeff, took me to his office behind the theater displays. We had a nice chat and I couldn’t help but notice his office was filled with Nerf weapons and ammo–it was a foam arsesal. I had visions of epic Nerf wars in that space, running across the catwalks dodging darts.
Then we went on a tour and he showed me the world’s most complete networking lab, bigger than a football field and filled with at least two each of every piece of networking hardware known to exist, from modems to massive WAN switches. The intent was to be able to recreate any possible networking scenario in a lab environment. There was an entire side room filled floor-to-ceiling with shelves of thousands of color-coded patch cables. Another room was filled with NICs, expansion cards, memory, every sort of hardware upgrade available so the boxes could be precisely configured to match the target environment. People flew in from all over the world to use the lab, Cisco tested beta gear there.
I had had reservations about moving to Ohio, but they all washed away when Jeff told me I would have unfettered access to this lab. I remember hesitantly asking him after that if my access badge would get me in the building 24/7–he burst out laughing and assured me it would.
My first day I was given a tour by our lead trainer, Steve Androw. Steve showed me this one box in the middle of a line of server racks located in the backstage area behind the NOC’s theatre displays and said “this is the master node—whatever you do, never press THIS button because it would shut off our HP OpenView monitoring and trigger…” at which point the finger he had been wagging at the button made contact. Monitoring of everything went down—x.25 wire transfer networks, pager networks used by the military, the modems at every major ISP, Visa, long-haul fiber–all of it wiped off the map, so to speak. Everybody was paged—-this could have been a nuclear strike! Bernie Ebbers was paged. The Joint Chiefs of Staff were paged. Bank presidents. It was pandemonium.
I was a member of the Product Control Specialists group, my official title was Internet Network Engineer III. PCS engineers were the ultimate authority on the hardware and software products assigned to them and spoke to vendors as WorldCom’s official voice. We wrote the internal documentation for our assigned products, designed and taught classes about them and served as their ultimate escalation point in emergency situations. Our goal was to document, automate and train so well nobody would ever have to page us.
Pager networks had varying levels of priority. Normal people were level five. Physicians were level four. Police and military were level three. Key government officials level two. POTUS was level one. Our team carried level zero pagers.
Each PCS had two areas of responsibility out of:
- Backbone–this covered most of the global WAN; we dwarfed all other providers
- IP Network–we provided a large chunk of US Internet service, mostly through resellers
- X.25 Network–we handled most US bank wire transfers
- Dial-up–we leased all modems used by large ISPs of the time like AOL & MSN
- Visa’s VITAL network–we handled all US Visa transactions
- Legacy CompuServe–we handled networking for CompuServe’s in-house-built PDP-10 clones, referred to as “the 36-bit hosts”
- Training–we all designed, wrote and taught clasess, but one person was primarily responsible for onboarding classes, coordination and maintaining the teaching lab
I was assigned Backbone and IP Network. I managed long-haul fiber, WAN switches, CSU/DSUs, routers, DSLAMs, VPNs and similar. I did not get to work in the fancy room; I worked across the hall in a small, heremtically-sealed fart enclosure called The Fishbowl. Jeff’s Nerf arsenal turned out to be toys Jeff had confiscated to prevent the Nerf battles of my fantasies–in fact Jeff turned out to be a literal cop. His job in the military had been hunting down AWOL soldiers and he spoke lovingly of his “jackboots” and the fun of dramatically landing helicopters in small towns to scare the crap out of everybody, telling the captured solider he’d have to shoot them if they caused a disruption in the helicopter, for safety.
Ever wonder why busy signals seems to stay the same in the late 1990s no matter which ISP you tried? All the major ISPs leased their modems from us. Each ISP each had unique phone numbers, but those numbers all connected to the same banks of modems and we had no mechanism to provision capacity granularly nor any interest in doing so.
Remember how credit card readers used to dial a seven-digit number to authorize transactions? Seems like a security issue if you think about it–just reprogram the box to dial a malicious number that steals credentials or approves bad transactions or whatever you like. Nope. Each of those devices was hardwired to dial the same seven-digit numbers (one per card type) and those numbers were assigned to the respective processing network in every area code that was or ever would be.
December 31, 1999 I volunteered to work overnight as Y2K rolled in. I had first spotted a Y2K bug in the wild at Citicorp several years before–an eight-year certificate of deposit was showing it would have a value of $0 upon maturity in 2000–and I was very curious to see what would happen. We’d had everything patched for months and there were no anticipated issues, but large-scale networking is complex and we couldn’t control preparations made by our peer networks.
WorldCom had hired a team of high-priced Y2K consultants to monitor the situation. They set up a war room in our teaching lab. We started off open-minded about their contributions, but after a couple of hours it was clear they had no idea what they were doing. They had a graphical dashboard set up to monitor network health and it froze on them early on. For a while we entertained ourselves by periodically stopping in and asking them how things were going, at which point they’d look up from whatever they were reading, glance at the frozen dashboard and give a thumbs up.
Once that got stale we used the remote admin tools on the lab computers to whisper creepy things and flash the screens and whatnot. They seemed oblivious to that, so finally we started playing klaxon sounds and loudly saying “oh no, everything’s down!” and similar through the lab computers. That was enough to finally get them to pay attention and notice their frozen dashboard.
As we were having a good laugh about this and scorning the consultants, a guy came running up in a panic. It had just turned midnight in New Zealand and we were receiving panicked reports that the phone lines were out. No emergency services! People were about to start dying!
It turned out people in New Zealand were so concerned about Y2K that hundreds of thousands of them all checked for dial tone right at midnight and many didn’t get one because POTS switches are always oversubscribed.
I mentioned X.25 networks for banking, and banks indeed were our best customers on that front, but X.25 over leased lines was the enterprise standard at the time. It seems crazy in 2019, but back then large companies ran their own wide-area networks in parallel with the Internet. This is the problem that VPNs were brought to market to solve.
The first big VPN project–we were told it was the largest networking contract ever awarded–was closed via handshake deal between Bernie Ebbers and Toyota chairman Hirosi Okuda. Toyota would pay tens of millions per year for Internet connections at all of its facilities and a VPN for its internal traffic and we would manage it all. This was a departure from standard telco practice in which support stopped at the “demarc”. The point of demarcation at your home is probably a grey box on the outside of the building that connects the outside telephone wires to the inside telephone wires. The phone company is responsible for everything on their side of the grey box, you or your landlord are responsible for everything on the other side. Sometimes the phone company will work on the lines on your side of the demarc, but there’s almost always a charge because they are working on your network, not theirs.
We had done add-on “beyond the demarc” services before, reacting to needs of enterprise customers, but this was the first time they were an integral part of a new deal.
The Internet components of the Toyota product were all standard fare, DS3s and so forth, but we were using a new CSU/DSU vendor and we’d never actually built a mesh VPN. Xedia was chosen to supply the VPN routers. They were a nifty shade of purple and their specs matched our requirements. The shell for their operating system was a TCL REPL, which was handy because you couldn’t ssh into them and had to manage them with Expect.
The Xedia routers were each capable of running 64 simultaneous point-to-point VPN connections and we had a large pile of them to support Toyota’s needs–each VPN connection was to replace a physical line they had been leasing.
They were ordered and installed and set up, but nobody actually bothered to test them until the project got to me. I set up a VPN tunnel, no problem. I set up four, no problem. I set up ten, no dice. A bit of trial and error showed I could make eight simultaneous connections.
I rang up Xedia and spoke with a proud and enthusiastic engineer who was really excited about the project I was telling him about until he realized we were implementing now and I had boxes on my desk. He stammered out that they would be able to support 64 simultaneous connections in the near future but they hadn’t expected anyone to need it yet.
He said he’d get back to me and I went to tell Jeff the bad news: Implementation had gone all-in on these routers, they were installed all over the world, they didn’t work and we weren’t going to deliver. It was a catastrophe. Our CEO and company were about to lose serious face.
Then the next day a miracle happened. The Xedia engineer called me back and said they were releasing an OS update that would support 64 simultaneous connections. There was a caveat–it wouldn’t work on any routers with an ISDN expansion card–but we could work around that by using external ISDN interfaces.
About six weeks later we were preparing to go live–Implementation would bring the system up and hand it over to our team to operate in production. Around 6p I was getting ready to go home and noticed a huddle down the hall. Implementation could not get the new CSU/DSUs online. They’d never worked with Larscom products or even heard of them and indicated they’d been unable to get support–somebody made a frustrated comment that their entire support team was a guy named Lars who lived in his mother’s basement.
They were out of ideas, giving up and going home. I had spent the past few weeks learning Larscom and Xedia inside and out, building tools for them, integrating them into our monitoring, writing classes for them. I offered to help, but they refused at first because our teams were so siloed that they didn’t realize I was now WorldCom’s Larscom expert. Even once I clarified that they were hesitant–this wasn’t how things were done–but once I pointed out they had nothing to lose they agreed.
I tried for a few minutes in the office, but I was hungry and had been thinking about a cold beer before all this happened, so I packed it up and went to Pizzeria Uno and had a bacon burger and a Sam Adams. I can remember it all with unusual clarity, senses heightened with a mix of adrenaline and calm. The burger was overcooked but it was delicious and the beer was cold and perfect.
I walked home from there and got to work. It took about 30 minutes to figure out the issue, get the first couple of Larscoms up and phone into the NOC with instructions to get the rest of them up. It took about 90 minutes total, dinner break and walk home inclusive.
Toyota’s managed Internet+VPN went live the next morning on schedule. Jeff passed me word from our VP that Bernie Ebbers was grateful, had sent me a personal “attaboy” and I was on a fast track to VP.
I didn’t care much about being a VP at the time, but I remember being so happy that I’d “made it”–I was on the CEO of WorldCom’s radar as an up-and-comer. I’d never have to worry about finding a job again.
Bernie Ebbers, of course, is still in prison as I write this. He won’t be eligible for parole until 2028. He’s considered the fifth-worst US CEO and the tenth-most corrupt CEO of all time.
Part of my job involved fielding calls from FBI agents demanding data related to investigations they were working on. They invariably had no warrant and when I told them they needed one they invariably got nasty and started to threaten me, tell me they could make my life difficult for not cooperating. I took no small pleasure in telling them my name was Jeff and giving them Jeff’s direct line so they could follow up.
Jeff loved reaming FBI agents out for this nonsense, threatening to tell their bosses or have them arrested. Having a cop boss turned out to work pretty well.
In general our organization was rather happily diverse, but not the PCS team. We were a bunch of white dudes in a secure room. Nobody ever said anything inappropriate about our co-workers or anything like that, but it was raunchy and it was fun. We were kings.
WorldCom choosing a vendor could be everything. Remember when US Robotics claimed to have the world’s best-selling modems? That was thanks to a WorldCom contract.
Despite having literally zero say in purchase decisions we were nonetheless the face of WorldCom to our vendors. We were wined and dined at lunch on a regular basis. One vendor offered to lease a line between my apartment and our data center so I could have a free blazing SDSL connection at home to “test out” their gear.
One day Jeff came into the Fishbowl and asked if we would be OK hiring a woman for the PCS team. He had someone talented in mind, but he didn’t want to put her through the process if we were going to reject her for being a woman.
Only one member of the team was willing to give her a shot. I am ashamed to say it was not me. I have no excuse; I knew it was wrong. I didn’t understand how wrong it was at the time, but that’s irrelevant. It remains the biggest regret of my career.
But Jeff the cop? He worked on us for a good two weeks trying to change our minds. He argued, he begged, he pleaded, he tried bribes–all to no avail. We were awful; Jeff was a hero.
Remember in WarGames the kid war dials numbers looking for remote access, then he has to figure out a password to get in? 16 years after WarGames wide swaths of the Internet were managed via dial-up modems. Most of the managed boxes simply had modems attached to their serial ports–there were no passwords or authentication at all. The only security was via obscurity. Not knowing the phone numbers, not knowing what the boxes were or what they did was what kept the Internet up in stub areas away from major data centers–which was most of the world at that time. Even the Larscom CSU/DSUs we’d just installed for Toyota were managed this way.
Monitoring happened via piles of US Robotics modems attached to console servers dialing up machines around the world and running Expect scripts 24/7. This would overheat the modems so they each had a little plastic clip-on fan attached and they were all spread out to let the heat dissipate–it was a hilarious contrast from the polished NORAD NOC on the other side of the wall.
In the US these outposts were relatively easy to manage, but outside the US things could get tricky. For whatever reason I was assigned to handle Asia while my counterparts in Virgina handled the rest of the world.
It worked like this: as the Internet was being quickly expanded we needed places to put gear, but we generally didn’t need offices or staff. Instead we would lease a data closet in some building and have wires run there–with no small amount of effort dealing with local governments and permits and contractors and so on. It was a truly global project. Our remote management capabilities, as mentioned, were rudimentary. Thus we relied on maintaining relationships with the other business or businesses sharing the building with our closets. Our sales people would stop by when they were in the area, bring some cookies or whatever and chat folks up, keep things friendly. So when we needed eyes and ears in the closet we had the number of the neighboring business and they had keys. I would loop a translating service in, ring up some random business and ask them to go look and tell me about the blinking lights.
This worked perfectly fine in practice, but it could get awkward–mostly for the sales reps–when the tenant of the building changed. We wound up sharing space with a couple of brothels and an “opium den” the sales rep called it though I’m guessing it was more of an illegal pharmacy than a red-curtained movie set.
After one particularly long and frustrating call dealing with a wiring closet in Kuala Lumpur I made an out-of-band remote management breakthrough: cameras. We’d set up cameras in each wiring closet attached to a box we could dial into. When we needed to see what was going on with the blinking lights we could use the cameras and cut down on third party interaction.
Yes, we knew about SNMP–in fact we probably had the world’s largest OpenView/SNMP installation–but not all gear supported SNMP and stub closets didn’t have alternate networks, anyway. If the closet couldn’t talk to our network there was no way to broadcast SNMP. There wasn’t enough infrastructure to form a mesh.
I have read a lot about mesh networks as a backup Internet solution if the large corporations that operate it drop the ball or interfere, either at the request of government or for their own purposes. But mesh networks only work across population centers and, besides, it’s not even a hard problem to solve in an emergency–in population centers you can run Ethernet cables building to building.
The hard part about the Internet is long haul, the backbone. Trawlers laying cables in the ocean–that’s hard. Getting fiber across the continental United States? That’s even harder. You have to dig a trench across thousands of properties, deal with dozens if not hundreds of state, county and local governments.
The traditional way to do this, since the days of the telegraph, has been to piggyback on railroad rights-of-way. Railroads either own the land they operate on or they have existing agreements with the landowners. If you are a large corporation with piles of cash and lawyers none of this is terribly difficult. But how do you make an Internet backbone in the continental United States without piles of resources?
I believe the answer is in the power of local government. Railroads have a vested interest in maintaining good relationships with local governments along their routes. If there is conflict local governments can apply pressure on the railroad and landowners. When local governments ask for help, railroads listen. I envision a network of old train towns–lovely, historic towns along the railroad currently underutilized thanks to automobiles and highways–whose local governments all pressure the railroad and landowners not only for fiber rights-of-way but also for hike/bike trails to connect them together. This, then, would lead to a “distributed tech center” connected via rail travel & shipping, high-throughput fiber and trails. It would need to be a ring around the continental US–something like connecting interstates 5, 95, 10 and 80.
Thus, instead of concentrating wealth in only a handful of coastal cities, we can spread it out across the country and make a major impact on state and local economies–even mitigate electoral college concerns by lifting all boats.
If you are interested in learning more about building a distributed tech center and enjoy a scripted podcast, check out The Remote Outcast Podcast.