Whiskey Tango Foxtrot ‽ Over

by Bryan

Yesterday Marcy posted The Clapper Review: How to Fire 90% of SysAdmins?

I read it and learned that USCYBERCOM employs 1,000 systems administrators with the same access as Edward Snowden. After I got over the shock, and the speculation as to what they needed a thousand SysAdmins for, I started to think about the cost. Snowden made over $100 thousand/year, so the cost of a thousand Snowdens would be over $100 million/year coming out of our taxes. That is a significant administrative cost. SysAdmins don’t collect intelligence, or analyze it, they just keep the computer networks up for organization.

The problem isn’t that they want to cut the number of SysAdmins, it’s that they have such a mess that they feel they need a thousand SysAdmins to keep it working.

4 comments

1 Badtux { 08.14.13 at 9:57 pm }: They’re running Google-scale infrastructure, and if you do that, you need Google-scale people. Google DevOps (the word for what these people do, which is *not* just plugging in machines and adding users, there’s a whole private cloud infrastructure to maintain and write automation to automate and yes, DevOps is deeply involved in that) is around 10,000 people. It’s highly compartmentalized so that only a few people have the sort of “God” access that Snowden had but it simply takes a lot of people to keep the massive infrastructure for Big Data up and going, period.

Personally, I’m surprised they have *only* 1,000 sysadmins…
2 Bryan { 08.14.13 at 10:31 pm }: The facility in Hawaii is definitely not in the thick of things, so if Snowden had the kind of access he obviously did have in Hawaii, there must be a major number of people with that kind of access.

This may be a matter of definitions, i.e. what Clapper calls sysadmins may not be what I think of, or the jobs you think of.

If they need these people now, I can’t see them being replaced with hardware, as Clapper seems to be suggesting – hardware doesn’t write code, it executes it.

I’ve known more than a few world-class coders, and few, if any of them could qualify for a security clearance, or would want to work for the government. They tended not to be ‘team players’.
3 Badtux { 08.15.13 at 2:33 am }: That’s Clapper’s problem. Big Data needs a lot of very bright people to keep it running, to program the routers and resurrect machines that have gone dark and basically run around replacing the vacuum tubes as they burn out (just joking about the last, but sometimes I feel like replacing disk drives in a modern infrastructure is about the same chore, you have hundreds of thousands of disk drives, there will be tens of thousands of disk drive failures during the course of a year), and if Clapper wants good ones, they are… unreliable. The fact that he can’t find 1,000 reliable sysops out of over a million IT people in America is hilarity-inducing.

An example of why you need so many people: I spent today with a very bright young DevOps engineer working on how to automate a new cloud deployment that we’re doing and it was all day just to get the basic design template deployed. And that was a very small cluster, “only” twelve machines. And it took two of us all day just to get a *non-functioning* design template up and going (i.e., one that got all twelve machines up and running bare Linux w/no applications in the correct subnets with the correct routing and firewalls to control security between them, it doesn’t do anything yet). And that was re-using firewall rules and routing templates from a *previous* deployment, just changing things as needed to work in the new configuration. And this was without writing actual programs, just hacking a giant JSON file (as in, several thousand lines) that defined each facet of what the virtual infrastructure looked like — software defined networking is really cool and all, but really tedious too.

Now, multiply that by hundreds of thousands of physical machines constantly being requisitioned by project groups to do their own particular tasks with reconfigurations of the virtual infrastructures needed on a daily basis to keep those project groups isolated from each other in this virtual cloud of hardware… that is a lot of people. Amazon gets away with only a few hundred devops engineers by basically making us customers do the hard work of reconfiguring bare metal virtual machines into new networks with correct firewall rules, keys, multi-factor authentication, etc. to keep them secure, basically turning us into tens of thousands of unpaid devops engineers. But the Feds don’t have that luxury.

It’s hard for people not working in Big Data to grasp the sheer *scale* of what we’re doing. And yes, Clapper is right that the tools to handle that scale aren’t there yet, which is why we’re hacking JSON files by hand to do this stuff. But the tools are not a panacea. Having good tools would make it easier, but nothing will replace actually understanding how to create multi-layered secure infrastructures with security layers and mechanisms to make it hard for an exploit of one machine to do anything to another machine. Most of what we did today was looking at a picture of the infrastructure on the whiteboard and drawing arrows and boxes representing subnets and routes and firewall rules and which set of machines went where to allow scalability and security, and discussing various scenarios and how were we going to handle them including failovers to backup infrastructure. Having better tools wouldn’t have sped that process up one bit.
4 Bryan { 08.15.13 at 11:54 am }: The more time you spend in planning and design, the less time you spend in coding.

Pulling coax for a physical network in an existing building was always time consuming, but if you did it right the next guy who had to expand the system had an easy time of it. Looking at a backplane with all of the cables the same color and none of them ID’d was always a bad start. Then you looked in the suspended ceiling and saw hundreds of feet of coax piled up. Doing that virtually has a lot of appeal.

Why would anyone work for the government when there was better pay and benefits working for Amazon or Google or any of the other Cloud providers for the same skill set? The best way to accomplish the job is to use military people who already have the commitment and clearance, who can be identified through testing, and train them to do it. A thousand E-6s and E-7s would be a hell of a lot cheaper and more dependable than outside contractors. When they retired, they would have marketable job skills to supplement the reduced military pensions that we hand out. It would certainly take months or years to train them, but it would be a better solution.

There are all kinds of ‘tools’ for coding something as simple as web pages, but in the end to get exactly what you want, you end up with source code and a text editor. People who do this for a living have created their own ‘tools and templates’, and after enough time are usually editing what they have already done to produce the new project they are working on, with the added benefit of knowing what is going on in the code.

The scale is the problem that none of the people who wanted to do this has ever really understood. They are going to need to build a new factory to get the disk drives they need for that place in Utah, which will probably drive the price of drives through the roof and move more people over to SSDs, as they become much more price competitive with rust.