In Search of Egg Baskets

The problem started because a little bit of knowledge can be a dangerous thing. It was 1999. I had a new job. My predecessor had ordered a new server, which arrived after I did. I didn’t know the first thing about setting up or running a server. But I had web access, and I had some help, and before I knew it, the students and staff had accounts and storage space, we were hosting our own web site, and technology had become mission-critical.

Photo credit: Billie Hara on Flickr
Photo credit: Billie Hara on Flickr

Before that point, if the technology didn’t work, there was always a plan B. Teachers didn’t check email every day. We didn’t have computers in all of our classrooms. Students were almost never required to use computers, except in computer-specific classes.

It wasn’t long before I felt comfortable enough with the new server. It helped that it would break occasionally. Without any remaining consultant time and no budget, I learned to fix things. Eventually, we added another server. And then another. And then one for each building. And then backup servers and email servers and archive servers. Then, administrators started buying applications that required their own servers: online gradebooks, professional development packages, food service point of sale systems. Before I knew it, I had 30 servers.

And the mission-critical nature of most of them had become evident. If something was the matter, I’d hear about it quickly. And keeping everything running was proving harder and harder, especially with limited budgets and even more limited staff.

So, two years ago, I started looking at virtualization. If we could take one piece of hardware and run multiple virtual servers on it, that would be easier to manage. We’d have fewer physical devices. We’d save money. We’d save energy. The grass would be greener. The air would be cleaner. We’d all be smarter and happier.

At the same time, we were having a problem with disk storage. Not only were we out of space on many of the servers that held staff and student data, but we were also out of backup space.

So, last year, I added a new 30 TB storage array and virtual server to my infrastructure. It was wonderful. I made a pile of all of the old servers that we could now “retire.” Everything was in the array. Life was good.

It was Mark Twain (probably) who wrote, “Put all your eggs in the one basket and — WATCH THAT BASKET.” Lately, we’ve been having trouble with the basket. Specifically, one of the hard drive controllers on the storage array has been a bit flaky. After countless hours of troubleshooting, support calls, and annoying reliability problems for my users, it appears to be working now. But I don’t have the confidence in it that I did a couple months ago.

Perhaps the worst part is that we now depend on it so much. It’s used by the early-riser teachers who are in at 6:00 AM. It’s used by the elementary school teachers who are still at school at 4:00 PM. It’s used by teachers and students working from home, sometimes until midnight or later. If I need to reboot it or pull a controller out to troubleshoot, it has to be done between about 2:00 AM and 6:00 AM. Since I get neither comp time nor overtime, that gets old quickly.

So, I’m pulling those old servers back off the shelf. After some hard drive upgrades, they’ll all become virtual servers, so I can move resources around more easily when something’s not working. But to really get the redundancy we should have, I’d need to invest a lot more money than we have in the budget, and a lot more time than I have.

We can have a lot of technology. We can have reliable technology. We can have inexpensive technology. But we can’t have all of those at the same time.