Figure 1. An example of a supply air plenum that is not supplying air.


Forget high-tech or high-expense fixes for a minute. Have you tried to bump efficiency by looking for the  next aisle over, deploying a Trim and Respond strategy, or trying the soon-to-be-famous Paper On A Stick approach? Some of the biggest names in the data center universe think that you plant the seeds of real savings with some mighty simple steps.

If you have ever gone apple-picking in the fall, then you know that early September beats late October. It’s the difference between plucking low-hanging fruit (literally) and having a kid stand on your shoulders, long pole in hand, poking the tree tops, causing the few worm-ridden apples remaining to fall on your miserable noggin. One year, we went so late in the season that when the charming horsedrawn haywagon dropped us off among the trees, we were greeted by the site of bins full of store-bought apples waiting to be  “picked.”

Figure 2. Little things add up. An example of a floor brush kit that can help minimize leakage at cut outs.

In the data center environment, there are equivalents to this apple harvest timing challenge. There are numerous no-cost, low-cost, simple steps that you can take today to tune up a data center. Plug some leaks here, move a temperature sensor there, and before you know it, you have increased operating efficiencies and lowered operating costs while simultaneously looking smart and thrifty. All while avoiding the weight on your shoulders and the inevitable headaches that come with procrastinating in your own orchard of technology.

A GOOD TIME AT UPTIME

I was fortunate enough to attend the Uptime Institute Symposium held in Santa Clara, CA in May of this year. I knew the speakers list beforehand (Google, Facebook, Microsoft, AOL, et al.) so I was not surprised by the numerous presentations on airside economization (good thing), expanded ambient conditions in data centers (good thing), and nifty modular design concepts (I’m still looking for the value proposition for the average data center operator). But what frankly surprised … nay … dare I say, delighted me was the number of presentations on good common sense steps that John Q. Datacenter could easily understand and apply in existing environments.

So much is written about new data centers and new data center concepts, be they Coops, PACs, or Pods, that when someone comes down from the mountaintop and actually reaches out to the simple operators of legacy data centers, it can be startling. But it shouldn’t be. Because anyone who knows anything about the big guys knows that they have their fair share of legacy data centers, too. 

Figure 3. Adding blanking panels can keep expensive cold air from bypassing gear.

Consequently, if Google, as an example, is really serious about reducing energy usage across the board, then they have to take every square foot seriously. So that includes that 10,000-sq-ft raised-floor computer room in Cleveland, landlocked in the basement of Building 201, just as much as the headline-grabbing seawater-cooled jewel in fancy Finland. (For the record, I have no idea if Google has properties in Ohio, and I live in St. Louis, so who am I to cast aspersions at Cleveland … but hopefully you get my point.)

So who was sharing? Well, I won’t list everyone because that would be cheating to reach my promised word count, but let me point out a few of my favorites so that you can do some self-guided research: Robert “Dr. Bob” Sullivan, PhD; Orlando Castro and Steve Press with Kaiser Permanente; and Chris Malone with Google.

And just so we are all on the same page, the types of solutions they were talking about and that I will be suggesting are predicated on a “typical” legacy data center design, which usually incorporates the following.
  • Traditional downflow computer room air conditioning (CRAC) units

  • A raised access floor utilized as a supply plenum

  • IT cabinets arranged in a hot aisle/cold aisle configuration


THE PATH OF LEAST RESISTANCE

The path to increased efficiency parallels the well-worn path of least resistance. The path of least resistance that I speak of is the mindless jaunt that air will take simply because it knows no better. You can do all kinds of nifty things in the data center, including relocating floor grilles, implementing aisle separation, and even making modifications to your CRAC units, but if you have an “undisciplined” plenum, then it probably doesn’t matter.

When was the last time you looked under your (or your client’s) raised floor? Are the penetrations through the walls and floor sealed? Are the cutouts under the equipment enclosed in any way? Does cabling under the floor block the plenum? All of these are signs of an undisciplined plenum, and because they are normally out of sight and out of mind, they will stealthily trump the steps you take above the floor or at the equipment.

In turn, the first step to an efficient data center is to examine the plenum and establish order. This is especially important in older data centers where cable management protocols may be lax or non-existent and so many changes have been made over the years that it may not be surprising to find Jimmy Hoffa buried under a mound of Cat-5 behind CRAC-004.

A troubling example of an unmanaged plenum that has overtaken a data center and come to dictate its operation is the National Computing Center (NCC) of our very own Social Security Administration. Figure 1 is an undoctored photograph of the raised-floor environment within the NCC. Of course, the face of the technician has been blacked out, but can you blame him? You think you have problems? Has your plenum made national headlines? Act now, or you may be staring at an unflattering photo in the upcoming company newsletter.

Simple actions to take:
  •   Seal holes in walls and floors with plenum-rated products.

  • Clean up the cabling: Try to establish dedicated horizontal planes for cabling to avoid restricting or affecting airflow.

  • Consider floor brush kits at cut-outs, similar to those made by Hoffman and KoldLok (Figure 2).

  •   Replace missing or ill-fitting floor tiles.

  •   Seal up the gaps around the CRAC cabinet and the raised floor, including between the CRAC and the wall if it isn’t installed tight.

  • And if you have a drop ceiling, return plenum or not, check for openings to adjacent spaces and seal them up, too.


IF YOU'RE HOT, THINK COLD

Once your plenum is in order you may notice that some hot spots got better, but gosh darn it, some are still out there. You may even have new hot spots. Some cold spots are likely as well, but conventional wisdom would dictate that you attack the hot spots first, right? Not so fast. Recall that data centers are air cooled, and we have established that air is in no way a wise medium, conventional or otherwise. Just because you have a hot spot doesn’t mean that’s where the problem is originating; that’s just where it’s manifesting itself.

When you clean up the plenum, new pathways for the air will be opened up, and places in your data center that used to require multiple floor tiles just to get the required cooling may now only need one or two. So while that boosted airflow may seem like a good thing at first, the reality is that extra air in aisle A-1 is actually air stolen from aisle D-4.

So using one of the most sensitive sensing devices known to man (yourself), walk the aisles and see where things feel too cold, and remove perforated panels and replace with a solid; or close their dampers if so equipped. I would caution you not to use the old standby of cardboard and duct tape. Not only does it look bad, but it is a tripping hazard. All of the efficiency improvements will be for naught if you cause an outage due to a tech taking a header into a bank of blade servers.

Another byproduct of better flow under the floor may be induction at floor diffusers where before there was normal flow. In particular, if you had a wall of wire just a few feet in front of a CRAC, then that blockage would have eaten up the velocity and forced air to flow through diffusers between the CRAC and the cabling. Remove that blockage, and now the air flows unfettered and, in turn, that diffuser may actually experience induction due to the high velocity so near the CRAC. This can be especially true with old-school CRACs that incorporate forward curved fans.

Another sophisticated device to determine if this problem exists is one for which I am applying for a patent: It’s called paper-on-a-stick. Basically, if you don’t like bending over, you can use that duct tape I told you to avoid earlier and tape a sheet of paper to a stick. Walk around the data center, and if the paper is sucked to the perforated tiles instead of sent fluttering above it, then you have induction, my friend.

Simple actions to take:
  •   Cover, close, or preferably replace unneeded perforated tiles in cold spots with solid tiles.

  •   Remove or relocate floor diffusers experiencing induction.



LOW-TECH AISLE SEPARATION

Hopefully, by this point you are seeing some improvements. Note that none of the steps have required any tweaking of CRAC unit setpoints or expensive modifications. Many of these ideas can hopefully be achieved using operational funding vs. new capital expenditures. If your data center is like the typical data center, even though things may be getting better, there are probably some additional opportunities to explore above the floor level.

Start in the cold aisle. If at any place you have an unobstructed view of an adjacent hot aisle, you just found a window of opportunity. And I mean anywhere. It’s surprising to me how many empty cabinets you can find in a data center.

Recently, I was invited to walk through the flagship data center for a major colocation provider. They are as serious about saving energy as the next guy, and they are leaders in our industry. But even though they had a model layout with clearly defined hot aisles and cold aisles, in more than one instance there were empty cabinets without doors or panels to block airflow.

Figure 4. A simple example of a flexible air curtain and hot aisle containment.

Now this may simply be a byproduct of being a colo. Perhaps they have agreements with clients that don’t allow them to get inside the cage or racks once leased. So, if like the colo, you work with a CTO who would rather give you his World of Warcraft on-line password than let you touch the inside of his precious cabinets, then I would suggest you could and should establish some basic air management protocols.

Simple actions to take:
  • Establish and enforce a regular and linear geometry within the data center: Consistent aisle widths; cabinets in rows of a regular length; no gaps in the rows.

  • If you have empty cabinets with perforated doors that can be replaced with solid doors easily, do it.

  • Obtain rack blanking kits from the rack manufacturers or firms like Tripp-Lite. These can range in size from 1U up to 8U and cost less than $5 a piece (Figure 3).

  • If cabinets are missing in a row, provide rigid blank-off panels or curtains similar to those provided by Polargy and Simplex (Figure 4).

  • Give a shower curtain to the CTO and ask him to stand in the gap.


STOP FIGHTING

So far it has all been about air management. And while I took the low-tech approach to aisle separation, you can feel free to go wild and apply a full segregation strategy of your choosing. But according to Dr. Bob Sullivan (the creator of the hot-aisle/cold-aisle concept, no less), you can get as much as 10 kW per cabinet by applying these basic strategies without a huge investment. Now those are Dr. Bob’s numbers, and I winced when I heard them, but even if the number is half that, that’s still a lot of bang for your legacy buck. The bottom line is that if you manage the paths and eliminate short circuits and mixing, you will get more from your cooling infrastructure.

The last simple step I would suggest is a request that might make you blanch, but I want you to embrace your inner rebel and start tweaking setpoints on your cooling and humidity levels. I recommend an approach commonly referred to as “Trim and Respond.” I try this at home on my A/C thermostat and my wife typically responds by slapping the back of my head, but I think servers are slightly more forgiving. Basically, Trim and Respond is a strategy where you adjust a setpoint and wait to see what happens. If nothing happens, you trim again and so on, until you either get a negative response or you reach a threshold you are comfortable with. And because I am not a big proponent of tight bands on humidification levels in the typical data center, I would suggest you tweak your humidity band first.

Why? Because there is a very high probability that your humidification sequence is fighting your dehumidification sequence and you are simultaneously humidifying and dehumidifying. That’s a lot of energy being wasted to likely stay within an artificially tight limit. Another potential benefit is that if you widen your relative humidity band enough and raise your CRAC leaving air temperature sufficiently, there’s a better than 50/50 chance that you will end up with dry coils. And dry coils mean lower static drop and less water in the data center environment.

Figure 5. Although not pretty, this installation helped make the case for more formal air separation5.

So how wild do you get? That’s your call based on your comfort levels, the criticality of your data center’s mission, and the amount of authority you have. But I would work inside the ASHRAE limits for starters. And in my experience, I work with a one step forward, two steps back approach. Meaning if you trim by increasing the temperature setpoint 1ºF and you get a negative response (equipment alarm, failure, or operator complaint), then you reset the temperature setpoint back down 2º. The intent is to stay off the ragged edge.

Simple step-by-step actions to take:
  • Incrementally trim your humidity setpoint range until you reach the latest ASHRAE-recommended absolute dewpoint values of 42º and 59º.

  • Incrementally trim your temperature setpoint range until you reach the latest ASHRAE recommended values of 64º and 81º.

  • Now, if you’re feeling especially saucy, adjust humidity levels to ASHRAE allowable rh of 20% to 80% and maximum dewpoint of 62º.

  • And if you are officially on the wild side, do the same for temperature until you are at the ASHRAE allowable temperatures of 59º to 90º.


IN CONCLUSION

I hope you have found this article helpful. There is nothing groundbreaking here, but for some in the mission critical world, these items represent a mindset that they abandoned a long time ago. They have accepted the inefficiencies of their facility, striking a Faustian bargain to tolerate operational mediocrity by overcooling and overpressurizing in order to avoid confrontation or complaint. If that’s you, then have at it.

But if your job is to run a data center in the 21st century, then I bet management ranks saving costs and energy only second to ensuring uptime. And if you don’t choose to pick these proverbial apples now, one day you may experience what I experienced in that barren orchard a few years ago: Someone may have come in and picked all the apples for you.

So, get pickin’.  ES

CITED WORKS

1. http://symposium.uptimeinstitute.com/symposium-2011-program

2. http://www.reuters.com/article/2011/05/24/idUS118386640920110524

3. http://www.datacenterknowledge.com/archives/2011/02/22/social-security-works-to-avert-data-center-failure/

4. Dickens, K. “Don’t Just Step Outside Your Comfort Zone…Jump!” Engineered Systems. August, 2010:22-28.

5. https://communities.bmc.com/communities/blogs/green-it/2011/02/02/isolation-expermentation