Despite vast investments in always-on power, data centers the world over continue to grapple with power disruptions and outages. Because continuous availability is the key performance indicator, new tools to reduce the likelihood of an on-site power failure are essential.

Globally, 69% of data center owners and operators experienced a power-related outage between 2019 and 2021. Half of those outages had a “significant impact in the form of cost, time, and reputation.” And the biggest cause of those outages were on-site power failures. Further, the cost of outages is increasing. Outages costing between $100,000 and $1 million are up year-over-year in 2020 and 2021, according to the Uptime Institute Global Data Center Survey 2021.

The data on outages is telling — focusing investments on electrical infrastructure continues to be a critical differentiator. Without question, new approaches are essential to achieve continuous availability. And, new intelligence in power distribution systems is delivering on higher levels of availability through data-driven insights, which make it possible to spot problems that have otherwise gone unnoticed.

Changing the paradigm for electrical maintenance

On-site power problems are a stubborn and costly challenge. An organization stands to make a big impact by reducing the risk of a power distribution challenge, preventing future issues, and reducing the duration of outages. To achieve these goals, here’s what is needed from an electrical system and power distribution equipment:

  • Insight into equipment health providing foresight into a problem as it develops (and before it causes an outage).
  • Information on the nature of the issue, so you’re able to dispatch maintenance teams to make a targeted fix, not look for the proverbial needle in the haystack.
  • The ability to get equipment back online quickly when problems do occur.
FIGURE 1: A digital approach to switchgear maintenance can help reduce walk-through time and electrical exposure through detailed data and set alerts — keeping personal informed without putting people in front of energized equipment.
Images courtesy of Eaton

Innovations in power distribution equipment are delivering solutions to keep the power on — always. Data driven insights and alerts from switchgear and circuit breakers are expanding the tools available to evaluate and prevent an outage. And when there’s a fault, arc-quenching technology dramatically reduces outage duration from an arc flash event, enabling equipment to come back online without the need to have it totally replaced.

Five ways intelligent switchgear reduces outages

1. Delivering always-on data and powerful insights

New data from the circuit breaker trip units, meters, and protective relays provide a host of individual insights, and when combined, can provide a comprehensive view into equipment health that can be used for predictive maintenance and improving system reliability. For example, specific insights from circuit breaker trip units include:

  • Operational data that shows when a circuit breaker mechanism was last exercised and if the mechanism was bound or jammed.
  • Total number of operations to provide indication of the endurance wear on a circuit breaker mechanism and indication of contact wear.
  • Number of interruptions and the magnitude of the energy interrupted to gain insight into contact wear and arc chute condition.
  • The magnitude of short-circuit events, which can be damaging to the contacts, integrity, and dielectric strength of the circuit breaker compared to the rating of the circuit breaker and weighed as a factor in the health of the device.
  • The environmental temperature, which is one of the most important measurements. This is the highest temperature recorded and the date and time of that temperature is saved in the analysis.

With the NFPA 70B shifting from a guide to a standard, there will be impacts to preventative maintenance operations for electrical installations in data centers. Vendors who provide preventative maintenance technology within equipment, like continuous monitoring and alarming, may be able to help reduce the frequency of prescribed maintenance operations. That said, in the event of an alarm, needed maintenance should be addressed in real time.

2. Protecting the cybersecurity of connected systems

Cybersecurity risks to connected systems have never been greater, as malicious threat actors look to exploit system vulnerabilities (weaknesses and gaps in protection). This means routine maintenance now includes a full assessment of a system’s attack surface, vulnerabilities, and maintenance practices — performed yearly at a minimum.

The digital transformation of industrial environments is enabling a critical differentiator for supporting trusted connections in continuous maintenance and real-time monitoring of operational technology (OT) networks. This involves real-time asset identification and vulnerability, anomalous activity, and rogue device detection. Configurable alerting, logging and alarming can be configured for centralized collection, correlation, and alerting from all asset types for a comprehensive view of system operation and potential intrusions, risks, and threats.

Today, operators can even go a step further by integrating industrial network defense into the centralized management system, which is a service focusing on network boundaries that offers several benefits. Network boundary defenses provide asset visibility, enforce functional isolation, traffic restrictions, secure remote access, and general protection to prohibit unauthorized access to critical assets. Boundary defenses can be readily deployed with minimal disruption to existing operations and network architecture and easily integrated into everyday centralized maintenance routines.

3. Continuous thermal monitoring identifies problems in real time

Traditionally, maintenance staff conduct infrared (IR) scans of switchgear during regular maintenance, which can be on an annual or even three-year cycle, depending on environmental conditions and the process itself. When the IR scan indicates an issue, trained staff needs to open the switchgear cabinet to investigate. Digitalization enables a fundamentally different approach.

Today, continuous thermal monitoring of gear enables a shift away from in-person IR scans to providing data that’s always available and doesn’t require timely or costly labor. When there’s a problem, an automated alert will inform staff of a problem and when equipment needs servicing.

In other words, there’s no need to wait for the annual (or three-year) maintenance scan because a problem can be identified when it’s developing. People do not need to perform time-intensive scans on equipment that does not need to be serviced, which reduces unneeded safety risks and proximity to energized systems. Instead of searching for a problem, staff time can be alerted that they need to focus on fixing a specific problem to prevent downtime.

4. Easy access to test reports

Digitalization of switchgear and circuit protection means there’s far more ready access to testing data, which removes guesswork when plant managers establish maintenance programs. There’s new avenues to access original manufacturer factory test reports, making it easier to users to access information about the equipment as it was originally shipped. In addition to the test reports, the next generation of electronic trip units allow for integrated secondary injection testing capabilities, which verifies everything connected through the trip unit. In other words, electronic trip units can verify the complete integrity of the connections, sensing circuitry, microprocessor, and circuit breaker mechanics. There’s no need for bulky test kits — testing capabilities are built directly into the digitally enabled circuit breakers.

5. Arc-quenching capabilities

There are three factors that impact the severity of an arcing event: available power, distance to a fault, and duration of a fault. The NFPA 70E outlines six risk control methods, including both preventative and protective risk control, in the following hierarchy: elimination, substation, engineering controls, awareness, administrative controls, and personal protective equipment.

New engineering controls can be designed into the system using arc-quenching technology to minimize the duration of a fault. This technology consists of two main parts: an arc flash relay and arc-quenching device. When the relay detects an arc fault inside of the gear, it sends a signal to the arc-quenching device, which produces a lower impedance arc fully contained inside an arc-containment vessel. The lower impedance arc collapses the voltage and immediately extinguishes the arcing fault as the current begins to flow into the arc-quenching device. This occurs in less than 4 ms or about one-quarter of a cycle, which is an order of magnitude faster than traditional technologies.

Driving down on-site power problems

Industry research across power-intensive applications shows data center operators and owners are leading on digital enablement and seeking the next opportunities that streamline operations and improve availability. Real consideration needs to go into technologies impacting continuous uptime. And advances in digitalization and arc science are paying off, creating new possibilities for uptime and safety. Adopting data-driven insights can also evolve maintenance approaches by enabling your teams to better anticipate problems, fix them more quickly, and minimize unnecessary physical work on systems that are already performing well. All in all, new digital solutions will help drive down the persistent occurrence of on-site power outages and will make them a less impactful when they inevitably do occur.