Critical Facility Mechanical Equipment And System Reliability
Come to terms with the key terms, and learn how to wrestle with the cumulative effect as each component tries to edge the overall system reliability downward.
In today’s world, so-called “high-performance, sustainable” facilities are a dime a dozen. But many of these buildings rely on overly complex mechanical systems to carry out their mission. While these systems may meet design requirements, they routinely fall short of performance expectations — a direct result of the mismatch between elaborate systems and the often-limited resources of the facilities personnel expected to operate them.
You were just selected to design a critical exhaust system. The critical exhaust system must maintain negative space pressurization without failure for the five-year time period between scheduled maintenance services. The client wants to review the proposed critical exhaust system design reliability calculations.
Several questions need to be answered. What is reliability? How reliable does the system need to be? How does the equipment selection affect reliability? How does the system design affect reliability? In order to address reliability, one needs to understand concepts like failure rate, availability, and mean time between failures (MBTF).
Failure rate (l) is the rate at which an item of equipment or a system fails. It is important to understand that each item of equipment or system has a variable failure rate. The graph illustrated in Figure 1 has three distinct failure rate phases: infant mortality period, constant failure period, and wear-out failure period (bathtub curve). During the infant mortality period, equipment failures typically are caused by manufacturing defects. A good manufacturing quality control program will remove the faulty equipment before it reaches the purchaser. During the constant failure period, equipment failures are random and are caused by operation out of specified design parameters, lack of maintenance, exposure to contaminants, and/or product design error. During the wear-out failure period, the various components have extensive operational wear and eventually a component will fail from fatigue or from being substantially out of the manufacturer’s dimensional tolerance. The failure rate (l) equation is:
l = r/T
where: r = total number of failures occurring during the defined time frame.
T = total running time or cycles during the defined time frame.
Availability is the percentage of time an item of equipment, a subsystem, or a system is operational. For example, a continuously operating fan during a given year with 99% availability translates to the fan not being operationally available for 3.65 days a year. Availability is typically expressed in nines notation, and its simplest equation is:
A = Uptime/(Uptime + Downtime)
MTBF is the average time between equipment and/or system failures or planned maintenance during operation. For the example critical exhaust system, the MTBF is five years.
Reliability (R) is the probability of an item of equipment, a subsystem, or a system performing its required functions under stated conditions for a defined time frame. Stated differently, reliability is a measure of the probability for failure-free operation during a given interval. Mathematical distribution models are available for determining reliability; including Gaussian, Rayleigh, Weibull, and exponential models. This article will focus on the exponential distribution model. The exponential reliability equation is:
R(t) = e-t/MTBF or R(t) = elt
Where: e = Euler’s Number (2.71828)
l = failure rate (1/MTBF)
t = time
HOW RELIABLE DOES THE MECHANICAL SYSTEM HAVE TO BE?
There are two questions that need to be answered. What is the financial/business disruption impact of a failure? How much reliability can you afford to buy?
Table 1 provides percentage of reliability and number of failures that can be anticipated during one year, five years, and 10 years of operation. For example, a mechanical system with 40% reliability can be expected to have a failure once a year, while a mechanical system with 80% reliability can be expected to have a failure only once every five years. There are three important points to understand:
- Reliability is a probability. The mechanical system failure can occur during the first week of operation or may not occur until after 10 years of operation.
- The probability of a failure can be reduced but never eliminated.
- The overall mechanical system reliability is cumulative of all the system equipment and components.
If a mechanical system failure does not have a substantial financial/business disruption impact, the mechanical system reliability does not need to be very high. For example, a restaurant or convenience store may only need a 30% or 40% reliable mechanical system. In most cases, these types of buildings would not have a specified reliability requirement.
If a mechanical system failure has a substantial financial/business disruption impact, the mechanical system reliability needs to be very high. Looking at Table 1 and the example critical exhaust system, a mechanical system reliability of 80% would not be acceptable due to the probability of one failure during the five years between system maintenance services. The 90%, 95%, and 99% reliability have less than one probable failure during the five-year period between maintenance services. The selected reliability needs to be based upon acceptable risk; cost of failure; and cost of building, operating, and maintaining higher-reliability mechanical systems.
There is a point of diminishing return where the cost to obtain a higher reliability substantially outweighs the benefit received.
HOW RELIABLE DOES THE EQUIPMENT NEED TO BE?
The system equipment has the largest influence on the overall system reliability. Was reliability included in your last critical facility mechanical equipment specification? Probably not. The present industry standard of care for equipment specifications typically does not include a reliability performance requirement. The closest that most equipment specifications come to specifying reliability is by specifying bearing basic rating life requirements (i.e., L10 = 40,000 hours, L10 = 80,000 hours, L50 = 200,000 hours, L50 = 400,000 hours) . These bearing basic rating lives are based upon a reliability of 90% or less. So, if the specified equipment has a reliability requirement above 90%, the present industry standard of care for equipment specifications is inadequate.
When determining equipment reliability, the reliability for each component needs to be determined. For the example critical exhaust system, there may be one or more exhaust fans. The exhaust fan design can have a substantial impact on its reliability. Looking at Figure 2, a belt-driven exhaust fan will have a fan housing, impeller, shaft, bearings, sheaves, belts, and motor reliability making up its overall reliability. The belt-driven exhaust fan reliability equation is:
Rbelt-driven fan = Rfh * Rfi * Rfs * Rb1 * Rb2 * Rfsh * Rms * Rfb * Rm
A direct-drive exhaust fan will have a fan housing, impeller, shaft, bearings, and motor reliability making up its overall reliability. The direct-drive exhaust fan reliability equation is:
Rdirect-drive fan = Rfh * Rfi * Rfs * Rb1 * Rb2 * Rm
If we assume all exhaust fan components have 95% reliability, the belt-driven exhaust fan will have a reliability of 63% and the direct-drive exhaust fan will have a reliability of 74%. Based upon the number of components that make up each exhaust fan and the operating performance requirements being the same, the direct-drive exhaust fan will usually have a higher reliability than the belt-driven exhaust fan. The cumulative exhaust fan component reliability makes it difficult to have a high-reliability exhaust fan. There are two important points to remember:
- Equipment reliability is cumulative of all individual component reliabilities, and the overall equipment reliability cannot be higher than the least reliable component. (The proverbial weakest link analogy.)
- The more components an item of equipment has, the more difficult it is to have a higher reliability (based upon none of the components are back-up).
DESIGNING MECHANICAL SYSTEMS TO MEET RELIABILITY REQUIREMENTS
To determine the required mechanical equipment reliability that needs to be specified, it is important to know how the individual mechanical equipment integrates into the overall mechanical system. The equipment can be in series or in parallel, as illustrated in Figure 3. The reliability equation for a mechanical system with two items of equipment in series is:
Rsystem = Requipment 1 * Requipment 2
The reliability equation for a mechanical system with two items of equipment in parallel is:
Rsystem = [1 – (1 - Requipment 1) (1 - Requipment 2)]
Using 95% reliability for all equipment, the system reliability equation with the equipment in series and parallel are:
Rsystem = 0.95 * 0.95 = 90.25% reliable (series)
Rsystem = [1 – (1 – 0.95) (1 – 0.95)] = 99.75% reliable (parallel)
Per the reliability calculations, having the equipment in parallel increased system reliability while having the equipment in series reduced system reliability. It is important to note that the additional reliability achieved by having two items of equipment in parallel is based upon the system being able to operate on one item of equipment and that the system can remain operational while the failed item of equipment is repaired or replaced. If the system must be de-activated to make the equipment repair or replacement, the system reliability is the same as if the equipment was installed in series.
Utilizing the belt-driven exhaust fan reliability of 63% and the direct-drive exhaust fan reliability of 74%, the subsequent reliability for putting two belt-driven fans in parallel and two direct-drive exhaust fans in parallel is as follows:
Rbelt-driven = [1 – (1 – 0.63) (1 – 0.63)] = 86% reliable
Rdirect -drive = [1 – (1 – 0.74) (1 – 0.74)] = 93% reliable
When running two exhaust fans in parallel, both the belt-driven and direct-drive exhaust systems had substantially better reliability. Adding a third exhaust fan to the exhaust systems had the following impact on system reliability:
Rbelt-driven = [1 – (1 – 0.63) (1 – 0.63) (1 – 0.63)] = 95% reliable
Rdirect-drive = [1 – (1 – 0.74) (1 – 0.74) (1 – 0.74)] = 98% reliable
With the third exhaust fan, both the belt-driven and direct-drive exhaust systems had improved reliability over the dual-fan exhaust system and the belt-driven exhaust system reliability was comparable to that of the direct-drive exhaust system.
With the example critical exhaust system having two direct-drive parallel exhaust fans, ductwork, and two dampers, the reliability equation is as follows:
Rdirect -drive exhaust system = Rductwork * Rdamper1 * Rdamper2 * [1 – (1 – Rfan1)(1 – Rfan2)]
Note: to simplify the discussion, the electrical service reliability was considered part of motor reliability.
Many high-technology facilities (hospitals, data centers, pharmaceutical production, nuclear plants, and chemical plants) have very demanding environmental requirements. A mechanical system failure can have a catastrophic impact on research, product quality, data, radioactive/biological/chemical contamination containment, and even human life. It is imperative the design for these mechanical systems include detailed reliability planning and calculations. Equipment reliability requirements must be included in the equipment specifications and system reliability requirements must be included in construction documentation. ES
1. Naval Surface Warfare Center Carderock Division. Handbook of Reliability Prediction Procedures for Mechanical Equipment.May 2011.
2. Troyer, D. Reliability Engineering Principles for the Plant Engineer. Reliability World 2007 Conference Proceedings. 2007.
3. AST Bearings LLC, Life and Load Ratings – Radial Ball Bearings.
4. Barringer, Paul H. “Availability Is Not Equal To Reliability.” July 2003.