Tuesday, 06 Jan 09
Home
Articles
Digital Editions
Products
Buyers Guide
Events
Webinars
Videos
Contact
CLB Media
Card Pack
Archives
White Papers
Subscribe
Site Map

Search Buyer's Guide


Site Search Options



GlobalSpec - The Engineering Search Engine
Advertising Info
Media Kit (PDF)
List Rental (PDF)
BPA (PDF)
Mechanical Requirements
Latest Articles



Improving Reliability of Complex Power Systems PDF Print E-mail

By Dave Cooper

Power requirements continue to increase in complexity as voltages drop and the number of voltage rails increases to support new generations of ICs.

At the same time, expectations for reliability and availability continue to increase due to the ongoing drive to reduce equipment downtime. This article examines ways in which to reconcile these apparently conflicting aspects of power system design. High reliability power converters are a key part of the solution but must be supported by a well-chosen overall equipment architecture, as well as attention to details in the power system integration.

Figure 1: typical 48 V card power system.Power System Complexity
Card power requirements have changed dramatically over the past few years, and the days when cards operated from a single 5 V power rail have gone. In current products it is not uncommon to have six or more voltages on a single card, and some high-end systems may have up to 20 or more separate power rails. Furthermore, the performance expectations for the power system are more demanding. Very low voltages must be delivered efficiently at high current, and must meet increasingly tight regulation, ripple, and transient specifications.

In addition to the need for very low voltage rails, many ICs impose requirements for sequencing and tracking between power rails during startup and shutdown. The power rails must be controlled such that the difference between them does not exceed the specified voltage and/or time limits, even under short-term transient conditions. Combine these requirements with the need to provide monitoring of all rails for overvoltage (OV) and undervoltage (UV) protection, and it is easy to see that card power systems are no longer simple.

Power System Implementation
As an example of a card power system, Figure 1 shows a typical product powered from 48 VDC, such as a communications system or a high-end compute server. DC/DC converters provide the voltage rails needed for the card, and maintain the required isolation between the 48 V input and the logic outputs. In this example, two isolated DC/DC converters are shown (usually referred to as ?bricks?) together with three non-isolated point-of-load (POL) power converters. Of course, many different combinations of power converters can be used to meet the specific needs of any particular card.
 Power management coordinates the operation of the DC/DC converters, and is necessary on both primary and secondary sides of the isolation, as shown. Although details vary, power management functions typically include:
? startup and shutdown of the power system at specified input voltage;
? controlled startup and shutdown of all outputs in the required sequence;
? monitoring of all outputs for OV and UV faults;
? controlled shutdown if a fault occurs;
? adjustment (trim) of output voltages if required;
? and reporting power status to the system controller.
 The reliability of this type of power system depends on the details of the design, including the DC/DC converters and the power management.

Power Reliability
Power reliability can be considered in two fundamentally different ways:
1. A bottom-up approach based on component failure rate. This aspect of reliability is typically expressed as MTBF (Mean Time Between Failures) or FITs (Failures in Time). Since 1 FIT = 1 failure in 109 device hours, 1000 FITs = 1 million hours MTBF.
2. A top-down approach based on ability to perform the required functions. This aspect of system-level reliability can be addressed by simulation and testing of the complete system. The testing must be sufficient to ensure that the design meets all required functions under all operating conditions, a process called ?qualification?.
 It is important to consider both of these aspects in your design. A very good MTBF is of little value to the customer if the power system shuts down every time there are thunderstorms in the area.

Reliability Improvement
There are three fundamental ways to improve reliability of any system:
1.  Use fewer components.
2.  Make the components more reliable.
3.  Make the system function even if components fail.
Each of these can play a part in improving power system reliability, together with comprehensive qualification testing.

Fewer Components
One area where component count can often be reduced is in the power system management. A dedicated power management IC can replace a large number of discrete components used for monitoring and control, such as comparators, op amps, optocouplers, and RC time delays. At the same time, a power management IC can offer much better performance than a discrete solution, improving system reliability by accurately reporting marginal performance while avoiding nuisance trips. For example, the Potentia PS-2610 measures each output rail voltage every 40 microseconds using an 8-bit analog-to-digital converter (ADC). The PS-2610 uses digital filtering to allow fast response to a real OV condition while avoiding false OV or UV shutdown due to voltage spikes.


The power system topology (the configuration of DC/DC converters) directly influences the reliability. A typical POL contains fewer internal components than an isolated brick, and the failure rate can be significantly lower. The manufacturer's quoted failure rate for a typical POL is about 200 FITs (equivalent to an MTBF of five million hours) whereas a typical brick is about 500 FITs (which is an MTBF of two million hours). On the other hand, a POL usually has a lower output power than a brick so you may need more of them to meet your total power requirement. Reliability is of course only one of many factors in the choice of power topology, but by considering reliability early in the design you can make the best trade-off for your application.
 It is difficult to reduce the number of components needed to generate the power rails themselves, except by using custom designed multi-output DC/DC converters. A custom DC/DC converter requires substantial time and effort, both for design and for qualification testing. Furthermore, a different design is typically needed for each card in a system, so that this approach can only be justified for very high volume applications.

More Reliable Components
Component reliability is influenced primarily by the qualification and quality control processes used in manufacturing, and by the stresses applied in the application. One way to improve the reliability of power conversion is to take a modular approach, using standard off-the-shelf DC/DC converters as components in your design. These units are built in high volume using an automated process with full quality control, and have excellent performance and reliability. You will avoid the need to calculate component stresses within the power converter, since the design has been optimized during the manufacturer?s in-house qualification.


Similarly, plan your power management design around a dedicated power management IC rather than a general-purpose device such as a gate array or a microcontroller. A power management design using a microcontroller or gate array will require extensive testing under both normal operation and fault conditions. This is needed to ensure that logical errors in programming do not cause incorrect behavior. Conversely, the behavior of the dedicated power management device has already been fully tested and qualified by the manufacturer, and only the operating parameters (voltage levels, time delays) need to be programmed.


Fault Tolerance
System reliability can be dramatically improved by designing the system to be fault tolerant. In the ideal case, if any component fails there is a backup available to instantly take its place, and system performance is unaffected by the failure. The term ?availability? is used to express the proportion of time for which the system performs as expected. The provision of backup components is referred to as ?redundancy?. In a practical system there are limits to the degree of redundancy that can be achieved, and availability can never reach 100%. Through careful design, redundancy can provide almost complete protection against any single fault, and can achieve 99.999% (five nines) availability or better.

In most redundant systems, the redundancy is achieved by duplicating entire cards. For example, two identical control processor cards can be used in a shelf, either of which can take control if the other fails. The 48 V distribution systems are also duplicated, with dual 48 V feeds to each card from independent circuit breakers. If any individual circuit breaker trips, the cards still receive uninterrupted power through the second feed. In most cases it is not considered beneficial to duplicate the on-card power system itself, and any failure of the card (power or otherwise) is repaired simply by replacing the card.


For redundancy to be effective, it is vital that all component failures are immediately reported to the operator for maintenance before the backup fails. In the power system, this implies not only comprehensive monitoring of all output voltage rails but also monitoring of fuses and power feeds to detect any loss of redundancy. Additional monitoring such as input current measurement and thermal sensing can provide advanced warning of overload conditions and further improve reliability. Potentia offers several power management devices that are designed to monitor these primary side functions as well as the secondary side components.

Summary
Even though the power system has become more complex, high reliability can still be achieved. By minimizing component count you can improve your failure rate and achieve a high calculated MTBF. By providing effective power management, you can implement features that improve the overall equipment reliability. Remember that reliability is much more than just MTBF, and carry out thorough qualification testing of your power system to ensure it fully meets equipment requirements under all conditions.
Dave Cooper is with Potentia Semiconductor.


 
< Prev   Next >


© All materials on this web site are copyright protected and the property of CLB Media Inc.
For permission reprinting or reproducing any materials please email your requests.
© CLB MEDIA INC., 2009 Canadian Electronics Magazine
Privacy PolicyTerms & Conditions





 
[ Top ]