Kevin Parmenter, Excelsys Technologies, Vice President Applications Engineering, N. America
Power supplies are the underpinning of any electronic system. This article presents the five main reasons power supplies fail. It also describes the necessary precautions design engineers should take to avoid systems failures. The article is based on data spanning all markets and applications, from industrial, medical electronics and military to telecom, datacom to computing and scientific fields. It includes the most common power electronics applications, from low cost to extremely high cost, including flight simulators, digital signage, medical equipment test and measurement equipment, semiconductor equipment and more.
Top Causes of Power Supply Failures
A fundamental law of physics is that for every 10 degrees C that you are able to keep the power supply’s environment lower than 40 degrees C, you double the MTBF. Conversely, for every 10 degrees C your power supply’s ambient temperature increases, your MTBF cuts in half (i.e., your power supply is half as reliable). Many, but not all the failure mechanisms on this list are related to temperature.
More and more we are seeing the use of end equipment plastic chassis compared to the metal chassis that have been used since time began which impacts thermals as well as EMC. Anything you can do to enhance thermal management around your power supply in the system is of critical importance.
Fans are the number one failure mechanism of power supplies, as found by both military MTBF simulations as well as Belcore standards, and as both simulated and demonstrated in reality. As the only electro-mechanical moving part incorporated into power supplies, fans are prone to fail in even the most in properly designed power supplies. Often we see a no-fans requirement for the power supply only to have the end user add fans to get rid of the heat of the entire system. But this approach just transfers the problem from one place to the other.
Another problem in the industry is the proliferation of counterfeit fans into the supply chain. In one case I know of, a customer discovered a substitute fan they bought that was indistinguishable from the original – except that it moved 30% less air and consumed different power than the original. It is important to make sure your power supply partner has processes to keep counterfeit parts out of the supply chain; otherwise that low cost power supply is going to get expensive very quickly.
A fanless system can be sealed, which also eliminates other issues, including ingress of moisture. And the case of outdoor applications, such as digital signage, a sealed system can keep out leaves, bugs, twigs, bird nests as well as rain and moisture, plus salt and fog in the case of maritime applications.
Removing the fan increases reliability by 25% and is the best solution for avoiding failure. A good design that keeps the efficiency of the power supply high enough makes fans unnecessary.
The key to good power electronics design is: “don’t need a fan if you can help it.” To address this need, Excelsys recently introduced a convection-cooled modular power supply that delivers 600W of output power without using fan-assisted cooling (Figure 1).
Click image to enlarge
Figure 1. The CoolX 600 Series fanless power supply offers very high input and surge-withstand built in.
Despite popular thought, a lot of progress is being made in capacitor technologies every year; however, they are prone to failure if overstressed or if substitutes are made in production or by counterfeiting.
Capacitors, especially electrolytics, can be found failed in many different failure states, including swollen, leaking, exploded, shorted, displaying reduced capacitance or increased in circuit ESR. Sometimes excess heat causes capacitor damage. Electrolytic capacitors can leak chemicals, which can then cause further damage from corrosion, eating away PCB traces and other problems (Figure 2).
Click image to enlarge
Figure 2. This example shows the damage caused by leaking electrolytic material from a capacitor.
To prevent failures, use high quality capacitors from name brands. Also, derate, derate. Keep capacitors as cool as possible and watch the ripple currents to make sure they are not excessively stressed. It’s important to know that storage life of electrolytic capacitors is limited to two years without power on the power supply, which is something that usually gets overlooked. As power designers we avoid electrolytic capacitors if we can, but if we can’t avoid them, we get the best we can find. (We specify two years of unpowered storage maximum to avoid the electrolyte becoming effected by the long term unpowered storage.)
Power switching components, or MOSFETS, which take the brunt force of operation of the power supply, can sometimes cause failure if the heat sinking is inadequate, or if the drain overvoltage, drain overcurrent, gate overvoltage or the internal antiparallel diode is overstressed.
Proper design and the derating of components will go a long way to help the MOSFET have a nice long life in the application. Proper design, attention to the control circuitry, loops testing and derating can ensure proper operation and long life of these components.
Power diodes can also fail due to improper heat sinking or thermal management, airflow and such. Schottky diodes can be damaged by overvoltage in driving inductive circuits. They are not as forgiving as MOSFETs during overvoltages. Also, switching losses in rectifiers can be a large source of heat. Trr tails can occur where switching time extends a bit with temperature, causing the heat to rise and a positive feedback loop can occur and the part can be damaged. This potential problem must be carefully considered during design to keep the dissipation low. Proper design, component selection and characterization, along with derating, will do wonders.
Control ICs often have an unusual region of operation and, if misunderstood or misapplied, can lead to failure. This includes incorrect clock operation or improper PCB layout, which will make the control IC susceptible to noise or oscillation. All controller ICs have their own unique behavior and need to be well understood in the application, including the work-arounds and “undocumented features” for the intended application.
To avoid failures with commercial control ICs, start up conditions must be understood. Current limiting, soft-start modes, proper gate drive, spacing and measuring the control loops – all must be done to ensure stable operation over all conditions. The control ICs must work perfectly every time, otherwise damage will likely be seen in the MOSFETs since they take the brunt of the energy when the control IC fails or becomes unstable. With digital controllers increasingly being used in power electronics designs, we see software and control ICs being one issue, and sometimes it is the control IC that fails; however, it’s usually the switching MOSFETs that end up being taken out.
Environmental issues from moisture ingress is sometimes seen in medical electronics where equipment is cleaned with disinfecting solutions that enter the power supply’s ventilation openings and fan ports (another reason to eliminate fans). Moisture will corrode the electronics and will eventually lead to failures. Other failure modes from the use environment include surges and transients that are well above the ratings and many IEC standards, which usually damage the semiconductor components in the front end of the power supply. Some of these environmental concerns can be controlled by design in the application and some cannot.
Other environmental problems are lightning strikes and other induced power line surges and transients (Figure 3). The toll from these causes can be minimized by careful design and test of the power supply and by adding external protection components. For example, there are excellent surge protection devices from Littelfuse, such as the LSP10240 series, that can handle tremendous transients and surges to protect the AC input of a system. Newer power supplies have surge protection designed into them and some are also designed to handle 300 VAC for five seconds, since power line stability globally is not a guarantee.
Click image top enlarge
Figure 3. This photo shows burned capacitors caused by an open-air arc from a lightning strike.
Other environmental considerations are loads – reactive loads such as regenerative motor drives, battery charging, super caps and more. Loads should be considered and, potentially, protection circuits like diodes can be added. In your application, this could prevent 250 volts from a motor-turned generator from being applied to the 24-volt outputs of your power supply.
In many of the applications I work with that have reactive loads the problem is solved by reactive load modules like the XGR and XGT modules from Excelsys. These modules employ bypass diodes and blocking circuitry built in, thus eliminating the need for any external circuitry to protect the power supply from back EMF. This approach often does wonders.
There are other conditions that can cause power supplies to fail but, based on the research, the ones I’ve described happen most frequently. When designing a system, the main rule is to make the power supply itself the first consideration – not the last.
Engineers should try to eliminate the fan using a fanless power supply if possible. They should also use legitimate components and create a well designed, robust system. It is also important to choose a power supply partner that offers a extended warranty to help ensure that they know what they are doing. But it is also vitally important for the engineer to understand the warranty. For example, when your low cost power supply fails it might mean that when you place your next 1000-piece MOQ order from a far off land, you will be shipped a new power supply. However, that solution doesn’t not begin to pay for the cost of the expense of the failure.
A high-quality power supply company will take the lessons learned from experience and incorporate them into new designs to increase reliability and reduce field issues. And offering a long term warranty means you won’t have any issues in the field in the first place.