Impact of Hardware Design on Failure Mode of Protective Devices

Authors: Paul Myrda and Charles Perry, Electric Power Research Institute, USA

This article presents the design characteristics and assumptions at the substation, and of the protection device design, field testing precautions, factory testing considerations and other factors relevant to this failure.  As the utility industry migrates to smart grid technology and begins to further ramp up the use of microprocessor based products, design considerations and device failure modes become even more critical.  In this particular case device design changes were required to mitigate the root cause as well as field upgrades to installed devices. 

Background
In the fall of 2009 utility personnel at a utility 12kV substation were performing routine testing of a 125V DC transfer switch, Figure 1.  During this testing all circuit breakers on a bus section tripped open automatically after switching the 125V DC transfer switch from Normal to Emergency for the Bus Section and then closed sequentially several times.  In total six circuit breakers operated.  To prevent further operations, the circuit breakers' Local/Remote switches were turned to Local.  One of the circuit breakers was then closed locally, but tripped when it was switched to remote.

Upon investigation it was discovered that the bus section control unit had failed in a state where all outputs were high. When they tried to interrogate the unit, it failed to communicate from the front or the rear ports. As the control unit was powered down, all output relays were observed being de-energized.  All output relays picked up again instantaneously as the control unit was repowered.  The DC Transfer Switch that was exercised also supplies 125V DC control power to this control unit.

This failure mode is not acceptable since it can cause customer outages.  There are several control units installed at the utility's substations, resulting in an expedited need to get to the root cause of the failure and have it corrected.
The control unit was left out of service, awaiting replacement.  As a result, the utility did not have the remote indications or controls for this bus section.  All circuit breakers in this bus section must remain in local for isolation.  The metering for the transformer was also out of service because of the defective control unit which caused the station Voltage VAR Control (VVC) to not function as well.

Preliminary Findings
The failed control unit was removed from the field and shipped to the supplier to determine the root cause of the failure.  The supplier's initial comments follow:

  • The most significant finding is that extensive damage occurred around Vcc and Vss pins, and the devices were exposed to electrical overstress (over-voltage or over-current) originating at those pins
  • No specific signature or chip failure mechanism, like latch-up conditions on parasitic bipolar transistors within the MOS structures, was recognized and acknowledged by the lab
  • Our further investigation is focused on simulating the cause of the device failure corresponding to the transfer switch operation as documented in materials obtained from the customer
  • Effects on control unit power supply during battery bank switching are of primary interest. Control unit grounding requirements:
  • The grounding method described in control unit user's manual is very clear about the function of each grounding terminal on the back panel

Proper grounding practices must be followed. The two back panel grounding terminals, power supply ground terminal and chassis protective earth stud, may be at an equivalent DC resistance, however at high frequencies these two connections do not provide equivalent EMI immunity

Initial Inspection at Vendor Facility
The failed control unit was removed from the test rack and the front panel was removed.  Upon removal of the front panel, an electrically programmable logic device (EPLD) was noticed to have a burn mark on it.  This chip IC2 is responsible for controlling the output relays on the control unit.  Figure 2a shows the burn mark on the EPLD.

After a second board was removed, a chip similar in appearance to IC2 and named IC1 was found to have a similar brown spot on it.  This chip controls general logic (Figure 2b). Inspection of the main board revealed no other visible problems.  Voltage was applied to the unit to perform a temperature analysis.  Below are the results:

  • Temperature was measured off center at lower right corner of label
  • At startup temperature was 260 C
  • IC1 reached 510 C steady after 5 minutes
  • IC2 reached 600 C and still rising after 5 minutes
  • After 10 minutes IC1 remained at 510C
  • IC2 increased to 650C
  • At the center of IC2 the temperature is 110 0 C
  • IC1 at the center (on the dark mark ) is 730 C
  • According to manufacture specs, max operating temp is 850 C

As shown in Figure 3, IC2 temperature rose above the manufacturer's max operating temperature.  This further indicates a failure on this chip. At this point, the remaining boards were removed from the control unit, cross-referenced with a parts list, and inspected for failures.  No other visible failures were noticed on the remaining boards.

 

 

Ad: Redefining MiCOM® Novel | Draw-out | Versatile
Ad: Do you think the demands on your substation have changed in 50 years? Then why are you still using 50-year old  test switch technology?
Ad: Solutions for Communication Protocol Development