Developing the holistic approach - from single element testing to system validation

Authors: T. Schossig and C. Pritchard, OMICRON electronics Gmbh, Austria

Elements become Systems

The first protection systems were built with single function primary devices (Figure 1). But the complexity of the panels very early on lead to electromechanical (EM) relays aggregating multiple elements (e.g. direction and impedance).
EM relays are still in operation and survived their successor electro-static relays (Figure 2). But in the early 1980's we finally saw a technological revolution – the microprocessor-based or digital relays, which still to this day are the de-facto standard when building or refurbishing a protection system. Digital relays brought the benefits of:

  • Reducing the complexity, size and cost of the protection panel by combining the functionality of many EM relays into one digital relay
  • No setting drift caused by mechanical components
  • Adding fault reporting
  • Possibility to add communication

Since the first digital relays many evolutionary steps followed, that are important when we want to shape the future of protection testing.

From multifunctional protection relays to multipurpose devices:   What used to be a protection relay with the single purpose of protecting the power system, nowadays must fulfill multiple purposes, for example be a:

  • Phasor measurement unit (PMU)
  • Meter
  • Power Quality device
  • Automation and control unit

IED + Communication = System: Out of all technological evolution, communication and the available data has the biggest impact on PAC systems today. There is almost no transmission line today, that is not protected either by a differential protection scheme or a permissive transfer tripping scheme or both. Distribution schemes use communication for isolation and restoration.  Inside substations, the effort of applying breaker failure, reverse blocking and fast bus transfer schemes is decreasing. With IEC 61850 communication, the engineer can simply decide if he wants to engineer such schemes, without the need for wiring changes. And with the first implementations of centralized protection and wide area protection systems, we get a glimpse of what is next.

History of control systems: Control was an important topic from the beginning. Figure 4 shows an example from the 1920s. The driving mechanism stored 100 operations. After that it was operated by hand back. This included a visual check.
Multifunctional digital relays come with hundreds of functions and include control.
With IEC 61850 and Sampled Values, often combined with none conventional measurement transformers IEDs (Intelligent Electronical Devices) and even digital centralized protection, automation and control became possible. As the  readers of this magazine are well aware of this.

Developments:  So, we have the development from classical control to complex control systems as it was with protection to combined PAC. Automation functions as busbar change, interlockings, visualization of topology, load control became reality.  When in the past line communication with 150 or 300 Baud was sufficient for control operation and transmission of sum alarms more bandwidth was required. Network technology entered the substation. When IEC 60870-5-104 was the first step describing the communication via Ethernet a new concept should be developed containing data modelling, description of services and flexibility.
Communication, especially with IEC 61850 became a part of the PAC system. Realtime communication with GOOSE opens a wide range of opportunities. Position indications can be exchanged via multicast, protection as automation information distributed fast within and even outside substations. Digital substation as covered in the last issues, pushed this development dramatically.
This has an impact on testing and testing technology.

Evolution of Protection Testing

History of protection testing: The history of protection testing is covered in the current issue of the magazine and will continue.
Test sets had to be multifunctional (Figure 5). They had to be portable (Figure 6).
Automation started already at this time. The most important was all the time to have fun while testing, as can be seen in Figure 7.
With the digital relays a new era in testing started (Figure8).

History of SAS testing:

 Substation Automation Systems have been available for a long time. They have been put into operation and tested in detail during commissioning and site acceptance ("bit test".)  Routine or maintenance testing was not common practice.

 Why are we testing protection? Utilities install and maintain power system protection with the intent to maintain or increase power system reliability. To achieve this goal utilities, standardization committees and federal agencies developed rules, processes and best practices that shaped state-of-the-art testing today. This can lead us to sometimes forget the goal of system reliability and only test for compliance, and in order to avoid legal claims. Therefore, we must constantly realign the testing strategy to the goal of improving the power system reliability and to reshape state-of-the-art protection testing.

Let's make this goal a little more tangible.  A well-designed protection system maintains an appropriate balance between selectivity, dependability, security and speed. If the protection is not working correctly during a system event, we call it a misoperation. By testing the protection, we try to find errors before they cause a misoperation. Therefore, with limited testing time and resources, we must invest our efforts where we can prevent misoperations. Which raises the question: what is causing today's misoperations?

The causes for misoperations in modern protection systems: In the early days of electromechanical (EM) protection, the cause for misoperations was very often found in the relay itself. Temperature, vibration and other influences caused the relay threshold to drift or caused complete mechanical failure. To avoid such relay failures, testing thresholds – for example over-current pick-up values and time delays – is a proven technique.
Since the introduction of the first microprocessor-based relays the power and complexity of the relays has continued to grow. Modern relays have more than 30 protection elements and approximately 1000 setting values plus a freely programmable logic. Due to now affordable communication technology these relays are part of a bigger protection scheme. At the same time less mechanical parts, suggest that the causes for misoperations is shifting. The NERC misoperation study proofs this hypothesis with numbers. (See Figure 10).

This study shows the reported main cause for misoperations in North America during the timespan of a year. Relay failures cause 20% of all misoperations, which still justifies conventional testing methods to some extent, though it must be considered that a significant amount of EM relays are still in operation in North America. As almost every transmission system relies on line differential protection, transfer tripping schemes or both, it is no surprise that communication failures are ranked 3rd as a cause for misoperations. With more IEC 61850 substations being built, protection engineers will start taking advantage and more often design schemes like breaker-failure, fast bus transfer etc.  This will relatively increase the misoperations caused by communication.

But what really underlines the shift in misoperations is that the most common cause for misoperations are setting, logic & design errors. These errors are usually assigned to protection engineers, who should ideally avoid making such errors. But looking at this study, we must accept that, setting, logic & design errors will sneak through the engineering process. Fortunately, this is why we test! Ideally the field test would act as a last safety net, finding the setting, logic & design errors before the system goes into operation.

With the goal of protection testing in mind and knowing the cause for misoperations, we can come up with a better testing strategy. We identified three sub-goals:

  • Making sure the protection is operating correctly under real-world scenarios and realistic current and voltage signals
  • Testing a system of IEDs simultaneously, to make sure the communication and coordination are working correctly
  • Increase efforts for testing protection automation and logic

System Testing Equipment Typical Components

Single line: Every activity in a substation starts with a look on the single line diagram.  Even IEC 61850 offers the opportunity to include single line information. The upcoming IEC 61850 part 6-2 (Configuration description language for extensions for human machine interfaces) will even describe the elements.
Modern test equipment must include and visualize single line information.

Testing the communication of several IEDs instead of single:  With IEC 61850 it became possible, to connect to several IEDs during a test and show test GOOSE sent out as multicast (one to many).  This is the move from single device under test (DUT) to multiple (Figure 9).

Power system simulation: The simplest method to test the protection system for real world scenarios is to use a power system simulation to calculate the voltage and current signals. Such a power system simulation can be directly integrated into the testing SW. After entering the power system data of the small network segment, we need to investigate, we can define fault scenario (or other system conditions).  The power system simulation will calculate the scenario and directly output the signals via the test set. Using a transient simulation, advanced protection functions like power swing blocking, time domain protection etc. and scenarios involving CT saturation, series compensation etc. can be tested.
Contrary to a steady-state threshold test, such a test validates if the protection system is working correctly under real power system conditions. (Figure 11).

Testing more than one IED (also distributed): Using a power system simulation also helps us with our goal of testing a system of IEDs. The biggest challenge when testing multiple IEDs is to calculate consistent test signals for each IED within a test step. The simulation takes care of this, as every simulation step calculates the test signals for each bay simultaneously.
The next challenge is to inject all test signals simultaneously. If the IED is still connected conventionally to the process values (e.g. via CTs and VTs), the capabilities of a single test set are often the limiting factor. To overcome this limitation, it is known practice to time-synchronize multiple test sets. The innovation here is to control these test sets from one single application.
If testing in only one substation, the test sets can time synchronize each other, by providing a IEEE 1588 PTP Master themselves via their Ethernet port – no additional GPS signal required. The testing SW calculates the transient signals within the simulation and transfers the signals onto the test sets. After the transfer is completed, the SW will set a start time (usually 10s ahead) for all test sets. The test sets will start the execution and record the binary events. After the execution is completed, the binary events are transferred back to the software for assessment.

If the test sets are positioned in different substations, each substation will need a GPS clock to provide the time. To control the remote test sets, a PC at the remote station (with Internet connection) needs to run a remote agent that grants remote access for the test set to the controlling PC. The principal of execution is the same.
The benefit of controlling multiple test sets with one SW are:

  • No coordination via phone required
  • Troubleshooting a test from one PC
  • All relay responses available in one software for an overall assessment
  • One report of the whole test

The remaining issue with these setups is the wiring effort that is linearly increasing with the amount of test sets. With IEC 61850 process bus on the horizon, the wiring effort becomes almost zero. As all process data (currents, voltages, CB status etc.) are available on one or multiple switches, we can freely decide, which part of the system we want to test. The test isolation would be performed using the mode data object in each logical node of the IED. Afterwards the test set would start injecting Sampled Values and GOOSE with the simulation flag set to true. On IEDs under test the LPHD.SIM will be set to truly correspond. After all test cases are executed successfully, the routine would have to be reversed.

Testing logic:  Logic errors are one of the main causes for misoperations. While standard protection elements are type tested already at the manufacturer, custom logic is completely in the hands of the engineer. Therefore, we already concluded that we must focus more on testing automation and logic. The challenge in testing logic until today is that the test sequences must be built manually. This test sequence must satisfy multiple conditions, which are not only binary logic, but also voltages and currents. Testing systems of IEDs even in different substations, further adds some challenges.
When testing logic from a system perspective, we need only make sure that when a trip or close command is sent, the breaker must operate within the simulation and the correct current flow must be simulated across all IEDs. If this is not the case, it might be considered as a breaker failure and logic that would become active after the first trip cannot be executed.
The capability of a simulation to react to a command of the system under test is usually called real-time closed-loop. But real-time simulation systems are only suitable for the lab, require expert knowledge and a high investment.
A suitable alternative to hard real time is an iterative closed-loop algorithm.

Assuming a test case with a line fault: As already described, the transient waveforms get calculated and transferred to the test sets. The test sets will simultaneously start the injection (of the first iteration). Shortly after the fault inception and current gets injected, the IED will respond with a trip. The waveforms didn't include any breaker opening, so the test system will immediately stop the injection of all test sets. Based on the assumption, that the trip will occur with the same time delay after fault inception again, new waveforms are calculated, containing the fault inception and a breaker open event. The test sets will simultaneously start to inject the (second iteration) waveform again. Again, after fault inception the IED under test will trip, but this time the waveforms contain the breaker opening event. With some time delay the IED will issue a close command. Now the same procedure will repeat. Finally, after all breaker commands have been automatically added to the simulation, the final iteration will be taken as the result for assessments, which achieves a similar result as a real-time simulator. The whole process runs fully automated, which simplifies logic testing. To test an auto-reclose sequence, the tester only places a fault on the line and the iterative closed-loop will take over. Figure 12 shows an example with two iterations. A miscoordination becomes immediately visible in the power system diagram.

When to Test

Routine testing? - > check for changes in the system:
Another important consideration, other than the testing tools and methodology, is the lifecycle of a protection system. the testing tools and methodology described so far can be applied to the following stages in the Lifecycle:

  • Pre-qualification testing
  • Factory acceptance testing
  • Site acceptance testing
  • Commissioning
  • Maintenance or Routine Testing

Generally, it can be said, the later the stage where an error is found, the costlier it will be. Contrary, the costs can even be reduced by eliminating redundancy in the testing process. To discuss every stage in this process in detail would go beyond the scope of this article, except for the topic of routine testing, which must be re-evaluated.
The main goal for most routine tests today, is to verify the correct functioning of the relay asset. The interval of testing mainly depends on the technology being used – electromechanical relays must regularly be re-adjusted and even microprocessor-based relays can now and then have component failures. Digital substations though can take advantage of the fact that GOOSE and Sampled Value communication has a life beat by design. As soon as an IED would fail, for example because of DC supply issues, its communication would drop out, which could raise an alarm. This would further extend the interval for a routine test for most IEDs. Only the end points like standalone merging units and switch control units would still require a shorter interval.
By contrast the intervals in which the power system itself is changing, is getting shorter. These contrary facts can lead to a situation that before the first routine test is performed, the setting and design are not appropriate anymore for the new power system status. Therefore, we must consider that the conventional routine test will become a re-commissioning, checking if the present protection scheme is still doing its job. For a re-commissioning the system-based test used during commissioning can easily be reused. Only the power system data must be updated, before the test plan can be re-executed.

Testing the SAS after firmware update, security:  With the introduction of complex substation automation systems, distributed functionality in combination with protection and updates because of security measures, the topic became more important. In the meantime, testing communication and automation is estimated as essential and time consuming. Testing solutions are available (PAC World 2018-03) (See Figure13).

Testing Together
As structures in the utility are developed further, it is no longer possible to distinguish between protection testing and control testing in all cases. Whoever is testing the protection should be testing the communication to the HMI as well. Changing communication might have an impact protection setup of the IED and demand a new protection test. A slightly different configuration revision in IEC 61850 GOOSE (confRev) can cause failed subscription of GOOSE and make a breaker failure scheme for instance not working any more. So, the protection testing solution must be capable of working with IEC 61850 GOOSE, set devices under test to test mode and utilize sampled measured values. Testing the SAS with the original protection system delivers trustworthy results- from the source to the local and remote HMI.

As we see, protection, automation and control and their testing are growing together. Testing the PAC system as a system becomes reality.  Modern testing solutions have to take this into account.


Thomas Schossig (IEEE) received his master's degree in Electrical Engineering at the Technical University of Ilmenau (Germany) in 1998. He worked as a project engineer for control systems and as a team leader for protective relaying at VA TECH SAT in Germany from 1998 until 2005.
In 2006 he joined OMICRON as a product manager for substation communication products.
He is author of several papers and a member of standardization WGs.

Christopher Pritchard Dipl.-Ing. (FH) received a diploma in Electrical Engineering at the University of Applied Science in Dortmund. He joined OMICRON electronics in 2006 where he worked in application software development in the field of testing solutions for protection and measurement systems and is now the responsible Product Manager for system-based testing solutions.

Relion advanced protection & control.
Let?s start with organization in protection testing