Root Cause Analysis of complex systems

See also – “Root cause, a discussion

 

There are six key steps in addressing the root cause of problems

1 – Describe all the symptoms of the problem in detail

2 – Identify possible causes

3 – Address the most probable cause

4 – Test to see if the problem is solved

5 – Repeat steps 3 and 4 as necessary

6 – If needed, make the fix permanent and apply it to similar equipment in a similar operating context.

For simple systems, all these steps can be followed by a couple of people who are experienced in their operation and maintenance, perhaps just an equipment operator and an area mechanic. For more complex systems, others need to be involved.

Let’s look at an example of a very complex system, an automobile.

In this case, the vehicle is a small truck with a manual transmission, and the problem is intermittent fast idling.

 

Step 1 – Describe the problem.

The information for this step usually comes from the operator, although if Maintenance has been involved in some troubleshooting, the mechanic will also contribute. For this example, the problem is described as follows:

– High idle speed occurs only when the engine is hot – e.g. run at highway speed for over 30 km.

– When coming to a stop, e.g. approaching a red light, as soon as the clutch is depressed or transmission put in neutral, the idle speed increases to 3,300 rpm until the vehicle comes to a complete standstill, when the idle speed immediately returns to normal.

– When the engine has just warmed up (after about 3 km) the problem may occur, but the idle speed is lower, about 1500-2500 rpm.

– The problem is intermittent in the long term. It happens for a couple of weeks about once per year and is not affected by temperature or humidity.

– It is also intermittent in the short term. It may occur at one set of traffic lights but not at the next.

 

Step 2 – Identify possible cause, test and repeat as necessary.

While the operator and mechanic may be able to successfully carry out these steps for many systems, for very complex systems such as an automobile, a system expert’s input is essential. In some cases, it is simply not possible to identify probable causes without extensive detailed knowledge of the system and all its components and their purpose.

In this example, the dealer’s engine control system specialist identified the following components as possibly contributing to the problem:

– the rear differential speed sensor

– the engine control computer

– vacuum hoses

– the throttle position sensor

– the mass air flow sensor

– the idle air control valve

– the PCV valve

Even the most experienced “operators” (drivers) are probably unaware that some of these components exist, let alone the effect they may have on engine speed.

 

Steps 3, 4 and 5. Address the most probable cause

In this, and many other cases, it is often more logical to first address the problems that are the easiest to check before the most probable cause. In this case, the mechanic checked the PCV valve and the vacuum hoses, then started temporarily replacing components with available spares, starting with the rear differential speed sensor and then the idle air control valve.

Replacing the idle air control valve solved the problem.

 

This problem had been clearly described by the operator but could not be solved even by very experienced and capable local auto mechanics. It required the high level of expertise available only from the vehicle manufacturer to reach a solution.

This is typical of complex industrial systems, especially electronic and hydraulic controls which are inherently so reliable that on-site and local people get little experience in troubleshooting and repair.

For complex systems a true expert is an essential member of the RCA team.

 

 

To return to the articles index, click here.

Don Armstrong, P Eng

President, Veleda Services Ltd