Reducing the risk of unspared critical components

Related articles:

– What parts should be in your Maintenance Stores, and why?

               – How effective is PM in reducing downtime?

               – Getting the most from your PM programme.

               – Equipment criticality ratings – are they of value?

It is not economical to carry spares for everything in a manufacturing plant, and for some very expensive and critical components, such as power transformers and large gear reducers, the cost may be so high that a decision may be made to accept the risk of operating without a spare, either installed or in the Stores. Modern manufacturing methods, supported by standards such as ISO 9000, have increased the reliability of machinery to the point where carrying a spare is less necessary than it used to be, however there is still an obvious risk if no spare is kept on hand. Several actions can be taken to minimize this risk.

The “Three P’s” principle to reduce operating risk is described below. This principle is primarily based on Reliability Centred Maintenance (RCM) fundamentals. RCM is the best way to ensure reliability and is universally used in the design and maintenance of equipment where a failure may have disastrous results, such as aircraft and submarines. However, RCM in industrial plants is very expensive to implement, and I believe that a well-designed preventive maintenance programme that is followed with discipline is a more economical approach for the general plant (see “How effective is PM in reducing downtime?”). However, for critical, unspared components, full RCM, and more, may well be justified (For more on RCM, “RCM II” by John Moubray is recommended reading).

The “Three P’s” principle is Prevent, Predict and Prepare.

  1. Prevent  – Prevent failure by ensuring that the component is as well protected from damage and wear as is economically possible.
  2. Predict – a) Use the best available predictive maintenance tools to provide the earliest possible warning of a failure and

– b) Examine the conditions under which the component is operating (its “operating context”) to   predict its probable life.

  1. Prepare  – Establish procedures and processes that can reduce downtime in the event that a failure does occur.

Looking at the “Three P’s” in more detail:

The first step is to determine exactly what it is that you want to prevent, predict or prepare for. To do this the standard technique is to perform a “Failure Mode and Effects Analysis” (FMEA), a key component of RCM.

FMEA involves examining each component in detail and determining the various ways it could fail and is a formal analysis which should be carried out by a team of experienced operators, maintenance people and engineers who are thoroughly familiar with the equipment and its operating context.

An example of a part of a FMEA for a large gear reducer is shown below.

In this example, each sub-component is examined to determine the possible ways that it could fail (the “failure modes”), the effect of each failure mode, the probability of each failure mode occurring and the “failure development period”, which is the amount of warning that can be expected from the time it can be determined that something is starting to fail until a component can no longer perform its intended function.

For each failure mode, if the probability of occurrence is significant, then actions to predict or prevent each failure mode for each sub-component are identified. Obviously, for failure modes which have a very short failure development period (such as a fuse failing) there will be no predictive actions.

Actions to prevent failure may include changing operating procedures (e.g. frequent cleaning, avoiding overloads), changing maintenance procedures (e.g. increasing oil change frequency, checking the torque of electrical connection fasteners) or redesign (e.g. replacing lip seals with bearing isolators, changing lubricant type, installing shear pin couplings or installing cooling systems for electronic equipment).

Actions to predict failures include standard predictive maintenance methods, such as vibration measurements, oil analysis, infra-red inspections of loaded electrical equipment and other appropriate testing. For critical, unspared components, the frequency and extent of these measurements will normally be greater than for other operating equipment. Not all failure modes lend themselves to predictive maintenance (see “Getting the most from your PM programme“).

The second type of prediction is to obtain an expert opinion on the expected life of the component in its actual operating context. For the large gear reducer in the above example, if it is operating in clean, dry conditions, not subject to shock loading and generally loaded below its continuous rating, its life expectancy may be many times the average life for similar components. This knowledge will help to establish the predictive maintenance methods to be used and to assess the value of the various options for increasing protection, as shown under “Preventive methods” in the FMEA example. The equipment manufacturer is in the best position to assist with this prediction, but if this is not possible a local expert should be approached. For the gear reducer example, a local gear manufacturer or your lubricant supplier’s technical staff may be good resources.

The third “P” is to Prepare. Even after the best preventive measures are in place and predictive maintenance is well-established, there will always remain some possibility of a failure. For critical, unspared equipment, it is worth having a contingency plan in place and filed under the equipment location number so that it can be used, perhaps years in the future, if a breakdown does occur. Some examples of such preparation are:

– where delivery times from the equipment manufacturer are very long, one option is to have a local machine shop make a drawing of the component so that in the event of a breakdown they can quickly make a replacement of at least sufficient quality to allow an OEM part to be ordered and delivered.

– develop a repair procedure for all likely failure modes.

– have available an alternative component that could be used long enough to purchase a replacement from the manufacturer. This may include designing or making an adaptor base.

– locate other companies that use the same component and reach some agreement on sharing spares.

A couple of examples of preparedness include:

– a mine with a large ball mill engaged a metallurgist to develop a welding procedure to repair or rebuild a broken tooth on the large ring gear drive, should it ever occur. A spare ring gear would cost over $500,000 with about a 12 month delivery time.

– a sawmill purchased a hydraulic drive and power pack from a large fishing vessel that was being scrapped. This drive unit could temporarily replace any one of several large conveyor drive units in the event of a breakdown, when a temporary drive base to fit the hydraulic motor could be quickly fabricated.

At the very least, critical unspared components should be clearly identified as such so that the people responsible for their operation and maintenance give them the care they deserve.

To return to the “Articles” index click here.

© Veleda Services Ltd

Don Armstrong, President