Allocating and analyzing downtime

– The Operations/Maintenance relationship

As a part of a reliability improvement programme, many process industries assign the losses resulting from each downtime or lost production event to the department “responsible” – usually operations, mechanical or electrical (and perhaps others). Frequently this allocation is based on someone’s perception of who is to blame, which can have significant negative effects.

I recall the following amazing discussion at a morning meeting in a kraft pulp mill. About 50 tonnes of production had been lost because of a leak in an elbow in a secondary knotter reject pipe (the secondary knotter is the last stage of coarse debris removal after the pulp leaves the digesters, where wood chips are cooked to raw pulp).

Operations Superintendent – “The reject pipe is a piece of mechanical equipment, so this is mechanical downtime.”

Maintenance Superintendent – “Sorry, but the pipe wore out because your operators have been running the chip piles too low and putting a lot of gravel through the system. Its not designed for that. Its Operation’s fault.”

Operations – “But the reason for that is the chip conveyor to the chip pile keeps breaking down so the pile has got so low that they’re scraping the bottom, and the chip conveyor problem is definitely mechanical.”

Maintenance – “The conveyor breaks down because the operators aren’t keeping the conveyor gallery clean and chips get caught under the drums and run the belt off – that’s an Operations problem.”

Operations – “But we can’t clean it because the air line that supplies the lances for blowing under the conveyor is rusted out and we have no air there – that’s mechanical.”

Maintenance – “OK – so what’s the work order to repair the air line? We’ll get on to it tomorrow.”

Operations – “We haven’t submitted a work order yet – I’ll do that right now.”

The point is, that downtime is downtime, and the focus needs to be on preventing problems from recurring, not on who is to blame. Arguing about blame drives a wedge through the Operations/Maintenance partnership. Such arguments are an inevitable result of the blaming process, and will be much more frequent (and louder) if a “departmental downtime” measurement is a component of an incentive programme, which it often is.

In this example, a prudent manager would recognize that the real root causes of the problem were a lack of communication between Operations and Maintenance and a failure to follow the work order/backlog/priority-setting/scheduling business process. If there is a mechanical inspection programme in place, it should also be reviewed if it is missing such fundamental problems as corroded service piping.

There is a better way.

Instead of assigning blame, a much more positive and productive approach is to always treat downtime as a joint responsibility of the Operations/Maintenance partnership. Record all losses against the equipment or event which resulted in the downtime (e.g. “Eq No. 23-4567, No. 3 hot oil pump” or “Raw material delivery delayed by rail strike”. For more on measuring downtime see “Measuring Reliability“). Then assign the responsibility for action to the department which is in the best position to initiate and follow through so that the problem will be prevented from recurring. This department may not even be directly involved in the day-to-day operation. An example – in a pulp mill where a large fibreglass pipe failed during startup, the root cause was determined to be a lack of training of the operators in the correct start-up procedure. Responsibility was assigned to Engineering to train the operators in the fundamentals of pump and piping systems and to develop a standard operating procedure for starting each pump.

Frequently, the first action required after a major downtime event is to conduct a Root Cause Analysis, which may involve just a couple of knowledgeable people but may, on occasion, require extensive investigation (see “Root cause, a discussion“).

Of course, managers need to ensure that the people assigned responsibility for preventing problems from recurring do take the necessary action. If a part of that action is to initiate a work order for some preventive or corrective maintenance or redesign, then it is also necessary to ensure that that work gets done.

If your operation is qualified under ISO 9000, then including Maintenance provides a tool to track “Corrective Action Requests” (CAR’s) and “Preventive Action Requests” (PAR’s). Otherwise, the use of a “How initiated” field in the work order database where one of the values in the “drop-down” list is “Investigation” enables managers to focus on all such work orders. The “Reason” field can be used to further separate “investigation” work orders into safety, operations, environment, etc. See “Work Order coding” for more details.

There is no value in assigning blame, good value in addressing individual problems and the best value can be achieved by improving systems, such as business processes, so that the philosophy of problem avoidance becomes ingrained in the culture.

Click here to return to the articles index

Don Armstrong, President
don.armstrong@veleda.ca
250-655-8267 Pacific Time

Navigation