Examining the Processes of RCM and TPM
What do they ultimately achieve and are the two approaches compatible?
Author : Ross Kennedy
President, The Centre for TPM (Australasia)
The Background of Reliability Centred Maintenance (RCM)
RCM evolved during the 1950s in the aircraft industry as a result of a number of major reliability studies concerning complex equipment. In particular, the 1960 FAA / Airline Industry Reliability Program Study was initiated to respond to rapidly increasing maintenance costs, poor availability, and concern over the effectiveness of traditional time-based preventive maintenance. This, like several other initial studies, centred around challenging the traditional approach to scheduled maintenance programs which were based on the concept that every item on a piece of complex equipment has a 'right age' at which complete overhaul is necessary to ensure safety and operating reliability. Through these 'reliability programs' it was discovered that many types of failures could not be prevented or effectively reduced by such 'right age' overhauls no matter how intensively they were performed.
Two notable and surprising findings from the 1960 FAA / Airline Industry Reliability Program were that:
- scheduled overhauls had little effect on the overall reliability of a complex item unless the item had a dominant failure mode; and that
- there were many items found for which there was no effective form of scheduled maintenance.
A New Perspective on Failure
As the results of these various aircraft reliability studies unfolded, the traditional views of equipment failure as depicted by the First Generation (pre World War II), and the Second Generation (post World War II) curves were challenged. Finally, a new series of Third Generation failure curves were developed relating to specific types of equipment on aircraft (see Figure 2). Various studies have since been carried out to relate these curves to other industries.
It became evident from the Third Generation failure patterns that views of equipment failure needed to change, as did what should be done to prevent failure. Imposed age limits and Time-Based Maintenance schedules often do little or nothing to improve the reliability of complex equipment. As shown in Figure 3, traditional maintenance can actually increase failure rates by introducing infant mortality into otherwise stable systems.
To address these issues, maintenance was faced with four challenges:
Reliability Centred Maintenance provides a maintenance oriented framework to meet these challenges. RCM can be defined as: a structured, logical process for developing or optimising the maintenance requirements of a physical resource in its operating context to realise its "inherent reliability" where "inherent reliability" is the level of reliability which can be achieved with an effective maintenance program. This level of reliability is a function of the equipment's design and cannot be improved without redesign.
- to deal effectively with each type of failure process with appropriate maintenance tactics;
- to improve maintenance productivity by moving towards a more pro-active and planned approach;
- to extend run length between scheduled shutdowns; and
- to ensure the active support and cooperation of people from the maintenance, material, operations and technical functions.
RCM is basically a methodology to balance the resources being used with the required inherent reliability based on the following precepts:
- a failure is an unsatisfactory condition and maintenance attempts to prevent such conditions from arising;
- the consequences of failure determine the priority of the maintenance effort;
- equipment redundancy should be eliminated, where appropriate;
- condition-based or predictive maintenance tactics are favoured over traditional time-based methods; and
- run-to-failure is acceptable, where warranted.
RCM Seven Step Implementation Process
RCM has seven logical review steps as shown above which are structured in an iterative process usually based on risk analysis and which depend on a clear understanding of the business objectives and requirements.
Two key tools are used in RCM: the Decision or Logic Diagram, which is called MSG-3 (Maintenance Steering Group - model 3) in the aircraft industry where it evolved; and FMECA (Failure Mode, Effect and Criticality Analysis).
The Decision Diagram is used to select maintenance tactics that are technically feasible and worth doing. Figure 4 shows a simple example of a Decision Diagram; however, in practice, a more comprehensive logic analysis is performed using more sophisticated diagrams.
Reliability Centred Maintenance has been renamed a number of times to distance it from its hi-tech origins and occasionally indicate a fresh approach - these names include RAM, RMA, R&M, MTA, MSG-3, RCM I and RCM II.
Fortunately, although the names have changed, the underlying principles of RCM have not! RCM was developed as a strategic methodology for developing a cost effective maintenance plan by identifying:
This is achieved using a progressive logical approach based on identifying all significant maintainable item's:
- what you want out of your equipment;
- what your equipment can do;
- the way in which it may fail to meet your requirements; and
- what you can do to ensure your equipment meets your expectations in a safe and cost-effective manner.
then applying a logic model to each item so as to identify tasks and maintenance inspection intervals.
- Functional Failure
- Failure Effects
- Failure Cause
It should be noted however, this approach is severely hampered if the issues of Basic Equipment Condition, Operating Standards and Accelerated Deterioration are not addressed first.
The Background to TPM
Unlike RCM that emerged from the American aircraft industry, TPM had its genesis in the Japanese car industry in the 1970s. It evolved at Nippon Denso, a major supplier of the Toyota Car Company, as a necessary element of the newly developed Toyota Production System which was originally thought to only incorporate Total Quality Control (TQC), Just in Time (JIT), and Total Employee Involvement (TEI). It was not until 1988, with the publication in English of the first of two authoritative texts on the subject by Seiichi Nakajima, that the western world recognised and started to understand the importance of TPM.
Suddenly it became obvious that TPM was a critical missing link in successfully achieving not only world class equipment performance to support TQC (variation reduction) and JIT (lead time reduction), but was a powerful new means to improving overall company performance. Hence it has only been since the early 90s that TPM has started to rapidly spread throughout the western world, significantly improving the performance of manufacturing, processing, and mining companies. TPM is now having a major impact on bottom-line results by revitalising and enhancing the quality management approach to substantially improve capacity while significantly reducing not only maintenance costs but overall operational costs. Its successful implementation has also resulted in the creation of much safer and more environmentally sound workplaces.
The Evolution of TPM
Traditionally high buffer stocks were allowed to develop between major pieces of the plant & equipment to ensure that if there was a problem with one piece of the plant or equipment then it would not affect production from the rest of the plant. Hence the role of maintenance was to cost effectively ensure major pieces of plant & equipment were available for an agreed period of scheduled time, for example 90%.
Because of the accepted practice of retaining high buffer stocks, most items of equipment could be considered independent. If the equipment in a process was maintained such that it achieved 90% availability, the availability of the process was 90%. If the equipment started to cause quality problems, these would probably be noticed in final quality inspection and the cause traced back to the offending piece of equipment and corrected by maintenance.
At Nippon Denso in 1970 with the introduction of the Toyota Production System, the buffer stocks were substantially reduced in their quest for shorter leadtimes and improved quality. Statistical Process Control (SPC) supported by "Quality at Source" was introduced to ensure quality right first time so to provide maximum customer value through the highest quality at the lowest cost supported by quick responsiveness and superior customer service. Hence in this quest for maximum customer value, buffer stocks were reduced to both reduce leadtimes and force the identification of cost consuming problems. This resulted in individual equipment problems affecting the whole process.
If one piece of equipment stopped then shortly afterwards the whole process stopped. This made the equipment interdependent. Under these circumstances, the availability of the process became the product of the individual availabilities of each piece of equipment. Thus, a process involving four pieces of equipment maintained at 90% no longer had an overall process availability of 90%, but an availability of 90% X 90% X 90% X 90%, or 66%!
Furthermore, as the quality approach changed to "Prevention at Source" by controlling process variables, equipment performance problems were identified much earlier. Conformance and reliability became much more important.
As buffer stocks reduced substantial pressure was placed on the maintenance department to improve process performance. From a maintenance perspective, the maintenance department's performance had not deteriorated, yet demand for the substantial improvement in equipment availability was overwhelming.
This caused friction between the production and maintenance departments. Production departments demanded former levels of process availability and quicker response times from maintenance, who were often unable to comply due to traditional organisation structures which keep maintenance as a separate function. After much conflict between maintenance and production, engineering were called in to find a solution. They soon realised that mathematically for the four pieces of equipment to achieve their original goal of 90% availability, their individual availabilities needed to increase from 90% to 97.5%.
The traditional view of maintenance was to balance maintenance cost with an acceptable level of availability and reliability often influenced by the level of buffer stocks which hid the immediate impact of equipment problems. In traditional companies, maintenance is seen as an expense that can easily be reduced in relation to the overall business, particularly in the short term. Conversely, maintenance managers have always argued that to increase the level of availability and reliability of the equipment, more expenditure needs to be committed to the maintenance budget. With the on set of substantial availability problems caused by the new way of running the plant, management soon realised that just giving more resources to the maintenance department was not going to produce a cost effective solution.
This conflict between maintenance cost and availability is similar to the old quality mind-set before the advent of Total Quality Control (TQC): that higher quality required more resources, and hence cost, for final inspection and rework. TQC emphasised "prevention at source" of the problem rather than by inspection at the end of the process. Instead of enlarging the inspection department, all employees were trained and motivated to be responsible for identifying problems at the earliest possible point in the process so as to minimise rectification costs. This did not mean disbanding the quality control department but having it now concentrate on more specialist quality activities such as variation reduction through process improvement. This new approach to quality demonstrated that getting quality right first time does not cost money but actually reduces the total cost of operating the business.
This new Quality approach of "prevention at source" was translated to the maintenance environment through the concept of TPM resulting in not only superior availability, reliability and maintainability of equipment but also significant improvements in capacity with a substantial reduction in both maintenance costs and total operational costs. TPM is based on "prevention at source" and is focused on identifying and eliminating the source of equipment deterioration rather than the more traditional approach of either letting equipment fail before repairing it, or applying preventive / predictive strategies to identify and repair equipment after the deterioration has taken hold and caused the need for expensive repairs.
TPM has developed over the years since its first introduction in 1970. Originally there were 5 Activities of TPM that is now referred to as 1st Generation TPM (Total Productive Maintenance). It focused on improving equipment performance or effectiveness only. Late in the 80's it was realised that even if the shopfloor were committed fully to TPM and the elimination or minimisation of the "six big losses" there were still opportunities being lost because of poor production scheduling practices resulting in line imbalances or schedule interruptions. Hence the development of 2nd Generation TPM (Total Process Management) which focused on the whole production process.
Finally, in more recent times it has been recognised that the whole company must be involved if the full potential of the capacity gains and cost reductions are to be realised. Hence 3rd Generation TPM (Total Productive Manufacturing / Mining) has evolved which now encompasses the 8 Pillars of TPM with the focus on the 16 Major Losses incorporating the 4Ms - Man, Machine, Methods, Materials. At the CTPM we have expanded the Japanese 8 Pillars to 10 Pillars of Australasian 3rd Generation TPM to better suit our needs in Australia and new Zealand based on our extensive research of the past two and a half years.
An important outcome of this new approach to equipment management which is now supported by many success stories throughout the world in a variety of operational industries, has been that senior management have realised that TPM is both strategically important for a world competitive business, and that TPM cannot be implemented by the maintenance department alone. TPM is a company wide improvement initiative involving all employees.
- Safety & Environmental Management
- Focused Equipment & Process Improvement
- Work Area Management
- Operator Equipment Management
- Maintenance Excellence for TPM
- Education & Training
- Human Resource Management
- Administration & Support Systems Improvement
- New Equipment Management
- Process Quality Management
Although each enterprise may approach TPM in its own unique way, most approaches recognise the importance of measuring and improving overall equipment effectiveness along with the need to reduce both operational and maintenance costs in an environment that promotes continuous improvement.
Understanding the Importance of Overall Equipment Effectiveness
Many companies who recognise the important roll equipment and process performance have on bottom-line results are turning to the measure which drives TPM called Overall Equipment Effectiveness (OEE) which incorporates not only Availability but also Performance Rate and Quality Rate. In other words, OEE addresses all losses caused by the equipment: not being available when needed due to breakdowns or set-up and adjustment losses; not running at the optimum rate due to reduced speed or idling and minor stoppage losses; and not producing first pass A1 quality output due to defects and rework or start-up losses. A key objective of TPM is to cost effectively maximise Overall Equipment Effectiveness through the elimination or minimisation of all losses. A simple model outlining these losses is shown in Figure 5.
When many organisations first measure Overall Equipment Effectiveness it is not uncommon to find they are only achieving around 40% - 60% (batch) or 50% - 75% (continuous process) whereas the international best practice figure is recognised to be +85% (batch) and +95% (continuous process) for Overall Equipment Effectiveness. In effect, this means there exists in most companies the opportunity to increase capacity / productivity by 25% - 100%.
Understanding the Cost Impact of Failure
TPM significantly reduces operational and maintenance costs by focusing on the Root Cause of Failure through the creation of a sense of ownership by the plant & equipment operators, maintainers and support staff to encourage "prevention at source". To help understand the thinking behind TPM we need to investigate what causes failure.
Most of us have heard of the concept of the 'Root Cause of Failure' and the tool most commonly used to assist in the search for the root cause - the 5-Whys. The 5-Whys is a simple technique of asking why 5 times recognising that statistically it has been shown that after 5 whys you are most likely to be at the root cause. In the work place we rarely get to the root cause because we are too busy reacting to the symptoms of our problems. However, unless we get to the root cause we will always have problems reappearing.
What is the root cause of failure? Often, before failure we can have poor performance, prior to poor performance we may get moans and groans coming from our equipment, and before the moans and groans we will have accelerated deterioration (see Figure 6).
What do we mean by 'Accelerated Deterioration'? This is where a piece of equipment or part of a piece of equipment wears out quicker than is expected. That is, its life is shortened because its natural deterioration is accelerated.
Let us look at the failure mechanisms of the parts that make up our plant & equipment. Most pieces of equipment in our plants can be broken up into 3 broad categories:
From above we can see the different failure mechanisms for the three different categories of items. It is worth noting how TPM will actually reduce the life of your wear items due to the increase in throughput as your OEE increases some 50% or more.
Our main interest however, is with the Working Items. These by far make up the majority of items that need maintenance attention and contribute most to our overall maintenance spend. So let us understand the impact of the laws of physics on our working parts.
If, for example, I were to rub my hands together for the rest of the day what is going to happen? I will get very sore hands as they get several layers of skin rubbed off. To stop this from happening I would need to apply some form of lubrication to act as an interface between my hands.
Hence, proper lubrication provides an interface between moving surfaces, and a key role of lubrication is to be a sacrificial wear element. That is, the lubrication wears out as the moving surfaces interface with it. This is why it is recommended that we replace the oil in our cars at say every 10,000 km. This is not because the oil is dirty, even though it may look dirty it is continuously filtered and clean. The reason for replacement is that the oil has worn out.
Accelerated deterioration occurs when:
Who has ever seen an operator "blow down" his equipment with compressed air, or hose it down with water? What is this process doing to the equipment? More than likely the operator is forcing contamination into the equipment without even realising it or caring about it. This contamination is a primary source of accelerated deterioration.
- lubrication is not present;
- lubrication is incorrect for the application;
- lubrication between surfaces is forced out due to overload;
- lubrication wears out; or
- lubrication becomes contaminated.
Many studies have been conducted to determine the impact of accelerated deterioration. Let us consider the situation of the working parts of your equipment. If you were to plot say the 30-year history of the actual life of a part that normally fails after 12-months would you get a straight line? In most studies the result is a normal distribution where the part fails for the majority of the time at 12-months however on other occasions it may fail early or later often with a range of some 6-months either side of the 12-month majority. If we were to introduce a periodic or preventive maintenance plan for this part what would be our strategy. Obviously if we were to replace the part after 12-months we would still have a significant number of failures. If we were very conservative we could replace the part every 6-months. This would significantly reduce the failures however we would have very high maintenance cost. So what is the answer?
This is where TPM becomes so important. TPM is based on the precepts of:
Under this approach the first task is to identify what is causing the variation. Studies conducted by the Japanese Institute of Plant Maintenance and companies like DuPont and Tennessee Eastman Chemical Company have shown that 3 major physical conditions make up some 80% of the variation.
- understand what causes the variation;
- reduce or minimise the variation; then
- look to improvement.
These physical conditions are:
The elimination of these three conditions is known as "establishing Basic Equipment Conditions". Once "basic equipment conditions" have been established we find our normal distribution curve squash up some 80% and moves to the right thus significantly increasing the life of our parts.
In his book, TPM in Process Industries, Suzuki raises the important issue when he states:
"Implementing a periodic / preventive maintenance system before establishing basic conditions - when equipment is still dirty, nuts and bolts are loose or missing, and lubrication devices are not working properly - frequently leads to failures before the next major service is due.
To prevent these would require making the service interval unreasonably short, and the whole point of the preventive maintenance program would be lost.
Rushing into predictive maintenance is equally risky. Many companies purchase diagnostic equipment and software that monitors conditions, while neglecting basic maintenance activities.
It is impossible, however, to predict optimal service intervals in an environment where accelerated deterioration and operating errors are unchecked."
Impact of Multi-Skilling on Basic Equipment Conditions
Although multi-skilling has often been successful in creating a more flexible workforce, experience now highlights that while employees move from equipment to equipment, or area to area, they loose the motivation to seek out basic equipment problems or defects which if left unchecked, will cause failure in the future (see Figure 6). The operators often demonstrate a lack of care for the equipment because they know they will soon be moved to another area or piece of equipment.
An area-based team approach which promotes the development of both base-skills and mastery-skills provides a means to achieve both flexibility and ownership within the workplace. Correctly formed area-based teams create an environment where employees can come to recognise the benefits for themselves to learn both the proper way to operate their equipment as well as how best to care for their equipment by maintaining "basic equipment conditions".
TPM implementation experience has shown that there is a definite relationship between failures and "basic equipment conditions" - no looseness, no contamination, and correct lubrication. Our experience with multi-skilling is that it takes away ownership and the motivation for operators to ensure basic equipment conditions.
Without the framework of effective area-based teams where team members can focus on multi-skilled base skills to ensure team flexibility as well as developing their mastery skills to become the expert at caring for, operating and detecting any defects that might develop in their equipment, operational and maintenance costs will always be high.
Equipment Defects - The Hidden Cause of Failure
The key driving objective of TPM is to eliminate or minimise, not just reduce the six big losses. To achieve this, TPM is an ongoing journey to excellence that challenges our mind-sets. One such important challenge is the traditional mind-set that focuses on either actual or potential failures or breakdown and largely ignores equipment defects that can be the hidden cause of failure (see Figure 7).
Equipment defects or imperfections with our equipment are subtle and not always obvious. They "flow" into our plant & equipment due to various reasons: poor initial design or changes to the initial design requirements of our plant & equipment due to output requirement changes; the way we operate our plant & equipment and the environment we operate our plant & equipment in; imperfections in the maintenance materials we use and the way we carry out our maintenance activities; and last but not least, as a consequence of any failures which occur to our plant & equipment. They are often difficult to identify and correct because they are traditionally accepted as the norm. Equipment defects play a major part in causing "losses" in equipment performance.
TPM implementation experience has shown that there is a definite relationship between failures and equipment defects in that most failures can be traced back to equipment defects. In a TPM environment, the aim is to focus on equipment defects so as to eliminate the occurrence of failures and early deterioration. This focus on equipment defects has a large bearing on the way everyone in the company needs to become involved with TPM. All employees need to ask the question: "are my actions focused on avoiding defects or merely addressing the issues associated with defect removal". Being able to identify and correct equipment defects and then find their source so they can be avoided in the future is a major ingredient in the process of implementing TPM.
Using Operator Equipment Management Pillar to Induce Change
Operator Equipment Management is about "caring for equipment at the source" so as to ensure the "basic equipment conditions" are established and maintained to allow the successful implementation of planned preventive and predictive maintenance to be successfully administered by the maintenance department. Ultimately operators become responsible for the overall equipment effectiveness of their plant & equipment through a "root cause" approach to defect avoidance.
It is not a simple exercise to create an area-based team environment that promotes ownership with base skill flexibility and mastery skill specialty. Changes take time. A systematic approach, supported by a robust process, needs to be adopted to allow the changes to be implemented at a rate commensurate with the organisation's evolving culture.
Although implementation of Operator Equipment Management needs to be specific to the situation and plant environment, the final goal of achieving mature equipment-competent area-based teams is for the area-based teams to be responsible for the Overall Equipment Effectiveness (OEE) of their plant & equipment. This does not mean operators carry out all maintenance activities, but that they are responsible for knowing when they need to carry out the simple defect avoidance and maintenance service work themselves and when they should call in maintenance experts to repair problems which they have clearly identified.
The Relationship between RCM and TPM
The original precepts for RCM (refer page 3) were developed for the aircraft industry where 'basic equipment conditions' (no looseness, contamination or lubrication problems) are mandatory, and where operators (pilots) skill level, behaviour and training is of a high standard. Unfortunately in most manufacturing and mining operations these 'basic equipment conditions' and operator skill and behaviour levels do not exist thus undermining the basis of any RCM application.
For this reason, the application of TPM as a company wide improvement strategy is highly advisable to ensure:
Before attempting a full blown RCM analysis or a partial RCM approach following the basic RCM process. Failure to do this in an environment where basic equipment conditions and operator error are causing significant variation in the life of your equipment parts will block your ability to cost effectively optimise your maintenance tactics and spares holding strategies.
- 'basic equipment conditions' are established; and
- 'equipment-competent' operators are developed
The other key difference between RCM and TPM is that RCM is promoted as a maintenance improvement strategy whereas TPM recognises that the maintenance function alone cannot improve reliability. Factors such as operator 'lack of care' and poor operational practices, poor 'basic equipment conditions', and adverse equipment loading due to changes in processing requirements (introduction of different products, raw materials, process variables etc) all impact on equipment reliability. Unless all employees become actively involved in recognising the need to eliminate or reduce all "losses" and to focus on 'defect avoidance' or 'early defect identification and elimination' failures will never be cost effectively eliminated in a manufacturing or mining environment.
It should be acknowledged that a TPM implementation is not a short-term fix. It is a continuous journey based on changing the work-area then the equipment so as to achieve a clean, neat, safe workplace through a "PULL" as opposed to a "PUSH" culture change process. Significant improvement should be evident within six months, however full implementation can take many years to allow for the full benefits of the new culture created by TPM to be sustaining. This time frame obviously depends upon where a company is in relation to its quality and maintenance activities and the resources being allocated to introduce this new mind-set of equipment management.
The Centre for TPM (Australasia)
In January of 1996, the Centre for TPM (Australasia) - a membership based organisation was created with the mission to "promote and advance the knowledge and practice of TPM and conduct, promote and advance the public education of TPM throughout Australasia." The Centre which has it's head office in Wollongong with regional offices in Melbourne, Brisbane and Adelaide, provides networking, information exchange, training and consulting support and has a strong Research, Development & Innovation Division in co-operation with the Business School at the University of Wollongong.
For further information please contact the Centre for TPM (Australasia) on (02) 4226 6184.
About the author: Ross Kennedy - President, The Centre for TPM (Australasia)
A fitter and turner by trade, Ross has a Mechanical Engineering degree from the University of New South Wales and a Management degree from the University of Wollongong. He has more than 25 years of manufacturing and operational experience covering maintenance, production, operations and executive roles followed by 5 years of international consulting experience with the Manufacturing and Operations Group of Coopers & Lybrand's International Management Consulting Practice. In August 1994 Ross established his own consulting practice specialising in TPM. In January 1996, along with several colleagues, he founded the Centre for TPM (Australasia). Ross has been actively involved with TPM since 1990 and has delivered publicly over 100 papers and workshops on the subject both within Australia and overseas. He, along with his colleagues from the CTPM, is presently assisting a number of companies both in Australia and New Zealand to embark on TPM.
Copyright 1996-2009, The Plant Maintenance Resource Center . All Rights Reserved.
Revised: Thursday, 08-Oct-2015 11:54:32 AEDT