Peeling the Onion to Improve Human Reliability

Stacy Berkshire

Note: The views expressed in this article are those of the author and do not necessarily represent those of his/her employer, GxP Lifeline, its editor or MasterControl, Inc.

Humans will make mistakes. Industries have been focusing on strategies to reduce or even eliminate human error in efforts to keep consumers and employees safe. Pharmaceutical companies are responsible for ensuring that their products are safe, labeled correctly, have the correct strength, are pure, and are produced with good quality. In the pharmaceutical industry when deviations are identified as a result of an error, organizations follow a deviation investigation process to understand the root cause and put in a corrective action to fix the problem. Often, human error is identified as the cause. The problem is that in many cases human error is considered the root cause without a more in-depth investigation to understand why the human made the error. A "root cause" of human error leads to no further investigation which means the CAPAs (corrective and preventive actions) put in place may be neither relevant nor effective. Human error CAPAs often involve putting employees, many times skilled and/or veteran employees, through remedial retraining. This is not only frustrating for the employee who rolls her eyes when she is retrained on a job she has performed many years, but also does not solve the actual problem. And, because the true root cause was never identified, the human error almost always recurs.

The majority of human errors—some believe upwards to 90%—can be attributed to system failures, not human failures.

Root Cause Investigation

To understand this better, let’s explore a case where a label was found to be incorrect. We find a misspelled ingredient on the label after the labels are printed. We investigate this and attribute the misspelling to the artist simply making an error and an auditor not catching it during review. We say, "The root cause is human error," and the CAPA we put in place is to review the procedure and re-train the artist who made the error and the auditor who missed the error when auditing.

Right? No, wrong!

In two weeks' time, we find the artist made another similar spelling error and the auditor did not catch it. Recurring errors tell us that the root cause was either not identified/identified incorrectly, or the CAPA was ineffective, or both.

In diving deeper into this example, we take the extra step to get to true root cause. We use a human error decision tree (Collazo, 2010) to help us drill down to the real root cause starting with the "5 Whys" technique. So what do we learn? We discover the lighting in the area where the auditor reviews labels is insufficient. We also find the auditor's space is in a high-traffic area where interruptions are rampant. Compounding this even more is the fact that the auditor has worked excessive overtime due to a label conversion to a new logo. We now attribute the cause to the work environment. To eliminate the root cause we put in a CAPA that separates the auditor into his own quiet space, where the lighting is sufficient for reviewing work. We also institute rotations to other work every two hours to give the auditor a break from tedious label reviews. Finally, we add a human error reduction technique for the artist who made the error in the first place. This technique is to back slash every letter of each word when reviewing copy so that mistakes are easily identified. The error does not repeat and our right-the-first-time improves.

Causes such as equipment, instrument or technological failures almost always receive thorough investigations and in-depth CAPAs. For instance, equipment receives planned preventive maintenance to help prevent failures, whereas employees are often expected to work long hours, sometimes performing tedious tasks, or extended shifts and overtime without a thought to the errors that these practices may generate.

Causes of Human Errors

Human errors are caused by active failures or latent conditions. Active failures can be directly associated to a human failure and include issues such as a lapse in judgment, a disregard for procedures, a lack of due diligence or skipping a step (Wiegmann & Shappell, 2003). Latent failures point to factors within a system and include: preconditions for errors, poor management controls, and cultural or organizational influences. Latent failures may lay dormant for a long time (days, weeks, or months) until they finally contribute to a failure (Reason, 1990).

To improve human reliability, organizations need to spend time understanding their systems and potential system failures. The majority of human errors—some believe upwards to 90%—can be attributed to system failures, not human failures. Edward Deming, the father of quality, preached the importance of looking at systems. In his book, Out of the Crisis, Deming stated, "The bulk of the causes of low quality and low productivity belong to the system and thus lie beyond the power of the workforce" (1982, p.24). Employees are often blamed for errors when the system (keep in mind, systems are created by management) is the culprit.

Typical systems with frequent failure modes are: the deviation investigation system; the organizational communication system; the SOP (written documentation) system; the performance management system (including recognition and accountability programs); the training system; the workplace layout and/or work environment system; the governance system; and the process improvement system. Failures within these systems cause humans to make errors rather than humans being the root source of errors (unless we count management who creates the system as the human making the error).

Peeling the Onion

The systems that cause human error are analogous to layers of an onion. Diagnosing, preventing and correcting these system issues can help organization ultimately get to human reliability, the "heart of the onion." The various system layers lie within an environment. A learning organization — "an organization where people continually expand their capacity to create the results they truly desire, where new and expansive patterns of thinking are nurtured, where collective aspiration is set free, and where people are continually learning to see the whole together" (Senge, 1990) — is the best environment to foster human reliability. Human reliability is dependent on the systems working together and the interconnectedness of the systems.

Quality Culture

Organizations that have a culture of quality see quality as being owned by all employees rather than owned by a quality department or function. In a quality culture behaviors focus on right-the-first-time and error prevention. The quality department is integrated into operations and is not seen as policing operations. This culture is fostered by strong management support and oversight. Elements of a quality culture include employee involvement and engagement in quality decision-making and initiatives; quality systems thinking and the impact changes have on the systems; strong quality communication and education; and recognition programs that reward good quality behaviors.

Human and Organizational Performance

Periodically, organizations need to spend time evaluating their human and organizational performance. This is especially important as organizations or companies experience a great deal of change, such as growth, loss of business or leadership changes. As organizations change, changes can lead to system breakdowns if they are not regularly upgraded. Some important human and organizational elements include: the clarity of the vision and mission; alignment of goals; management support and commitment of objectives; organizational structure and sufficiency of span of management control; effectiveness of procedures and documentation; understanding what behaviors are rewarded and/or punished; effectiveness of key processes; and if physical facilities and space are adequate for good performance (Hale, 1998). An evaluation of human and organizational performance can help organizations adjust and shore up systems that may have changed or shifted over a period of time.

Management Controls

Management is responsible for providing adequate resources, proper oversight and governance and fostering a quality culture. Setting up escalation processes and notifications when errors or quality events occur; reviewing metrics, trends and scorecards to identify internal as well as external quality system inadequacies; implementing regular venues, such as quality councils, to report and share information up and down line; providing adequate resources; recognizing employees for good quality behavior; and fostering and supporting the quality culture will help to detect and prevent errors. These are all pivotal elements of the management control system.

Human Reliability

As the system layers are evaluated, assessed, and continuously improved, human reliability flourishes and can be further developed and fostered. Prevention, detection, and correction become a framework for the human reliability program. More investment is placed on prevention and detection and, because of this, failure costs begin reducing (Crosby, 1979). Prevention techniques are the focus so that errors can be eliminated. If prevention techniques cannot be identified, then strong detection techniques need to be put in place to catch errors before a product makes it out the doors to consumers. If an error does occur, correction is used only after root cause is identified through solid root cause analysis tools.

Prevention, Detection, and Correction Techniques

Prevention techniques include teaching and using human error reduction techniques. This human error reduction education helps employees understand how the mind works, how the left and right side of the brain are in a constant battle, and, through human error reduction techniques and tools, how employees can "trick the brain" to reduce or even eliminate errors. Other error prevention techniques involve poka-yoke (mistake-proofing); workplace organization (5S program) to prevent mix-ups; color coding equipment to error-proof line setups; highlighting important text, using document whitespace appropriately and using visuals, such as photos, in documents to reduce documentation errors; using checklists, templates, and standard work techniques to reduce complexity and opportunities for errors; and process mapping to identify, highlight, and segregate the critical-to-quality steps.

Detection techniques involve larger system assessments as previously described to smaller, focused quality system audits of external suppliers, and internal processes. Detection techniques also include using human error root cause diagnostic tools to identify true root causes, modifying equipment to detect errors or identify potential problems, and scanning both the internal and external environment for signs and signals that can help detect issues before they occur.

Correction is needed if errors cannot be prevented or detected. Effective CAPAs rely on disciplined, data-based decision-making and problem-solving. Programs such as Lean Six-Sigma can provide not only excellent tools but also principles, concepts, and models to help drive a process excellence culture.

Eliminating human interaction wherever possible by installing equipment, such as vision systems, or upgrading technology to reduce the need for human intervention is an obvious best way to correct errors and prevent human errors from recurring.

All of these systems or "layers" discussed need to reside within a functioning learning organization, an environment where people learn together to create desired results through nurtured thinking and a clear understanding of expectations, where visible collective goals are in place and where people are continually learning to see the whole together.


Human error reduction is dependent on strong systems. Systems need to be continually reviewed and upgraded in order to keep pace with both internal and external changes. A quality culture in which employees own quality needs be fostered. The human and organizational system needs to be assessed and upgraded periodically, especially when change occurs. Management needs to provide proper oversight and governance, and foster a strong quality culture. These interdependent layers need to be working effectively and efficiently in order to improve the heart: human reliability.


  • Collazo, G. (2010). Human Reliability—how to achieve it. FDA News Webinar.
  • Crosby, P. (1979). Quality is Free. New York: McGraw-Hill.
  • Deming, W.E. (1982). Out of the Crisis. Cambridge, MA: MIT Center of Advanced Educational Services, Cambridge, MA.
  • Hale, J. (1998). The Performance Consultant's Fieldbook. San Francisco, CA: Jossey-Bass/Pfeiffer.
  • Reason, James (1990-10-26). Human Error. MA: Cambridge University Press.
  • Senge, P (1990). The Fifth Discipline — The Art and Practice of the Learning Organization. NY: Doubleday.
  • Wiegmann, D., and Shappell, S., (2003). A Human Error Approach to Aviation Accident Analysis: The Human Factors Analysis and Classification System. Ashgate Publishing, Ltd.

Stacy Berkshire has held senior positions in the pharmaceutical industry and has worked within global quality, HR/organizational development and operations. Ms. Berkshire holds a master of arts degree in curricular education and counseling psychology from Western Michigan University, a master of arts degree in regulatory, quality, and compliance from Purdue University, and is a Villanova-trained Lean Six Sigma Black Belt. Contact her at