Beyond Human Error

One of my most frequently viewed posts is on human errors. I am intrigued by this. I’ve run training on root cause analysis a number of times and occasionally someone will question my claim that human error is not a root cause. Of course, it may be on the chain of cause-and-effect but why did the error occur? And you can be sure it’s not the first time the error has occurred – so why has it occurred on other occasions? What could be done to make the error less likely to occur? Using this line of questioning is how we can make process improvements and learn from things that go wrong rather than just blame someone for making a mistake and “re-training” them.

There is another approach to errors which I rather like. I was introduced to it by SAM Sather of Clinical Pathways. It comes from Gilbert’s Behavior Engineering Model and provides six categories that need to be in place to support the performance of an individual in a system:

Category Example questions
Expectations & Feedback Is there a standard for the work? Is there regular feedback?
Tools, Resources Is there enough time to perform well? Are the right tools in place?
Incentives & Disincentives Are incentives contingent on good performance?
Knowledge & Skills Is there a lack of knowledge or skill for the tasks?
Capacity & Readiness Are people the right match for the tasks?
Motives & Preferences Is there recognition of work well done?

 

Let’s take an example I’ve used a number of times: getting documents into the TMF. As you consider Gilbert’s Behavior Engineering Model you might ask:

    • Do those submitting documents know what the quality standard is?
    • Do they have time to perform the task well? Does the system help them to get it right first time?
    • Are there any incentives for performing well?
    • Do they know how to submit documents accurately?
    • Are they detail-oriented and likely to get it right?
    • Does the team celebrate success?

I have seen systems with TMF where most of the answers to those questions is “no”. Is there any wonder that there are rejection rates of 15%, cycle times of many weeks and TMFs that are never truly “inspection ready”?

After all, “if you always do what you’ve always done, you will always get what you’ve always got”. Time to change approach? Let’s get beyond human error.

Got questions or comments? Interested in training options? Contact me.

 

Text: © 2019 DMPI Ltd. All rights reserved.

DIGR-ACT® is a registered trademark of Dorricott Metrics & Process Improvement Ltd.

Picture: Based on Gilbert’s Behavior Engineering Model

Searching For Unicorns

I read recently that we have reached “peak unicorn”. I wonder if that is true. I joined a breakout discussion at SCOPE in Florida last month entitled “RBM and Critical Reasoning Skills” and the discussion shifted to unicorns. The discussion was about how difficult it is to find people with the right skills and experience for central monitoring. They need to understand the data and the systems. They need to have an understanding of processes at investigator sites. And they need to have the critical reasoning skills to make sense of everything they are seeing, to dig into the data and to escalate concerns to a broader group for consideration. Perhaps this is why our discussion turned to unicorns – these are people who are perhaps impossible to find.

It does, though, strike me in our industry how much we focus on the need for experience. Experience can be very valuable, of course, but it can also lead to “old” ways of thinking without the constant refreshing of a curious mind, new situations and people. And surely we don’t have to just rely on experience? Can’t we train people as well? After all, training is more than reading SOPs and having it recorded in your training record for auditors to check. It should be more than just the “how” for your current role. It should give you some idea of the “why” too and even improve your skills. I asked the group in the breakout discussion whether they thought critical reasoning skills can be taught – or do they come only from experience? Or are they simply innate?  The group seemed to think it was rather a mixture but the people who excel at this are those who are curious – who want to know more. Those who don’t accept everything at face value.

If we can help to develop people’s skills in critical reasoning, what training is available? Five Whys is often mentioned. I’ve written about some of the pitfalls of Five Whys previously. I’m excited to announce that I’ve been working with SAM Sather of Clinical Pathways to develop a training course to help people with those critical thinking skills. We see this as a gap in the industry and have developed a new, synthesized approach to help. If you’re interested in finding out more, go to www.digract.com.

Unfortunately, looking for real unicorns is a rather fruitless exercise. But by focusing on skills, perhaps we can help to train future central monitors in the new ways they need to think as they are presented with more and more data. And then we can leave the unicorns to fairy tales!

 

Text: © 2019 DMPI Ltd. All rights reserved.

To Err is Human But Human Error is Not a Root Cause

In a recent post I talked about Human Factors and different error types. You don’t necessarily need to classify human errors into these types but splitting them out this way helps us think about the different sorts of errors there are. This moves us on from when we get to ‘human error’ when carrying out our root cause analysis (using DIGR® or another method). Part of the problem with having ‘human error’ as a root cause is that there isn’t much you can do with your conclusion. To err is human after all so let’s move on to something else. But people make errors for a reason and trying to understand why they made the error can lead us down a much more fruitful path to actions we can implement to try to prevent recurrence. If a pilot makes an error that leads to a near disaster or worse, we don’t just conclude that it was human error and there is nothing we can do about it. In a crash involving a self-driving car we want to go beyond “human error” as a root cause to understand why the human error might have occurred. As we get more self-driving cars on the road, we want to learn from every incident.

By getting beyond human error and considering different error types, we can start to think of what some actions are that we can implement to try to stop the errors occurring (“corrective actions”). Ideally, we want processes and systems to be easy and intuitive and the people to be well trained. When people are well trained but the process and/or system is complex, there are likely to be errors from time to time. As W. Edwards Deming once said, “A bad system will beat a good person every time.”

Below are examples of each of the error types described in my last post and example corrective actions.

Error Type Example Example Corrective Action
Action errors (slips) Entering data into the wrong field in EDC Error and sense checks to flag a possible error
Action errors (lapses) Forgetting to check fridge temperature Checklist that shows when fridge was last checked
Thinking errors (rule based) Reading a date written in American format as European (3/8/16 being 8-Mar-2016 rather than 3-Aug-2016) Use an unambiguous date format such as dd-mmm-yyyy
Thinking errors (knowledge based) Incorrect use of a scale Ensure proper training and testing on use of the scale. Only those trained can use it.
Non-compliance (routine, situational and exceptional) Not noting down details of the drug used in the Accountability Log due to rushing Regular checking by staff and consequences for not noting appropriately

These are examples and you should be able to think of additional possible corrective actions. But then which ones would you actually implement? You want the most effective and efficient ones of course. You want your actions to be focused on the root cause – or the chain of cause and effect that leads to the problem.

The most effective actions are those that eliminate the problem completely such as adding an automated calculation of BMI (Body Mass Index) from height and mass, for example, rather than expecting staff to calculate it correctly. If it can’t go wrong, it won’t go wrong (the corollary of Murphy’s Law). This is mistake-proofing.

The next most effective actions are ones that help people to get it right. Drop-down lists and clear, concise instructions are examples of this. Although instructions do have their limitations (as I will discuss in a future post). “No-one goes to work to do a bad job!” (W Edwards Deming again) so let’s help them do a good job.

The least effective actions are ones that rely on a check catching an error right at the end of the process. For example, the nurse checking the expiry date on a vial before administering. That’s not to say these checks should not be there, but rather they should be thought of as the “last line of defence”.

Ideally, you also want some sort of check to make sure the revised process is working. This check is an early signal as to whether your actions are effective at fixing the problem.

Got questions or comments? Interested in training options? Contact me.

 

Text: © 2017 Dorricott MPI Ltd. All rights reserved.

DIGR® is a registered trademark of Dorricott Metrics & Process Improvement Ltd.

“To err is human” – Alexander Pope

Root Cause Analysis – A Mechanic’s View

My car broke down recently and I was stuck by the side of the road waiting for a recovery company. It gave me an opportunity to watch a real expert in root cause analysis at work.

He started by ascertaining exactly what the problem was – the car had just been parked and would now not start. He then went into a series of questions. How much had the car been driven that day? Was there any history of the car not starting or being difficult to start? Next he was clearly thinking of the process of how a car starts up – the electrics of turning the motor, drawing fuel into the engine, spark plugs igniting the fuel, pistons moving and the engine idling. He started at the beginning of the process. Could the immobiliser be faulty? Had I dropped the key? No. Maybe the battery was not providing enough power. So he attached a booster – but to no avail. What about the fuel? Maybe it had run out? But the gauge showed ½ tank – had I filled it recently? After all the gauge might be faulty. Yes, I had filled it that day. Maybe the fuel wasn’t getting to the engine – so he tapped the fuel pipe to try to clear any blockage. No. Then he removed the fuel pipe and hey presto, no fuel was coming through. It was a faulty fuel pump. And must have just failed. This all took about 10 minutes.

The mechanic was demonstrating very effective root cause analysis. It’s what he does every day. Without thinking about how to do it. I asked him whether he had come across “Five Whys” – no he hadn’t. And as I thought about Five Whys with this problem, I wondered how he might have gone about it. Why has the car stopped? Because it will not start. Why will the car not start? Erm. Don’t know. Without gathering information about the problem he would not be able to get to root cause.

Contrast the Five Whys approach with the DIGR® method:

Define – the car will not start

Is/Is not – the problem has just happened. No evidence of a problem earlier.

Go step-by-step – Starter motor, battery, immobiliser, fuel, spark plugs.

Root cause – He went through all the DIGR® steps and it was when going through the process step-by-step that he discovered the cause. He had various ideas en route and tested them until he found the cause. He could have kept going of course – why did the fuel pump fail? But he had gone far enough, to a cause he had control over and could fix.

Of course, he hadn’t heard of DIGR® and didn’t need it. But he was following the steps. In clinical trials, there is often not a physical process we can see and testing our ideas may not be quite so easy. But we can still follow the same basic steps to get to a root cause we can act on.

If you don’t carry out root cause analysis every day like this mechanic, perhaps DIGR® can help remind you the key steps you should take. If you’re interested in finding out more, please feel free to contact me.

 

Photo: Craig Sunter (License)

Text: © 2017 Dorricott MPI Ltd. All rights reserved.

DIGR® is a registered trademark of Dorricott Metrics & Process Improvement Ltd.

Go Step-By-Step to get to Root Cause

In an earlier post, I described my DIGR® method of root cause analysis (RCA):

Define

Is – Is Not

Go Step By Step

Root Cause

In this post, I wanted to look more at Go Step By Step and why it is so powerful.

“If you can’t describe what you’re doing as a process, you don’t know what you’re doing” – a wonderful quote from W. Edwards Deming! And there is a lot of truth to it. In this blog, I’ve been using a hypothetical situation to help illustrate my ideas. Consider the situation where you are the Clinical Trial Lead on a vaccine study. Information is emerging that a number of the injections of trial vaccine have actually been administered after the expiry date of the vials. This has happened at several sites. You’ve taken actions to contain the situation for now. And have started using DIGR® to try to get to the root cause. It’s already brought lots of new information out and you’ve got to Go Step By Step. As you start to talk through the process, it becomes clear that not everyone has the same view of what each role in the process should do. A swim-lane process map for how vaccine should be quarantined shows tasks split into roles and helps the team to focus on where the failures are occurring:

In going step-by-step through the process, it becomes clear that the Clinical Research Associates (CRAs) are not all receiving the emails. Nor are they clear what they should do with them when they do receive them. The CRA role here is really a QC role however – the primary process takes place in the other two swimlanes. And it was the primary process that broke down – the email going from the Drug Management System to the Site (step highlighted in red).

So we now have a focus for our efforts to try to stop recurrence. You can probably see ways to redesign the process. That might work for future clinical trials but could lead to undesired effects in the current one. So a series of checks might be needed. For example, sending test emails from the system to confirm receipt by site and CRA or regular checks for bounced emails. Ensuring CRAs know what they should do when they receive an email would also help – perhaps the text in the email can be clearer.

By going step-by-step through the process as part of DIGR®, we bring the team back to what they have control of. We have moved away from blaming the pharmacists or the nurses at the two sites. Going down the blame route is never good in RCA as I will discuss in a future post. Reviewing the process as it should be also helps to combat cognitive bias which I’ve mentioned before.

As risk assessment, control and management is more clearly laid out in ICH GCP E6 (R2), process maps can help with risk identification and reduction too. To quote from section 5.0 “The sponsor should identify risks to critical trial processes and data”. Now we’ve discovered a process that is failing and could have significant effects on subject safety. By reviewing process maps of such critical processes, consideration can be given to the identification, prioritisation and control of risks. This might involve tools such as Failure Mode and Effects Analysis (FMEA) and redesign where possible in an effort to mistake-proof the process. This helps to show one way how RCA and risk connect – the RCA led us to understand a risk better and we can then put in controls to try to reduce the risk (by reducing the likelihood of occurrence). We can even consider how, in future trials, we might be able to modify the process to make similar errors much less likely and so reduce the risk from the start. This is true prevention of error.

In my next post I will talk about how (not) to ‘automate’ a process.

 

Text: © 2017 Dorricott MPI Ltd. All rights reserved.

DIGR® is a registered trademark of Dorricott MPI Ltd.

Overcoming the Hidden Assumptions of Root Cause Analysis

(Photo by Lars Ploughmann, Flikr; License)

In these strange days, when facts seem to matter less, I thought the pediment above the door of London’s Kirkaldy Testing and Experimenting Works from 1874 was rather good. Of course, with root cause analysis (RCA), we are trying to use all the facts available to get to root cause and not rely on lots of guesswork and opinion. In my last post I described a method of RCA that I called DIGR® and I explained why I think it is more effective than the oft-taught “Five Whys” method. As a reminder the steps to DIGR® are:

Define

Is – Is Not

Go Step By Step

Root Cause

When you decide you are going to carry out an RCA there are a number of hidden assumptions that you make. Being aware of these might mean you don’t fall into a trap. In the comments to my previous posts, people have mentioned some of these already and I wanted to explore five of them a little further.

1.Assuming that the effects you see are all due to the same root cause. In the example I have been using in this blog where expired vaccine was administered to several patients at two different sites, we carried out an RCA using DIGR*. In doing so, we assumed that the root cause of the different incidents was the same and the evidence we gathered in the DIGR® process seems to confirm that. But it is possible that these independent incidents have no common root cause – the issue occurred for different reasons at each site. As you review the evidence in the Is-Is Not and Root Cause parts of DIGR® it is worth remembering that the effects might be from different root causes. This is likely to show up when the analysis seems to be getting stuck and facts seem to be at odds with each other.

2. Assuming there is only one root cause. Often issues happen because of more than one root cause or ‘causal factor’. Sometimes there is benefit in focusing on just one of these but other times, there may be a benefit in considering more than one. In our example, we came to the conclusion that the root cause was that ‘the process of identifying expired batches and quarantining them has not been verified’. This is something we can tackle with actions and try to stop a recurrence of the issue. But we could have gone down the path of trying to understand why the checks in the process had failed on these occasions and tried to get to root cause on those. We would have started looking at Human Factors which I will cover in a subsequent post. You have to make a judgement on how many strands of the issue you want to focus your efforts on. In our example we have assumed that by focusing on the primary process, the pharmacists and nurses will not have expired vaccine and so their check (whilst still a good one) should never show up expired vaccine.

3. Assuming you have enough information to work out the cause and effect relationships. Frustrating though it is, it is not always possible to get to root cause with the facts you have available. You always want to use facts (evidence) to check whether your root cause is sound and if you’re really in the guessing mode. If there is no further information available you might have to put additional QC checks in place until you obtain more facts. In our example, if we carried out a RCA using DIGR® straight after the first issue occurred, we might have focused on the root cause being at that particular site on the basis it had not happened at any others (the Is-Is Not part of DIGR®). But we might simply not know enough about exactly what happened at that one site. Of course, following further cases at another site, we realised that there was a more fundamental, systemic issue.

4. Assuming all facts presented are true. I’ve mentioned Edward Hodnett’s book from 1955 “The Art of Problem Solving” previously. There is a chapter on ‘facts’ and in it he says: “Be sceptical of assertions of fact that start, ‘J. Irving Allerdyce, the tax expert, says…’ There are at least ten ways in which these facts may not be valid. (1) Allerdyce may not have made the statement at all. (2) He may have made an error. (3) He may be misquoted. (4) He may have been quoted only in part.” Hodnett goes on to list another six possible reasons the facts might not be valid. This is not to say you should disbelieve people – but rather that you should be sceptical. Asking follow up questions such as “how do you know that?” and “do we have evidence for that?” help avoid erroneous facts setting you off in the wrong direction on your search for root cause.

5. Assuming that because an issue appears to be the same as another issue, the root cause is the same. One of the challenges with carrying out a good RCA is the lack of time. When we are pressurized to get results now, we focus on containing the issue and getting to root cause comes lower down in the priorities. After all, if we get to root cause and put fixes in place, we will help the organization in the future but it doesn’t help us now. As RCA is often a low priority, it is also rushed. And to quote Tim Lister from Tom DeMarco’s book Slack, “people under time pressure don’t think faster.” One way of short-cutting thinking is to use a cognitive short-cut and just assume that the root cause must be the same as a similar issue you saw years ago. If you go down that route you really need to test the root cause against the available facts to make sure it stands up in this case too. Deliberate use of the DIGR® method of RCA can help combat this cognitive bias as it takes you logically through the steps of Define, Is-Is Not, Go step by step and Root Cause. People need time to think.

DIGR® can help with the focus on facts rather than opinion in RCA. It helps pull together all the available facts rather than leaving some to the side by focusing on ‘why’ too early.

In my next post I will go into some more detail on the G of DIGR®. How using process maps can really help everyone involved to Go step by step and start to see where a process might fail.

 

Text © 2017 Dorricott MPI Ltd. All rights reserved.

DIGR® is a registered trademark of Dorricott MPI Ltd.

Use DIGR to get to the Root Cause!

(Photo: Martin Pettitt, License)

I want to thank everyone who read, commented or liked my last post – “Root Cause Analysis: we have to do better than Five Whys”. Many seemed to agree that the Five Whys approach is really not up to the job. The defense of Five Whys seemed to fall into a number of buckets – “It’s just a tool”, “It’s a philosophy, not a tool”, “It needs someone who is trained to use it”, “It’s not meant to be literal: it’s not only about whys”, “It’s not meant to be literal: five isn’t a magic number”. No-one tried defending the Lincoln Memorial example which is so often used to teach Five Whys. I really do think it is a poor tool on its own – at the very least, it is mis-named. I think we do people a mis-service by suggesting “just ask why five times” – we over-simplify and mislead. I think there is a better way. One that is still simple but, importantly, doesn’t miss out key information to help get to root cause and is more likely to lead to consistent results. This is why I came up with the DIGR® method. At the end of this post I explain the basis for DIGR®. There are many sophisticated RCA methods and they have their place but I do think we’d do well to replace Five Whys with DIGR®:

  • Define the problem. You need to make sure everyone is focused on the same issue for the RCA. This sounds trivial but is an important step. What is the problem you are focusing on? You would be surprised how often this simple question brings up a discussion.
  • Is – Is Not. Consider Is – Is Not from the perspective of Where, When and How Many. Where is the issue and where is it not? How many are affected and how many not? When did the problem start or has it always been there?
  • Go step-by-step. Go step-by-step through the process. What should happen – is it defined? Was the process followed? Were Quality Control (QC) steps implemented and does data from them tell you anything? If an escalation occurred earlier was the issue dealt with appropriately? This is where a process map would help.
  • Root cause. Use the information gathered to generate possible root causes. Then use why questions until you get to the right level of cause – you need to get back far enough in the cause-effect process that you can implement actions to address the cause but not to go back too far. This is where experience becomes invaluable. Narrow down to one or two root causes – ideally with evidence to back them up.

Of course, once you have your root cause you will want to develop actions to address the root cause and to monitor the situation. I will talk more about these in future posts. For now, I want to use an example with the DIGR® method of RCA.

Consider a hypothetical situation where you are the Clinical Trial Lead on a vaccine study. Information is emerging that a number of the injections of trial vaccine have actually been administered after the expiry date of the vials. This has happened at several sites. The first thing you should do is contain the problem. You do not need DIGR® for this. When you have chance to carry out the RCA, what might the DIGR® approach look like?

Define. Let’s make sure everyone agrees on what the problem is. It’s not that a nurse didn’t notice that a vial that was about to be administered was past its expiry date. Rather it is that expired vaccine has been administered to multiple patients at multiple sites.

Is – Is Not (Where and When). Where is the issue? It has happened in two sites in two regions (North America and Western Europe). In one site, it has happened twice and this is where the problem was discovered by the CRA reviewing documentation. Is there anything different about the sites where it happened versus those where it did not? There is only one batch that has actually passed the expiry date and not all sites received that batch. So there are many sites where this problem could not have occurred (yet). In fact in reviewing the data we see that for the sites with the expired batch, there have only been 30 administrations of the vaccine since the expiry date. So there was the potential for 30 cases and we have three at two sites. 27 other administrations were of unexpired vaccine.

Go step-by-step. What should actually happen? Each batch has the same expiry date. The drug management system determines which vials are sent to which site based on the recruitment rate. The system flags when there are vials that are expiring soon at particular sites and sends an email. The email explains the action needed – to quarantine expired vials by placing them away from the non-expired ones and being clearly labelled. These are then collected to be destroyed centrally. So this process must have failed somewhere. Further investigation highlights that the the two sites did not receive the email. In fact, email addresses used to send the notification to the sites have minor errors in them – indeed not just the two sites where the issue occurred but in another three. At the two sites with the issue, the emails did not arrive and so they were not informed of expired vaccine and did not specifically go in to quarantine them. There are also no checks in place to make sure the process works – test emails, check for bounced emails, copy to CRA to follow up with site etc.

Root cause. Based on all the information brought together in this RCA, it seems that this was an issue waiting to happen. One route of enquiry is why the two sites did not check the expiry date prior to administration. This could go down the route of blame which is unlikely to lead to root cause (as I will discuss in a future post). But a more fundamental question is how the nurses at these sites were given expired vaccine in the first place. We were lucky in 27 cases – presumably good practice at sites stopped the issue from occurring. But we don’t want to rely on luck. Why did the nurses and pharmacists have expired drug available to use? Because the process of identifying expired batches and quarantining them has not been verified. I would argue this is the root cause. You could go further to trying to understand how the erroneous email addresses were entered into the drug management system but the level we have got to means we can take action – it is within our control to stop this recurring. In other words, we are at the right level to develop countermeasures.

In my next post I will expose some of the hidden assumptions of RCA.

I hope you are intrigued by the DIGR® method of root cause analysis. Could we replace Five Whys with DIGR®? Of course, I welcome your thoughts, comments and challenges to the approach!


Some background to DIGR®

Some people seem naturally good at seeking out root cause. And when you try to formulate the method it is not easy. In DIGR® I have brought together various approaches. Define comes from the D in DMAIC as part of Six Sigma. It is also part of A3 methodology. Is – Is Not comes from the approach described by Kepner and Tregoe in “The New Rational Manager”. Go Step-by-Step comes from Lean Sigma’s process and systems approach – to quote W. Edwards Deming, “If you can’t describe what you’re doing as a process, you don’t know what you’re doing”. Root Cause is, in part, the Five Whys approach – but only used after gathering critical information from the other parts of DIGR® and without a need for five. To look at DIGR® from the approach of 5WH: D=Who and What, I=When and Where, G=How, R=Why.

 

Text © 2017 Dorricott MPI Ltd. All rights reserved.

DIGR® is a registered trademark of Dorricott MPI Ltd.