Why should we care about root causes?

So, there’s been an accident. Let’s patch everyone up and fix the bollard. Why do we care about how the accident happened? One of the reasons I enjoy training people is the questions they ask. Every time I run training, I get at least one question that really makes me think. And often, the question is surprisingly simple – on the surface at least. One of the areas I regularly train organisations on is root cause analysis methods and how issue management should link back to risk management. I presented on this topic at SCOPE Europe last year. So how intriguing it was at a recent training to get a question which I had not really considered in any depth before: why do we need root causes of an issue?

The stock answer is that knowing the root causes helps you to focus on those to try to reduce the likelihood of such issues recurring in the future. It means you focus on the issue at its fundamentals rather than just treating the symptoms. It is here that the realisation hit me – we are actually determining root causes primarily so we can reduce the risk of future issues. If we were not concerned about the risk of the issue recurring then there would be little point in spending time trying to get to root causes. And if it is about reducing the risk, then it is not just about the likelihood of the issue recurring. It could also be about the impact and possibly the detectability. We evaluate risks based on these three after all: likelihood, impact and detectability. For a traffic accident, if the root cause was that a child’s ball had rolled into the road and a car had swerved to avoid the child hitting the bollard instead we could:

      • Erect a fence next to the play area to stop balls going into the road (and children following them) – reducing likelihood
      • Reduce the speed limit near the play area to reduce the likelihood of serious injury – reducing impact
      • Erect motion sensors in the play area and link them to a flashing warning sign for road users – to improve detectability

Thinking of a clinical trial example: If the issue is that very few Adverse Events (AEs) are being reported from a particular site and the root cause is determined to be lack of site understanding on AE reporting then to reduce the risk we could:

      • Work with the site to make sure they understand the reporting requirements (to reduce the likelihood)
      • Review source data and raise queries where AEs should have been reported but were not (to reduce the impact)
      • Monitor the Key Risk Indicator for AEs per participant visit at a greater frequency for that site to see if it picks up (to improve detectability)

You may do one or more of these. In risk terms, you are trying to reduce the risk by modifying one or more of – likelihood, impact and detectability. And, of course, you might decide to take these actions across all sites and even in other studies.

And it brings me back to that thorny problem of corrective actions and preventive actions. Corrective actions work on reducing the risk of the issue recurring – whether it is reducing the likelihood, impact and/or improving detectability. If that is so, what on earth are preventive actions? Well, they should be about reducing the risk of issues ever happening – by building quality in from the start. Before a clinical trial starts, GCP requires that a risk assessment is carried out. And as part of the risk assessment, risks are evaluated and prioritised. The additional risk controls that are implemented before the start of the trial are true preventive actions.

It is unfortunate that GCP confuses the language by referring to corrective actions and preventive actions in relation to issue management rather than showing how they relate to risk. And from the draft of E6 R3, it appears that will not be fixed. ISO 9001 fixed this with the 2015 version. Let’s hope that one day, we in clinical trials, can catch up with thinking in other industries and not continue to confuse people as we do now.

As so often, we should ask the “why” question to get to a deeper truth – as encouraged by Taicchi Ohno. And I was very grateful to be reminded of this as part of a training program I was providing.

I have modified my training on both issue and risk management to show better how the two are intricately linked. Is your organization siloing issues and risks? If so, I think there is a better way.

No children, animals or balls were harmed in the writing of this blog post.

 

Text: © 2024 Dorricott MPI Ltd. All rights reserved.

Image: © 2024 Keith Dorricott

Contingencies: Time to Take out this Tool from the RBQM Toolbox

When evaluating risks in clinical trials, people normally evaluate the likelihood, impact and detectability. This is closely following the guidance in ICH E6 R2. For example, perhaps there is an assessment made by the investigators of morphology of the eye and there is a relatively new rating scale used to assess this. A risk might be “inconsistency in applying the rating scale by investigators due to the unusual rating scale might lead to an inability to assess a key secondary endpoint.” It might be decided by the study team that this is likely to happen, if it does happen then the impact would be high and that detecting this during the study is difficult. This risk would score high for all three dimensions and end up as one of the high risks for the study.

The next step is risk management is to look at the high risks and consider how they can be further controlled (or “mitigated”). I teach teams to look at the component risks scores for likelihood, impact and detectability and consider how/whether each of them can be influenced by additional controls.

To reduce likelihood, for example:

    • Protocol changes (e.g. use a more common scale or have a maximum number of participants per site)
    • Improved training including an assessment (but not “retraining”!)
    • Increasing awareness (e.g. with reminders and checklists)

And to improve detectability (and reduce its score), for example:

    • Implement additional manual checks (e.g. site or remote monitoring)
    • Close monitoring of existing or new Key Risk Indicators (KRIs)
    • Computational checks (e.g. edit checks in EDC)
    • Use of central readers

But what of the impact dimension though? Are there any additional controls that might be able to reduce the impact? Here we need to think more about impact. As issues emerge, they rarely start with their maximum impact. For example, if there is a fire in an office building, it takes time before the building is burnt to the ground. There are actions that can be taken after the emergence of an issue to reduce the overall impact. For a fire in an office building, examples of such actions are: having fire extinguishers available and people trained to use them, having clearly signed fire exits and people who have practiced exiting the building through regular fire drills, and having fire alarms that are regularly tested. These are actions that are implemented before the issue emerges so that when it emerges, they are ready to implement and to reduce the overall impact. They are contingencies.

As I work through possible additional controls with teams, they typically look at the impact and decide there is no way they can affect it. For some risks that might be true but often there are contingencies that might be appropriate.

To reduce the impact the following are example contingency actions:

    • Upfront planning for how to manage missing datapoints statistically
    • Planning for the option of a home healthcare visit if an on-site visit is missed
    • Preparing to be able to ship investigational product direct to patients if a pandemic strikes
    • Back-up sites

In our risk example “inconsistency in applying the rating scale by investigators due to unusual rating scale might lead to an inability to assess a key secondary endpoint,” the impact is the inability to assess a key secondary endpoint. But if we detect this emerging issue early enough, are there any actions we could take (and plan for upfront) that could help stop that maximum impact from being realised? Maybe it is possible to take a picture that could be assessed at a later point if the issue emerges? Or there could be remedial training prepared in case it appears that an investigator’s assessments are too variable?

Of course, not all risks need risk controls. But contingencies are worth considering. In my experience, contingencies are a tool in the risk management toolbox that is not removed often enough. Perhaps by helping teams understand how contingencies fit into the framework of RBQM, we can encourage better use of this tool.

 

Text: © 2023 Dorricott MPI Ltd. All rights reserved.

Image: © 2023 Keith Dorricott

Is the risk of modifying RBQM in GCP worth it?

At SCOPE Europe in Barcelona earlier this month, I took the opportunity to talk with people about the proposed changes to section 5.0 of ICH E6 on Quality Management. People mostly seemed as confused as I was with some of the proposed changes. It’s great we get an opportunity to review and comment on the proposal prior to it being made final. But it is guesswork trying to determine why some of the changes have been proposed.

ICH E6 R2 was adopted in 2016 and section 5.0 was one of the major changes to GCP in twenty years. Since then, organizations have been working on their adoption with much success. Predefined Quality Tolerance Limits (QTLs) is one area that has received much discussion in industry and has been much written about. And I have listened to and personally led many discussions on the challenges of implementation (including through the long-running Cyntegrity mindsON RBQM series of workshops which is nearing episode twenty this year!) So much time and effort has gone into implementing section 5.0 and much of it remains intact in the proposed revision to E6 R3. And there are some sensible changes being proposed.

But there are also some proposed changes that appear minor but might have quite an impact. I wonder if the risk of making the change is actually worth the potential benefit that is hoped for. An example of such a proposed change is the removal of the words “against existing risk controls” from section 5.0.3 – “The sponsor should evaluate the identified risks, against existing risk controls […]” We don’t know why these four words are proposed to be dropped in the revised guidance. But I believe dropping them could cause confusion. After all, if you don’t consider existing risk controls when evaluating a risk then that risk will likely be evaluated as being very high. For example, there may be an identified risk such as “If there are too many inevaluable lab samples then it may not be possible to draw a statistically valid conclusion on the primary endpoint.” Collecting and analysis of lab samples is a normal activity in clinical trials and there are lots of existing risk controls such as provision of dedicated lab kits, clear instructions, training, qualified personnel, specialised couriers, central labs etc. If that risk is evaluated assuming none of the existing risk controls are in place, then I am sure it will come out as a high risk that should be controlled further. But maybe the existing risk controls are enough to bring the risk to an acceptable level without further risk controls. And there may be other risks that are more important to spend time and resource controlling.

We don’t know why the removal of these four words has been proposed and there may be very sound reasons for their removal. As someone experienced in helping organizations implement RBQM and an educator and trainer, however, it is not clear to me. And I worry that a seemingly simple change like this may actually cause more industry confusion. It may take time and resource away from the work of proper risk management to process, system, and SOP updates. It may delay still further some of the laggards in implementing Risk-Based Quality Management (RBQM). Delaying implementation is bad for everyone, but particularly patients. They can end up on trials where risks are higher than they need to be and patients may also not get access to new drugs as quickly because trials fail operationally (as their risks have not been properly controlled).

So my question is, is the risk of modifying RBQM in GCP worth it?

The deadline for comments on the draft of ICH E6 R3 has now passed. The guidance is currently planned for adoption in October 2024. I’ll be presenting my thoughts on the proposed changes at SCOPE in Florida in February.

Text: © 2023 Dorricott Metrics & Process Improvement Ltd. All rights reserved.

Picture: Neil Watkins

Enough is enough! Can’t we just accept the risk?

I attended SCOPE Europe 2022 in Barcelona recently. And there were some fascinating presentations and discussions in the RBQM track. One that really got me thinking was Anna Grudnicka’s on risk acceptance. When risks are identified and evaluated as part of RBQM, the focus of the team should move to how to reduce the overall risk to trial participants and the ability to draw accurate conclusions from the trial. Typically, the team takes each risk, starting with those that score the highest and decides how to reduce the scores. To reduce the risk scores (“control the risk”), they can try to make the risk less likely to occur, to reduce the impact if it does occur (a contingency) or to improve the detection of the risk (with a KRI, for example). It is unusual for there to be no existing controls for a risk. Clinical trials are not new, after all, and we already have SOPs, training, systems, monitoring, data review, etc. There are many ways we try to control existing risks. In her presentation, Anna was making the point that sometimes it may be the right thing to actually accept a risk without adding further controls. She described how at AstraZeneca they can estimate the programming cost for an additional Key Risk Indicator (a detection method) and to use this to help make the decision on whether to implement this additional risk control or not.

Indeed, the decision on whether to add further controls is always a balance. What is the potential cost of those controls? And what is the potential benefit? Thinking of a non-clinical trial example, there are many level crossings in the UK. This is where a train line crosses a road at the same level. Some of these level crossings have no gates – only flashing lights. A better control would be to have gates that stop vehicles going onto the track as a train approaches. But even better would be a bridge. But, of course, these all have different costs and it isn’t practical to have a bridge to replace every level crossing. So most level crossings have barriers. But for less well-used crossings, where the likelihood of collision is lower, the flashing light version is considered to be enough and the risk is accepted. The balance of cost and benefit means the additional cost of barriers is not considered worth it for the potential gain.

So, when deciding whether to add further controls, you should consider the cost of those controls and the potential benefits. Neither side of the equation may be that easy to determine – but I suspect the cost is the easier of the two. We could estimate the cost of additional training or programming and monitoring of a new KRI. But how do we determine the benefit of the additional control? In the absence of data, this is always going to be a judgement.

The important thing to remember is that not all risks on your risk register need to have additional controls. Make sure the controls you add are as cost-effective as possible and meet the goal of reducing the overall risk to trial participants and the ability to draw accurate conclusions from the trial.

 

Text: © 2022 Dorricott MPI Ltd. All rights reserved.

Image – © Walter Baxter CC2.0

And Now For Some Good News

It feels as though we need a good news story at the moment. And I was reading recently about the incredible success of the Human papillomavirus (HPV) vaccine. It really is yet another amazing example of the power of science. HPV is a large group of viruses that are common in humans but normally do not cause any problems. A small number of them though can lead to cancers and are deemed “high risk”. Harald zur Hausen isolated HPV strains in cervical cancer tumours back in the 1980s and theorised that the cancer was caused by HPV. This was subsequently proved right: in fact we now think 99.7% of cervical cancers are caused by persistent HPV infection. This understanding along with vaccine technology led to the development of these amazing vaccines, which are incredibly as much as 99% effective against the high risk virus strains. And the results speak for themselves, as you can see in the graphic above. This shows the percentage of women at age 20 diagnosed with cervical cancer by birth year and that the numbers have dropped dramatically as the vaccination rates have increased. zur Hausen won the Nobel Prize for medicine for his fundamental work that has impacted human health to such a degree.

What had me intrigued particularly about this story is that here in the UK, there has been public concern that the frequency of testing for cervical cancer (via the “smear test”) is being reduced – in Wales specifically. The concern is that this is about reducing the cost of the screening programme. The reason the frequency is being reduced from 3 to 5 years is scientifically supported however, because the test has changed. In the past, the test involved taking a smear and then looking for cancerous cells through a microscope. This test had various problems. First, the smear may not have come from a cancerous part of the cervix. Second, as it involves a human looking through a microscope, they might miss seeing a cancerous cell in the early stages.

The new test, though, looks for the high risk HPV strains. If there is HPV present, it will be throughout the cervix and so will be detected regardless of where the test is from. And it doesn’t involve a human looking through a microscope. But there is an added, huge, benefit. Detecting the high risk HPV strain doesn’t mean there is cancer – it is a risk factor. And so further screening can take place if this test is positive. This means that cancer can be detected at an earlier stage. Because the new test is so much better, and gives an earlier detection, there is more time to act. Cervical cancer normally develops slowly.

In Risk-Based Quality Management (RBQM) in clinical trials, we identify risks, evaluate them, and then try to reduce the highest risks to the success of the trial (in terms of patient safety and the integrity of the trial results). One way to reduce a risk is to put a measurement in place. People I work with often struggle with understanding how to assess the effectiveness of a risk measurement but I think this cervical cancer testing gives an excellent example. The existing test (with the microscope) can detect the early stages of cancer. But the newer test can actually detect the risk of a cancer – it is earlier in the development cycle of the cancer. The newer test detects with more time to act. And because of that, the test frequency is being reduced. The best measurements for risk provide plenty of time to take action in order to reduce the impact – in this case, cervical cancer.

This example also demonstrates another point. That understanding the process (the cause and effect) means that you can control the process better. In this case by both eliminating the cause (via the HPV vaccine) and improving the measurement of the risk of cancer (via the test for high risk HPV strains). Process improvement always starts with process understanding.

Vaccines have been in our minds rather more than usual over the last couple of years. It is sobering to think of the number of lives they have saved since their discovery in 1796 by Edward Jenner.

 

Text: © 2022 Dorricott MPI Ltd. All rights reserved.

Image – Vaccine Knowledge Project https://vk.ovg.ox.ac.uk/vk/hpv-vaccine

Is Risk Thinking Going Mainstream?

I sing in a chamber choir – rehearsals though have, of course, been over Zoom in recent months. I’m on the choir committee and we’ve been discussing what we might need to do to get back to singing together in the real world. And the conductor showed us a Risk Assessment that he’d been working on! I was really impressed. It showed different categories to consider for risks such as preparation for rehearsal, attendee behaviour during rehearsals, rehearsal space etc. The risks had been brainstormed. Each are scored for Likelihood and Impact. These scores were multiplied to determine a Risk Score. Mitigations were listed to try to reduce the high Risk Scores. Then each risk was re-scored assuming the mitigation is implemented – to see whether the score is now acceptable. We went through the risk assessment process and the main mitigation actions we needed to take were:

      1. Maintain social distancing at all times and wear masks when not singing.
      2. Register all attendees for track and trace purposes.
      3. No sharing of music, pencils, water etc. Choir members need to bring their own music.
      4. Rehearsal limited to one hour, then a 15 minute break where the room is ventilated, then continue with rehearsal to prevent unacceptable build-up of aerosols. Ideally, people go outside during break (if not too cold).
      5. Clear instructions to the choir before, during and after. Including making it clear the rehearsal is not risk-free and no-one is obliged to attend.

Which I thought was pretty good.

It really intrigued me that a small choir would be completing something like this. I helped develop the MCC’s Risk Assessment & Mitigation Management Tool 2.0 and there are interesting similarities – the brainstorming of risks, the use of Likelihood and Impact to provide a Risk Score, the mitigations, and the re-scoring to see if the Risk Score is at an acceptable level.  And there are some differences too – in particular, there is no score for Detectability. I’ve often heard at meetings in the MCC and with other clients how difficult it is in clinical trials to get people really thinking critically for risk assessments. And how challenging the risk assessment can be to complete. I wonder if COVID-19 is helping to bring the concept of risk more into the mainstream (as mentioned in an article in the New Scientist on risk budgeting) and that might make it easier for those involved in clinical trials to think this way too?

Unfortunately, within days of us completing the choir rehearsal risk assessment, the government announced a new national lockdown. Which has stopped us moving forward for now. But we’re ready when restrictions ease. Zoom rehearsals for a while longer!

 

Text: © 2020 Dorricott MPI Ltd. All rights reserved.

Lack of Formal Documentation – Not a Root Cause

When conducting root cause analysis, “Lack of formal documentation” is a suggested root cause I have often come across. It seems superficially like a good, actionable root cause. Let’s get some formal documentation of our process in place. But, I always ask, “Will the process being formally documented stop the issue from recurring?” What if people don’t follow the formally documented process? What if the existing process is poor and we are simply documenting it? It might help, of course. But it can’t be the only answer. Which means this is not the root cause – or at least it’s not the only root cause.

When reviewing a process, I always start off by asking those in the process what exactly they do and why. They will tell you what really happens. Warts and all. When you send the request but never get a response back. When the form is returned but the signature doesn’t match the name. When someone goes on vacation, their work was in process and no-one knows what’s been done or what’s next. Then I take a look at the Standard Operating Procedure (SOP) if there is one. It never matches.

So, if we get the SOP to match the actual process, our problems will go away won’t they? Of course not. You don’t only need a clearly defined process. You need people that know the process and follow it. And you also want the defined process to be good. You want it carefully thought through and the ways it might fail considered. You can then build an effective process – one that is designed to handle the possible failures. And there is a great tool for this – Failure Mode and Effects Analysis (FMEA). Those who are getting used to Quality-Based Risk Management as part of implementing section 5.0 of ICH E6 (R2) will be used to the approach of scoring risks by Likelihood, Impact and Detectability. FMEA takes you through each of the process steps to develop your list of risks and prioritise them prior to modifying the process to make it more robust. This is true preventive action. Trying to foresee issues and stop them from ever occurring. If you send a request but don’t get a response back, why might that be? Could the request have gone into spam? Could it have gone to the wrong person? How might you handle it? Etc. Etc.

Rather than the lack of a formal documented process being a root cause, it’s more likely that there is a lack of a well-designed and consistently applied process. And the action should be to agree the process and then work through how it might fail to develop a robust process. Then document that robust process and make sure it is followed. And, of course, monitor the process for failures so you can continuously improve. Perhaps more easily said than done. But better to work on that than spend time formally documenting a failing process and think you’ve fixed the problem.

Here are more of my blog posts on root cause analysis where I describe a better approach than Five Whys. Got questions or comments? Interested in training options? Contact me.

 

Text: © 2019 DMPI Ltd. All rights reserved.

Image: Standard Operating Procedures – State Dept, Bill Ahrendt

Please FDA – Retraining is NOT the Answer!

The FDA has recently issued a draft Q&A Guidance Document on “A Risk-Based Approach to Monitoring of Clinical Investigations”. Definitely worth taking a look. There are 8 questions and answers. Two that caught my eye:

Q2. “Should sponsors monitor only risks that are important and likely to occur?”

The answer mentions that sponsors should also “consider monitoring risks that are less likely to occur but could have a significant impact on the investigation quality.” These are the High Impact, Low Probability events that I talked about in this post. The simple model of calculating risk by multiplying Impact and Probability essentially prioritises a High Impact, Low Probability event the same as a Low Impact, High Probability event. But many experts in risk management think these should not be prioritized equally. High Impact, Low Probability events should be prioritised higher. So I think this is a really interesting answer.

Q7. “How should sponsors follow up on significant issues identified through monitoring, including communication of such issues?”

One part of the answer here has left me aghast. “…some examples of corrective and preventive actions that may be needed include retraining…” I have helped investigate issues in clinical trials so many times, and run root cause analysis training again and again. I always tell people that retraining is not a corrective action. Corrective actions should be based on the root cause(s). See a previous post on this and the confusing terminology. If you think someone needs retraining, ask yourself “why?” Could it be:

      • They were trained but didn’t follow the training. Why? Could it be one or more of the Behavior Engineering Model categories was not supported e.g. they didn’t have time, they didn’t have the right tools, they weren’t provided with regular feedback to tell them how they were doing? Etc. If it’s one of these, then focus on that. Retraining will not be effective.
      • They haven’t ever received training. Why? Maybe they were absent when the rest of the staff was trained and there was no plan to make sure they caught up later. They don’t need retraining – they were never trained. They need training. And is it possible that there might be others in this situation? Who else might have missed training and needs training now? Maybe at other sites too.
      • There was something missing from the training (as looks increasingly likely as one possible root cause in the tragic case of the Boeing 737 Max). Then the training needs to be modified. And it’s not about retraining one person or one site on training they had already received. It’s about training everyone on the revised training. Of course, later on, you might want to try to understand why an important component was missing from the training in the first place.

I firmly believe retraining is never the answer. There must be something deeper going on. If your only action is retraining, then you’ve not got to the root cause. I can accept reminding as an immediate action – but it’s not based on a root cause. It is more about providing feedback and is only going to have a short-term effect. An elephant may never forget but people do.

Got questions or comments? Interested in training options? Contact me.

 

Text: © 2019 DMPI Ltd. All rights reserved.

Save me from the snow – a perspective on risk

I recently attended and presented at the MCC Clinical Trial Risk and Performance Management Summit in Princeton. It was a fantastic event – always great to meet people you’ve been talking with on the phone and there was a real energy and desire to exchange ideas and learn. Around noon on day two, snow started to fall. And it kept falling. I wasn’t concerned. After all, snow is hardly unusual in these parts and I assumed it would all be sorted out fine. Unfortunately, this was not to be the case. Our taxi was around an hour late arriving to take us to Newark airport. And the drive that should have taken 45 minutes took four hours. There were plenty of accidents and broken-down vehicles on the way. When we got near to the airport itself, things seemed to get worse and at one point we were stuck, not moving, for around an hour. At the airport itself there was plenty of confusion as flight after flight was cancelled. The queue for the Customer Service Desk for people to rebook flights and find a hotel was around 400 people. I estimated based on the processing time that it would take around 10 hours for the person at the end of the queue to be seen. My flight was delayed by five hours but did leave. Other delegates from the conference had flights cancelled and ended up in the airport over night.

It did get me thinking about the whole thing from a risk perspective. This was, apparently a rare event – so much snow settling in November. The probability of such an event was low. But the impact was quite significant on people trying to get anywhere and many people’s plans were significantly disrupted. This is one of those high impact, low probability events which are actually rather difficult to manage from a risk perspective. Much more extreme examples are the 2011 Fukishima nuclear plant melt down following a tsunami caused by an earthquake, and the possibility of a large asteroid hitting the earth. There’s even a UK government report on these high impact, low probability events from 2011 where a range of experts reviewed the latest research and different approaches. It’s important not to simply dismiss these risks – in particular because the probability is actually rather uncertain. The events happen rarely which makes determining the true probability difficult. One approach is to improve detection – if you can detect early enough to take action, you can reduce the impact. And you can always have contingencies in place.

So back to the snow. I wonder, could they have predicted earlier that there was going to be so much snow? And that it would actually settle rather than melt away? Why didn’t they have better contingencies in place (e.g. gritting of roads, snow ploughs, better practices to deal with customers whose flights have been cancelled)? And here’s a scary thought – the probability of such events may be low. But it is uncertain. And with climate change, could it be that weather-related high impact, low probability events are actually becoming more common? Perhaps we need to improve our detection and contingencies for such events in the future.

On a final note, I will say I was very impressed by the stoicism of those impacted. I saw no-one getting angry. I saw people queuing in apparently hopeless queues without complaint. And there was plenty of good humour to go around. Enough to lift the spirits as we head into the holiday season!

 

Text and Picture: © 2018 Dorricott MPI Ltd. All rights reserved.