Satyan

Jan 1816 min read

No safety in numbers

Updated: Feb 1

Metric-fixation is hurting patient safety but are we truly ready to move beyond quantification?

"...but how would you measure that?"

Across my healthcare consulting and advisory work, one question I bump into with painful regularity is the one about measurability. When it comes to quality and safety in healthcare, the idea that the things that count must be countable (and its corollary - that countable things must count for something), saturates so much of our thinking that it is taken by many as an inviolable article of faith. Recently, former United Kingdom health minister and eminent surgeon, Lord Ara Darzi, drew on a common axiom when asked to respond to an Imperial College report that ranked patient safety in the United Kingdom a low 21st out of a list of 38 countries. He said “You can’t improve things if you don’t measure them and this is our attempt to engage clinicians in all countries – not just the UK – to have better measurements and drive performance improvements.” No one stopped to ask how exactly measurement drives performance improvements because it is now an accepted doctrine that one leads to the other. A lot of what we deem as worthwhile, how we make sense of impact, how we think about value, and how we assess the safety of our services - all revolve around quantification. Numbers are at the core of our sector's sense-making apparatus and they exert a powerful shaping influence over much of what is undertaken in the name of assurance and improvement. So what outcomes do we have to show for all our devotion to measurement?

In "Still not Safe: Patient Safety and the Middle Managing of American Medicine", Wears and Sutcliffe lament the failures of the modern patient safety movement, ascribing its lack of impact to an uncritical adoption of top-down, rationalist, 'factory-like' production principles. They argue that this bureaucratised form of healthcare has drawn the cause of patient safety far away from its origins as a human-centred, dynamic and grass-roots (clinician-led) innovation movement. Instead it has led us towards a false reality: one centred on hard measures, compliance and control - into the realm of accounting rather than innovation, into stagnation rather than progress. Wears and Sutcliffe speak mainly of American medicine, but the book's lessons apply widely.

Go to any health system today and you would be hard-pressed to find a board committee, an executive leadership group or clinical governance team who describe anything less than a tortured relationship with their patient safety data. It is staggering that for all the countless person-hours spent by hospital facilities (for well over two decades) gathering, grooming, compiling, analysing, trending, reporting and iterating on the core measures of safety and quality, the highest levels of governance (arguably those with the widest ranging access to information about an organisation) can still feel woefully under-informed about the levels of safety at any given moment. It seems 200-page quarterly board reports, replete with facts, visualisations, traffic lights and thermometers, do nothing more than compound the problem. Unfortunately, within the current data-centric paradigm, the response to any perceived lack of insight is usually to push for more measurement, more forensic analysis, more reporting bloat. Not only does this tie up precious (senior-level) resources in producing, consuming and acting on these interpretations, the bureaucratic measures that flow out in response to troubling signals in the data invariably erode clinical resilience (but more on this later). And so the busy rituals of safety governance continue, all the while decoupling organisational safety functions from the many everyday opportunities to manage real risks in clinical care.

Rethinking metrics in the context of safety

At the risk of stating the obvious, this piece is not a critique of incident reporting in healthcare. There are a multitude of benefits in creating a culture of transparency when things go wrong and the reporting of incidents is a big part of that. However, we run into problems when we ground our safety assurance and management efforts in quantitative measures derived from such data. I do believe there are concrete ways in which we can do better. Yet, a good chunk of the process of delimiting a 'better' way requires deconstructing apparent problems in how we think about safety metrics and indicators in healthcare first. Hopefully the tone of what follows comes across as discerning and balanced. If this opens up a productive debate, it would have served some purpose.

Problem 1: We assume that adverse event rates are an effective proxy for safety

Most informed observers agree that the trends we monitor using healthcare adverse event data are imperfect proxies for safety, but imprecision aside, does this type of modelling still have some residual utility?

With the implementation of mandatory incident reporting in many countries, we capture huge amounts of unintended harm. However, we know (quite definitively too) that reported incidents are a subset of the true rate of harm [Hill et al, 2010]. Unfortunately, they are not a uniform subset (that is, we cannot assume the real rate is uniformly X% higher than the reported rate) and this is one confounder of efforts at interpretation. Certain incidents can be more prone to misreporting (such as those that occur overnight or those that take place unobserved). There is also the lesser problem of over-reporting, with the weaponisation of incident reporting against colleagues being one troubling manifestation (#datixing).

Even when they are accurate, selected indicators of safety can mislead as much as they can inform - and often we only know we were being misled in hindsight. Take the case of petroleum producer BP. Only two hours before the Deep Water Horizon offshore oil rig exploded, taking eleven lives and spilling a catastrophic amount of crude oil into the Gulf of Mexico, BP's Vice President for Drilling Operations was reported to have been on the rig to celebrate seven years without a 'lost-time incident'. Four years after the incident, Louisiana District Judge Carl Barbier would rule that BP was "guilty of gross negligence and wilful misconduct" with multiple external investigations finding significant and recurrent issues that had been explained away or ignored both by BP and several of its contractors - all during a period where the high-level metrics were telling a vastly different story. This gradual unmooring of metrics from reality is not unknown within the safety literature [Dekker, 2016], and seems to especially plague organisations that lean hard into a target-driven management doctrine. While there are some interesting side quests we could go on here, I will leave those for later in this article. The central implication is that if declining rates of harm are to be treated with as much suspicion as escalating rates, and if the source data is not all that reliable in the first place, is there any residual value left in the trend-line?

Beyond the numbers, there are other (even more fundamental) problems with treating adverse events as proxies for safety. In conventional patient safety thinking (and by extension, in clinical governance), we tend to assume an inverse relationship between indicators (based on incident rates) and safety. The subtlety here is that although incidents are a countable variable (or more precisely, incidents are a form of ratio data), safety is not. When our datasets are large enough, they allow us to 'read-in' some correspondence between a gradually rising or falling indicator and the underpinning level of safety, but the logic breaks down when you zoom in a little. Many hospital wards will experience incident-free days periodically. So do we interpret periods with zero events as representing maximal safety? Of course not. Our everyday experience of risk and safety is filled with shades and nuance, with very few absolutes. We might perceive a situation as risky and unsafe despite the absence of incidents (take the experience of ethnic minorities in some countries) or feel 'safe enough' to go about life normally in objectively dangerous places (think of the entrenched populace in conflict zones). Contemporary safety science agrees. Modern thinking on the topic regards the notion of 'objective safety' as illusory. Not only is it relative, subjective, and contextual - it is also a social construction. Further, its meaning has varied over time and across groups. It is fascinating to also consider how culturally-mediated such concepts are. Many traditional cultures do not have exact words for the western conception of safety. For example, Australian Aboriginal languages have terms for protection, safeguarding and kinship (active, purposeful words) but nothing that approximates the abstract "disembodied" idea of ‘safety’.[Yunkaporta, 2019]

For a deeper treatment of various definitions of safety and contemporary perspectives, I recommend James Pomeroy's excellent introduction here. Briefly however, contemporary thinking no longer regards the mere absence of harm as an adequate surrogate for safety. What we are also seeing in the contemporary literature, is a shift in emphasis away from objective measures of safety and towards a better understanding of the features of complex work that are desirable for safety. So safety concepts are increasingly being reframed in the language of positive capacities, proactive adaptations and intentional actions - in other words, everyday performance.

Safety is to be seen as something we produce through complex practice, rather than something we retain or 'defend'. This has meant that more research and practical attention is being focused on what used to be regarded as the boring 'business-as-usual' bits between incidents. In the words of leading safety researcher Prof Erik Hollnagel, in order to understand what it takes to create safety, we have to pay attention to "what happens when nothing happens". [Hollnagel, 2019] This way of thinking is yet to make major inroads into healthcare safety and quality, and there are many reasons for this slow uptake. I believe the sheer volume of incident reports in healthcare is a fundamental barrier of sorts. If a health service compiles several thousand reports of harm each year it can seem like failure is ubiquitous while success is fleeting. With so many signals of failure to contend with, it can feel morally problematic to turn our focus away from harm (even for a little while). There are also real capability questions because practitioners cannot take familiar failure-oriented methods into explorations of normal work. Still, if we are serious about safety and not just about averting incidents, this is a path we must go down.

The apparent distancing of safety from incidents presents a compelling secondary challenge: If safety is not the absence of harm, then how is healthcare meant to frame the relationship between safe practices and the rates of harm that we so assiduously curate? There is no easy answer. Take for example the risk of harm from hazardous chemicals in healthcare. Given that proven (simple) engineered controls exist to manage hazardous chemicals, one might be tempted to treat this as a linear problem - you seek to minimise harm by promoting compliance with safe handling practices. But what if an area has an exceptionally high number of incidents compared to peers? We then have to consider the possibility of more complex causes at play - were workload factors involved or could this be a signal of management shortfalls leading to maintenance lapses? Contemporary accident causation theories offer many competing explanations for how complex failures arise - if Reason's Swiss Cheese Model draws our attention to the adequacy of defence measures, Rasmussen's Dynamic Risk Management Model would guide our focus towards the economic, production and safety pressures that cause a system's operating point to drift towards the boundary of failure. In my experience, no model explains every event perfectly, so it's important to be familiar with several and then assess the persuasiveness and explanatory power of each in a given situation.

To avoid getting sidetracked, it should suffice to say that in some situations, changes to everyday practices will produce immediate effects on incidents. In other situations, temporary degradations in work practices may have no effect at all. Equally some degradations can incubate silently for months or years, combining and interacting with other risks such that when (and if) eventual effects do manifest, they are so far removed in time and space that they are functionally separated from the 'causes' they arose from. The aetiology of harm can be incredibly complex and is one of the most active areas of theoretical development within the sphere of safety (see Sidney Dekker's Foundations of Safety Science for more depth on the topic). Accident theories aside, the take-home message is this: while there is a definite link between the quality and safety of clinical work and the genesis of adverse events - the relationship is far too complex and contingent to be deciphered through any form of linear interpretation, especially based on an analysis of rates of harm alone. In conclusion, I would posit that incident-based indicators tell us precious little about safety in healthcare. What does this mean for you?

Problem 2: Since rates of harm are of central importance to regulators and accreditation bodies, we assume it should be so for hospitals too.

In his best-selling book "Atomic Habits", James Clear talks a lot about goals and outcomes. Having worked with world's most elite athletes, CEOs and high performing teams, you would think Clear's message would be one of unwavering focus on the prize, setting transformative targets, breaking these down to manageable interim goals and relentless execution. In reality, he advocates something quite different. Say you have a dream to play for the NBA, Clear argues that setting outcome-based goals (aiming to be the best player in your school team, in your freshman year of college and so on) is not particularly useful. Not necessarily detrimental, but not useful either. Many people have such goals but never achieve them. Instead, Clear places emphasis on creating the habits that produce small compounding benefits on your everyday performance. Habits and systems -that's what counts.

When lay people listen to elite athletes describing their gruelling journeys to ultimate success, we tend to attribute their achievements to the clarity of their goals because it fits the heroic narrative we know and love. This is not the real story though. According to Clear, everyone who competes professionally in a sport will at some point harbour dreams of winning the big one - the Tour De France, Le Mans, the Grand Slam. Yet, the ones that succeed are uncompromising about creating and refining daily habits and systems. It's all about a million scaffolded adjustments to unremarkable everyday practices. He goes as far as saying that tracking progress against goals and outcomes is completely optional because those outcomes are "lagging indicators of your systems and habits". Your body weight is a lagging indicator of your eating and exercise habits just as your bank balance is a lagging indicator of your spending and saving habits. You are much better off focusing your energies on developing the habits you need. Time spent on this is a far better investment than staring at the scales, your bank statement and perhaps, even your safety metrics.

Safety theorist Prof Sidney Dekker posits a related idea. He speaks of managers being held accountable for the 'dependent' variable (safety or some marker of it in the form of an indicator) when in fact their sphere of influence might only stretch to a few (albeit very important) manipulable variables - such as ensuring rosters are filled, that staff are trained and competent, that workplaces are kept organised and equipment is well maintained. Holding managers accountable for things that are out of their control can take attention away from the things they should be focused on, which leads to a subversion of the very system that creates movement in the dependent variable. The relationship that many hospitals have with their safety indicators can be similar.

Published rates of harm serve as a useful benchmarking function at a macro level. We all agree that significant deviations from the mean deserve to be investigated and explained. Still, by the time these deviations stand out prominently enough (against peer data) to warrant external intervention, all that's left to investigate is the smouldering ruins of safety. The chronology of events leading to the investigation of Mid-Staffordshire Trust's failures in the early 2000s is instructive.

Indicators are legitimately useful in many spheres, but in the case of patient safety we should be opening up more to the possibility that we aren't helping our cause by focusing so heavily on the numbers. Organisations might do better and go further by channelling their collective attentions towards understanding what drives safety at the grassroots. They might then devote available resources to strengthening the practices and systems that are most likely to create compounding benefits in performance. Do this consistently enough and the lagging outcome indicators and accreditation cycles should look after themselves. I say 'should' rather than 'will' because I know of no healthcare organisations that have travelled down this path. However, this does not require abandonment of current approaches. As one of my mentors, Paul Plsek, used to say "you just need to find a container big enough to manage your anxiety". In essence, create small safe to fail experiments to try out alternate ways of knowing and thinking (for one day, one shift, one hour even), harness researchers to help compile the evidence of your journeys, be willing to innovate. It can be done.

Problem 3: We assume that it is impractical to attempt to keep track of frontline safety practices

In the rationalist framing that has come to dominate patient safety, we are prone to automatically think of compliance (with policies and procedures) when we talk about frontline safety practices - and thus mistakenly assume that any efforts to track these capacities must involve some form of intrusive auditing and compliance monitoring. This is a misapprehension. From a contemporary understanding of safety in complex work, compliance with procedures is of little consequence. Current thinking suggests that real-world practices that produce safety are rarely undertaken with safety as the core driver. The Safety II paradigm explains this succinctly. The sources of success and failure are the same: namely, clinicians and frontline teams making dynamic decisions to resolve goal conflicts and trade-offs. These decisions are made with imperfect information, under considerable uncertainty and with less than optimal resources. These decisions mostly go well but sometimes do not. When things don’t go to plan, clinical teams are resourceful enough to retrieve these situations so that no harm is done. This is called ‘adaptive’ behaviour and it is what produces resilient performance within complex systems like healthcare. When systems fail catastrophically at the first deviation from an intended plan, it is not a problem of compliance but a problem of 'brittleness’ (a sign of a system that has become fundamentally inflexible).

From this standpoint, practices that enhance safety are those that reduce brittleness and foster resilience. Now here is a hard-won insight I'll give you for free: improvement practitioners conversant with Safety II know that the goals of resilience enhancing measures are deeply aligned with the core priorities of clinical teams themselves. If safety leaders engage with clinical teams with questions like “what helps you get the job done in difficult circumstances?” and “how could we help you with that?”, if they acknowledge the capacity for creative problem solving and the showing of initiative are incredibly valuable assets (rather than evidence of unwarranted deviations from protocol) they will not likely encounter push back. Many safety and quality practitioners have experienced breath-taking reversals in their sense of agency, the impact of their work and their relationships with frontline teams when they've adopted resilience engineering methods over compliance-oriented ones. If this is all new and unfamiliar (but interesting) seek out work by Erik Hollnagel and colleagues or reach out to me for a chat.

Problem 4: We often overlook the risk of detrimental effects from the adoption of certain indicators

In the 'Tyranny of Metrics', Jerry Z Muller unpacks the pernicious effects of management ideologies grounded in measurement and the many maladaptations that come from taking these ideas too far. The book goes through a number of illuminating case studies from policy making, to military campaigns, to education, policing and even healthcare. The story of Robert McNamara, the US Secretary of Defense during the Vietnam war, is a particularly unsettling example quoted in the book. McNamara, a Harvard trained business executive, had played a part in pioneering efforts to apply statistical science to supply chain optimisation and managing troop movements in the Second World War. He then went on to use these skills at the Ford Motor Company, rising to the rank of Company President, before accepting his political appointment. McNamara arrived with a clearly formulated doctrine of precise measurement. This led him to search for a central metric to gauge the progress of the war in Vietnam, and they found it in the form of the body count of enemy combatants. McNamara reduced the notion of success to pure arithmetic - widening the gap between American losses and those of the Viet Cong. The implementation of this metric produced a number of perverse behaviours including (quite predictably) inflation of actual figures (double counting) but also the risky practice of sending American servicemen on increasingly dangerous missions into enemy territory to verify actual kills. This, rather macabre account, is a clear example of what is known as Campbell's law - an adage developed by an American social scientist and psychologist Donald Campbell. It states that "the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor". The attachment of rewards and punishments to these indicators only serves as an accelerant for such organisational maladaptations.

It would be naive to think such patterns cannot arise in clinical practice even if they have done so (multiple times) in other safety critical sectors [Dekker, 2016]. The increasing adoption of zero harm targets in healthcare is setting the stage for exactly this. Many groups have adopted zero harm targets internationally, and it is also now a matter of policy for World Health Organization (WHO) with its Global Patient Safety Action Plan 2021 setting out its zero harm orientation thus: "Make zero avoidable harm to patients a state of mind and a rule of engagement in the planning and delivery of health care everywhere." But 'how'? How will zero-harm targets be realised? New view thinking in safety (drawing on systems and social safety theories) at least delivers a mechanism to meaningfully apply ourselves to the task at hand without making outlandish claims. I can work with that. What I cannot fathom is how organisations would contemplate zero harm goals when we have demonstrably little experience in producing sustainable & intentional changes at scale in patient safety. This is cargo cult science at its worst. Regardless, these blue sky targets are not innocuous messages that we can emblazon on our safety 'merch' with no ill effects. They come with real costs and voracious appetites. Overzealous safety programs can do a lot of damage in these conditions, harming vital relationships with operational services and making managers less receptive to hearing bad news while making them more reticent to share it upwards. In turn, this can lead to with workers being increasingly belittled, overtly shamed or implicitly stigmatised when incidents do occur (and they will occur) under their watch. What this will eventually yield is a culture of non-reporting of all but the most egregious incidents. An extended pattern of excellent numbers before the proverbial oil rig explodes. I concede that data is yet to be compiled, and some knowledgeable commentators do argue that such fears are overblown - perhaps they are right. For my part, I strongly suspect zero harm targets will engender deep dysfunction in reporting practices and resurrect cultures of secrecy around harm - potentially reversing the singular win of the patient safety movement, which was to bring the issues of unintended harm into the light.

Conclusions

We have incredible opportunities to go down a science-backed path to make care safer for our patients, from applying contemporary safety science, to building capacities in human-factors based systems improvement methods, to incorporating participatory methods to produce compassionate and customisable patient journeys. There is real work to be done, but most of it lies beyond the numbers.

Get in Touch

References

Dekker, S. (2016). Drift into failure: From hunting broken components to understanding complex systems. CRC Press.

Hill, A. M., Hoffmann, T., Hill, K., Oliver, D., Beer, C., McPhail, S., ... & Haines, T. P. (2010). Measuring falls events in acute hospitals—a comparison of three reporting methods to identify missing data in the hospital reporting system. Journal of the American Geriatrics Society, 58(7), 1347-1352.

Hollnagel, E. (2019). Making health care resilient: from safety-I to safety-II. In Resilient health care (pp. 3-18). CRC Press.

Yunkaporta, T. (2019). Sand talk: How Indigenous thinking can save the world. Text Publishing.

HEALTHCARE