Spotting and rooting out bias in AI algorithms for healthcare

AI, machine learning

AI, machine learning


Algorithms are becoming more entrenched in our lives, a consequence of the growing stores of data and the push to make greater use of them. While that’s happening everywhere, in health care, the consequences are much higher because lives are at stake. Algorithmic calculations can affect diagnosis, treatment, the allocation of healthcare resources, and reimbursement. But lurking within algorithms is bias that can lead to unintended or suboptimal outcomes.

Many of the algorithms are proprietary “black boxes,” leaving outsiders to wonder how they reach their conclusions. Carol McCall, chief healthcare analytics officer at artificial intelligence software startup, proposes one solution: Open things up. The healthcare industry must insist on AI solutions that are traceable and auditable, so that they can be reviewed over time to assess whether they are drifting off course. For that to happen, algorithms must be transparent and explainable.

“What did we all learn in math class? Show your work. That’s what the teacher always said,” McCall explained. “I think all algorithms are going to have to stand in front of the teacher and kind of show their work. I think what’s going to change is who are they showing it to.”

McCall was a panelist on the “How to Incorporate Robust Bioethics in AI Algorithms” panel, held during MedCity News’ INVEST Digital Health virtual conference this week. She was joined by Erich Huang, chief science & innovation officer of Onduo, a Verily Life Sciences company that has developed an app that helps people manage diabetes among other conditions. Shoshana Hoffman, a professor of law and bioethics at the Case Western University School of Law, moderated the session.

Huang, who earlier in his career was a physician at Duke University Medical Center, said that in medicine, clinicians have to assume bias exists everywhere. He pointed to pulse oximeters, devices placed on the fingertip to estimate the pulse rate oxygen saturation of the blood. Dark skin decreases the accuracy of the reading. The key is to understand the bias so that clinicians can work with it, or around it. If an algorithm is biased in a particular space but its performance is satisfactory if its use is narrowed, it can still be used.

Clinicians already make such choices in prescribing decisions. As an example, he pointed to the prescription of blood pressure-lowering ACE inhibitors. If some patients don’t do well on these drugs, “it doesn’t mean I throw out ACE inhibitors as a medication for hypertension. It just means that I’ll try to confine the use of ACE inhibitors to those who may respond that don’t have side effects. We need to know what the side effect profile of an algorithm is so that we can try avoid using it in a populations with whom those side effects are detrimental.”

McCall’s company has turned its approach of AI transparency into national recognition. In April, the Austin, Texas-based company was judged the winner of the Center for Medicare and Medicaid Services health outcomes challenge, beating out industry giants such as IBM and Geisinger in the task of developing an algorithm that predicted unplanned hospital admissions and adverse events. Part of what the judges were looking for was the ability for an algorithm to assess bias.

What set ClosedLoop’s submission apart was explainability, McCall said. More than just producing a risk score, the startup was able to demonstrate or quantify specific risk factors used by the algorithm that gave rise to any particular score. That feature offers visibility into the inner working of the algorithm, so the data scientists and clinicians can see why it’s producing certain answers. McCall said that this transparency and explainability will be key as value-based care becomes more widely adopted. These models will use algorithms, but if the algorithms are wrong or biased, the problems that value-based care is trying to solve could become worse.

Even if an algorithm has been well designed, its ability to avoid bias could change as circumstances change. As an example, Huang said that an algorithm developed for one health system could be less useful if that system merges with another. The change in the population characteristics patients in the combined entity changes the performance of the algorithm. New patients coming into an algorithm happens more often than people talk about, Huang said.

“It’s mostly unrecognized because people haven’t looked for it,” he said. “We know it’s a real phenomenon in the machine learning world.”

The regulatory framework for algorithms is still in development. Rather than issue strict rules, the FDA has issued guidance, as it regards algorithms as “software as a medical device,” Huang said. He added that some level of regulation is needed to give the medical community confidence in the technology. McCall said that the degree of scrutiny given to an algorithm will depend on how it is used. Regardless, she said that companies can’t put forth algorithms that are “black boxes” in which the way they work is unexplained or not understood.

Part of the reason that bias persists in health care data is because they are so siloed, Huang said. That means an algorithm may not be directly measuring what you want it to measure. McCall noted that a tsunami of data is coming as health care becomes more digitized. That’s leading some people to see data as a burden. But McCall believes the perception will change; in about five years, data will “go from burden to bonanza.”

“I actually don’t think it’s a question of whether AI is going to come. It is going to come,” she said. “We’re going to need it to sustain value-based care. We are at a crisis of affordability and access and equity that we have got to solve. Value-based care can help solve it, and you need algorithms to do it, and they have to be unbiased. So, giddy up.”