A host of AI tools have been quickly developed to respond to the Covid-19 pandemic, from algorithms to screen lung x-rays for signs of Covid-19, to triage tools to predict which patients will become critically ill. But how do we know they are working as planned?
Transparency in how these tools are developed and their intended use is critical, experts said at a virtual panel at CES.
“You can’t just set a piece of software in front of somebody and say, ‘trust me,’ particularly if they need to make decisions based on that information that are going to be critical to the patient,” said Christina Silcox, a digital health policy fellow at the Duke-Margolis Center for Health Policy.
Currently, healthcare algorithms occupy a regulatory gray area where many are not regulated by the Food and Drug Administration, removing a critical source of outside verification. Some are considered wellness products, and some types of clinical decision support software were exempted from being regulated as a medical device under the 21st Century Cures Act.
But the FDA plans to take a closer look, it indicated on Tuesday. The agency shared a list of planned actions, including developing methods to evaluate and improve machine learning algorithms, and improving transparency to users. It also plans to take a second look at regulatory framework, including how it will handle software that “learns” over time.
Many of these AI tools function as a “black box” models, meaning that it’s not clear how exactly the software uses various inputs — such as images or claims data — to reach a conclusion.
“That’s really different from an individual user not knowing how all of the equipment in their office might work. Because in that case someone understood it and someone evaluated the risk with that understanding of how it works,” she said. “In this case, even the developers don’t understand how the software is doing what it’s doing.”
In other cases, developers may be incentivized not to share exactly what goes into the secret sauce, particularly if their algorithm can’t be patented.
So, additional means are needed to ensure these tools are trustworthy, such as testing for performance. That performance data needs to be completely independent from the data a software tool was trained on, and should be representative of the patient population for which a product is intended.
Putting data into context
A big concern is ensuring tools are implemented in the right context. For example, a tool developed for a children’s hospital is unlikely to work at a hospital for adults, said Dr. Jesse Ehrenfeld, chair of the board of trustees for the American Medical Association.
In other cases, external changes to how data is collected can affect the efficacy of AI tools.
“I’ve build lots of clinical decision support systems in my career, and I can’t tell you the number of times where there was an external systems change or process change downstream that broke everything,” he said. “The data flow stopped or it wasn’t collected in the same way or it had a different meaning.”
While developers should check whether the data they’re using is correct, complete, and actually relevant, additional context about where it is coming from is critical, said Pat Baird, senior regulatory specialist for Philips. With a plethora of tools available to run analytics, his concern is that important clinical context will be an oversight.
“What I’ve learned over the years and what I worry about the most when it comes to data quality is people focusing on getting the numbers and getting data, and skipping over the fact that they need some knowledge about how was that data captured, what is that source,” he said. “They need some context.”
Photo credit: Pixtum, Getty Images