FDA: Make medical apps reliable, not risky

By Michael L. MillensonApril 21, 2021

Reprints

Gerry Broome/AP

The fear of Covid-19 catapulted symptom checkers from the periphery of public attention to the center. Wearables like the Apple Watch, Fitbit, and Oura Ring have shown promise as Covid-19 early-warning systems, while apps like MyCovidRisk let you enter your location and other factors for a quick estimate of your odds of infection.

But the deserved praise for these devices shouldn’t obscure a troubling underside of health-related apps. Loose regulatory policies have allowed many thousands of consumer-facing apps to avoid oversight by claiming to be “low risk” and make it impossible to clearly assess the benefits and risks even for those approved by the Food and Drug Administration.

Several years ago, colleagues and I found evidence of wide variations in the functionality, accuracy, and safety of consumer-facing diagnostic apps. Now, new research is highlighting the health and financial consequences of a “wild west” marketplace and the growing need for government action.

Perhaps the most startling finding comes from a 2021 study comparing the advice from 15 online symptom checkers to recommendations from a random collection of nondoctors. What should have been a slam dunk turned into a draw: The apps fared no better than “an average layperson” in correctly deciding whether a set of symptoms justified emergency care, a nonemergency physician visit, or simply self-care. Only the five best apps were superior to the accuracy of “most” participants.

Related: As the FDA clears a flood of AI tools, missing data raise troubling questions on safety and fairness

Moreover, although people’s greatest need may be support for appropriate self-care, “it is in that very situation where symptom checkers perform the worst,” the researchers found.

It’s scant consolation that symptom checkers erred on the side of declaring an emergency, unless your idea of a good time is a panicked drive to the hospital followed by handing over your credit card. Emergency care is expensive for consumers, with out-of-pocket deductibles that can range from $250 to $1,000 even for those who are insured. Emergency care is even more expensive for the health care system as a whole, particularly as out-of-network providers proliferate. On top of that are the health hazards individuals face from pervasive overuse in the emergency department of diagnostic tests and drug prescribing.

Nonetheless, most symptom checkers are classified as “low risk” devices not needing FDA validation, although the agency reserves the right for later review under what it calls “enforcement discretion.” The low-risk category seems to include any app whose fine print disclaimer says it is not providing medical advice, no matter how the app is typically used. It’s akin to those vitamin concoctions with large letters on the bottle front proclaiming “Youth Restorative” while tiny type on the back disavows making actual health claims.

Even apps for detecting skin cancer are often labeled low risk, as if Whac a Mole and analyze-a-mole were merely different varieties of smartphone games. But one of them is deadly serious. At a time when skin cancer deaths are rising, a review of studies of these apps’ diagnostic accuracy concluded they “cannot be relied upon.”

If pandemic stress has left you perturbed about your mental state, the news about popular behavioral health apps isn’t much better. An in-depth review concluded they “may have poor or questionable support from the [medical] literature.” Want to use an app to try to get pregnant or prevent pregnancy? Be aware that app developers “seldom involve health professionals or users in the design, development, or deployment.” Study after study consistently finds similar issues with health apps.

Devices undergoing FDA review do a better job of involving professionals and users, but there’s still a very big catch: There are no accepted standards for validating the sensors that examine your skin lesions or listen to your heartbeat. That means consumers can’t find reliable comparative information on the effectiveness of devices with different groups of people, or the margin of error of the results. Add in opacity about FDA regulatory decisions and the result, as a STAT investigation concluded, is that patients and doctors “know very little about whether [the devices] will work or how they might affect the cost and quality of care.”

The growing importance of digital health in a post-pandemic world may finally attract policymakers’ attention. In March, an international team of researchers, with support from the ABIM Foundation and AcademyHealth, proposed a framework for evaluating digital health devices. The authors acknowledged the “tremendous promise” for such devices “to improve public health and individual wellbeing,” then homed in on what they delicately labeled the “potential to exacerbate low-value care.”

That wonk-friendly wording opened the door for a close examination of the ways in which unreliable medical advice can both hurt patients’ health and unnecessarily cost them — and the health care system — large amounts of money. The authors called on the FDA to study the effect on patients of its light-touch review process, while asking health care systems to take responsibility for identifying “the small subset of effective and rigorously evaluated apps.”

With the pandemic accelerating our reliance on digital health, more urgent action is needed. “Innovation” is not a fig leaf the FDA should be using to cover up its inattention to the evidence about ineffective and potentially injurious apps. Now is the time for Congress and senior policymakers to order regulators to ensure that all consumer-facing apps are reliable, not risky.

Michael Millenson is the president of Health Quality Advisors LLC and an adjunct associate professor of medicine at Northwestern University Feinberg School of Medicine.

About the Author Reprints

Michael L. Millenson

[email protected]

@MLMillenson

linkedin.com/in/michaelmillenson/

FDA: Make medical apps reliable, not risky

Related: As the FDA clears a flood of AI tools, missing data raise troubling questions on safety and fairness

About the Author Reprints

Michael L. Millenson

Tags

Recommended

Recommended Stories

Getting ahead of a non-alcoholic beverage boom among youths

Journals that published Richard Lynn’s racist ‘research’ articles should retract them

STAT Plus: Why a big California employer ditched Elevance for some of its health plans

STAT Plus: How a tweet about a gene discovered long ago led to a $190 million startup and, maybe, hope for heart disease

STAT Plus: The inside story of how Lykos’ MDMA research went awry

Related: As the FDA clears a flood of AI tools, missing data raise troubling questions on safety and fairness

Sign up for STAT Health Tech

About the Author Reprints

Michael L. Millenson

Tags

Trending

Recommended

Recommended Stories

Getting ahead of a non-alcoholic beverage boom among youths

Journals that published Richard Lynn’s racist ‘research’ articles should retract them

STAT Plus: Why a big California employer ditched Elevance for some of its health plans

STAT Plus: How a tweet about a gene discovered long ago led to a $190 million startup and, maybe, hope for heart disease

STAT Plus: The inside story of how Lykos’ MDMA research went awry