AI was supposed to save health care. What if it makes it more expensive?

Mount Sinai Hospital last year switched on an artificial intelligence program to search the hospital’s records for evidence of malnourished patients in its wards. | Bryan Bedder/Getty Images

NEW YORK — Last year, Mount Sinai Hospital switched on an artificial intelligence program to search the hospital’s records for evidence of malnourished patients in its wards. The numbers it turned up were eye-popping: 20 percent more cases were diagnosed than in the previous year.

Around the same time, Barbara Murphy, chief of the renowned health system’s department of medicine, was helping to develop another AI program, to predict whether diabetic patients are at near-term risk of kidney disease and to help prioritize specialist visits for those who are. One of the early findings, according to Murphy: “We probably need some more nephrologists.”

As hospital systems around the country unleash machine learning algorithms — computer models that function like millions of unblinking eyes inspecting patient records — such findings are becoming more common. The algorithms, deployed in hospitals over the past couple of years, are often designed to help locate the sickest patients, but in some cases, they also provide more opportunities to bill.

“One of the ways that we get reimbursed for stays is being able to treat and document what’s going on,” said Robbie Freeman, Mount Sinai’s vice president for clinical innovation. “We can document that malnutrition. It means the hospital will get credit for that patient and paid for the care.”

The specialists who are introducing machine learning and other tools say they should make health care more efficient by enabling better identification of riskier problems before they become tragic, expensive conditions. For example, Murphy’s algorithm could help doctors intervene to prevent diabetics from needing dialysis.

But there’s a big risk of overdiagnosis and recommendations for procedures or drugs that might not be necessary, said Kenneth Mandl of Harvard Medical School, who created a new term — “biomarkup” — for the tendency of new detection to spur more spending in medicine. Marketing and innovation fever tend to stimulate the use of tests that indicate risks for diseases like cancer and stroke before there is reliable data showing the tests lead to improved patient outcomes, he said.

As algorithms point out health risks that haven’t always been captured in the course of normal medicine, “the risks of overdiagnosis go up, and so can the risks of potential harms,” David Kent, a doctor and researcher at Tufts University, said at a recent FDA-sponsored event.

A recent study by Google scientists, for example, showed that one of the company’s computer models was better than doctors at reading mammograms to detect breast cancers. Speedy algorithms like the one Google deployed will allow radiologists to work faster and lead to more positive findings, Dell Medical School dermatologist Adewole Adamson wrote recently in the New England Journal of Medicine.

But the problem is, the algorithm won’t know which of those growths require attention or treatment, Adamson wrote. So, more biopsies will probably be ordered, with the possibility of more overdiagnosis and higher spending.

Hospital systems around the country are testing machine-learning algorithms that detect everything from smoldering blood infections or pneumonia to social issues that affect health, like poor housing and lack of access to transportation or nutritious food. How they deploy the programs may have a major impact on costs.

Many of the algorithms deliver a risk score — for instance, they might say a patient has a 60 percent chance of suffering a fall. Hospitals can then display the risk number on clinicians’ computer screens in green, yellow or red, depending on what level of risk they decide is worth addressing.

Setting those thresholds is a delicate matter. If a “red” score is set too low, doctors will be flooded with alerts and probably ignore them. If the risk threshold is too high, clinicians run the risk of missing a serious condition that could harm a patient — and potentially expose their institution to legal jeopardy.

“There is always a tradeoff,” says Michelle Gong, chief of critical care at the Montefiore Health System in the Bronx. “You have to select a level of sensitivity you think is safe but doesn’t create too many alarms.”

Liability and safety concerns could push hospital systems to set thresholds to detect more cases, leading to more medical intervention, and more billing, because “each false negative could represent a loss of life,” said Keith Dreyer, chief data science officer for Partners HealthCare. Bad drivers kill tens of thousands of people a year, but a single death caused by a driverless car in 2018 caused Uber to suspend its testing of the vehicles, Dreyer noted.

Not everyone agrees that algorithms inevitably will drive up health spending. If algorithms are increasing costs, they probably aren’t being implemented correctly, said Eric Topol, director of the Scripps Research Translational Institute.

Connie Lehman, chief of breast imaging at Massachusetts General Hospital, said algorithms she has been testing there could reduce unnecessary consults by eliminating more false positives that could lead to sonograms and biopsies. If they were widely used, “you could immediately see lower costs,” she said.

In short, say Topol and others — including several AI leaders interviewed at four New York City-area hospital systems recently — the successful use of algorithms depends on so much more than just good technology. The industry’s uncertainty over how to deploy them is a major reason they haven’t had much impact on health care yet, said Parsa Mirhaji, director of the Center for Health Data Innovations at Montefiore.

“The coding is the easy part,” said Charles Esenwa, a neurologist at Montefiore who is trying to assess stroke risks in the hospital’s population with an algorithm that accounts for socioeconomic issues like housing and access to healthy food. Scientists need to work with accurate data that reflects the population they were working with, he said.

Yindalon Aphinyanaphongs, who leads a team implementing AI at New York University’s Langone Health System, sees it as a four-step process: building a model, assuring it delivers information to clinicians effectively, convincing doctors to use the information and making sure it benefits patients.

Getting buy-in from doctors is a big part of the process. They’ve already been burned by difficult-to-use electronic health records, which promised more progress than they delivered, noted Andrew Racine, chief medical officer at Montefiore.

The “black box” aspect of machine learning — the fact that computer models may make predictions based on calculations that are too complex to explain — could make it challenging to communicate the findings to clinicians. Health systems have taken different approaches to communicating algorithm results.

At Mount Sinai, the clinician gets a drop-down list of vital signs or other data that went into the computer’s decision-making process, according to Freeman. But at Montefiore, an algorithm for acute respiratory distress syndrome — a severe lung condition often overlooked — doesn’t convey the underlying data to doctors. So much information goes into the model — including 35 different measures — that it would only confuse clinicians, said Gong. Instead, hospital staff are instructed to respond to red alerts for severe cases by calling her critical care team for a consultation.

Doctors should get accustomed to not knowing the “why” behind an algorithm result, informaticist Blackford Middleton said at a medical informatics event in Washington.

“You’d never expect a pilot to understand every sensor on a plane,” he said. “Who can understand all the proteins and metabolites at the same time?”

For the most part, AI right now is being used as a kind of electronic triage, helping clinicians sort out the cases that require the most attention. At best, it may help them notice patterns or prod them to alter treatments in ways that could be lifesaving.

Gong has a federal grant to evaluate whether machine learning can help identify patients with acute respiratory distress. When doctors misdiagnose it as typical pneumonia, their treatments can aggravate the disease or even kill the patient.

Elsewhere, algorithms operate like electronic nags: At NewYork-Presbyterian, for example, the hospital uses AI to crawl through data and find anomalies in patient care, then nudges staff about them. One algorithm helps smooth patient discharges by reporting when a patient needs physical therapy and an MRI before going home, said Daniel Barchi, the system’s chief information officer. Better coordination can get a patient out of the hospital faster, saving money and, hopefully, getting them on the road to recovery faster, he said.

[Read More…]