A new study out from Google seems to show the promise of AI-assisted healthcare. Actually, it shows the threat.

Google researchers made headlines early this month for a study that claimed their artificial intelligence system could outperform human experts at finding breast cancers on mammograms. It sounded like a big win, and yet another example of how AI will soon transform healthcare: More cancers found! Fewer false positives! A better, cheaper way to provide high-quality medical care!
Hold on to your exclamation points. Machine-enabled healthcare may bring us many benefits in the years to come, but those will be contingent on the ways in which its used. If doctors ask the wrong questions to begin withif they put AI to work pursuing faulty premisesthen the technology will be a bust. It could even serve to amplify our earlier mistakes.
In a sense, thats what happened with the recent Google paper. Its trying to replicate, and then exceed, human performance on what is at its core a deeply flawed medical intervention. In case you havent been following the decades-long controversy over cancer screening, it boils down to this: When you subject symptom-free people to mammograms and the like, youll end up finding a lot of things that look like cancer but will never threaten anyones life. As the science of cancer biology has advanced, and screening has become widespread, researchers have learned that not every tumor is destined to become deadly. In fact, many people harbor indolent forms of cancer that do not actually pose a risk to their health. Unfortunately, standard screening tests have proven most adept at finding precisely the latterthe slower-growing ones that would better be ignored.
This might not be so bad, in theory. When a screening test uncovers harmless cancer, you can just ignore it, right? The problem is, its almost impossible to know at the time of screening whether any particular lesion will end up dangerous or no big deal. In practice, most doctors are inclined to treat any cancer thats discovered as a potential threat, and the question of whether or not mammograms actually save lives is a matter of intense debate. Some studies suggest they do, others find that they dont, but even if we take the rosiest interpretations of the literature at face value, the number of lives saved by this massive, widespread intervention is small. Some researchers have even calculated that mammography is, in balance, bad for patients health; i.e. that its aggregate harms, in terms of the excess treatment it inspires and the tumors brought on by its radiation, outweigh any benefits.
In other words, AI systems like the one from Google promise to combine humans and machines in order to facilitate cancer diagnosis, but they also have the potential to worsen pre-existing problems such as overtesting, overdiagnosis and overtreatment. Its not even clear whether the improvements in false-positive and false-negative rates reported this month would apply in real-world settings. The Google study found that AI performed better than radiologists who were not specifically trained in examining mammograms. Would it come out on top against a team of more specialized experts? Its hard to say without a trial. Furthermore, most of the images assessed in the study were created with imaging devices made by a single company. It remains to be seen whether these results would generalize to images from other machines.
The problem goes beyond just breast-cancer screening. Part of the appeal of AI is that it can scan through reams of familiar data, and pick out variables that we never realized were important. In principle, that power could help us to diagnose any early-stage disease, in the same way the subtle squiggles of a seismograph can give us early warnings of an earthquake. (AI helps there, too, by the way.) But sometimes those hidden variables really arent important. For instance, your dataset might be drawing from a cancer screening clinic that is only open for lung cancer tests on Fridays. As a result, an AI algorithm could decide that scans taken on Fridays are more likely to be lung cancer. That trivial relationship would then get baked into the formula for making further diagnoses.