Artificial intelligence is used at an unreasonable rate to analyze data in some areas of biomedical research, leading to inaccurate findings, warned a leading US computer scientist and medical statistician on Friday.
"I didn't want to rely on a very large part of the discoveries currently made using machine learning techniques used for large datasets," said Genevera Allen of the Baylor College of Medicine and Rice University at the American Association for Advancement of Science Meeting. yearly.
Machine learning is a form of AI that is widely used to find patterns and associations in scientific and medical data, for example, between genes and diseases. In precision medicine, researchers look for groups of patients with similar DNA profiles, so that the treatments can be targeted to their particular genetic disease form.
"Many of these techniques are designed to always make a prediction," Dr said. Allen. "They never come back with" I don't know "or" I haven't discovered anything "because they're not there."
She was reluctant to point a finger at individual studies, but said unconscious findings from the machine learning analysis of cancer data, published recently, were a good example.
"There are cases where discoveries cannot be reproduced," Dr said. Allen. "The clusters discovered in a study are quite different from the clusters found in another. Why? Because most machine learning techniques today always say: & # 39; I found a group & # 39; sometimes it would be much more useful if they said, "I think some of these are really grouped together, but I'm unsure of these others." "
When machine learning identifies one A particular link between the patients' genes and a trait of their illness, human scientists can then construct a scientific rationalization for the discovery. But that does not necessarily mean that it is correct.
"There is always a story that you can construct to show why a particular group of genes are grouped together," Dr. Allen.
Computer scientists are now only beginning to appreciate the problem, which threatened to lead medical researchers down false roads and waste resources trying to confirm results that cannot be reproduced.
Dr Allen and colleagues are trying to improve statistical techniques and machine learning technology so that AI can criticize its own data analysis and indicate the likelihood that a particular finding is genuine rather than a random association.
"An idea is deliberately interfering with the data to discover if the results survive this disturbance," she said.