Even if the big data and machine learning methods (machine learning) sometimes give spectacular results, they do not make it possible to understand the mechanisms involved. These can only be elucidated by fundamental research, which remains very important to ensure scientific progress. This is recalled in an editorial published in the journal Science Signaling Holden Thorp, journal editor Scienceand Michael Yaffe, editor of the journal Science Signaling.
“Faced with the accumulation of vast scientific data banks and new methods [informatiques et statistiques] analysis of this big data, it is seductive to believe that the main advances in biomedical science will come mainly from the translation of these treasures of information directly into applications for health care, for agriculture and in strategies to counter the climate change, rather than discoveries generated by basic research,” say Thorp and Yaffe. At the same time, they point out that the main national, international and private funding agencies seem to allocate a substantial part of their grants to this approach aimed at developing immediate practical applications.
Both are concerned about the future of fundamental research, which aims to increase our knowledge of the world and to discover the fundamental mechanisms at play in the phenomena we observe, mechanisms which can then be exploited to develop biomedical applications.
Long-term scientific losses
They argue that it remains important to invest in basic science, although many argue that scientific research should focus on more pragmatic concerns, especially at this time, given the incredible potential of intelligence technologies. artificial intelligence (AI) and the abundant scientific data. All this in order to respond to the monumental challenges facing our species in particular, if not the entire planet. They “warn against research that places too much emphasis on short-term technological gain, which would lead to long-term scientific losses”.
“The greatest advances in science are still the result of proven research methods”, they emphasize while giving the example of Paxlovid, this antiviral used against COVID-19, which was developed thanks to our understanding viral enzymology and traditional medicinal chemistry, as well as that of immunotherapy to treat cancer, which was developed from new knowledge acquired on immunology during basic research.
The use of advanced machine learning methods, such as deep learning (deep learning), in the biological sciences has revealed to us how much our fundamental understanding is still deficient, they say.
Vital basic research
Calculation software AlphaFold and RoseTTAFold are able to accurately predict the three-dimensional structure of a protein from its amino acid sequence using deep learning methods: an amazing feat, which humans have never achieved , although he has known the physico-chemical principles since the 1950s, they report.
“This example shows us that there are very fundamental aspects of the protein folding process that we don’t yet understand. Continuing basic research aimed at understanding this process is vital if we are to close the gap between the prediction made by AI and our scientific understanding,” they write.
Another example: machine learning is better than doctors at detecting pathologies in mammography, chest X-ray and CT scan images. “But what these approaches can’t do adequately is explain exactly what the computer sees when it makes a diagnosis or a classification. »
And machine learning could not have predicted that a coronavirus, which had been the subject of basic research since 1960, would become the pathogen that would threaten humans the most over the past hundred years. Nor that mRNA vaccines would protect us from it. The fruits of fundamental scientific research have never been so crucial as in this episode, they point out.
In which direction to look
“If we have RNA vaccines today, it is largely thanks to the fundamental research carried out by the Hungarian Katalin Karikó, who, 25 years ago, sought to understand how mRNA works, whereas it was not the fashion. […] The results of basic research are unpredictable. Maybe they won’t lead anywhere, maybe the answer will be negative, but it’s important to know that, because it tells us not to look in that direction anymore,” adds Yves Gingras. , director of the Science and Technology Observatory.
Believing that with algorithms based on deep machine learning and massive amounts of data, we will no longer need theories because the data will give us the answers “constitutes a kind of regression towards empiricism”, estimates he.
“AI techniques are based on algorithms that only make correlations, which only look for relationships between various elements in a mass of data. And, once established, these relationships make it possible to predict future cases by induction and extrapolation with a certain probability,” says the professor of history and sociology of science at UQAM.
“Making predictions is not explaining. If we are content to predict, we could thus continue to use the planetary model of Ptolemy’s epicycles [astronome et mathématicien grec du IIe siècle de notre ère], which is wrong, but works. If the computer is provided with numerous epicycles, it indeed gives good predictions of the position of the planets seen from the Earth. But that’s just empirical prediction, which even allowed the Babylonians to predict eclipses without really understanding how they happen,” he says.
“It took Kepler and Newton to really explain — following the formulation of a physical theory based on the attraction between planets revolving around the Sun — what Ptolemy, and before him the Babylonian scribes, contented themselves with predict”, he recalls in a column published in the journal for science.
False predictions
Mr. Gingras also points out that “predictions [produites par les méthodes de l’IA] never work 100%. They generally stagnate at 80%. They are false in 20% of cases. So there is uncertainty.”
“As long as we do not know the mechanism, we cannot have full confidence, he recalls. On the other hand, we can trust, for example, the computers that today plan a trip to the Moon because they apply Newton’s, Kepler’s and Einstein’s equations, which we know to be valid. »
The authors of the editorial published in Science Signaling do not condemn the use of AI, which they see rather as a technique for accelerating scientific discoveries. “Algorithms are technology, they’re useful, but they can’t replace science,” adds Mr. Gingras.
“More discoveries will emerge if we first rely on a better understanding of biology to guide data analysis rather than naively thinking that they will come the other way,” conclude Thorp and Yaffe.