In a study published in the scientific journal PNAS, researchers from St. John's College and the University of Cambridge report how they have used machine-learning to identify the biological languages of cancer and Alzheimer’s, as well as other neurodegenerative diseases. The researchers say that their eventual aim will be to decipher the languages they identify in order to 'correct the grammatical mistakes inside cells that cause disease'.
Professor Tuomas Knowles, lead author of the paper and a Fellow at St John's College, noted that "Bringing machine-learning technology into research into neurodegenerative diseases and cancer is an absolute game-changer. Ultimately, the aim will be to use artificial intelligence to develop targeted drugs to dramatically ease symptoms or to prevent dementia from happening at all."
In other words, the researchers trained a computer to recognize patterns of neurodegeneration and disease that scientists have already identified. First author Dr. Kadi Liis Saar, explains that "The human body is home to thousands and thousands of proteins and scientists don't yet know the function of many of them. We asked a neural network-based language model to learn the language of proteins. We specifically asked the program to learn the language of shapeshifting biomolecular condensates -- droplets of proteins found in cells -- that scientists really need to understand to crack the language of biological function and malfunction that cause cancer and neurodegenerative diseases like Alzheimer's. We found it could learn, without being explicitly told, what scientists have already discovered about the language of proteins over decades of research." From this training, the machine learns to pick out similar patterns that perhaps even human eyes have missed over the years.
The authors compare their technology to that used by other algorithms to suggest the next episode, video, or friend. Dr. Saar comments: "We fed the algorithm all of the data held on the known proteins so it could learn and predict the language of proteins in the same way these models learn about human language and how WhatsApp knows how to suggest words for you to use. Then we were able to ask it about the specific grammar that leads only some proteins to form condensates inside cells. It is a very challenging problem and unlocking it will help us learn the rules of the language of disease."
The team says that machine learning is revolutionary because it provides the opportunity to make connections about data not even yet conceived of. In an effort to support future investigations, they have made their network open access.