CancerGPT – This Proposed LLM-Powered Model Will Predict Drug Pair Synergies
In recent times, foundation models or ‘generalist’ models have established themselves as dominant forces in the realm of artificial intelligence. Trained on gigantic datasets, these models are skilled to do a variety of unsupervised tasks. They are versatile and have profound language processing capabilities, computer vision, visual comprehension, etc.
Despite the fact that Large Langauge Models (LLM) have proven adept at few-shot learning in a number of applications, such as robotics, computer vision, and natural language processing, a comprehensive investigation of its generalization to issues that cannot be detected in more complicated fields like biology is still pending.
According to researchers from the University of Texas, the University of Massachusetts Amherst, and the University of Texas Health Science Center, LLMs that typically obtain knowledge from unstructured literature, could be novel approaches to predict biological problems where there are minimum sample sizes and the absence of structured data.
Drug Pair Synergy & Combination Therapy
In the field of few-shot biological prediction, a common yet vital issue is the prediction of drug pair synergy in different kinds of cancer that have not been thoroughly studied.
Today, it is common practice to use combinations of drugs in therapy to treat diseases that are challenging to treat, such as cancer, infectious infections, and neurological disorders. In many cases, combination therapy produces better therapeutic outcomes than single-drug therapy. Predicting the efficacy of medicine combinations has become a major focus of research in the discovery and development of new medications.
Drug pair synergy explains how combining two drugs has a more potent therapeutic effect than doing it individually.
Predicting the synergy of a drug pair is impossible due to the vast number of possible combinations and complexity of the underlying biological systems.
Machine Learning in Drug Prediction
To forecast the pairing, a number of computational algorithms have been developed, most notably using machine learning. For some tissues, like bone and soft tissues, only a small quantity of experimental data is available.
On the contrary, the majority of data focuses on prevalent cancer types in specific tissues, such as cancers of the breast and lungs. Large dataset-reliant machine learning models could require assistance during training.
Early studies generalized the combination’s score to lines of cells in various tissues according to relational or contextual information, disregarding the molecular and cellular differences between these tissues.
Another area of research has focused on narrowing the gap between tissues by using diverse and highly dimensional information, such as chemical or genetic profiles.
They seek to deal with the issue that LLMs in this work addressed earlier. They claim that despite the lack of organized data and contradictory characteristics, scientific research nevertheless offers helpful details on many types of cancer.
Manually collecting prognostic information on such biological things from literature is difficult. Their unique approach involves using previous data from scientific publications kept in LLMs.
The handful-shot drug pair synergy prediction model they developed transforms their forecasting job into a natural-language inference challenge and generates answers based on the knowledge contained in LLMs.
CancerGPT – proposes a few-shot learning approach based on LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features.
CancerGPT (~124M parameters) is comparable to a larger fine-tuned GPT-3 model (~175B) on drug pair synergy prediction.… pic.twitter.com/i25OH2NZyQ
— elvis (@omarsar0) April 24, 2023
According to their experimental results, their few-shot LLM prediction model outperformed robust tabular prediction methods in the majority of cases and was remarkably accurate even under zero-shot conditions.
This exceptional few-shot prediction performance in a few of the most challenging biological prediction tasks has a significant and relevant significance to a vast community of biomedicine since it shows high potential in the “generalist” biomedical artificial intelligence.