AI and data science using gradient boosting (GBM) machine learning and analysis are at the forefront of machine learning applied to life sciences and medicine in the field of artificial intelligence (AI). Gradient enhancement involves a toolbox of learning techniques aimed at creating predictive models by combining the outputs of multiple less robust models using decision trees sequentially. During a presentation at Analytica in Munich, Germany, experts discussed the use of GBM in life sciences and medicine.
The first lecture of this session was presented by Bing Zhang of Baylor College of Medicine in Houston, Texas, and was titled “Leveraging Artificial Intelligence to Illuminate the Dark Phosphoproteome.” This presentation addressed the challenge of effective analysis and interpretation of mass spectrometry-based phosphoproteomic data. Zhang’s team used machine learning (ML) and deep learning (DL) methods to improve the analysis of phosphoproteomic data, with the goal of understanding the so-called “dark phosphoproteome.” Specifically, they developed DeepRescore2 software, using deep learning-based fragment ion retention time and intensity predictions to improve phosphopeptide identification and phosphosite localization. Additionally, Zhang discussed the IDPpub computational pipeline, which leverages BioBERT software to extract phosphorylation sites from biomedical abstracts, facilitating the identification of regulatory enzymes and biological functions of phosphosites.
The second lecture of this session was presented by Lennart Martens from the life sciences research institute VIB and Ghent University in Belgium. The presentation was titled “Machine Learning-Based Spotlights to Illuminate Precision Medicine” and focused on integrating machine learning models into mass spectrometry-based proteomics. Martens highlighted the significant improvement in identification performance achieved by machine learning models such as MS2PIP and DeepLC software coupled with the MS2Rescore variant of the Percolator rescoring engine. These machine learning models improve information retrieval from proteomic data, providing new insights into the underlying biology and pathology encoded in existing datasets. Additionally, Martens highlighted the potential for machine learning models to reveal detailed information about molecular pathologies and map protein activity on a proteome scale, which could have implications for precision medicine.
The third presentation by Fan Liu from the Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP) in Berlin, Germany, was on the “development of structural interactomics and its application in cell biology” and focused on the mass spectrometry of Proteome-wide cross-linking to capture protein interactions and molecular spatial arrangements. Liu highlighted advances in experimental methods and software tools, generating extensive data on protein-protein interactions (PPIs) in several biological systems. These data provide insight into protein localizations, interactions, and subcellular architectures, serving as valuable training data for AI-based methods to identify protein-protein interactions (PPIs) and amino acid sequences. specific or structural features of proteins that play a crucial role in mediating the binding of proteins to each other,
The final lecture of the session was given by AP Gamiz-Hernandez of Stockholm University in Sweden, who presented “Overview of the molecular principles of protein function and disease”, addressing the energy metabolism of cells and the challenges in understanding OXPHOS (oxidative phosphorylation). ) energy transduction mechanism of complexes located in the inner mitochondrial membrane and responsible for the generation of ATP (adenosine triphosphate) for cellular energy, Gamiz-Hernandez discussed the combination of molecular dynamics simulations and models of machine learning to predict structure-based chemical reactivity, such as pKa and redox potentials, in proteins. This approach aimed to identify key residues responsible for protein function and disease-related mutations, thereby providing insight into the molecular principles underlying protein function and disease mechanisms.