Researchers Develop Highly Accurate Carbohydrate Binding Site Prediction Algorithm DeepGlycanSite

As the most abundant organic substances in nature, carbohydrates are essential for life. Carbohydrates interact with diverse protein families to modulate various biological processes, including immune response, cell differentiation and neural development. Understanding how carbohydrates regulate proteins in physiological and pathological processes presents opportunities to address crucial biological problems and develop new therapeutics. However, the diversity and complexity of carbohydrates pose a challenge in experimentally identifying the sites where carbohydrates bind to and act on proteins. Hence, the development of a reliable carbohydrate-binding site predictor is paramount in uncovering carbohydrate-protein interactions.
In a study published in Nature Communications on June 17, a research team led by CHENG Xi and WEN Liuqing from the Shanghai Institute of Materia Medica (SIMM) of the Chinese Academy of Sciences and WANG Dingyan from the Lingang Laboratory, along with collaborators, introduced a new carbohydrate-binding site predictor DeepGlycanSite. This predictor remarkably outperforms previous state-of-the-art methods and effectively predicts binding sites for diverse carbohydrates. 
Incorporating geometric and evolutionary features of proteins into a deep equivariant graph neural network with transformer architecture, DeepGlycanSite is capable of accurately predicting carbohydrate-binding sites on a given protein structure. On the independent testing sets involving more than one hundred unique carbohydrate-binding proteins, researchers compared DeepGlycanSite with state-of-the-art binding site predictors. DeepGlycanSite had an average Matthews correlation coefficient (MCC) and precision more than 0.62, while all alternative methods had small average MCC and precision less than 0.35. For monosaccharide- or disaccharide-binding sites prediction, DeepGlycanSite showed the average MCC and precision more than twice those of alternative methods. For oligosaccharide- or nucleotide- binding sites prediction, DeepGlycanSite still had the average MCC and precision more than 0.60. Collectively, DeepGlycanSite displayed great performance across various carbohydrate-binding site prediction, highlighting its generalized applicability. 
In addition, DeepGlycanSite could predict the specific binding site for a query carbohydrate. The researchers built a network model, DeepGlycanSite+Ligand, to process protein structure and two-dimensional chemical structure of the query carbohydrate, with extra modules for dealing with ligand parts. DeepGlycanSite+Ligand could distinguish the specific binding site of the query carbohydrate belonging to various classes, while previous state-of-the-art methods showed inefficacy in distinguishing mono-, di-, or oligosaccharide-binding sites.
To set an example of its application, researchers used DeepGlycanSite+Ligand to identify the specific carbohydrate-binding site on a functionally important G-protein coupled receptor, P2Y purinoceptor 14 (P2Y14). P2Y14 regulates immune responses and associates with asthma, kidney injury and lung inflammation. In the calcium mobilization assay, researchers found that guanosine 5’-diphosphatefucose (GDP-Fuc) activates human P2Y14 with a half-maximal effective concentration (EC50) of 0.49 ± 0.04 μM. As an essential sugar nucleotide in mammals, GDP-Fuc is critically involved in tumor growth and metastasis across various cancers. The GDP-Fuc-induced activation of P2Y14 has not been reported before. Hence, how GDP-Fuc acts on this receptor is unknown. This study used DeepGlycanSite to identify that G80, D81 and N90 form the guanosine-5’-diphosphate-sugar-recognition site of P2Y14, and these findings were validated in mutagenesis studies.
Carbohydrates are critical mediators of biological function. Their remarkably diverse structures and varied activities present exciting opportunities for understanding many areas of biology. DeepGlycanSite will not only help decipher the biological functions of carbohydrates and carbohydrate-binding proteins but also provide a powerful tool for the development of carbohydrate drugs.

DOI: 10.1038/s41467-024-49516-2
 
 
DeepGlycansite Network Architecture (Image by Xi Cheng)
Contact:
JIANG Qingling
Shanghai Institute of Materia Medica, Chinese Academy of Sciences
E-mail: qljiang@stimes.cn