醣化學新突破:以統計分析及機器學習解析醣合成反應
Breakthrough in Sugar Chemistry: Unravelling Synthetic Carbohydrate via Statistical Analysis and Machine LearningAngew. Chem. Int. Ed. 2021, 60, 12413 – 12423
Chun-Wei Chang, Mei-Huei Lin, Chieh-Kai Chan, Kuan-Yu Su, Chia-Hui Wu, Wei-Chih Lo, Sarah Lam, Yu-Ting Cheng, Pin-Hsuan Liao, Chi-Huey Wong,* and Cheng-Chung Wang*
Carbohydrates, widely distributed on cell membrane, dominate numerous signal transduction among cells and the infection of bacteria and virus. Tumor cell exhibits abundant abnormal glycan sequences and bacteria capsular polysaccharides show great difference from mammalian glycoconjugates, making tumor associated carbohydrates and capsular polysaccharides highly potential vaccine candidates. However, the development of carbohydrate-based vaccine and medicine is greatly limited due to the absence of a reliable guideline on glycosylation, core to carbohydrate synthesis. Without an efficient and stable control on the stereoselectivity and yield of glycosylation reaction, the mass production of carbohydrate-based vaccine and medicine is unpractical.
Recently, Dr. Cheng-Chung Wang, an associate research fellow at the Institute of Chemistry, Academia Sinica, Dr. Chi-Huey Wong, a former president, Academia Sinica, and their research teams successfully integrated real experiments, quantitation, big data analysis and machine learning algorithm to establish a designed program “GlycoComputer”, “http://chemwww.chem.sinica.edu.tw/ChemicalGlycosylation/index.php” enabling a precise prediction of glycosylation reaction. An acceptor nucleophilicity constant (Aka), summarizing the steric, electronic and structural effects, was developed to quantify the reactivity of hydroxyl groups, providing a connection between synthetic experiments and computer algorithm. This new discovery has been published in Angewandte Chemie International Edition on February, 2021.
At least eleven factors across chemical participants and environment are involved in chemical condition. A subtle change on the building blocks can greatly influence the stereoselectivity and yield. The optimization of this reaction therefore often results in trial-and-error, and renders the mass production and manufacturing of complicated carbohydrate molecules unattainable goals. The GlycoComputer, established by Dr. Wang and Dr. Wong, can accurately predict the stereoselectivity and yield of glycosylation reaction before manual manipulation by using the concept of computer-aided synthesis, and is expected to greatly facilitate the production of oligosaccharides and carbohydrate-based vaccine and medicine.
Dr. Wang remarked, “Conventional carbohydrate synthesis is a trial and error process, while empirical rules highly rely on and are usually misled by human judgment. Big data analysis and machine learning provide an evaluation platform to analyze different factors in glycosylation reaction under big data analysis and unravel potential parameters.” By establishing the GlycoComputer program, a diverse range of glycosylation donors and acceptors with well-defined reactivity and promotors were analyzed and studied. The applicability was further validated by the synthesis of a carbohydrate antigen to show that the stereoselectivity and yield can be accurately estimated without involving sophisticated computational processing. The production of carbohydrate molecules is expected to be greatly simplified in the future by integrating this program.
Dr. Chun-Wei Chang is the first author in this study. The corresponding authors, Dr. Cheng-Chung Wang and Dr. Chi-Huey Wong, appreciate the financial support from Academia Sinica and Ministry of Science and Technology, Taiwan.
The full article entitled “Automated Quantification of Hydroxyl Reactivities: Prediction of Glycosylation Reactions” can be now found in the Angewandte Chemie International Edition website at: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202013909
GlycoComputer: http://chemwww.chem.sinica.edu.tw/ChemicalGlycosylation/index.php
Media Contact:
Dr. Cheng Chung Wang, Associated Research Fellow, Institute of Chemistry, Academia Sinica
Email: wangcc@chem.sinica.edu.tw
(Tel) +886-2-5572-8618
醣分子為細胞表面上主要及重要的結構,其複雜的結構形成特異的分子辨識,主導多個重要功能,例如細胞間的訊號傳遞、媒介細菌與病毒的感染、及調控疾病的生成與進展。例如,醣類疫苗的發展,正是利用致病原外鞘膜與癌細胞表面上特有的醣分子作為基體,剖析其分子結構並尋找出具高免疫活性的醣序列,來協助自身免疫系統產生對應的抗體,以達到精準治療的方針。然而,醣分子的合成相當困難,至今產、學界仍然缺乏取得大量且高純度的醣類衍生物的合成通則和指導方針,其根本原因在於其反應核心反應--醣鏈結反應中,立體選擇性難以預測,產率也很難有效評估,導致以醣為核心之疫苗和藥物開發受到極大的限制。
近期,中研院化學所王正中副研究員、中研院基因體中心翁啟惠院士共同帶領的研究團隊,結合機器學習、統計分析以及傳統合成開發出”GlycoComputer”軟體及http://chemwww.chem.sinica.edu.tw/ChemicalGlycosylation/index.php,透過分子定量和親合常數(Aka)數據庫網頁,可統整醣分子所表現的立體、電子、結構性對於合成反應的影響,成功架起分子科學、演算法以及有機合成之間的橋樑,讓準確預測醣化學合成不再是夢想。此研究成果於2021年2月26日正式刊登在國際期刊《德國應用化學》(Angewandte Chemie International Edition)。
在傳統醣疫苗開發中,需考量的反應物和合成環境變因高達11種以上,導致合成路徑的篩選和優化過程相當漫長且雜亂無章。合成單體的細微結構差異更左右著醣苷鍵生成的產率和立體位向,費勁耗時的研發過程使得複雜醣類藥物的量產過程遙不可及。有藉於此,王正中以及翁啟惠研究團隊開創GlycoComputer預測軟體以期解決此瓶頸,在進行合成反應前,以電腦輔助突破傳統合成思維,即能透過預測結果來篩選出成功率最高的合成路徑,大幅加速合成開發並促進寡糖和醣類疫苗的生產。
王正中副研究員指出:「傳統合成往往受限於盲目摸索,而經驗法則更容易受到主觀意識和人為判讀而有所偏歧。因此,機器學習和統計分析提供一個客觀基於大量數據的評估平臺,解析醣化學合成的多項變因」。透過GlycoComputer程式的建立,各種醣體活性、反應物以及試劑皆能明確定量並逐一分析和探討。此預測系統無須複雜的演算過程,即可準確合成醣類抗原疫苗並成功估算其合成結果,預計未來可大幅簡化醣類藥物分子的生產。
此研究由本院以及科技部支持。第一作者為王正中實驗室的張峻瑋博士;通訊作者為本院化學所王正中副研究員以及基因體中心翁啟惠院士。 研究題目: 羥基反應性的自動化定量:預測醣鏈結反應
參考網站: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202013909
GlycoComputer網頁:http://chemwww.chem.sinica.edu.tw/ChemicalGlycosylation/index.php
新聞聯繫人:
王正中博士,中央研究院化學研究所副研究員 wangcc@chem.sinica.edu.tw
(Tel) +886-2-5572-8618