1.江南大学 江苏省媒体设计与软件技术重点实验室,江苏 无锡 214122
2.江南大学 理学院,江苏 无锡 214122
成志兴,男,从事机器学习和生物信息学研究,6191611021@stu.jiangnan.edu.cn。
丁彦蕊,女,教授,从事机器学习和生物信息学研究,yr_ding@jiangnan.edu.cn。
扫 描 看 全 文
成志兴, 丁彦蕊. 多源描述符融合的药物-靶标相互作用预测框架[J]. 西北大学学报(自然科学版), 2023,53(5):782-793.
CHENG Zhixing, DING Yanrui. Prediction framework for drug-target interactions with multi-source descriptors fusion[J]. Journal of Northwest University (Natural Science Edition), 2023,53(5):782-793.
成志兴, 丁彦蕊. 多源描述符融合的药物-靶标相互作用预测框架[J]. 西北大学学报(自然科学版), 2023,53(5):782-793. DOI: 10.16152/j.cnki.xdxbzr.2023-05-010.
CHENG Zhixing, DING Yanrui. Prediction framework for drug-target interactions with multi-source descriptors fusion[J]. Journal of Northwest University (Natural Science Edition), 2023,53(5):782-793. DOI: 10.16152/j.cnki.xdxbzr.2023-05-010.
识别药物靶标相互作用(DTI)是药物发现的一项关键任务,在虚拟筛选、药物重定位和识别药物副作用等领域都发挥了至关重要的作用。通过传统的生物实验方法识别DTI通常昂贵且耗时,随着深度学习在自然语言处理等领域取得的巨大成功,已有许多研究利用深度学习预测潜在的DTI。然而,先前的许多研究仅利用药物和蛋白的单视角特征预测DTI,忽视了多源描述符的潜在价值。该文提出了一种融合多源描述符的药物靶标相互作用预测框架(DFDTI),充分利用药物和蛋白不同视角的结构信息。首先,通过全连接(FC)层为多源描述符生成低维表示;然后,考虑到不同类型的描述符对DTI预测的贡献度不同,利用通道注意力机制给予不同的描述符权重;此外,使用单层Transformer编码器增强描述符的特征表示;最后,拼接药物和蛋白的增强特征表示,并输入到深度神经网络(DNN)中以预测DTI。实验结果表明,DFDTI能够有效融合不同类型的描述符,在3类评价指标上均优于该文中所有的基线方法。
Identifying drug-target interactions (DTIs) is a key task in drug discovery, and it plays an important role in the fields of virtual screening, drug repurposing and identification of potential drug side effects. Traditional biological experimental methods of identifying DTIs are often expensive and time-consuming. With the great success of deep learning in natural language processing and other fields, many studies have used deep learning to predict DTIs. However, many previous studies only utilized single-view features of drugs and proteins to predict DTIs, ignoring the potential value of multi-source descriptors. A drug-target interaction prediction framework with multi-source descriptors fusion (DFDTI) is proposed in this paper to make full use of the structural information of drugs and proteins from different view. First, low dimensional representations are generated for the multi-source descriptors through fully connected (FC) layers. Then, considering the different contribution of different types of descriptors, the channel attention mechanism is used to give weights to descriptors. In addition, one layer Transformer encoder is used to enhance the feature representation of descriptors. Finally, the enhanced features representations of drugs and proteins are concatenated and input into the deep neural network (DNN) to predict DTIs. Experimental results show that the DFDTI can effectively fuse different types of descriptors and outperform all baseline methods in this paper on all three types of evaluation metrics.
药物-靶标相互作用特征融合注意力机制深度学习
drug-target interactionfeature fusionattention mechanismdeep learning
VAMATHEVAN J, CLARK D, CZODROWSKI P, et al. Applications of machine learning in drug discovery and development[J]. Nature Reviews Drug Discovery, 2019, 18(6): 463-477.
LUO H M, LI M, YANG M Y, et al. Biomedical data and computational models for drug repositioning: A comprehensive review[J]. Briefings in Bioinformatics, 2021, 22(2): 1604-1619.
ZHAO Q C, YANG M Y, CHENG Z J, et al. Biomedical data and deep learning computational models for predicting compound-protein relations[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 19(4): 2092-2110.
ZHENG S J, LI Y J, CHEN S, et al. Predicting drugprotein interaction using quasi-visual question answering system[J]. Nature Machine Intelligence, 2020, 2(2): 134-140.
FORLI S, HUEY R, PIQUE M E, et al. Computational proteinligand docking and virtual drug screening with the AutoDock suite[J]. Nature Protocols, 2016, 11(5): 905-919.
KEISER M J, ROTH B L, ARMBRUSTER B N, et al. Relating protein pharmacology by ligand chemistry[J]. Nature Biotechnology, 2007, 25(2): 197-206.
刘皓淼, 杨志伟, 王力卓, 等. 基于机器学习的药物-靶标相互作用预测[J]. 中国生物工程杂志, 2022, 42(4): 40-48.
LIU H M, YANG Z Y, WANG L Z, et al. Research progress of drug target interaction prediction based on machine learning[J]. China Biotechnology, 2022, 42(4): 40-48.
ZITNIK M, NGUYEN F, WANG B, et al. Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities[J]. Information Fusion, 2019, 50: 71-91.
PERMAN L, GOTTLIEB A, ATIAS N, et al. Combining drug and gene similarity measures for drug-target elucidation[J]. Journal of Computional Biology, 2011, 18(2): 133-145.
OLAYAN R S, ASHOOR H, BAJIC V B. DDR: Efficient computational method to predict drugtarget interactions using graph mining and machine learning approaches[J]. Bioinformatics, 2018, 34(7): 1164-1173.
LUO Y N, ZHAO X B, ZHOU J T, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information[J]. Nature Communications, 2017, 8: 573.
WAN F P, HONG L X, XIAO A, et al. NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drugtarget interactions[J]. Bioinformatics, 2019, 35(1): 104-111.
戴青青, 余俊霖, 李国菠. 深度学习辅助药物发现的研究进展[J]. 药学进展, 2022, 46(1): 60-70.
DAI Q Q, YU J L, LI G B. Recent advances in deep learning aided drug discovery[J]. Progress in Pharmaceutical Sciences, 2022, 46(1): 60-70.
ÖZTÜRK H, ÖZGÜR A, OZKIRIMLI E. DeepDTA: Deep drug-target binding affinity prediction[J]. Bioinformatics, 2018, 34(17): i821-i829.
王波, 高映新, 袁煦. SMILES 编码技术在化学物质数据库中的应用[J]. 中国石油和化工, 2010 (9): 41-42.
LEE I, KEUM J, NAM H. DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences[J]. PLoS Computational Biology, 2019, 15(6): e1007129.
ROGERS D, HAHN M. Extended-connectivity fingerprints[J]. Journal of Chemical Information and Modeling, 2010, 50(5): 742-54.
NGUYEN T, LE H, QUINN T P, et al. GraphDTA: Predicting drugtarget binding affinity with graph neural networks[J]. Bioinformatics, 2021, 37(8): 1140-7.
WAN F, ZHU Y, HU H, et al. DeepCPI: A deep iearning-based framework for large-scale in silico drug screening[J]. Genomics, Proteomics & Bioinformatics, 2019, 17(5): 478-95.
任欢, 王旭光. 注意力机制综述[J]. 计算机应用, 2021, 41(S1): 1-6.
REN H, WANG X G. Review of attention mechanism[J]. Journal of Computer Applications, 2021, 41(S1): 1-6.
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[EB/OL]. (2017-06-12)[2022-08-20]. https://arxiv.org/abs/1706.03762https://arxiv.org/abs/1706.03762.
CHEN L F, TAN X Q, WANG D Y, et al. TransformerCPI: Improving compound—protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments[J]. Bioinformatics, 2020, 36(16): 4406-4414.
ZHAO Q C, ZHAO H C, ZHENG K, et al. HyperAttentionDTI: Improving drugprotein interaction prediction by sequence-based deep learning with attention mechanism[J]. Bioinformatics, 2022, 38(3): 655-662.
WISHART D S, FEUNANG Y D, GUO A C, et al. DrugBank 5.0: A major update to the DrugBank database for 2018[J]. Nucleic Acids Research, 2018, 46(D1): D1074-D1082.
CONSORTIUM T U, BATEMAN A, MARTIN M J, et al. UniProt: The universal protein knowledgebase in 2021[J]. Nucleic Acids Research, 2021, 49(D1): D480-D489.
GAO K F, NGUYEN D D, SRESHT V, et al. Are 2D fingerprints still valuable for drug discovery?[J]. Physical Chemistry Chemical Physics: PCCP, 2020, 22(16): 8373-8390.
DURANT J L, LELAND B A, HENRY D R, et al. Reoptimization of MDL keys for use in drug discovery[J]. Journal of Chemical Information and Computer Sciences, 2002, 42(6): 1273-1280.
KIM S, CHEN J, CHENG T J, et al. PubChem in 2021: New data content and improved web interfaces[J]. Nucleic Acids Research, 2020, 49(D1): D1388-D1395.
LANDRUM G. RDKit: Open-source cheminformatics[EB/OL]. (2021-09-13)[2022-09-01]. https://www.rdkit.orghttps://www.rdkit.org.
ROGERS D, BROWN R D, HAHN M. Using extended-connectivity fingerprints with laplacian-modified Bayesian analysis in high-throughput screening follow-up[J]. SLAS Discovery, 2005, 10(7): 682-686.
GUHA R. Chemical informatics functionality in R[J]. Journal of Statistical Software, 2007, 18(5): 1-16.
CHEN Z, ZHAO P, LI C, et al. iLearnPlus: A comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization[J]. Nucleic Acids Research, 2021, 49(10): e60.
LUAI B, JIANG J B, ZENG Q G, et al. Predicting apoptosis protein subcellular location with PseAAC by incorporating tripeptide composition[J]. Protein & Peptide Letters, 2011, 18(11): 1086-1092.
CHEN K, KURGAN L A, RUAN J S. Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs[J]. BMC Structural Biology, 2007, 7: 25.
SHEN J W, ZHANG J, LUO X M, et al. Predicting protein-protein interactions based only on sequences information[J]. Proceedings of the National Academy of Sciences of the United States of America, 2007, 104(11): 4337-4341.
CHOU K C. Prediction of protein cellular attributes using pseudo-amino acid composition[J]. Proteins: Structure, Function, and Bioinformatics, 2001, 43(3): 246-255.
DUBCHAK I, MUCHNIK I, HOLBROOK S R, et al. Prediction of protein folding class using global description of amino acid sequence[J]. Proceedings of the National Academy of Sciences of the United States of America, 1995, 92(19): 8700-8704.
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City: IEEE, 2018: 7132-7141.
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构