Education, Science, Technology, Innovation and Life
Open Access
Sign In

Deep Learning-Driven Protein Design

Download as PDF

DOI: 10.23977/acss.2024.080120 | Downloads: 10 | Views: 178

Author(s)

Ao You 1, Xiaobing Xu 1

Affiliation(s)

1 School of Information, Yunnan Normal University, Kunming, China

Corresponding Author

Ao You

ABSTRACT

Protein, the material basis of all living organisms and the primary carrier of life activities, plays a pivotal role in regulating various physiological functions. Composed of specific amino acid sequences, proteins can fold into distinct structures, enabling them to perform diverse functions such as biocatalysis, metabolic regulation, immune defense, transport, and storage. The design of novel proteins with targeted biological functions is a central task in protein engineering, with broad applications in synthetic biology and drug research. This paper provides a comprehensive overview of the recent advancements in deep learning-assisted protein design research. It primarily introduces related language models and generative models, and discusses the associated research and existing challenges from both sequence and structural perspectives. Finally, it offers a forward-looking perspective on the future development of deep learning-assisted protein design research.

KEYWORDS

Deep learning; Protein design; Language model; Generative model; Protein sequence; Protein structure

CITE THIS PAPER

Ao You, Xiaobing Xu, Deep Learning-Driven Protein Design. Advances in Computer, Signals and Systems (2024) Vol. 8: 170-175. DOI: http://dx.doi.org/10.23977/acss.2024.080120.

REFERENCES

[1] CHEN K, ARNOLD F. Tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin E for catalysis in dimethylformamide [J]. PNAS, 1993(90):5618-5622.
[2] BLEICHER K H, BHM H J, MULLER K, et al. Hit and lead generation: beyond high-throughput screening [J].Nature reviews Drug discovery, 2003, 2(5): 369-378.
[3] MACARRON R, BANKS M N, BOJANIC D, et al. Impact of high-throughput screening in biomedical research [J].Nature reviews Drug discovery, 2011, 10(3):188-195.
[4] WU Z, JOHNSTON K E, ARNOLD F H, et al. Protein sequence design with deep generative models[J].Current opinion in chemical biology, 2021(65): 18-27.
[5] HIRANUMA N, PARK H, BAEK M, et al. Improved protein structure refinement guided by deep learning based accuracy estimation [J].Nature communications, 2021, 12(1):1340.
[6] DING W, NAKAI K, GONG H. Protein design via deep learning [J].Briefings in bioinformatics, 2022, 23(3): bbac102.
[7] RIVES A, GOYAL S, MEIER J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences[J].bioRxiv,2019(10): 622803.
[8] MADANI A, MCCANN B, NAIK N, et al. Progen: Language modeling for protein generation [J].arXiv preprint arXiv, 2004(3497):2020.
[9] ELNAGGAR A, HEINZINGER M, DALLAGO C, et al. ProtTrans: Towards cracking the language of Life's code through self-supervised deep learning and high performance computing[J].arXiv preprint arXiv, 2007(06225).
[10] ALLEY E C, KHIMULYA G, BISWAS S, et al. Unified rational protein engineering with sequence-based deep representation learning[J].Nature methods, 2019, 16(12): 1315-1322. 
[11] RAO R, BHATTACHARYA N, THOMAS N, et al. Evaluating protein transfer learning with TAPE[J].Advances in neural information processing systems, 2019:32.
[12] ANAND N, HUANG P. Generative modeling for protein structures [J].Advances in neural information processing systems, 2018:31.
[13] GREENER J G, MOFFAT L, JONES D T. Design of metallo proteins and novel protein folds using variational autoencoders [J]. Scientific reports, 2018, 8(1): 16189.
[14] SHIN J E, RIESSELMAN A J, KOLLASCH A W, et al. Protein design and variant prediction using autoregressive generative models [J].Nature communications, 2021, 12(1): 2403.
[15] REPECKA, DONATAS. "Expanding functional protein sequence spaces using generative adversarial networks." [J]. Nature Machine Intelligence, 2021(4): 324-333.
[16] XIAN Y, SHARMA S, SCHIELE B, et al. f-vaegan-d2: A feature generating framework for any-shot learning[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:10275-10284.
[17] Z. Wu, K. E. Johnston, F. H. Arnold, and K. K. Yang, "Protein sequence design with deep generative models," Curr. Opin. Chem. Biol., vol. 65, pp. 18–27, 2021.
[18] S. Ovchinnikov and P.-S. Huang, "Structure-based protein design with deep learning," Curr. Opin. Chem. Biol., vol. 65, pp. 136–144, 2021.
[19] BISWAS S, KHIMULYA G, ALLEY E C, et al. Low-N protein engineering with data-efficient deep learning[J]. Nature methods, 2021, 18(4): 389-396.
[20] RIESSELMAN A, SHIN J E, KOLLASCH A, et al. Accelerating protein design using autoregressive generative models [J]. BioRxiv, 2019: 757252.
[21] DING X, ZOU Z, BROOKS III C L. Deciphering protein evolution and fitness landscapes with latent space models [J].Nature communications, 2019, 10(1): 5644.
[22] HAWKINS-HOOKER A, DEPARDIEU F, BAUR S, et al. Generating functional protein variants with variational autoencod-ers[J].PLoS computational biology, 2021, 17(2): e1008736.
[23] RUSS W P, FIGLIUZZI M, STOCKER C, et al. An evolution based model for designing chorismate mutase enzymes [J]. Science, 2020, 369(6502): 440-445.

Downloads: 13954
Visits: 262538

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.