BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides

Research output: Contribution to journalArticlepeer-review

151 Scopus citations

Abstract

Motivation: The identification of bitter peptides through experimental approaches is an expensive and timeconsuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desirable. Results: In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)- based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with an accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of 8.0% accuracy and 16.0% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research.

Original languageEnglish
Pages (from-to)2556-2562
Number of pages7
JournalBioinformatics
Volume37
Issue number17
DOIs
StatePublished - 1 Sep 2021
Externally publishedYes

Fingerprint

Dive into the research topics of 'BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides'. Together they form a unique fingerprint.

Cite this