TY - JOUR
T1 - PRR-HyPred
T2 - A two-layer hybrid framework to predict pattern recognition receptors and their families by employing sequence encoded optimal features
AU - Firoz, Ahmad
AU - Malik, Adeel
AU - Ali, Hani Mohammed
AU - Akhter, Yusuf
AU - Manavalan, Balachandran
AU - Kim, Chang Bae
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2023/4/15
Y1 - 2023/4/15
N2 - Pattern recognition receptors (PRRs) recognize distinct features on the surface of pathogens or damaged cells and play key roles in the innate immune system. PRRs are divided into various families, including Toll-like receptors, retinoic acid-inducible gene-I-like receptors, nucleotide oligomerization domain-like receptors, and C-type lectin receptors. As these are implicated in host health and several diseases, their accurate identification is indispensable for their functional characterization and targeted therapeutic approaches. Here, we construct PRR-HyPred, a novel two-layer hybrid framework in which the first layer predicts whether a given sequence is PRR or non-PRR using a support vector machine, and in the second, the predicted PRR sequence is assigned to a specific family using a random forest-based classifier. Based on a 10-fold cross-validation test, PRR-HyPred achieved 83.4 % accuracy in the first layer and 95 % in the second, with Matthew's correlation coefficient values of 0.639 and 0.816, respectively. This is the first study that can simultaneously predict and classify PRRs into specific families. PRR-HyPred is available as a web portal at https://procarb.org/PRRHyPred/. We hope that it could be a valuable tool for the large-scale prediction and classification of PRRs and subsequently facilitate future studies.
AB - Pattern recognition receptors (PRRs) recognize distinct features on the surface of pathogens or damaged cells and play key roles in the innate immune system. PRRs are divided into various families, including Toll-like receptors, retinoic acid-inducible gene-I-like receptors, nucleotide oligomerization domain-like receptors, and C-type lectin receptors. As these are implicated in host health and several diseases, their accurate identification is indispensable for their functional characterization and targeted therapeutic approaches. Here, we construct PRR-HyPred, a novel two-layer hybrid framework in which the first layer predicts whether a given sequence is PRR or non-PRR using a support vector machine, and in the second, the predicted PRR sequence is assigned to a specific family using a random forest-based classifier. Based on a 10-fold cross-validation test, PRR-HyPred achieved 83.4 % accuracy in the first layer and 95 % in the second, with Matthew's correlation coefficient values of 0.639 and 0.816, respectively. This is the first study that can simultaneously predict and classify PRRs into specific families. PRR-HyPred is available as a web portal at https://procarb.org/PRRHyPred/. We hope that it could be a valuable tool for the large-scale prediction and classification of PRRs and subsequently facilitate future studies.
KW - Boruta
KW - Feature selection
KW - Machine learning
KW - Pattern recognition receptors
KW - Random forest
KW - Support vector machines
UR - https://www.scopus.com/pages/publications/85148039903
U2 - 10.1016/j.ijbiomac.2023.123622
DO - 10.1016/j.ijbiomac.2023.123622
M3 - Article
C2 - 36773859
AN - SCOPUS:85148039903
SN - 0141-8130
VL - 234
JO - International Journal of Biological Macromolecules
JF - International Journal of Biological Macromolecules
M1 - 123622
ER -