Abstract
Proline cis-trans isomerization (CTI) plays a key role in the rate-determining steps of protein folding. Accurate prediction of proline CTI is of great importance for the understanding of protein folding, splicing, cell signaling, and transmembrane active transport in both the human body and animals. Our goal is to develop a state-of-the-art proline CTI predictor based on a biophysically motivated intelligent consensus modeling through the use of sequence information only (i.e., position specific scores generated by PSI-BLAST). The current computational proline CTI predictors reach about 70-73 percent Q2 accuracies and about 0.40 Matthew correlation coefficient (Mcc) through the use of sequence-based evolutionary information as well as predicted protein secondary structure information. However, our approach that utilizes a novel decision tree-based consensus model with a powerful randomized- metalearning technique has achieved 86.58 percent Q2 accuracy and 0.74 Mcc, on the same proline CTI data set, which is a better result than those of any existing computational proline CTI predictors reported in the literature.
Original language | British English |
---|---|
Article number | 6646170 |
Pages (from-to) | 26-32 |
Number of pages | 7 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 11 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2014 |
Keywords
- ensemble methods
- intelligent systems
- machine-learning
- Proline cis-trans isomerization