IPFPi: A System for Improving Protein Function Prediction through Cumulative Iterations

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

We propose a classifier system called iPFPi that predicts the functions of un-annotated proteins. iPFPi assigns an un-annotated protein P the functions of GO annotation terms that are semantically similar to P. An un-annotated protein P and a GO annotation term T are represented by their characteristics. The characteristics of P are GO terms found within the abstracts of biomedical literature associated with P. The characteristics of T are GO terms found within the abstracts of biomedical literature associated with the proteins annotated with the function of T. Let F and F\prime be the important (dominant) sets of characteristic terms representing T and P , respectively. iPFPi would annotate P with the function of T , if F and F\prime are semantically similar. We constructed a novel semantic similarity measure that takes into consideration several factors, such as the dominance degree of each characteristic term t in set F based on its score, which is a value that reflects the dominance status of t relative to other characteristic terms, using pairwise beats and looses procedure. Every time a protein P is annotated with the function of T , iPFPi updates and optimizes the current scores of the characteristic terms for T based on the weights of the characteristic terms for P. Set F will be updated accordingly. Thus, the accuracy of predicting the function of T as the function of subsequent proteins improves. This prediction accuracy keeps improving over time iteratively through the cumulative weights of the characteristic terms representing proteins that are successively annotated with the function of T. We evaluated the quality of iPFPi by comparing it experimentally with two recent protein function prediction systems. Results showed marked improvement.

Original languageBritish English
Article number6871330
Pages (from-to)825-836
Number of pages12
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume12
Issue number4
DOIs
StatePublished - 1 Jul 2015

Keywords

  • biomedical literature
  • protein annotation
  • Protein function prediction
  • semantic similarity

Fingerprint

Dive into the research topics of 'IPFPi: A System for Improving Protein Function Prediction through Cumulative Iterations'. Together they form a unique fingerprint.

Cite this