Predicting protein function from biomedical text

Kamal Taha, Paul D. Yoo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We propose a classifier system called PFPBT that predicts the functions of un-annotated proteins. PFPBT assigns an un-annotated protein p the functional category of annotated proteins that are semantically similar to p. Each protein p is represented by a vector of weights. Each weight reflects the significance of a molecule m in the biomedical abstracts associated with p. That is, each weight quantifies the likelihood of the association between m and p. This is because all proteins bind to other molecules, which are highly predictive of the functions of the proteins. Let S be the set of proteins that is semantically similar to an un-annotated protein p. p is annotated with the functional category f, if its occurrence probability in abstracts associated with S whose functional category is f is statistically significantly different than its occurrences in abstracts associated with S that belong to all other functional categories. PFPBT automatically extracts each co-occurrence of a protein-molecule pair that represents semantic relationship between the pair. We present novel semantic rules based on the syntactic structures of sentences for identifying the semantic relationships between each co-occurrence of a protein-molecule pair in a sentence. We evaluated PFPBT by comparing it experimentally with two systems. Results showed improvement.

Original languageBritish English
Title of host publication2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3275-3278
Number of pages4
ISBN (Electronic)9781424492718
DOIs
StatePublished - 4 Nov 2015
Event37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2015 - Milan, Italy
Duration: 25 Aug 201529 Aug 2015

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
Volume2015-November
ISSN (Print)1557-170X

Conference

Conference37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2015
Country/TerritoryItaly
CityMilan
Period25/08/1529/08/15

Fingerprint

Dive into the research topics of 'Predicting protein function from biomedical text'. Together they form a unique fingerprint.

Cite this