An information extraction system for protein function prediction

Kamal Taha, Paul D. Yoo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We present a Natural Language Processing extraction system called IESforPFP, which can retrieve useful information from biomedical abstracts. IESforPFP aims at enhancing the state of the art of biological text mining by applying novel linguistic computational technique. By retrieving significant patterns of associations between proteins and molecules from biomedical abstracts, IESforPFP can determine the functions of un-annotated proteins. The system determines the semantic relationship between each protein-molecule pair in sentences using novel semantic rules. It applies a semantic relationship extraction model that retrieves information from different structural forms of constituents in sentences. In the framework of IESforPFP, each protein p is represented by a vector of weights. Each weight reflects the significance of a molecule m in the biomedical abstracts associated with p. That is, each weight quantifies the likelihood of the association between m and p. IESforPFP determines the set of annotated proteins that is semantically similar to p by comparing their vectors. It then annotates p with the functions of these annotated proteins. We evaluated the quality of IESforPFP by comparing it experimentally with two other systems. Results showed marked improvement.

Original languageBritish English
Title of host publication2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479969265
DOIs
StatePublished - 16 Oct 2015
EventIEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015 - Niagara Falls, Canada
Duration: 12 Aug 201515 Aug 2015

Publication series

Name2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015

Conference

ConferenceIEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015
Country/TerritoryCanada
CityNiagara Falls
Period12/08/1515/08/15

Keywords

  • biological NLP
  • biomedical literature
  • Dependency parser
  • information extraction
  • Text mining

Fingerprint

Dive into the research topics of 'An information extraction system for protein function prediction'. Together they form a unique fingerprint.

Cite this