BioHCDP: A hybrid constituency-dependency parser for biological NLP information extraction

Kamal Taha, Mohammed Al Zaabi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

One of the key goals of biological Natural Language Processing (NLP) is the automatic information extraction from biomedical publications. Most current constituency and dependency parsers overlook the semantic relationships between the constituents comprising a sentence and may not be well suited for capturing complex long-distance dependencies. We propose in this paper a hybrid constituency-dependency parser for biological NLP information extraction called BioHCDP. BioHCDP aims at enhancing the state of the art of biological text mining by applying novel linguistic computational techniques that overcome the limitations of current constituency and dependency parsers outlined above, as follows: (1) it determines the semantic relationship between each pair of constituents in a sentence using novel semantic rules, and (2) it applies semantic relationship extraction models that represent the relationships of different patterns of usage in different contexts. BioHCDP can be used to extract various classes of data from biological texts, including protein function assignments, genetic networks, and protein-protein interactions. We compared BioHCDP experimentally with three systems. Results showed marked improvement.

Original languageBritish English
Title of host publicationIEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - CIDM 2014
Subtitle of host publication2014 IEEE Symposium on Computational Intelligence and Data Mining, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages78-85
Number of pages8
ISBN (Electronic)9781479945191
DOIs
StatePublished - 13 Jan 2015
Event5th IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2014 - Orlando, United States
Duration: 9 Dec 201412 Dec 2014

Publication series

NameIEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - CIDM 2014: 2014 IEEE Symposium on Computational Intelligence and Data Mining, Proceedings

Conference

Conference5th IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2014
Country/TerritoryUnited States
CityOrlando
Period9/12/1412/12/14

Keywords

  • biological NLP
  • biomedical literature
  • dependency parsers
  • information extraction
  • Text mining

Fingerprint

Dive into the research topics of 'BioHCDP: A hybrid constituency-dependency parser for biological NLP information extraction'. Together they form a unique fingerprint.

Cite this