TY - JOUR
T1 - Determining the semantic similarities among gene ontology terms
AU - Taha, Kamal
PY - 2013
Y1 - 2013
N2 - We present in this paper novel techniques that determine the semantic relationships among Gene Ontology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S′ of GO terms, where each term in S′ is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms. We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
AB - We present in this paper novel techniques that determine the semantic relationships among Gene Ontology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S′ of GO terms, where each term in S′ is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms. We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
KW - Gene ontology (GO)
KW - Related terms
KW - Semantic similarity
UR - http://www.scopus.com/inward/record.url?scp=84885116012&partnerID=8YFLogxK
U2 - 10.1109/JBHI.2013.2248742
DO - 10.1109/JBHI.2013.2248742
M3 - Article
C2 - 24592450
AN - SCOPUS:84885116012
SN - 2168-2194
VL - 17
SP - 512
EP - 525
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 3
ER -