TY - GEN
T1 - Automatic protein structure classification through structural fingerprinting
AU - Aung, Zeyar
AU - Tan, Kian Lee
PY - 2004
Y1 - 2004
N2 - In this paper, we present a new scheme named "CP-Mine" for automatic three-dimensional (3D) protein structure classification using structural fingerprints. We represent a 3D protein structure as a CPset, which is a set of inter-SSE contact patterns (CPs) existing in the protein. Suppose we have a database of protein structures whose class labels are already known, and suppose there are n distinct protein structure classes in the database. For each class, we generate its fingerprint by mining the frequent CPsets from all the member protein structures belonging to this class. When we want to predict the class label of an unknown protein, we also generate the CPset of this protein, and find the intersection between this CPset and the fingerprint of each protein structure class one by one. Then, the labels of the classes with the highest degree of intersection are returned as the answer. The proposed method is a pure classification scheme in that any kind of structural comparison, alignment or searching is not needed to be performed. The preliminary experimental results shows that our method can classify the protein structures accurately and efficiently.
AB - In this paper, we present a new scheme named "CP-Mine" for automatic three-dimensional (3D) protein structure classification using structural fingerprints. We represent a 3D protein structure as a CPset, which is a set of inter-SSE contact patterns (CPs) existing in the protein. Suppose we have a database of protein structures whose class labels are already known, and suppose there are n distinct protein structure classes in the database. For each class, we generate its fingerprint by mining the frequent CPsets from all the member protein structures belonging to this class. When we want to predict the class label of an unknown protein, we also generate the CPset of this protein, and find the intersection between this CPset and the fingerprint of each protein structure class one by one. Then, the labels of the classes with the highest degree of intersection are returned as the answer. The proposed method is a pure classification scheme in that any kind of structural comparison, alignment or searching is not needed to be performed. The preliminary experimental results shows that our method can classify the protein structures accurately and efficiently.
UR - http://www.scopus.com/inward/record.url?scp=4544267274&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:4544267274
SN - 0769521738
SN - 9780769521732
T3 - Proceedings - Fourth IEEE Symposium on Bioinformatics and Bioengineering, BIBE 2004
SP - 508
EP - 515
BT - Proceedings - Fourth IEEE Symposium on Bioinformatics and Bioengineering, BIBE 2004
T2 - Proceedings - Fourth IEEE Symposium on Bioinformatics and Bioengineering, BIBE 2004
Y2 - 19 May 2004 through 21 May 2004
ER -