TY - JOUR
T1 - Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
AU - Muneeb, Muhammad
AU - Feng, Samuel
AU - Henschel, Andreas
N1 - Publisher Copyright:
Copyright © 2022 Muneeb, Feng and Henschel.
PY - 2022
Y1 - 2022
N2 - Converting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and compared the results with the one-dimensional convolutional neural network. Surprisingly, the average accuracy of multiple runs of 2DCNN was 0.86, and that of 1DCNN was 0.89, yielding a difference of 0.03, which suggests that even the 2DCNN algorithm works on genotype sequences. Moreover, the results generated by the 2DCNN exhibited less variation than those generated by the 1DCNN, thereby offering greater stability. The purpose of this study is to draw the research community’s attention to explore encoding schemes for genotype data and machine learning algorithms that can be used on genotype data by changing the representation of the genotype data for case/control classification.
AB - Converting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and compared the results with the one-dimensional convolutional neural network. Surprisingly, the average accuracy of multiple runs of 2DCNN was 0.86, and that of 1DCNN was 0.89, yielding a difference of 0.03, which suggests that even the 2DCNN algorithm works on genotype sequences. Moreover, the results generated by the 2DCNN exhibited less variation than those generated by the 1DCNN, thereby offering greater stability. The purpose of this study is to draw the research community’s attention to explore encoding schemes for genotype data and machine learning algorithms that can be used on genotype data by changing the representation of the genotype data for case/control classification.
KW - applied machine learning
KW - bioinformatics
KW - genetics
KW - genotype-phenotype prediction
KW - image classification
UR - https://www.scopus.com/pages/publications/85142870880
U2 - 10.3389/fbinf.2022.914435
DO - 10.3389/fbinf.2022.914435
M3 - Article
AN - SCOPUS:85142870880
VL - 2
JO - Frontiers in Bioinformatics
JF - Frontiers in Bioinformatics
M1 - 914435
ER -