A Deep Learning Autoencoder for Genetic Imputation Towards the Prediction of Complex Common Diseases

  • Abdallah Mohammed Hadid

Student thesis: Master's Thesis

Abstract

Genetic imputation is a crucial pre-processing step in many genome-wide association studies (GWAS) that focus on disease prediction. Since high-quality imputation would directly affect the accuracy of GWAS results, various well-established imputation tools have been developed. However, these conventional methods are computationally expensive, operate using specialized software, and are highly dependent on the coverage of the reference panels employed. In our work, we contribute to this problem by proposing a deep learning autoencoder for the imputation of human genotype data. The proposed model is fully automated, requires no preprocessing of the genotype data, and is totally independent of the quality of any reference panels. In addition to its flexibility and simplicity, our results show that the model performs robustly with minimal training and competes with the classical imputation tools while substantially outperforming them in terms of time complexity. Overall, the proposed autoencoder architecture achieve an imputation accuracy of 95% for chromosome 6 and 93% for chromosome 22 of the 1000 genome dataset.
Date of AwardJul 2022
Original languageAmerican English

Keywords

  • Imputation; Autoencoder; Genotype; SNP.

Cite this

'