Genetic annotation aware dimensionality reduction in UK Biobank

  • Aisha Abdalla Alsuwaidi

Student thesis: Master's Thesis


This study analyzed the clustering of top ten death causing diseases in the UK Biobank using dimensionality reduction. Studies are infrequent in finding ethnic related diseases associated with risk SNPs. Although there were different kinds of analyses on diseases, there were no analyses of how multiple diseases cluster differently for different ethnic groups. For this analysis, sophisticated queries based on annotation were queried using the dbSNP database and detailed graphs provided using PCA and cluster analysis. This analysis aimed to help researchers visualize the clustering of diseases in the UK Biobank using the dimensionality reduction technique PCA and cluster analysis. Observation showed that out of ten death-causing diseases in the UK, chronic lower respiratory diseases and pancreatic cancer were the only cases in which people of different ethnicities are subject differently to this disease.
Date of AwardMay 2022
Original languageAmerican English


  • PCA
  • dimensionality reduction
  • cluster analysis
  • dbSNP
  • UKBB.

Cite this