Identification of Halophilic Bacteria from a Meta-analysis of Microbial Community Profiles Using Bioinformatics Tools for Differential Abundance Analysis

  • Ameera Aldarmaki

Student thesis: Master's Thesis


Microorganisms have surprising biotechnological potential as some of them can survive harsh conditions such as high salinity. Extreme salt concentrations lower life existence and hypersaline regions have small biological diversity. Microorganisms find their way to evolve, adopt and thrive in such condition. Consequently, halophile and halotolerant microorganisms become the interest of many biologists and scientists for the creation of biological processes for several purposes such as bioremediation of marine oil spills and saline wastewater treatment. In this study, publicly available 16S rRNA sequence profiles from saline, hypersaline and non-saline water and sediments samples from different studies were retrieved from the Sequence Read Archive (SRA) and a SQL Database. The 16S rRNA gene exists in almost all bacteria, and has widely been used as marker gene to charactarize the composition of microbial communities. Many bacterial strains can not be studied in isolation, and large amounts of 16S profiles became available thanks to advances in Next Generation Sequencing. 16S rRNA sequence profiles were subjected for enrichment analysis to identify what bacteria are dominating saline and hypersaline environments. Metagenomics provides therefore a promising alternative to identify halophiles and discover how diverse these bacteria and visualize the phylogenetic distribution of the identified halophiles. The QIIME bioinformatics pipeline was the main software package used throughout this study as it contains powerful group of different analysis tools for microbial community profiles. Two different algorithms were utilized to perform Differential Abundance analysis DESeq2 and MetagenomeSeq. Throughout the analysis 0.08% (DESeq2) and 0.119% (MetagenomeSeq) of the discovered Operational Taxonomic Unit (OTUs) are the most ii significantly differentially abundant in saline and hypersaline samples, 6.4% (both DESeq2 and MetagenomeSeq) were observed in both saline and non-saline environments and 1.23%(Deseq2) and 0.21% (MetagenomeSeq) are the most significant OTUs in non-saline samples. The results showed that performing statistical analysis for a comprehensive set of saline and non-saline samples that the OTUs significantly differentially abundant in these environment are water and soil bacteria such as Rhodobacteraceae and Pelagibacteraceae familes. With the assistance of statistical methods we can corroborate and complement previous results from isolation studies as well as provide a phylogenetic overview of halophiles in the tree of life.
Date of AwardDec 2017
Original languageAmerican English


  • Microorganisms
  • bioremediation
  • Sequence Read Archive (SRA)
  • SQL Database
  • bioinformatics
  • Operational Taxonomic Unit (OTUs).

Cite this