Improving Enhancer Identification with a Multi-Classifier Stacked Ensemble Model: Journal of Molecular Biology

B.A. Mir, M.U. Rehman, H. Tayara, K.T. Chong

    Research output: Contribution to journalArticlepeer-review

    9 Scopus citations

    Abstract

    Enhancers are DNA regions that are responsible for controlling the expression of genes. Enhancers are usually found upstream or downstream of a gene, or even inside a gene's intron region, but are normally located at a distant location from the genes they control. By integrating experimental and computational approaches, it is possible to uncover enhancers within DNA sequences, which possess regulatory properties. Experimental techniques such as ChIP-seq and ATAC-seq can identify genomic regions that are associated with transcription factors or accessible to regulatory proteins. On the other hand, computational techniques can predict enhancers based on sequence features and epigenetic modifications. In our study, we have developed a multi-classifier stacked ensemble (MCSE-enhancer) model that can accurately identify enhancers. We utilized feature descriptors from various physiochemical properties as input for our six baseline classifiers and built a stacked classifier, which outperformed previous enhancer classification techniques in terms of accuracy, specificity, sensitivity, and Mathew's correlation coefficient. Our model achieved an accuracy of 81.5%, representing a 2–3% improvement over existing models. © 2023 Elsevier Ltd
    Original languageBritish English
    JournalJ. Mol. Biol.
    Volume435
    Issue number23
    DOIs
    StatePublished - 2023

    Keywords

    • bioinformatics
    • computational biology
    • DNA sequences
    • enhancers
    • meta classification
    • Computational Biology
    • DNA
    • Enhancer Elements, Genetic
    • Genomics
    • Transcription Factors
    • transcription factor
    • Article
    • catboost classifier
    • classifier
    • controlled study
    • enhancer region
    • ensemble learning
    • extra tree classifier
    • extreme gradient boosting
    • false negative result
    • false positive result
    • feature extraction
    • gene identification
    • intermethod comparison
    • light gradient boosting
    • machine learning
    • measurement accuracy
    • multi classifier stacked ensemble
    • multilayer perceptron
    • physical chemistry
    • random forest
    • sensitivity and specificity
    • genetics
    • genomics
    • procedures

    Fingerprint

    Dive into the research topics of 'Improving Enhancer Identification with a Multi-Classifier Stacked Ensemble Model: Journal of Molecular Biology'. Together they form a unique fingerprint.

    Cite this