TY - JOUR
T1 - Maximum parsimony interpretation of chromatin capture experiments
AU - Homouz, Dirar
AU - Kudlicki, Andrzej S.
N1 - Funding Information:
The research was supported by Clinical and Translational Science Award from the National Center for Advancing Translational Sciences UL1TR000071, UL1TR001439 and NIH grant GM112131. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We thank Gang Chen for assistance with code optimization and testing. The research was partly supported by Clinical and Translational Science Award from the National Center for Advancing Translational Sciences UL1TR000071, UL1TR001439 and NIH grant GM112131.
Publisher Copyright:
© 2019 Homouz, Kudlicki. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - We present a new approach to characterizing the global geometric state of chromatin from HiC data. Chromatin conformation capture techniques (3C, and its variants: 4C, 5C, HiC, etc.) probe the spatial structure of the genome by identifying physical contacts between genomic loci within the nuclear space. In whole-genome conformation capture (HiC) experiments, the signal can be interpreted as spatial proximity between genomic loci and physical distances can be estimated from the data. However, observed spatial proximity signal does not directly translate into persistent contacts within the nuclear space. Attempts to infer a single conformation of the genome within the nuclear space lead to internal geometric inconsistencies, notoriously violating the triangle inequality. These inconsistencies have been attributed to the stochastic nature of chromatin conformation or to experimental artifacts. Here we demonstrate that it can be explained by a mixture of cells, each in one of only several conformational states, contained in the sample. We have developed and implemented a graph-theoretic approach that identifies the properties of such postulated subpopulations. We show that the geometrical conflicts in a standard yeast HiC dataset, can be explained by only a small number of homogeneous populations of cells (4 populations are sufficient to reconcile 95,000 most prominent impossible triangles, 8 populations can explain 375,000 top geometric conflicts). Finally, we analyze the functional annotations of genes differentially interacting between the populations, suggesting that each inferred subpopulation may be involved in a functionally different transcriptional program.
AB - We present a new approach to characterizing the global geometric state of chromatin from HiC data. Chromatin conformation capture techniques (3C, and its variants: 4C, 5C, HiC, etc.) probe the spatial structure of the genome by identifying physical contacts between genomic loci within the nuclear space. In whole-genome conformation capture (HiC) experiments, the signal can be interpreted as spatial proximity between genomic loci and physical distances can be estimated from the data. However, observed spatial proximity signal does not directly translate into persistent contacts within the nuclear space. Attempts to infer a single conformation of the genome within the nuclear space lead to internal geometric inconsistencies, notoriously violating the triangle inequality. These inconsistencies have been attributed to the stochastic nature of chromatin conformation or to experimental artifacts. Here we demonstrate that it can be explained by a mixture of cells, each in one of only several conformational states, contained in the sample. We have developed and implemented a graph-theoretic approach that identifies the properties of such postulated subpopulations. We show that the geometrical conflicts in a standard yeast HiC dataset, can be explained by only a small number of homogeneous populations of cells (4 populations are sufficient to reconcile 95,000 most prominent impossible triangles, 8 populations can explain 375,000 top geometric conflicts). Finally, we analyze the functional annotations of genes differentially interacting between the populations, suggesting that each inferred subpopulation may be involved in a functionally different transcriptional program.
UR - http://www.scopus.com/inward/record.url?scp=85075539277&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0225578
DO - 10.1371/journal.pone.0225578
M3 - Article
C2 - 31765406
AN - SCOPUS:85075539277
SN - 1932-6203
VL - 14
JO - PLoS ONE
JF - PLoS ONE
IS - 11
M1 - e0225578
ER -