Researchers at Harvard University recently constructed the first ever high-resolution, 3-D map of the folded genome. Informally referred to as the “loop-ome”, this research has significantly increased scientist’s understanding of the structural basis for gene regulation. This 3-D, folded structure of the genome allows one gene to produce a number of different cells.
The human genome is around 3,000,000,000 nucleotide bases in length. The average length of a nucleotide is roughly 0.6 nanometres, making the length of the human genome, which is packed into every microscopic cell nucleus in the body, approximately 1.8 metres. The enormous length of unfolded DNA therefore necessitates it be compacted into a tightly coiled molecule. Proteins called histones that associate with the DNA achieve this basic structure by wrapping DNA around a group of histones twice, creating a nucleosome. This structure is condensed even further into a helically structured fibre 30 nanometres in diameter. This forms the fundamental structure of DNA in the cell; however it can be further condensed again by folding into a series of looped domains that each contain up to 2 million base pairs. This is how most DNA is found in a resting cell. When these loops are themselves folded, it is referred to as heterochromatin and it is typically less accessible to proteins that bind DNA, therefore serving a protective role. In fact, all of these structures play an important role in gene regulation. Scientists have long known about the looped structure of DNA, however mapping these loops had been regarded an overwhelming challenge due to the volume of data involved.
“More and more, we’re realising that folding is regulation…When you see genes turn on or off, what lies behind that is a change in folding. It’s a different way of thinking about how cells work,” explains study co-first author Suhas Rao. By mapping these genomic loops, the team has been able to reveal thousands of previously unknown loops. Knowing where these genomic loops are and how they function may be essential in identifying the genetic basis of cancer in the future. This was achieved by employing an improved version of the Hi-C method that the team introduced initially in 2009 that identifies which parts of the genome are touching when it is folded up in the cell nucleus. First the genome is frozen in place, and then fragmented into small pieces. The ends of fragments are subsequently marked with biotin and fused to other marked segments that are genomic neighbours in the 3-D structure before extracting and sequencing the fused segments.
Since this study began five years ago, computer processors have developed and become more powerful enabling the researchers to produce maps with very high resolutions. The need for powerful CPUs and big data tools has been a necessity when modelling how millions of DNA bases interact with millions of others.
They found that many discovered loops are conserved between cell types and even between humans and mice. This means that in addition to the 1-D structure of DNA being conserved between species, the 3-D structure is as well. Additionally, this study showed that the 3 billion bases of DNA are apportioned into approximately 10,000 loops, which is in contrast to previous estimates that suggested more than 1 million.
One of the most significant results from the paper is the role of a single protein, CTCF in the formation of these loops, which is thought to be involved with the 3-D structure of chromatin. Interestingly, the DNA sequence motifs that CTCF binds to at the loop anchor must be pointing towards each other, even when separated by great distance. Furthermore, these loops appear to differ between sexes. For example, the largest loops are exclusively found in females. Co-author Huntley stated, “The copy of the X chromosome that is off in females contains gigantic loops that are up to 30 times the size of anything we see in males”.
How might the complexity of DNA differ with the complexity of other organisms such as chimpanzees?