Wednesday, 10 December 2014

The chromatin code

A dynamic landscape


The chromatin polymer is a dynamic assembly of DNA and various nuclear proteins that spatially regulates eukaryotic gene transcription by structural adaptation and genome compartmentalisation. 

Gene expression is primarily regulated by chromatin accessibility. The repeating unit of chromatin is the nucleosome, comprised of approximately 147 base pairs of DNA encircling an octamer of two copies of each of the four core histone proteins: H2A, H2B, H3, and H4. A relaxed chromatin structure where DNA is loosely associated with nucleosomes (euchromatin) is permissive to the transcriptional machinery, allowing the synthesis of RNA. By contrast, genes are silenced when DNA is tightly wrapped around histones and buried within the chromatin (heterochromatin). It is by selective activation and silencing of different genes that more than 200 distinct cell types of the human body are derived from a single source of genetic information. Furthermore, chromatin dynamics provides the adaptive agency for cells to manage environmental variations with rapid changes in gene expression.

Small chemical modifications that decorate the chromatin landscape distinguish active and silent genomic regions. Modifications frequently occur at specific amino acid residues on histones as well as the DNA itself to collectively comprise the epigenome. 

By several distinct mechanisms, the chemical signatures of chromatin architectures are fundamental to gene expression. Individual modifications can recruit functional complexes responsible for the structural reorganisation of chromatin. Other modifications directly affect the stability of the histone-DNA complex to promote an open conformation. These gene-regulatory effects in combination with substantial functional interplay between histone modifications has lead to the proposition of a chromatin code that greatly extends the information potential of the genetic code by shaping transcriptional competency. 


The epigenome is diverse


Methylation of DNA predominantly at cytosine nucleotides adjacent to guanine residues (CpG) at gene promoters is predominantly associated with gene suppression and corresponds most closely to the etymological interpretation of epigenetics. Regulatory proteins mechanistically interpret this prevalent epigenetic modification by recruiting chromatin-remodeling complexes to establish transcriptionally repressive chromatin. Recent discovery of the Ten Eleven Translocation family of dioxyegnases that oxidise methyl-cytosine to generate hydroxymethyl-cytosine have inspired strong interest in transitions between methylated and unmethylated DNA in mammalian cells.

Equally important are the wealth of chemical groups dynamically written to and erased from the protruding N-terminal tails of histones by specialised enzymes and complexes. Here the methyl modification also occupies a prominent role in gene regulation when assigned to lysine (K) and arginine (R) residues. Moreover the effect on transcription is dependent on the site of modification. For example, methylation at lysines 9 and 36 of H3 histones associates with gene repression and activation respectively. Furthermore, diverse degrees of methylation (lysine residues can be mono-, di-, or tri-methylated, and arginine residues mono-methylated and asymmetrically or symmetrically di-methylated) are differentially distributed across chromatin and ascribed distinct functional roles in gene regulation. A prominent example is the range of methyl modifications occurring on lysine 4 of H3 histones. The tri-methylated form of this modification punctuates the start of actively transcribed genes. On the other hand, di-methylation is dispersed throughout gene bodies, and mono-methylation is frequently observed at both proximal and distal regulatory regions of active genes. Contradicting the activities of histone methylases are enzymes responsible for the removal of methyl groups, such as JmjC-domain-containing proteins and LSD1.   

In contrast, site-specific histone lysine acetylation is ubiquitously associated with open chromatin and transcriptional activation. Regulated by the opposing activities of histone acetylases (HATs) and deacetylases (HDACs), acetyl groups are thought to disrupt the charge states of histone tails, altering their contact with the nucleosome as well as reducing their affinity for DNA. Additionally, acetylation of lysines 9 and 27 of H3 histones directly competes with the transcriptionally repressive methylation of these residues.

Other important histone modifications include phosphorylation, ubiquitination, SUMOylation, ADP-ribosylation, and the recently emerged O-linked addition of N-acetylglucosamine (O-GlcNAcylation).



Gene function is primarily regulated by chromatin accessibility (click to enlarge). Shown here is a summary of acetyl (purple) and methyl (red) modifications of specific arginine (R) and lysine (K) residues on N-terminal tails of the heavily modified H3 and H4 histones, as well as DNA methylation at active and silent regions of chromatin. Click image to enlarge


Who regulates the regulators?


The enzymes responsible for the transfer and removal of histone modifications demonstrate strong specificity toward particular amino acid positions within histone tails. Research has characterised a substantial catalogue of enzymes that catalyse more than 100 histone tail marks. Indeed, several biological factors are known to influence the expression of epigenetic modifiers. However a number of questions remain unanswered regarding the precise mechanism underlying their enzymatic function. Specifically, how are chromatin-modifying reactions regulated?

A current topic in the field is the effects of metabolism on the epigenome. Several components of the epigenetic machinery require intermediates of cellular metabolism for enzymatic function. Dietary factors such as folate and B vitamins influence cellular concentrations of S-adenosyl methionine, the universal methyl-donor to methylation reactions. Similarly, acetyl-CoA generated from glucose and fatty acid metabolism is the essential acetyl group donor to lysine acetylation reactions. Furthermore, some DNA and histone demethylase enzymes utilise specific intermediates of the citric acid cycle.

Secondly, post-translational modifications are not restricted to the chromatin. In fact, the same enzymes that write and erase epigenetic marks modify a diverse array of non-histone proteins. These include not only transcription factors, which can independently influence gene expression, but also the activities of other histone modifiers. For example, acetylation enhances the enzymatic functions of the P/CAF, p300, and MYST acetylases, as well as lowering the methylase activity of SUV39H1.  On the other hand, SUV39H1 is activated, and the DNA methylase DNMT1 destabilized by lysine methylation. Characterisation of this network of modifications controlling enzyme function and consequent epigenetic chromatin modulation holds immense potential to further our understanding of gene regulation.

Finally, an important role for non-coding RNA molecules in the establishment of cell-type and gene-specific chromatin modification patterns has recently emerged. Advances in nucleic acid sequencing technologies have revealed that while approximately 90% of the human genome is transcribed, only 1-2% of RNA transcripts encode proteins. Non-coding RNA vary in length and function, with short transcripts such as mircoRNAs playing numerous regulatory roles in gene expression primarily at the mRNA level. Importantly, long non-coding RNAs (lncRNAs) interact with epigenetic enzymes to direct their chromatin binding. For example, the HOTAIR lncRNA simultaneously recruits the Polycomb Repressive Complex 2 and the LSD1 demethylase for coordinated methylation of lysine 27 and demethylation of lysine 4 on H3 histones. Future research is anticipated to uncover many scaffolding and tethering roles for lncRNAs in the specific localisation of chromatin changes.


Finally, an important role for non-coding RNA molecules in the establishment of cell-type and gene-specific chromatin modification patterns has recently emerged. Advances in nucleic acid sequencing technologies have revealed that while approximately 90% of the human genome is transcribed, only 1-2% of RNA transcripts encode proteins. Non-coding RNA vary in length and function, with short transcripts such as mircoRNAs playing numerous regulatory roles in gene expression primarily at the mRNA level. Importantly, long non-coding RNAs (lncRNAs) interact with epigenetic enzymes to direct their chromatin binding. For example, the HOTAIR lncRNA simultaneously recruits the Polycomb Repressive Complex 2 and the LSD1 demethylase for coordinated methylation of lysine 27 and demethylation of lysine 4 on H3 histones. Future research is anticipated to uncover many scaffolding and tethering roles for lncRNAs in the specific localisation of chromatin changes.

Mapping the chromatin landscape

Whether these observations constitute a true code is often questioned.  The existence of a strict chromatin code implies that definite patterns of post-translational histone modifications instruct a rigid functional outcome.  Unlike the causal nature of the genetic code, it is more likely that combinatory patterns of histone modifications create a biased chromatin landscape that generally favours a particular transcriptional outcome. Nonetheless, genetic knock-out animals and similar approaches in cells demonstrate the necessity of epigenetic regulation.

Chromatin modifications are increasingly studied in development and disease and considerable interest surrounds the pharmacological targeting of epigenetic enzymes for therapy. Particularly HDAC inhibitors are rigorously investigated for their clinical use in the treatment of human disease such as cancer and heart disease. Clearly, the thorough characterisation of the epigenome holds immense promise for the clinic. 

Increased availability and use of massive parallel sequencing has allowed the epigenetic analysis of various cell types and biological contexts. Accelerated generation of epigenomic datasets has driven the accumulation of large repositories of data.This shift to systems-level perspective signifies a fundamental change in the way cell biology is investigated, rapidly propelling our modern knowledge of medicine and biology. The generation of cell-specific epigenomic maps and transcriptome profiles is fundamental to a truly comprehensive understanding of gene regulation.


References



3. Bernstein BE. et al., (2005) Genomic maps and comparative analysis of histone modifications in human and mouse. Cell 120(2): 169-81

4.
Heintzman ND. et al., (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39(3): 311-8

5.
Klose RJ. et al., (2006) JmjC-domain-containing proteins and histone demethylation. Nat Rev Genet 7(9): 715-27

6.
Verdone L. et al., (2005) Role of histone acetylation in the control of gene expression. Biochem Cell Biol 83(3): 344-53

7.
Hanover JA. (2010) Epigenetics gets sweeter: O-GlcNAc joins the "histone code". Chem Biol 17(12): 1272-4

8.
Donohoe DR & Bultman SJ. (2012) Metaboloepigenetics: interrelationships between energy metabolism and epigenetic control of gene expression. J Cell Physiol 227(9): 3169-77

9.
Santos-Rosa H. et al., (2003) Mechanisms of P/CAF auto-acetylation. Nucleic Acids Res 31(15): 4285-92

10.
Thompson PR. et al., (2004) Regulation of the p300 HAT domain via a novel activation loop. Nat Struct Mol Biol 11(4): 308-15

11.
Yuan H. et al., (2012) Myst protein acetyltransferase activity requires active site lysine autoacetylation. EMBO 31(1): 58-7012. Vaquero A. et al., (2007) Sirt1 regulates the histone methyl-transferase suv39h1 during heterochromatin formation. Nature 450(7168): 440-4

13. Wang D. et al., (2013) Methylation of SUV39H1 by SET7/9 results in heterochromatin relaxation and genome instability. PNAS 110(14): 5516-21

14. Esteve PO, et al., (2009) Regulation of DNMT1 stability through SET7-mediated lysine methylation in mammalian cells. PNAS 106(13): 5076-81

15. Mercer TR, et al., (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10(3): 155-9

16. Clark MB, et al., (2011) The reality of pervasive transcription. PLoS Biol 9(7)

17. Tsai MC, et al., (2010) Long noncoding RNA as modular scaffold of histone modification complexes. Science 329(5992): 689-93

18. Mathiyalagan P, et al., (2014) Interplay of chromatin modifications and non-coding RNAs in the heart. Epigenetics 9(1): 101-12