Nucleic acid modifications in regulation of gene expression

Nucleic acids carry a wide range of different chemical modifications. In contrast to previous views that these modifications are static and only play fine-tuning functions, recent research advances paint a much more dynamic picture. Nucleic acids carry diverse modifications and employ these chemical marks to exert essential or critical influences in a variety of cellular processes in eukaryotic organisms. This review covers several nucleic acid modifications that play important regulatory roles in biological systems, especially in regulation of gene expression: 5-methylcytosine (5mC) and its oxidative derivatives, and N6 -methyladenine (6mA) in DNA; N6 -methyladenosine (m6A), pseudouridine (), and 5-methylcytosine (m5C) in messenger RNA and long non-coding RNA. Modifications in other non-coding RNAs, such as tRNA, miRNA, and snRNA, are also briefly summarized. We provide brief historical perspective of the field, and highlight recent progress in identifying diverse nucleic acid modifications and exploring their functions in different organisms. Overall, we believe that work in this field will yield additional layers of both chemical and biological complexity as we continue to uncover functional consequences of known nucleic acid modifications and discover new ones.

A Brief History of DNA 5-methylcytosine methylation in higher eukaryotes

The existence of cytosine methylation (5mC) in genomic DNA was first reported by Wyatt in 1951 (Wyatt, 1951). More than two decades later, the regulatory maintenance of the 5mC pattern across cell divisions was proposed (Holliday and Pugh, 1975; Riggs, 1975). The activity of the speculated writer enzymes, mammalian methyltransferases, was detected in cellular extracts early on (Kalousek and Morris, 1968; Roy and Weissbach, 1975). But it was not until in 1983 that the first DNA methyltransferase, Dnmt1, was purified by the Ingram group (Bestor and Ingram, 1983). Dnmt1 was shown to preferentially methylate the hemimethylated DNA at CpG sites; and its loss in mouse embryonic stem cells (mESCs) leads to genome-wide depletion of the CpG methylation, indicating the methylation-maintaining role of Dnmt1 during DNA replication. Besides the maintenance of 5mC, de novo DNA methylation, i.e. the establishment of 5mC on unmethylated DNA, was detected in early pluripotent embryonic cells by the Jaenisch group in 1982 (Jahner et al., 1982). Subsequent homology studies in mouse carried out by the Li group led to the discovery of Dnmt3a and Dnmt3b, which are two enzymes responsible for de novo methylation of proviral DNA and repetitive sequences (Okano et al., 1999; Okano et al., 1998). Later the same group showed that these methyltransferases are also required for the establishment of de novo methylation on maternal imprinted genes with the cooperation of Dnmt3L (Hata et al., 2002).

The functional outcomes of DNA methylation are generally associated with the repression of gene expression. Early studies largely benefited from the inhibitory effects of 5-azacytidine on DNA methylation in living cells, in which the reactivation of silenced genes was shown to be achieved by the use of this nucleoside analog (Clough et al., 1982; Jones and Taylor, 1980; Mohandas et al., 1981). The later study using Dnmt1 knockout mice also revealed that loss of methylation led to the reactivation of several naturally inactive genes (Li et al., 1993). These findings suggested the repressive nature of DNA methylation.

A more direct approach for the functional investigation of 5mC in genomic DNA involves the discovery and characterization of proteins that recognize 5mC and carry out subsequent actions, i.e. 5mC effectors or readers ( Figure 1 ). The first 5mC reader to be characterized was methyl-CpG binding protein complex MeCP1, which was identified by the Bird group (Meehan et al., 1989). The subsequent studies eventually revealed four 5mC readers comprising the methyl-CpG binding domain (MBD) family, including MeCP2, MBD1, MBD2, and MBD4 (MBD3 in this family is not a 5mC reader) (Hendrich and Bird, 1998). Among them, MeCP2, MBD1, and MBD2 have been shown to be involved in 5mC-dependent transcriptional repression (Bird and Wolffe, 1999). An unrelated p120 catenin partner protein Kaiso was also found to be a specific 5mC reader and functions as a methylation-dependent transcriptional repressor (Prokhortchouk et al., 2001). The discovery and subsequent characterization of 5mC readers led to a more comprehensive mechanistic elucidation of DNA methylation in gene expression regulation. This specific binding of reader proteins to methylated CpG results in repression of gene expression and represents a fundamental epigenetic mechanism of particular importance in higher organisms.

An external file that holds a picture, illustration, etc. Object name is nihms748956f1.jpg

Scheme of the reversible cytosine methylation in DNA, and binding proteins that are known to or proposed to bind modified cytosine derivatives (Liyanage et al., 2014).

DNA methylation has long been associated with regulation of gene expression. Together with histone modifications, DNA methylation modulates the chromatin structure and affects cognate gene expression by maintaining various expression patterns across cell types (Cheng and Blumenthal, 2010; De Carvalho et al., 2010). The presence of DNA methylation in the promoter region is directly connected to repression of transcription. How the repression is induced by methylation is described by two possible models: DNA methylation may either recruit its reader proteins that act as transcription repressors, e.g. MeCP and Kaiso proteins, preventing transcriptional factors from accessing to the promoter region as an “indirect” model; or serve as a disruptor to interfere with the binding of certain transcription factors and thus prevent the activation of corresponding genes in a “direct” model (Boyes and Bird, 1991; Iguchi-Ariga and Schaffner, 1989; Kovesdi et al., 1987; Nan et al., 2007; Watt and Molloy, 1988). In contrast, DNA methylation in the gene body shows positive correlation with gene expression (Aran et al., 2011; Ball et al., 2009; Hellman and Chess, 2007; Laurent et al., 2010; Lister et al., 2009; Rauch et al., 2009; Shann et al., 2008; Yang et al., 2014). Gene body DNA methylation may affect the transcription elongation process, regulate RNA splicing, and alter nucleosome-positioning, which further highlight diverse functions of DNA methylation in gene expression (Chodavarapu et al., 2010; Jones, 2012; Laurent et al., 2010; Lorincz et al., 2004). It should be noted that the transcription regulation roles of DNA methylation typically synergize with various histone marks as the methyltransferases, demethylases and readers of DNA methylation interact with various histone marks or histone modification enzymes.

Dynamic spectrum of cytosine epigenetic marks in DNA: from 5-methylcytosine to its oxidative derivatives

Based on early observations, DNA 5mC methylation was considered to be dynamic and reversible (Mayer et al., 2000; Oswald et al., 2000; Wu and Zhang, 2010). However, although the writers and readers of 5mC were identified and characterized early on, identity of eraser, enzymes that remove 5mC methyl mark remained a mystery for several decades. This all changed with the discovery of 5-hydroxymethylcytosine (5hmC) in the mammalian genome and the identification of TET (ten-eleven translocation) proteins. TET proteins are methylcytosine dioxygenase that utilize dioxygen to oxidize 5mC to 5hmC (Ito et al., 2010; Kriaucionis and Heintz, 2009; Tahiliani et al., 2009). Subsequent studies demonstrated that the TET enzymes can further oxidize 5hmC to 5-formylcytosine (5fC) and 5-arboxylcytosine (5caC) (He et al., 2011; Ito et al., 2011; Pfaffeneder et al., 2011). Both 5fC and 5caC can be recognized and excised by human thymine DNA glycosylase (TDG), followed by base excision repair (BER) to replace the modified cytosine with a normal cytosine, completing the active demethylation process (He et al., 2011; Maiti and Drohat, 2011). Additionally, the 5mC oxidation derivatives of 5hmC, 5fC, and 5caC may also be passively diluted to the unmethylated stage through cell division ( Figure 1 ) (Inoue and Zhang, 2011).

The close relationship between cytosine methylation and levels of gene expression has resulted in intense research that focused on investigation of presence and patterns of cytosine methylation within promoter regions (Deaton and Bird, 2011). However, recent progress reveals that distal regulatory elements, genomic regions that are further away from the genes but are occupied by transcriptional factors and can loop back to interact and regulate transcription, may undergo much more dynamic methylation and demethylation (Booth et al., 2012; Lu et al., 2015a; Shen et al., 2013; Song et al., 2013; Wu et al., 2014; Xia et al., 2015; Yu et al., 2012). Genome-wide studies support the hypothesis that active demethylation, associated with the presence of methylated cytosine oxidation derivatives, may play critical roles in cell development and stem cell maintenance (Shen et al., 2014)

Furthermore, the process of active demethylation is known to occur in certain biological contexts. For example, during fertilization, the loss of 5mC in paternal chromosome and the appearance of 5hmC/5fC/5caC have been observed by immunostaining, suggesting the association between active DNA demethylation and embryonic development. The genome-wide distributions of all 5mC oxidation derivatives have also been mapped by recent advances in developing next-generation sequencing methods, which showed wide-spread active demethylation events at the genomic level along with the association of 5hmC/5fC/5caC with functional elements (Inoue et al., 2011; Inoue and Zhang, 2011; Lu et al., 2015b; Smith et al., 2012). In addition, some human cancers have been associated with aberrant TET activity. Reduced 5hmC abundance and downregulation of TET activity were observed during tumor progression, including melanoma, hepatocellular carcinoma, and hematopoietic malignancies (James et al., 2008; Ko et al., 2010; Lian et al., 2012; Liu et al., 2013). These findings indicate that 5mC oxidation derivatives could be used as markers in cancer diagnostics and prognostics. The DNA 5mC oxidation and demethylation pathway could also be targeted for therapeutic interventions.

A new eukaryotic DNA epigenetic mark: N 6 -methyladenosine

Another methylation modification, N 6 -methyladenine (6mA or m 6 dA), exists in the genomic DNA of prokaryotes and plays critical roles ( Figure 2A ) (Ratel et al., 2006). In bacteria, 6mA serves as an important marker participating in DNA repair, replication, and cell defense (Campbell and Kleckner, 1990; Collier et al., 2007; Low et al., 2001; Lu et al., 1994; Messer and Noyer-Weidner, 1988; Ogden et al., 1988). In particular, 6mA is a marker in restriction–modification (R-M) systems, in which 6mA can be recognized by corresponding restriction endonucleases as a label to prevent the host genome from restriction digestion and further enable the degradation of unmethylated foreign DNA (Murray, 2002). The deletion of 6mA methyltransferase in pathogenic Escherichia coli leads to global transcription changes, indicating significant regulatory functions of 6mA besides a host genetic marker (Fang et al., 2012). Intriguingly, although bacteria employ R-M systems to cleave foreign genomic DNA such as bacteriophage DNA, 6mA methyltransferases were found to be encoded by some viral DNA (Arnold et al., 2000; Baranyi et al., 2000; Magrini et al., 1997; Schlagman and Hattman, 1983).

An external file that holds a picture, illustration, etc. Object name is nihms748956f2.jpg

N 6 -methylation on adenine in genomic DNA. (A) A brief overview of biological function of methyl groups in bacterial genomic DNA. (B) High-throughput mapping of N 6 -methyladenine (6mA) in Chlamydomonas reinhardtii revealed a unique distribution pattern in the genome with complete depletion at transcription start sites (TSS) and high enrichment at the linker region between nucleosomes. (C) In Caenorhabditis elegans 6mA is installed by DAMT-1 and reversibly removed by NMAD-1. The “crosstalk” between 6mA and histone modification, particularly the histone H3 methylation, indicates critical roles that 6mA may play in gene expression regulation. (D) 6mA in Drosophila melanogaster could be converted back to A by Tet homolog DMAD. Intriguingly, the 6mA level is correlated with the expression level of transposon, supporting the regulatory significance of 6mA in eukaryotes.

It should be noted that besides 6mA, bacterial genomes also contain N 4 -methylcytosine (4mC or m 4 dC) and 5mC ( Figure 2A ) (Ehrlich et al., 1987; Vanyushin et al., 1968). These cytosine modifications are also used by bacterial restriction–modification (R-M) systems as defense mechanisms. These two cytosine modifications can be differentiated at base resolution using a revised TAB-seq protocol (Kahramanoglou et al., 2012; Li et al., 2015).

In addition to prokaryotes, several eukaryotes have relatively abundant 6mA in genomic DNA (Cummings et al., 1974; Harrison et al., 1986; Hattman et al., 1978; Rae and Spear, 1978). However, functions of 6mA in eukaryotic systems remained unclear until very recently. The lack of R-M systems suggests that 6mA mainly exerts regulatory roles in these unicellular eukaryotes (Ehrlich and Zhang, 1990). On the other hand, the existence of 6mA in higher eukaryotes, which was supported by indirect evidence, hinted that 6mA might be another potential epigenetic DNA mark in eukaryotes in addition to 5-methylcytosine (Ratel et al., 2006).

In 2015, three groups reported the presence of 6mA in three different eukaryotes independently, shedding light on the function of this methylation modification in eukaryotes (Fu et al., 2015; Greer et al., 2015; Zhang et al., 2015). The existence of 6mA in alga genomic DNA was verified decades ago, but the methylation distribution and biological function were not addressed then. By employing and developing several high-throughput sequencing approaches to map 6mA in Chlamydomonas genomic DNA, it was revealed that 6mA is not only enriched at ApT dinucleotides around transcription start sites (TSS), but also labels the active transcribed genes, and marks the linker DNA regions between adjacent nucleosomes, indicating the potential gene activation function of adenine methylation in the Chlamydomonas genome ( Figure 2B ) (Fu et al., 2015).

The discovery of 6mA, its demethylase NMAD-1 and potential methyltransferase DAMT-1 in C. elegans changed the previous view that C. elegans lacks DNA methylation, raising an intriguing possibility that 6mA serves as DNA methylation mark instead of 5mC. The phenotypes of deletion of nmad-1 and damt-1 and the crosstalk between adenine methylation and histone modifications indicate a potential gene activation role of 6mA ( Figure 2 C ) (Greer et al., 2015).

By knocking out demethylase candidates in the Drosophila genome and monitoring the 6mA level, Zhang, Huang et al found that the Drosophila Tet homolog is likely responsible for the demethylation of 6mA in the Drosophila genome. The identified DNA 6mA demethylase (DMAD) regulates the level of 6mA during embryogenesis and tissue homeostasis processes. Further sequencing analyses have revealed that the dynamic demethylation is correlated with transposon expression and plays a critical role in development ( Figure 2D ) (Zhang et al., 2015).

Beyond the DNA: N 6 -methyladenosine methylation and other RNA modifications on messenger RNA and long non-coding RNA in regulating post-transcriptional gene expression

In addition to genomic DNA being regulated by different chemical modifications, RNA molecules are also decorated with similar modifications and some of them have been appreciated for decades. For example, the existence of N 6 -methyladenosine (m 6 A) in mRNA was discovered in 1974 in both eukaryotic and viral mRNAs (Desrosiers et al., 1974; Desrosiers et al., 1975; Dubin and Taylor, 1975; Perry and Kelley, 1974; Perry et al., 1975; Wei and Moss, 1974). m 6 A is the most prevalent internal modification in mRNAs and long non-coding RNAs (lncRNAs) in higher eukaryotes (Wei et al., 1975). It was revealed that, in the mammalian transcriptome, approximately 3 m 6 A marks exist per mRNA molecule and occur within a consensus motif of G(m 6 A)C (70%) or A(m 6 A)C (30%), but the methylation percentage at each site varies substantially (Carroll et al., 1990; Harper et al., 1990; Horowitz et al., 1984; Kane and Beemon, 1985; Schibler et al., 1977; Wei et al., 1976; Wei and Moss, 1977).

The methylation of adenosine in mRNA could be dynamic and could introduce biological regulations (He, 2010). The methyl group on N 6 position of adenosine is installed in the nucleus by m 6 A ‘writer’, a multicomponent complex that contains methyltransferase like 3 (METTL3), methyltransferase like 14 (METTL14), and Wilms’ tumor 1-associating protein (WTAP) ( Figure 3 ) (Bokar, 2005; Bokar et al., 1994; Bokar et al., 1997; Liu et al., 2014; Ping et al., 2014; Wang et al., 2014b). The deficiency in the methyltransferase complex leads to significant phenotypes, such as blocking the subsequent differentiation of mESCs, lethality at the early stage of mouse embryo development, developmental arrest or defects in gametogenesis in yeast, flies, and plants (Batista et al., 2014; Bodi et al., 2012; Bokar, 2005; Clancy et al., 2002; Dominissini et al., 2012; Geula et al., 2015; Hongay and Orr-Weaver, 2011; Schwartz et al., 2013; Zhong et al., 2008). In zebrafish, the knockdown of METTL3 leads to smaller head, eyes, and brain ventricle, and curved notochord (Ping et al., 2014). Moreover, the m 6 A level is thought to be regulated by miRNA through the modulated binding of METTL3 to mRNA (Chen et al., 2015).

An external file that holds a picture, illustration, etc. Object name is nihms748956f3.jpg

N 6 -methyladenosine (m 6 A) in mRNA and its biological significance. The reversible methylation and demethylation process occurs in the nucleus, catalyzed by methyltransferase complex and demethylases, respectively. The m 6 A modification has profound effects on mRNA fate: it switches mRNA to active translation mode, and also accelerates its decay rate.

Two human AlkB family proteins, the fat mass and obesity-associated protein (FTO) and ALKBH5, serve as m 6 A ‘eraser’, exerting the function of RNA demethylases to remove m 6 A methylation in mammalian poly(A)-tailed RNA ( Figure 3 ) (Jia et al., 2011; Zheng et al., 2013). FTO has significant effects on development while ALKBH5 affects spermatogenesis, suggesting the effects of m 6 A on multiple biological phenomena (Boissel et al., 2009; Claussnitzer et al., 2015; Fischer et al., 2009; Jia et al., 2011; Peters et al., 1999; Zheng et al., 2013). The ‘reader’ proteins YTHDF1 and YTHDF2, specifically binding to m 6 A and shown to interact with thousands of mRNA targets, mediate methylation-dependent translation efficiency regulation and mRNA decay, respectively ( Figure 3 ) (Wang et al., 2014a; Wang et al., 2015). Besides the direct reader proteins, two groups have independently demonstrated that m 6 A methylation also modulates the mRNA and lncRNA structure transcriptome-wide, which dramatically affects protein-RNA interactions to impact mRNA abundance and the alternative splicing of the methylated RNA (Liu et al., 2015; Spitale et al., 2015). hnRNPA2B1 has also been shown to selectively recognize methylated pri-microRNA in order to promote microRNA maturation (Alarcón et al., 2015a; Alarcón et al., 2015b). In a separate study, m 6 A depletion prolongs nuclear retention and delays the nuclear exit of mature mRNAs of clock genes Per2 and Arntl, which connect m 6 A to the pace of the circadian cycle and the clock speed and stability (Fustin et al., 2013). The discoveries of m 6 A also present in other organisms further indicate its critical roles in biological functions (Deng et al., 2015; Luo et al., 2014).

Transcriptome-wide sequencing revealed that m 6 A shows distinct distribution patterns in eukaryotic transcriptomes. In mammals, m 6 A is enriched at 3′-UTR around stop codons and also marks long exons, which may be related to mRNA splicing (Dominissini et al., 2012; Meyer et al., 2012; Schwartz et al., 2014b). In Arabidopsis mRNA, this mark is enriched not only at 3′-UTR but also 5′-UTR (Luo et al., 2014). It has been suggested that the selectivity of methylation installation machinery is modulated by the cofactors in the m 6 A methyltransferase complex, yet the detailed mechanisms underlying the specific distribution remain to be unveiled (Schwartz et al., 2014b).

Besides m 6 A, pseudouridine (Ψ) also exists as another internal mRNA modification. Pseudouridine is the most abundant post-transcription modification in the RNA realm (Charette and Gray, 2000a; Ge and Yu, 2013). It is well known that pseudouridylation is catalyzed by pseudouridine synthases, and that this modification plays critical roles in ribosomal RNA and non-coding RNA (ncRNA) with the defect of pseudouridylation leading to noticeable phenotypes (Fujiwara and Harigae, 2013; Heiss et al., 1998; Hoareau-Aveilla et al., 2008; Jiang et al., 1993; Toh and Mankin, 2008; Zebarjadian et al., 1999). However, the prevalence and distribution of this abundant modification in mRNA has only recently been revealed with the transcriptome-wide distribution of pseudouridines uncovered by employing a selective chemical-labelling approach coupled with high-throughput sequencing (Carlile et al., 2014; Li et al., 2015; Schwartz et al., 2014a). Several hundreds to thousands of pseudouridylation sites have been revealed in mRNA. Pseudouridylation in mRNA is also dynamically tuned in response to environmental changes and stresses, which in turn affect the stability of mRNA, indicating potential regulatory roles of pseudouridine in mRNA metabolism (Carlile et al., 2014; Li et al., 2015; Schwartz et al., 2014a). Intriguingly, the replacement of the initial uridine in stop codons with pseudouridine leads to “read-through” by specific tRNA species, which suggests a potential recoding role of this modification (Fernandez et al., 2013; Karijolich and Yu, 2011).

Cytosine can be methylated in RNA in order to form 5-methylcytosine (m 5 C). The m 5 C modification of several tRNAs installed by DNMT2 noticeably increases the stability of these tRNAs (Kiani et al., 2013; Schaefer et al., 2010; Tuorto et al., 2012). There are multiple RNA cytosine methyltransferases (RCMT) present in mammals. The knockout of the NSUN4 or NSUN4 partner protein in mitochondria led to lethality in mouse models (Cámara et al., 2011; Metodiev et al., 2014). In humans, defects or misregulation of m 5 C-RMTs have been related to cancer, infertility, and mental retardation (Frye and Watt, 2006; Harris et al., 2007; Hussain et al., 2013; Khan et al., 2012; Khosronezhad et al., 2015; Okamoto et al., 2012). The m 5 C methylation has been known to exist in mRNA decades ago (Dubin and Taylor, 1975; Squires et al., 2012). With a considerable amount of m 5 C sites identified in mRNA by recent transcriptome-wide approaches, it is possible that the same group of methyltransferases that install tRNA and rRNA methylations could install mRNA m 5 C methylation (Khoddami and Cairns, 2013; Squires et al., 2012); what is more, the aberrant mRNA m 5 C methylation could contribute to phenotypes observed with mutations or disruption of these methyltransferases.

Modifications on other non-coding RNA that could affect gene expression

tRNA modifications are known to affect translation and affect different physiological processes (Phizicky and Hopper, 2015). In S. cerevisiae, there are 74 genes involved in the installation of ~25 chemically distinct modifications presented at 36 positions in yeast cytoplasmic tRNAs (Cantara et al., 2011; Machnicka et al., 2013). Modifications in tRNAs have profound effects on critical biological processes. For instance, in S. cerevisiae, mutations in genes related to 16 of the 25 modifications lead to significant phenotypes, including lethality, poor growth, and temperature sensitivity (Phizicky and Hopper, 2015). These modifications are known to affect the stability of the tRNAs and the efficiency of protein translation (Phizicky and Hopper, 2015; Tuorto et al., 2012). Moreover, tRNA modifications also play key roles in stress response and immune recognition (Chan et al., 2012; Fu et al., 2010; Kaiser et al., 2014; Pang et al., 2014; Schaefer et al., 2010; Umeda et al., 2005). Transcriptome-wide methods have been developed in order to map certain tRNA modifications and to quantify tRNA levels in a high-throughput manner (Cozen et al., 2015; Zheng et al., 2015).

miRNA is a class of endogenous non-coding RNA that affects gene expression regulation at the post-transcriptional level by pairing to mRNA (Bartel, 2009). The biogenesis of mature miRNA starts from the processing of primary miRNA (pri-miRNA) by the microprocessor complex, including RNA-binding protein DGCR8 and type III RNase Drosha (Denli et al., 2004; Gregory et al., 2004; Han et al., 2004; Landthaler et al., 2004). The specificity of DGCR8 binding to primary miRNA is less well understood, and it is thought that RNA modifications might be involved. Recent studies reveal that m 6 A on pri-miRNA plays critical roles in pri-miRNA and DGCR8 interaction, and the inhibition of pri-miRNA methylation leads to a global decrease in mature miRNA level, indicating the key function of m 6 A in miRNA biogenesis (Alarcón et al., 2015b). RNA modifications also occur in mature miRNA. In plant, 2′-O-methylation is installed by HEN1 at the 3′ end of both miRNA and miRNA*, protecting miRNA from 3′-to-5′ degradation and uridylation (Li et al., 2005; Zhao et al., 2012). The concentration of a specific miRNA species is determined by the equilibrium of biogenesis and degradation. 2′-O-methylation regulates gene expression by affecting the amount of miRNAs through tuning the rate of degradation. In addition to miRNA, siRNAs in plants and Drosophila, as well as piRNAs in animals, are also 2′-O-methylated by HEN1 or HEN1 orthologs (Billi et al., 2012; Horwich et al., 2007; Kamminga et al., 2010; Kamminga et al., 2012; Kirino and Mourelatos, 2007; Kurth and Mochizuki, 2009; Li et al., 2005; Montgomery et al., 2012; Saito et al., 2007). On the other hand, phosphor-dimethylation at 5′ end of pre-miRNA modified by BCDIN3D leads to the loss of the negative charge of pre-miRNA, thus decreasing the extent of miRNA maturation, which serves as another regulatory mechanism to control the abundance of miRNA in cell (Xhemalce et al., 2012).

Another class of non-coding RNA species, small nuclear RNA (snRNA), is an integral part of spliceosome and is extensively modified post-transcriptionally, with 2′-O-methylation and pseudouridine being the major modifications (Massenet et al., 1998; Reddy and Busch, 1988). The modifications can be installed by employing either RNA-dependent or RNA-independent mechanisms (see review: (Karijolich and Yu, 2010)). These modifications can alter the structural stability, base stacking, and hydrogen bond formation of snRNA (Agris, 1996). For example, pseudouridine can stabilize certain secondary structure by strengthening base stacking and providing extra water-mediated interaction between the base and the backbone, while 2′-O-methylation leads to an alteration of ribose interaction with the environment (Arnez and Steitz, 1994; Auffinger and Westhof, 1997, 1998; Charette and Gray, 2000b; Davis, 1995; Helm, 2006). Both modifications play critical roles in small nuclear ribonucleic proteins complex (snRNP) assembly, spliceosome formation, and splicing process (Helm, 2006; Karijolich and Yu, 2010).

Concluding Remarks and Future Outlook

Our understanding of nucleic acid modifications has evolved over the past few decades. We now appreciate that DNA methylation, as a bona fide epigenetic marker, is not only inheritable and dynamic, but also involved in diverse regulatory processes. Aberrant DNA methylation patterns are known to be associated with many types of human diseases including cancer (Feinberg et al., 2006; Jones and Baylin, 2007). The imbalance of the DNA methylation, which was previously thought to be caused by the dysfunction of methylation machinery, now is also considered to be induced by the abnormal status of demethylation machinery through TET-mediated active 5mC demethylation. In human cancer cells, 5hmC is largely depleted and the expression levels of TET genes tend to be reduced (Jin et al., 2011; Yang et al., 2013). Both 5mC and 5hmC could serve as disease markers for early diagnosis and prognosis. The recent discoveries of 6mA as a functional DNA mark in eukaryotic genomic DNA raise the possibility that 6mA plays regulatory roles complementary to 5mC. With a better understanding of 6mA methyltransferase and 6mA demethylase and discovery of potential reader proteins, DNA methylation looks to be a ubiquitous epigenetic marker in almost all kingdoms of life. The interplay between 5mC and 6mA in species that contain both marks, and the transition from 6mA to 5mC in regulating certain biological processes are fascinating subjects that remain to be investigated not only on the functional aspects but also with the perspective of evolution.

Unlike genomic DNA, RNA has more complicated post-transcriptional processing: RNA splicing significantly increase the complex of gene expression by alternatively joining exons and removing introns; RNA editing alters the nucleoside sequence of specific transcript, which may or may not change protein coding regions or potential splicing sites to further diversify the transcriptome; RNA chemical modifications, most of which do not affect nucleotide sequence, are much more diverse and functionally versatile, suggesting broader functional impacts ( Figure 4 ) (He, 2010; Slotkin and Nishikura, 2013). While tRNA and rRNA modifications as well as mRNA cap methylations have been studied in the past, it was only recently that the internal m 6 A methylation in mRNA was shown to be reversible (Fu et al., 2014; He, 2010; Jia et al., 2013; Pan, 2013; Zheng et al., 2013). Recent studies have uncovered this mRNA methylation as a new realm of biological regulation at the post-transcriptional level. It is now believed that internal modifications, such as m 6 A and Ψ, are distributed in unique patterns and affect multiple RNA metabolic processes in order to impact gene expression. While Ψ and m 5 C may not be reversed in RNA, other methylations on heteroatoms resembling m 6 A could be more broadly spread and be dynamically/reversibly regulated by specific enzyme systems. They could affect RNA metabolism and function via RNA structure alteration or recognition by specific reader proteins. As new modifications and new functions continue to emerge, these chemical marks on RNA may collectively provide additional tuning that affect biological outcomes at the post-transcriptional level. With the development of new approaches to quantitatively analyze RNA modifications in a transcriptome-wide manner, a quantitative picture of how chemical modifications affect gene expression regulation and their effects in various human diseases will emerge. RNA modifications may very likely mirror histone modifications: multiple chemical marks on bio-macromolecules that dynamically controlled by multiple enzymes and proteins to enable synergistic regulation of the metabolism, processing and function of the target RNA.

An external file that holds a picture, illustration, etc. Object name is nihms748956f4.jpg

A partial spectrum of diverse RNA chemical modifications.

In summary, the diverse chemical modifications in nucleic acids provide essential or critical chemical-coding processes that exponentially expand the complexity of eukaryotic organisms. These modifications serve as another layer of information carrier, precisely regulating almost every aspect of cell physiology. These pathways provide new opportunities for chemical biologists to investigate the underlying mechanisms, manipulate the modification status to affect gene expression, and develop small molecules or other means to tune these pathways for fundamental research and therapeutic purposes in the future.

Acknowledgments

C.H. is supported by National Institutes of Health GM071440. C.H. is also an investigator of the Howard Hughes Medical Institute.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References