Study : In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features


In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features
RNA structure plays critical roles in processes ranging from ligand-sensing to regulation of translation, polyadenylation, and splicing. Lack of genome-wide in vivo RNA structural data, however, has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high throughput, genome-wide in vivo RNA structure probing method, “Structure-Seq”, in which dimethylsulfate (DMS) methylation of unprotected As and Cs is identified by next generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three nucleotide periodic repeat pattern in the structure of coding regions, as well as a less structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also found patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5’-splice sites that correlates with unspliced events. Remarkably, in vivo structures of mRNAs annotated for stress responses are poorly predicted in silico, while mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length, and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-Seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.


Accession number Name Taxon