Genetic diversity in the processing and transcriptomic diversity in the targeting of microRNAs
MetadataShow full item record
MicroRNAs are short RNA molecules that are central to the regulation of many cellular and developmental pathways. They are processed in several stages from structured precursors in the nucleus, into mature microRNAs in the cytoplasm where they direct protein complexes to regulate gene expression through, often imperfect base-pairing with target messenger RNAs. The broad aim of this project is to better understand how polymorphisms and new mutations can disrupt microRNA processing and targeting, and ultimately define their contributions to human disease. I have taken two approaches towards this. The first approach is to comprehensively identify the microRNA targets by developing and applying a novel computational pipeline to identify microRNA binding events genome-wide in RNA-RNA interaction datasets. I use this to examine the transcriptomic diversity of microRNA binding, finding microRNA binding events along the full length of protein coding transcripts and with a variety of non-coding RNAs. This reveals enrichment for non-canonical microRNA binding at promoters and intronic regions around splice sites, and identifies highly spatially clustered binding sites within transcripts that may be acting as competitive endogenous RNAs to compete for microRNAs, effectively sequestering them. Using statistical models and new cell fractionated RNA-seq data, I rank the features of microRNAs and their binding sites which contribute to the strength and specificity of their interaction to provide a better understanding of the major determinants of microRNA targeting. The second approach is to directly identify DNA sequence changes in microRNA precursors that alter processing efficiency affecting mature microRNA abundance which are routinely overlooked in the search for disease or trait associated causal variants. I have systematically screened public datasets for both rare and common polymorphisms that overlap microRNA precursors and are correlated with mature microRNA levels as measured in short RNA sequencing. I use these eQTL SNPs to examine the most important microRNA precursor regions and sequence motifs. Several of these SNPs have been observed as risk factors in cancer or other clinically relevant traits, and correlated with microRNA processing efficiency. I demonstrate that a specific DNA change which is known to be important in the development of some cancers, is located in a microRNA precursor and affects the balance of its two products, miR-146a-3p and miR-146a-5p, that can be produced from that single precursor providing new insights into the mechanisms of microRNA production and the aspects of genetic mis-regulation that result in cancer. I find further examples of common human polymorphisms that appear to affect microRNA production from their precursors, several of these variants are independently implicated in human immune disease, cancer susceptibility and associated with other complex traits. As they exhibit a molecular phenotype and immediately lead to mechanistic hypotheses of trait causality that can be tested, these variants could provide a route into the frequently intractable problem of mechanistically linking non-coding genetic variation to human phenotypes. Applying similar studies to patient DNA has revealed rare and unique DNA changes that are now candidates for causing human disease that are being subject to follow-up experimental studies. Collectively this work has started to define which sequences changes in microRNAs are likely to disrupt their function and provides a paradigm for the analysis of microRNA sequence variants in human genetic disease.