MicroRNAs (miRNAs) are versatile, post-transcriptional regulators of gene expression. Canonical miRNAs are generated through the two-step DROSHA- and DICER-mediated processing of primary miRNA (pri-miRNA) transcripts with optimal or suboptimal features for DROSHA and DICER cleavage and loading into Argonaute (AGO) proteins, whereas multiple hairpin-structured RNAs are encoded in the genome and could be a source of non-canonical miRNAs. Advances in miRNA biogenesis research have revealed details of the structural basis of miRNA processing and cluster assistance mechanisms that facilitate the processing of suboptimal hairpins encoded together with optimal hairpins in polycistronic pri-miRNAs. In addition, a deeper investigation of miRNA–target interaction has provided insights into the complexity of target recognition with distinct outcomes, including target-mediated miRNA degradation (TDMD) and cooperation in target regulation by multiple miRNAs.
1. Introduction
MicroRNAs (miRNAs) are versatile post-transcriptional regulators of gene expression
[1][2][3]. MiRNAs are small non-coding RNAs (ncRNAs), approximately 22 nt in length, which mainly utilize the 5′ seed sequences (nucleotides 2–7) to recognize diverse target mRNAs and direct them for suppression. Because of the dependency on short seed sequences for target recognition, one miRNA affects many genes, and one gene can be affected by multiple miRNAs. In particular, the 3′-UTRs of each mRNA contain multiple miRNA target sites and undergo complex and context-dependent regulation by multiple miRNAs. In addition, related miRNA families are frequently encoded together in polycistronic miRNAs
[4]. The coordinated or network regulation of both miRNA biogenesis and miRNA–target interaction is prevalent in miRNA biology.
2. Biogenesis of Canonical miRNAs
The biogenesis of canonical and non-canonical miRNAs has been systematically and quantitatively characterized in many studies (
Figure 1a)
[1][2][3]. Canonical miRNAs in animals are transcribed by RNA polymerase II as long primary miRNA transcripts (pri-miRNAs). The highly active transcription of cell type-specific miRNAs via super-enhancers contributes to highly biased expression patterns of miRNAs, where a small subset of miRNAs dominates miRNA expression and function
[5]. Hairpin structures within pri-miRNAs are cleaved by DROSHA and DGCR8 endonuclease complexes, yielding hairpin RNAs (precursor miRNAs, pre-miRNAs). The export of pre-miRNAs from the nucleus to the cytoplasm is mediated by exportin-5 (XPO5) and RAN-GTP, while the presence of other export mechanisms has been suggested
[6]. In the cytoplasm, pre-miRNAs are further processed by DICER endonucleases to yield miRNA duplexes
[7]. The miRNA duplex is loaded into Argonaute (AGO) proteins (AGO1–AGO4 in mammals). Once loaded, only one strand, termed the guide strand, whose 5’-nucleotide interacts with the MID domain of AGO proteins, is retained to form the final AGO–miRNA complex, termed the RNA-induced silencing complex (RISC). The choice of the guide strand depends on the identity of the 5′-nucleotide and the thermodynamic stability of the two ends of the miRNA duplex; a strand with 5′-uridine or 5′-adenosine and thermodynamically unstable 5′ ends is preferred
[8][9][10][11]. The resultant RISC binds to target mRNAs mainly through sequence complementarity between seed sequences and target sites within mRNAs. TNRC6 (GW182) proteins, interacting partners of AGO, play important roles in target repression by interacting with the poly(A)-binding protein and recruiting the PAN2-PAN3 and the CCR4-NOT deadenylation complexes to the target mRNAs. Target recognition via AGO2 accompanies stepwise conformational changes in the AGO2 structure
[12][13].
Figure 1. Structural basis of efficient pri-miRNA processing. (a) Outline of the canonical miRNA biogenesis. (b) The structure of pri-miRNA and related molecules are demonstrated. The numbers in parentheses indicate the features of the miRNA precursor sequences contributing to efficient pri-miRNA processing. (1) UG and UGU motifs are recognized by DROSHA and DGCR8, respectively. Thus, the DROSHA–DGCR8 complex contributes to the correct measurement of the miRNA basal stem length. (2) A CNNF motif is bound by SRSF3. SRSF3s have multifaceted roles in miRNA processing by facilitating pri-miRNA processing efficiency, modulating alternative processing, and preventing 5p-nick processing and inverse processing. (3) An mGHG motif is bound by the dsRBD of DROSHA and helps DROSHA identify the miRNA cleavage site. (4) The large-sized asymmetric internal loops, i.e., large-sized mismatches, inhibit the cleavage of the 3p strand and therefore downregulate miRNA expression. Pri-miRNA, primary miRNA; mGHG, mismatched GHG; dsRBD, double-stranded RNA binding domain.
3. Structural Basis of Pri-miRNA Processing
Efficient pri-miRNA processing requires several pri-miRNA sequence features, including a narrow range of tolerable pri-miRNA stem lengths (35 ± 1 base pairs), a CNNC motif (SRp20/SRSF3-binding motif) 16–18 bp downstream of the DROSHA processing site, a UG motif at the base of the pri-miRNA hairpin, a mismatched GHG motif in the basal stem region, a stable lower basal stem structure (typically less than four mismatches), and a UGU(GUG) motif in the apical loop (
Figure 1b)
[14][15][16][17]. These features are highly conserved among animals
[18]. Recent studies have provided mechanistic insights into these sequence features. DROSHA and DGCR8 form the heterotrimeric DROSHA–DGCR8 complex, consisting of one DROSHA and two DGCR8; the heterotrimeric complex recognizes a basal UG motif and an apical UGU motif via DROSHA and DGCR8, respectively
[18][19]. This model is supported by recently solved cryo-EM structures of human DROSHA and DGCR8, which show that a molecular ruler consisting of two double-stranded RNA binding domains (dsRBDs) from DROSHA and DGCR8 is important for the measurement of pri-miRNA stem lengths between two dsRNA–single-stranded RNA (ssRNA) junctions
[20][21].
For some miRNAs, the position of the mismatched GHG motif determines the DROSHA cleavage site
[22]. The dsRBD domain of DROSHA recognizes a mismatched GHG motif to place the catalytic center in an appropriate position
[22]. This supports the microprocessor cleavage precision
[17][22].
4. Cluster Assistance in Pri-miRNA Processing
A series of recent studies revealed the interdependency of the processing of multiple pri-miRNA hairpins embedded in polycistronic pri-miRNA transcripts
[23][24][25][26][27][28]. Pri-miRNAs often contain multiple miRNA hairpins, and this clustered arrangement can assist in the processing of otherwise defective or suboptimal hairpins. In particular, the presence of neighboring optimal pri-miRNA hairpins enhances the processing of suboptimal pri-miRNA hairpins to facilitate miRNA production
[23][24][25][26][27]. This “cluster assistance” phenomenon is observed for diverse suboptimal pri-miRNA hairpins, including (1) miR-451 in the miR-144-451 cluster, which has a short loop and short stem, thereby relying on DICER-independent and AGO2-dependent maturation, and (2) miR-15a in the miR-15a-16-1 cluster, which has a large unpaired region in its lower stem
[23][24][25][26][27]. Consistent with this, suboptimal canonical miRNA hairpins with a short loop preferentially reside in polycistronic pri-miRNAs
[24]. Similarly, optimal miRNA hairpins enhance the processing of other miRNA hairpins within clustered viral miRNA transcripts in cis
[28]. Importantly, the biological importance of the erythrocyte-specific miR-144-451 cluster is controlled by a sophisticated balance between the optimal miR-144 hairpin and suboptimal miR-451 hairpin (
Figure 2)
[23]. In erythropoiesis, miR-144 not only facilitates miR-451 production, but also represses canonical miRNA biogenesis through DICER suppression (
Figure 2a), thereby making miR-451 the most abundant miRNA in erythrocytes
[23].
Figure 2. Cluster assistance mechanisms for processing of suboptimal pri-miRNAs. (
a) The pri-miR-144 recruits DROSHA–DGCR8 and is processed by the canonical pathway. In erythropoiesis, miR-144 represses canonical miRNA biogenesis through DICER suppression, thereby making miR-451 the most abundant miRNA in erythrocytes. (
b) For the processing of the suboptimal pri-miR-451, several scenarios have been proposed regarding the assistance from the optimal pri-miR-144. Scenario 1: DROSHA–DGCR8 recruitment to the optimal hairpins facilitates recruitment of another DROSHA–DGCR8 to the suboptimal ones through the dimerization properties of SAFB2 and ERH. Scenario 2: DROSHA–DGCR8 at the optimal hairpins is transferred to the suboptimal ones for continuous processing.
5. Structural Basis of Pre-miRNA Processing
DICER, which belongs to the RNase III family, mediates pre-miRNA processing in the cytoplasm. DICER has two RNase III domains and one dsRBD located at the C-terminal region, the PAZ domain in the middle region, and three tandem RNA helicase domains (DExD/H domain) located at the N-terminal region, which are associated with pre-miRNA binding and cleavage. Among a series of functional and structural studies of DICER
[7][29][30][31][32], a recent study demonstrated that the DExD/H domain has an ATP-independent essential structural role in mice and ensures the high fidelity of miRNA biogenesis in vivo
[29]. The PAZ domain is thought to recognize the 5′- and 3′-ends of pre-miRNAs and determine their cleavage sites
[33]. Consistent with this, a 3′-end mono-uridylation of a subset of pre-miRNAs with a shorter (1 nt) 3′ overhang promotes DICER processing
[34]. In addition, apical loops or upper stem-loop regions (USL) of pre-miRNAs play important roles in DICER cleavage
[35][36][37][38]. Recent high-throughput DICER cleavage assays revealed that a single-nucleotide bulge (22-bulge) facilitates the cleavage activity of DICER on shRNAs and human pre-miRNAs, and the stem lengths and defects in two RNase III domains and dsRBD differentially affect single cleavage events by DICER
[39].
6. Inverse Regulation of miRNAs by Target RNAs: Target-Directed miRNA Degradation (TDMD)
While the processing of pri-miRNAs and pre-miRNAs and the generation of miRNA duplexes occurs more rapidly than the generation of most mRNAs, miRNAs are generally very stable, with half-lives extending to days
[40][41]. Nevertheless, the half-lives of individual miRNAs vary among cell types
[41], and miRNAs are rapidly degraded in a context-dependent manner. In particular, highly complementary target RNAs promote miRNA decay through a process called target-directed miRNA degradation (TDMD) (
Figure 3a)
[42][43][44]. Initial examples of TDMD include the artificial target RNAs that induce 3′-end remodeling of small RNAs and the viral RNAs that induce miRNA decay
[45][46]. A well-characterized example of endogenous TDMD includes the Cdr1as-miR-7-Cyrano and Nrep-miR-29b networks
[47][48][49]. The Cdr1as-miR-7-Cyrano network consists of Cdr1as a circular RNA with more than 70 binding sites for miR-7, miR-7, and the long ncRNA Cyrano with an extensively paired site to miR-7 (
Figure 3b)
[47][48]. While Cyrano mediates the TDMD of miR-7, Cdr1as interacts with miR-7 and prevents miR-7 downregulation (
Figure 3b)
[47][48]. Cdr1as knockout mice display impaired sensorimotor gating, together with downregulation of miR-7 and upregulation of miR-7 targets in the brain
[47]. The Nrep-miR-29b network consists of miR-29b and Nrep, with a near-perfect miR-29b target site in its 3′-UTR. Genetic disruption of the miR-29 site within Nrep in mice induces upregulation of cerebellar miR-29b and impairs coordination and motor learning
[49].

Figure 3. The miRNA decay by target-directed miRNA degradation (TDMD). (
a) Distinct outcomes of target recognition: target repression vs. TDMD. (
b) A long ncRNA Cyrano mediates TDMD of miR-7. ZSWIM8, an E3 ubiquitin ligase, binds to AGO2 and induces polyubiquitination and degradation of AGO2. By comparison, Cdr1as, a circular RNA with more than 70 binding sites for miR-7, interacts with miR-7 and prevents miR-7 from downregulation.
7. miRNA Dosage Control by Fine-Tuning of miRNA Biogenesis Pathways
In contrast to TDMD, miRNA abundance is globally controlled by a fine regulation of miRNA biogenesis pathways
[50]. As for DICER, miR-103/-107 and miR-144 were reported to induce global downregulation of canonical miRNAs through the repression of DICER
[23][51]. As described earlier, regulation of DICER by miR-144 is important for miRNA regulation during erythropoiesis
[23]. As for DROSHA–DGCR8, DGCR8 mRNAs have hairpin structures, which are destabilized by the DROSHA-DGCR8 complex, and thereby undergo post-transcriptional autoregulation
[52]. DGCR8 is regulated by alternative transcription initiation
[53]. Alternative transcription initiation using downstream promoters is enhanced during the differentiation of mouse embryonic stem cells and yields shorter DGCR8 mRNAs that do not have stem-loop structures
[53]. Deletion of the stem-loop structures escapes autoregulation and results in an imbalanced DGCR8–DROSHA protein stoichiometry and irreversible aggregation of DROSHA–DGCR8
[53]. This reduces the efficiency of the pri-miRNA processing and the abundance of mature miRNAs, leading to de-repression of lipid metabolic mRNA targets
[53]. This mechanism appears to be important in germ layer specification.
8. Impacts of Target Site Properties on Target Regulation
Target recognition by miRNAs mainly depends on the seed sequence of miRNAs
[2][54]. The mechanistic understanding of target recognition by miRNAs has been improved by advances in computational approaches to predict miRNA targets, assessment of target regulation at RNA and/or translation levels upon perturbation of miRNAs, and high-throughput analyses of AGO2-bound RNAs, such as crosslinking-immunoprecipitation (CLIP)
[1].
Although target repression via non-canonical target sites is typically modest, non-canonical target sites appear to have context-dependent roles in miRNA function. A class of non-canonical target sites with extensive base-pairing at the 3′ site, but not the 5′ site, has been reported to function only in CDS regions
[55]. Target repression through this class of target sites is independent of GW182 and involves translation repression with transient ribosome stalling but not mRNA destabilization
[55]. While miRNAs frequently have non-templated 3′-uridine through TUT4/TUT7-mediated uridylation, a recent report described that complementary base pairing through 3′-uridine enables the repression of otherwise non-responsive seed-mismatch non-canonical targets
[56].