Lack of Splicing Distinguishes piRNA-producing Transcripts from other RNAs in Mammals
Date:
In the germ cells of animals, PIWI-interacting RNAs (piRNAs) guide silencing transposons and regulating genes to ensure normal fertility. In adult mammalian testes, most piRNAs derive from ~100 discrete loci called pachytene piRNA clusters that produce canonical RNA polymerase II transcripts, while some piRNAs directly source from transposon transcripts. These piRNA-producing transcripts bear 5’ caps and 3’ poly(A) tails as most canonical transcripts but fall into a distinct fate, processed by the piRNA biogenesis machinery. What feature distinguishes piRNA-producing transcripts from other transcripts is unknown.
Our analysis of transposon transcripts that produce piRNAs focuses on the koala retrovirus (KoRV), a gammaretrovirus transposon that invaded the koala genome 200–40,000 years ago and has not been suppressed by the adaptive piRNA pathway yet in the koala population. KoRV produces both spliced Env mRNAs and unspliced transcripts encoding Ga, Pol, and the viral genome, but we found only the unspliced genomic transcripts are exclusively processed into mature piRNAs. Our analysis further validated the selective piRNA processing of unspliced transposon transcripts is conserved from insects to mammals.
Turning to piRNA-producing clusters, we focus on mice that have 100 well-defined pachytene piRNA clusters by RNA-seq, small RNA-seq, CAGE-seq, PAS-seq, and H3K4me3 ChIP-seq. Our analysis reveals that lack of splicing marks long transcripts (≥10 kb) for piRNA processing, and these long unspliced transcripts contribute to 84% of the adult testis piRNA pool. Evolutionary analysis confirms that lack of splicing is a critical feature of mammalian piRNA-producing transcripts.
Together, we suggest bypassing splicing is a conserved molecular pattern that differentiates from non-piRNA-producing transcripts and can be recognized by the piRNA biogenesis machinery.