Page 117 - 2022_01-Haematologica-web
P. 117

RNA-sequencing of fusion genes in AML
ding break-point regions. Diagnostic application of RNA- sequencing has the potential to overcome these limitations through systematic detection of fusion genes on a transcrip- tome-wide level, as demonstrated in these three examples: (i) NUP98-NSD1 is a biomarker for poor prognosis and NUP98 fusions in general were found to define a clinically relevant distinct subgroup in AML24-26 but reliable detection of the underlying cryptic translocation t(5;11)(q35.2;p15.4) by Karyotyping is not possible.27 Of note, we identified NUP98-NSD1 in eight samples using RNA-sequencing, as well as further known fusion genes in 22 samples that showed no or only low evidence for these fusions by either Karyotyping or molecular diagnostics. (ii) We observed dis- crepancies between results from routine and RNA-sequenc- ing, i.e., one sample showing a translocation t(6;11)(q27;q23), according to Karyotyping. This transloca- tion results in a KMT2A-AFDN fusion but RNA-sequencing reported a KMT2A-MLLT10 fusion with high evidence, cor- responding to translocation t(10;11)(p12;q23). Furthermore, KMT2A rearrangements were reported by break-apart FISH without any evidence for a rearrangement by Karyotyping in two cases. Fusion detection by RNA-sequencing identi- fied a KMT2A-MLLT10 fusion in these samples. Since vari- ous KMT2A fusions may reflect different risk assessments based on the European LeukemiaNet classification,6 the cor- rect description of the fusion may have therapeutic conse- quences. (iii) In another sample, Karyotyping and FISH identified a t(15;17)(q24;q21) translocation, typically result- ing in a PML-RARA fusion transcript (no information on PCR status was available), while RNA-sequencing identi- fied a PML-CASC3 fusion, with CASC3 being located ~170 kb upstream of RARA. Unfortunately, no information was available on this patient’s response to all-trans retinoic acid treatment.
In addition to standard diagnostic methods that are used in clinical routine, targeted RNA-sequencing panels are becoming increasingly popular for high-throughput detec- tion of annotated fusion genes and were shown to be more sensitive than classical approaches.28
Admittedly, RNA-sequencing-based fusion callers report many false positive events due to technical and biological properties, such as sequencing errors, false mapping, homologous genomic regions, polymorphic genes, or exceptionally high gene expression.29 Some genes are therefore prone to be reported in fusion gene artifacts, requiring reasonable filtering to maintain sensitivity while increasing specificity of the fusion detection analysis. Current callers integrate blacklists of fusion events into their built-in filters, which are compiled from public data- bases. However, technical differences between sequencing protocols and fusion calling algorithms may result in spe- cific fusion artifacts that are not covered by those black- lists. Therefore, the generation of an additional customized blacklist further improves the specificity in RNA-sequenc- ing-based fusion analyses. Furthermore, we found genes which form fusions with many distinct partners indicating that these events are likely artifacts. The PS, developed in the present study, evaluates fusion events using this char- acteristic and filters events based on scores obtained from known fusions. However, the PS depends on the sequenc- ing properties and the number of samples from which the score was derived. Thus, we defined cutoffs for the indi- vidual cohorts separately. Furthermore, the amount of fusion supporting reads correlates with the number of reads supporting the expression of the individual partner
genes. The FTS, also developed in this study, measures the abundance of fusion transcripts relative to their respective partner gene transcripts. Most known fusions had an FTS around 0.3, but fusions present in subclones only, or fusions found in samples with lower tumor load will yield lower scores. As a tradeoff between specificity and sensi- tivity, we defined the median of all FTS detected in unknown fusion events as a cutoff. Besides, we observed unknown fusion events with high recurrence that passed all preceding filter steps in some samples, while these fusion events were filtered out in most other samples. This may indicate transcript artifacts of error-prone genes. The RS filter therefore excludes fusion events that failed at least one preceding filter in most of the identified cases. The integration of our PS, FTS, customized blacklist and RS Filter into our detection strategy substantially reduced the fusion calls that were most likely false or irrelevant. Selection of fusion events consistently found between Arriba and FusionCatcher further increased the evidence of fusion candidates. As an additional source of evidence for fusion events, we utilized individual gene expression val- ues of the partner genes. The expression of a fusion tran- script is mostly driven by the promoter of the 5' partner gene and the expression of the 3' partner should therefore adjust to the levels of the 5' partner. Although this simpli- fied assumption neglects the influence of 3' enhancers and other regulatory elements, we observed substantially ele- vated expression of the 3' partner if it is usually not expressed or expressed at low levels only. Consistently, 3' partner genes with inherently similar expression as the 5' partner showed no or only marginal adjustments in expres- sion levels. However, genomic rearrangements do not nec- essarily result in a fusion transcript but may have other effects, e.g., the reallocation of the 3' enhancer of GATA2 in inv(3)(q21.3q26.2)/t(3;3)(q21.3;q26.2)-positive leukemia, causing overexpression of MECOM and GATA2 haploin- sufficiency.30,31 Although, there is usually no fusion tran- script in these patients, we found evidence for the transpo- sition of MECOM by chimeric reads found in several affected samples (data not shown).
Among our fusion candidates, we identified the novel recurrent fusion gene NRIP1-MIR99AHG, which results from inversion inv(21)(q11.2;q21.1). Interestingly, both Nanopore sequencing and RNA-sequencing revealed dif- ferent breakpoint positions in NRIP1-MIR99AHG-positive samples. None of the identified fusion transcripts included an annotated consensus coding sequence, and therefore translation to a protein product is rather unlikely. NRIP1 was described as a transcriptional repressor,32 playing a role in hematologic malignancies,33,34 and was found to be involved in other fusions.35 A disruption of the correspon- ding gene by the NRIP1-MIR99AHG rearrangement might therefore contribute to leukemogenesis. On the other hand, overexpression of MIR99AHG and accompanying enhanced proliferation were previously demonstrated in acute megakaryoblastic leukemia cell lines (with MIR99AHG referred to as MONC).36 Furthermore, MIR99AHG is the host gene of miR-99a/let-7c/miR-125b-2, a microRNA cluster, also shown to influence homeostasis of hematopoietic stem and progenitor cells.37 Interestingly, the identified fusion breakpoint in the MIR99AHG locus was located between let-7c and miR-125b-2, thereby dis- rupting the tricistronic gene cluster. This aberration as well as fusion-induced transcription of the 3’ region of MIR99AHG may constitute a mechanism of leukemogen-
haematologica | 2022; 107(1)
109


































































































   115   116   117   118   119