Page 110 - 2022_01-Haematologica-web
P. 110

P. Kerbs et al.
(FIMM, n=57).18 RNA-sequencing was performed as described pre- viously.13-18 The patients’ characteristics are summarized in Table 1 and Online Supplementary Table S1. Sequencing metrics are sum- marized in Table 2. In addition, RNA-sequencing data of healthy samples were obtained from the Gene Expression Omnibus Database (Online Supplementary Table S2; n=36) and the FIMM cohort (n=3). All study protocols were approved by the institu- tional review boards of the participating centers. All patients pro- vided written informed consent for scientific use of surplus sam- ples in accordance with the Declaration of Helsinki.
Definitions and metrics for evaluating performance in fusion gene detection
Comprehensive definitions and metrics are provided in the Online Supplementary Methods. In brief, recurrent and reliably detected fusion genes that were reported by public databases were defined as known fusions. Furthermore, fusion genes that were found with high evidence by at least one method in routine diag- nostics were defined as a benchmark (true fusions). High and low evidence were defined separately for Karyotyping, molecular diag- nostics and RNA-sequencing (Online Supplementary Methods, Online Supplementary Figure S1).
Filtering strategies
Initially, built-in filters of the callers were applied and fusions were filtered by a custom-generated blacklist (Online Supplementary Methods). The Promiscuity Score (PS), developed in this study, excluded fusion events whose respective partner genes were frequently called in other distinct fusion events, since these are likely artifacts. Furthermore, low read coverage of a fusion event relative to the read coverage of its partner genes indicates an artifact. Our custom Fusion Transcript Score (FTS) measures, in transcripts per million, the expression of a fusion relative to the expression of its partner genes. Fusion events with a low FTS were excluded. The Robustness Score (RS) of a fusion gene is defined as the ratio between the number of samples in which this fusion gene passed all applied filters and the total number of samples in which this fusion gene was called. Only fusion genes passing all filters in at least half of the reported samples were considered. A comprehensive description of the filtering is enclosed in the Online Supplementary Methods.
Results
Close correlation of fusion detection by routine diagnostics and RNA-sequencing
In 806 patients’ samples, we identified 138 true fusions which provided the benchmark for the comparison of per- formance in fusion gene detection between routine diag- nostics (Karyotyping, molecular diagnostics) and RNA-
Table 2. Statistics for RNA-sequencing, mapping and fusion calling. AMLCG
sequencing (Figure 1, Online Supplementary Table S3). Of 138 true fusions, Karyotyping identified 121 (87.7%) and molecular diagnostics identified 107 (77.5%) with high evi- dence. In addition, Karyotyping identified 11 (8%) and molecular diagnostics identified five (3.6%) true fusions with low evidence. Fusion gene detection by RNA- sequencing resulted in 124 (89.9%) positive findings (high evidence: 115, low evidence: 9), thereby missing 14 true fusions (AMLCG: n=10; Beat AML: n=4).
Notably, samples from the AMLCG cohort showed less overall coverage and mappability of sequencing reads as compared to other samples (Table 2). In particular, CBFB and KMT2A showed poor coverage (Online Supplementary Figure S2), both involved in eight of ten undetected true fusions by RNA-sequencing in those samples. Further fusions missed by RNA-sequencing were DEK-NUP214 and GTF2I-RARA. Overall, in samples from the AMLCG cohort, substantially fewer fusion events were detected by FusionCatcher while Arriba detected twice as many com- pared to samples from other cohorts (Table 2).
In the Beat AML cohort, we observed discrepancies in reported fusions between RNA-sequencing and clinical rou- tine in three of four cases of true fusions missed by RNA- sequencing: (i) Karyotyping reported t(6;11)(q27;q23) result- ing in KMT2A-AFDN, while RNA-sequencing detected KMT2A-MLLT10 resulting from t(10;11)(p12;q23); (ii) Karyotyping reported del(2)(p21p23) resulting in EML4- ALK, while RNA-sequencing detected KMT2A-MLLT3 resulting from t(9;11)(p21;q23); (iii) Karyotyping reported der(17)t(15;17)(q24;q21) and inv(17)(q21q21) resulting in PML-RARA and STAT5B-RARA, respectively, while RNA- sequencing detected PML-RARA but not STAT5B-RARA. In the fourth case, a PML-RARA fusion was found by FISH while Karyotyping indicated a normal karyotype in this sample.
RNA-sequencing identifies known fusions missed by routine and yields additional candidates
Before filtering, a total of 25,817 and 56,594 fusion events were detected in 806 samples by Arriba and FusionCatcher, respectively (mean 32 and 70 per sample, respectively) (Table 2). We applied filtering strategies as shown in Figure 2A. PS filter cutoffs for individual cohorts were set at 8.25 for AMLCG, 3.5 for DKTK, 16.5 for Beat AML and 3.5 for FIMM (Online Supplementary Figure S3A, Online Supplementary Methods). In addition to our previously described cutoffs for FTS5’ and FTS3’ (Online Supplementary Methods), we set a minimum FTS for unknown fusion events based on the median FTS of all detected unknown fusions (FTS ≥0.1) (Online Supplementary Figure S3B). Finally, we defined highly reliable fusion gene candidates based on
RNA selection
Avg. library size in millions (range)
Avg. % uniquely mapped reads (range)
Avg. % reads mapped to exons (range)
Avg. insert size (range)
Avg. fusion events called by Arriba
Avg. fusion events called by FusionCatcher
poly(A)
30.6 (19.1-97.8) 80 (44.2-94.1) 72.4 (40.5-87.6) 248.1 (97-455) 48.3
12.9
DKTK
poly(A)
33.7 (23.4-43.3) 90.7 (82-93.7) 81.5 (75.6-85.7) 257.1 (217-289) 23.2 113.1
Beat AML
poly(A)
55.1 (24.7-126.8) 91.4 (80.9-94.3) 76.8 (60.1-86.8) 186.7 (131-246) 24.1
97.8
FIMM
Total RNA (rRNA depleted)
57.4 (23.9-119.9) 86.3 (70.4-93.3) 51 (20.2-67.9) 219.5 (141-289) 27.8
71.4
Avg: average.
102
haematologica | 2022; 107(1)


































































































   108   109   110   111   112