Page 42 - 2021_03-Haematologica-web
P. 42
L.-A. Sutton et al.
independent of the assay design or sequencing protocol. A full description of these variants is provided in the ‘Interpretation of false-positive findings’ section in the Online Supplementary Appendix and in the Online Supplementary Table S14.
High-sensitivity assay to confirm variant calling
In order to gain a better understanding of whether the low-frequency mutations were indeed true variants or false-positives arising from the library preparation or sequencing steps, the entire experiment was repeated using a custom-design HaloPlex high-sensitivity assay and gDNA from 38 of the CLL patients originally sequenced (additional material was unavailable for 10 of the original 48 samples). The HaloPlexHS system follows a similar workflow to the standard HaloPlex assay (described in the Online Supplementary Materials and Methods), however a cardinal feature is the attachment of a UMI to individual captured DNA molecules. During downstream analyses the UMI is used to collapse reads originating from the same molecule, thereby improving base calling accuracy and permitting accurate quantification of the mutant allele fraction by excluding PCR amplification bias and improv- ing discrimination of variant nucleotides from background sequencing errors.
A median of 95.3% of bases within the targeted ROI achieved at least 100x coverage (Online Supplementary Table S15). Of the 128 somatic variants identified during the comparative analysis, excluding 18 mutations in sam- ples not re-sequenced, 103 variants were confirmed (VAF range: 0.5-100%) (Online Supplementary Table S9). No addi- tional variants were identified in the repeat NGS data that were not identified previously, while a few low-frequency variants (VAF <5%) could not be verified. The high-sensi- tivity assay data also enabled us to investigate the variants found in only a single center (described above) and provid- ed further evidence that variants detected by only a single center were indeed false-positives (Online Supplementary Table 10).
Discussion
As the number of genes with diagnostic, prognostic, or predictive significance increases, there is a need for robust assays that detect multiple alterations simultaneously. Whether for research, in order to provide a better under- standing of the molecular mechanisms driving disease pathogenesis and evolution, or within clinical diagnostics, targeted gene panels in combination with NGS have evolved as a pragmatic cost-effective approach. These assays are also uniquely positioned to yield a volume of data that is more manageable and easier to interpret than the complex datasets generated by WES or WGS. As a consequence, targeted gene panels are becoming increas- ingly popular in the cancer molecular diagnostics arena and systems for implementing NGS within routine clinical practice need to be established. This is essential since, despite its numerous attractive attributes, mutational pro- filing by NGS can be challenging and involves workflows comprising several distinct parts, i.e., wet-bench compo- nents, bioinformatics analyses, and clinical interpretation of the variant calls. In addition, with the abundance of assays and sequencing platforms on the market, all with their own technical and methodological specifications,
ensuring test-to-test reproducibility and inter-laboratory reliability is fundamental.
Several regulatory and professional organizations have published guidelines and best-practice recommendations to assist laboratories in the transition from the previous gold-standard methodology of Sanger sequencing for mutation detection to NGS.33-37 Whilst these reports focus on critical aspects of clinical gene sequencing such as doc- umentation of the protocols and reporting the specificity and sensitivity of an assay, comparative analyses of differ- ent enrichment techniques and workflows are rare.38,39 Additional inter-laboratory studies focusing on variant calling comparisons between different methodologies are therefore required. To aid in this endeavor, the present study compared the ability of different NGS technologies to accurately detect a spectrum of mutations with varying VAF within genes of prognostic relevance in CLL, specifi- cally focusing on the sensitivity and reproducibility of tar- geted gene panels. Aside from the particular gene panel utilized in the various test centers, we kept parameters as uniform as possible including use of the same sequencing system, same patient samples etc. In order to avoid con- founding results due to individual customization of bioin- formatic pipelines at participating institutes i.e., combina- tions of aligners, variant callers, differences in filtering parameters and quality control, bioinformatics was per- formed centrally.
A first critical step in our analysis was to compare the metrics of sequence coverage and depth. Variability in depth of coverage between centers using different tech- nologies could be attributed to the differing size of the panel designs i.e., despite targeting the same regions, the HaloPlex design includes redundancy to ensure targets are covered even if an amplicon drops out, whereas the Multiplicom technology sequenced fewer samples per run. While incomplete coverage was not a consistent problem for any of the methods utilized, considerable dif- ferences in the coverage of EGR2 and NFKBIE were observed, particularly when using the TSCA gene panel. Certain regions within these genes failed to amplify and had lower read depth in comparison to other genes target- ed indicating that they were intrinsically more difficult to amplify. This may stem from the varying ability of the dif- ferent probe and primer sets to anneal in these high GC regions, thus leading to a reduction in the efficiency of tar- get capture or amplification. These difficulties may have been exacerbated by the decline in the TSCA chemistry over time. As a final note on coverage, direct comparison of the sequencing depth obtained from panels designed with/without UMI is not meaningful since the number of reads does not reflect the actual number of unique tem- plate gDNA molecules as many reads will be duplicates generated during PCR. While molecular barcodes do not prevent PCR duplication from occurring they do facilitate their removal and hence the overall coverage can appear lower albeit more accurate.
Next, the ability of disparate NGS methods to detect variants with varying allelic frequencies was compared and we observed a high degree of concordance and accu- racy when performing pairwise analysis. Imposing an arbitrary VAF cut-off i.e., 10% or 5%, at the initial stage of the analysis could provide an inaccurate view of concor- dance as variants borderline of a threshold may appear to be discrepant, hence we first looked at the agreement between variants irrespective of their VAF and yielded
688
haematologica | 2021; 106(3)