Page 97 - Haematologica3
P. 97

Prediction of primary resistant AML
death in aplasia or death due to indeterminate cause.4 More recent recommendations define primary refractory disease as a failure to achieve complete remission (CR) or CR with incomplete hematologic regeneration (CRi) after two courses of induction treatment.2 Regardless of the definition, RD is associated with extremely poor survival. Patients with RD can currently not be identified with high specificity before the start of treatment and thera- peutic resistance remains one of the main problems in AML therapy.3
It is difficult to quantify predictive ability. Usually, area under receiver-operating characteristic curve (AUC) is used to describe predictive ability, where a value of 1 indi- cates perfect prediction and 0.5 indicates no prediction.3,6 AUC values of 0.6-0.7, 0.7-0.8 and 0.8-0.9 are considered as poor, fair and good, respectively.3,6 An AUC of more than 0.9 would be desirable.
Several tools have been developed to predict therapeutic response in AML. The AML-score by Krug et al., based on standard clinical and laboratory variables including genet- ics, was developed to predict CR or early death of older patients (≥60 years) treated with intensive chemotherapy.7 The score reached a “poor” prognostic ability of AUC=0.68 in the validation set.7 A study using compara- ble variables to identify RD analyzing 4601 patients of all age groups from the MRC/NCRI, HOVON/SAKK, SWOG and MD Anderson Cancer Center achieved an AUC pre- diction of AUC=0.78 by bootstrap adjusted validation in the training sets.3 The inclusion of extensive genetic test- ing data from the initial diagnostic workup in this classifi- er was not able to significantly improve the ability to pre- dict primary resistance to treatment in younger patients.6,8 These ‘maximal’ models yielded AUCs of 0.77-0.80 but were not validated in independent data sets.6
We hypothesized that we could improve the prediction of RD by combining standard clinical and laboratory vari- ables, mutation data and gene expression data of large homogeneously treated patient cohorts to design a new classifier.
Methods
Patients
In this study, we used three independent data sets, hereafter referred to as training set 1, 2 and validation set. All patients included in the analysis received cytarabine- and anthracycline- based induction treatment. All patients included in the German AML Cooperative Group (AMLCG) trials were scheduled to receive at least one high-dose cytarabine-containing course as part of their double induction treatment before they were considered resistant.
Training set 1 consisted of 407 patients randomized and treated in the multicenter phase III AMLCG-1999 trial (clinicaltrials.gov identifier 00266136) between 1999 and 2005.9,10 The patients are part of a previously published gene expression data set (GSE37642) and samples were analyzed on Affymetrix arrays.11,12 Patient selection was based on the availability of information on response to induction treatment. All patients with a t(15;17), myelodysplastic syndrome (MDS), or an overall survival (OS) of less than 16 days were excluded.
Training set 2 consisted of samples from 462 AML patients treated in various trials of the Haemato-Oncology Foundation for Adults in the Netherlands (HOVON). These samples were analyzed by Affymetrix arrays and clinical and gene expression
data are publicly available (GSE14468).13,14 Thirteen patients had to be excluded due to early death (OS <16 days) or missing fol- low-up data.
Finally, the validation set consisted of all patients with available material treated in the AMLCG-2008 study (clinicaltrials.gov identi- fier 01382147), a randomized, multicenter phase III trial (n=210).15 These patients were analyzed by RNAseq. Because of the high response rate in the AMLCG-2008 trial (CR and CRi: 289 of 387, 75%) and the low rate of resistant patients (48 of 387, 12%), we decided to include an additional 40 patients with RD in the vali- dation set to increase the statistical power; these patients were treated in the AMLG-1999 trial. We selected these patients by including all patients with RD of the AMLCG-1999 trial that were not part of training set 1 and who had sufficient material for analy- sis. Subsequently, only patients matching the control treatment arm of the AMLCG-2008 trial were selected for RNAseq and included in the validation set (n=40). Cytogenetic data were miss- ing in 8 cases from the validation set (not done: n=3; no cells divid- ing: n=5); since cytogenetic information is required to calculate most risk scores, these patients were excluded from subsequent analysis. The gene expression data are publicly available through the Gene Expression Omnibus Web site (GSE106291).
Details regarding the treatment regimens are described in the Online Supplementary Appendix. A detailed flow chart describing the patient cohorts and selection process is shown in Online Supplementary Figure S1. All study protocols were in accordance with the Declaration of Helsinki and were approved by the insti- tutional review boards of the participating centers. All patients provided written informed consent for inclusion on the clinical trial and in the genetic analyses.
Molecular workup
Cytogenetic analyses in the AMLCG trials were performed cen- trally, and risk groups were defined according to the 2010 UK Medical Research Council (MRC) and the European LeukemiaNet (ELN) 2017 genetic risk classification (ELN2017). Patients were characterized for NPM1 and CEBPA mutations, FLT3 internal tan- dem duplications (FLT3-ITD), and KMT2A (formerly MLL) partial tandem duplications (KMT2A-PTD) using standard methods described recently.16 Targeted amplicon sequencing of 68 recurrent- ly mutated genes as published recently was used for genetic char- acterization in training set 1 and the validation set.17 RNAseq libraries were prepared using the Sense mRNA Seq Library Prep Kit V2 (Lexogen, n=238) and the TruSeq RNA Library Preparation V2 Kit (Illumina, n=12). Between 500-1000 ng total RNA [RNA integrity number (RIN) >7] were used as input material. All sequencing was paired end and performed using polyadenylated- selected and, in case of the Lexogen libraries, stranded RNA sequencing. Processing details and sequencing metrics are provid- ed in the Online Supplementary Appendix. Samples were sequenced on a HiSeq 1500 instrument (Illumina) as 100 bp reads to a targeted depth of 20 million mappable paired reads per sample according to the “Standards, Guidelines and Best Practices for RNA-Seq v.1.0 (June 2011)”18 recommendations of the ENCODE Consortium. Samples were aligned with STAR 2.4.019 to the human hg19 refer- ence genome and analyzed by DESeq2.20 Details regarding the workflow are provided in the Online Supplementary Appendix.
Development of the predictive classifier
The aim of the study was to develop a predictor that accurately identifies patients with RD. To achieve this goal, we used clinical markers, cytogenetics (defined according to the MRC), mutational analysis of 68 recurrently mutated genes in AML and gene expres- sion markers to construct a predictive model. All gene expression variables were scaled to a mean value of 0 and variance equal to 1.
haematologica | 2018; 103(3)
457


































































































   95   96   97   98   99