Page 102 - Haematologica - Vol. 105 n. 6 - June 2020
P. 102

  L. Mazzarella et al.
 gressive mutation accumulation and clonal stem cell expansion that accompanies aging.2 Although obesity has recently emerged as a prominent risk factor for a variety of solid tumors,3 its impact on hematologic neoplasms has received less attention. A moderate but consistently posi- tive correlation between body mass index (BMI) and inci- dence of leukemias has been identified in observational studies,4-6 yet none of the collected evidence has been con- sidered sufficiently strong to consider obesity as a bona fide risk factor for AML.3,7 Most studies did not distinguish between myeloid/lymphoid and acute/chronic forms, nor between genetic subtypes within each form. AML is rec- ognized as a highly heterogeneous disease with genetical- ly diverse subtypes.8 Subtypes have radically different out- comes and, similarly, their risk may be differentially affected by environmental factors. Identification of sub- type-specific risk associations, however, is made difficult by their rarity.
A genetic subset of AML, acute promyelocytic leukemia (APL), is characterized by a specific chromosomal translo- cation (t15;17), homogeneous biology and response to clinical agents all-trans retinoic acid (ATRA) and arsenic trioxide, which have made it the most curable form of AML to date.9 We previously demonstrated that the risk of relapse after ATRA/idarubicin is significantly increased in overweight/obese APL patients.10 In the present report, we investigated the association of overweight/obesity with the risk of developing APL and other leukemias. We describe the results of multiple studies across four western populations with significantly different dietary regimens and prevalence of obesity. All the studies demonstrated increased risk of developing APL in overweight/obesity subjects. In an effort to generate mechanistic hypotheses to explain this relationship, we analyzed transcriptomic and mutational data from the AML project in The Cancer Genome Atlas (TCGA)11 and identified alterations selec- tively associated with obesity and/or APL which may be involved in obesity-associated leukemogenesis.
Methods
UK population-based study: data collection and statistical methods
Full details of the methods for the UK population study were described previously.6 The study was approved by the London School of Hygiene and Tropical Medicine Ethics Committee. To identify outcomes of specific leukemia sub-types, Clinical Practice Research Datalink (CPRD) clinical records were searched for codes relating to specific leukemia subgroups. We controlled for multiple co-variates at time of the BMI record(s): age, smoking status, alco- hol use, previous diabetes diagnosis, index of multiple depriva- tion, calendar period, and stratified by gender. We excluded peo- ple with missing smoking [49,206 of 5.24 million (0.9%)] and alco- hol [394,196 of 5.24 million (7.5%)] status. Confidence intervals (CI) in Figure 1 are presented at the 99% level; all other CIs are presented at the 95% level.
Cross-sectional studies: data collection and statistical methods
Acute promyelocytic leukemia cases from Spain were extracted from the PETHEMA database to include 414 cases diagnosed between 1998 and 2012. APL cases from Italy, where 134 adult patients were treated under the AIDA protocol, were included in the previously described cohort.10 APL cases from the USA includ-
ed the entire cohort of the published AML The Cancer Genome Atlas (TCGA) project11 (n=20) plus 22 additional APL cases, unse- lected for any clinical variable, diagnosed at Washington University, St. Louis, MO, USA (Expanded TCGA cohort). For all case cohorts, BMI was measured at the time of diagnosis.
Data collection was approved by the Research Ethics Board of each participating institution, as referenced.11-14 Data sources for expected BMI in the local population are described in the Online Supplementary Appendix.
We compared the distribution of BMI observed in the three APL case cohorts to the distribution of BMI expected in the general pop- ulation of the same countries. Specifically, to calculate the expected distribution of BMI in Italy, we used data from the Italian National Institute of Statistics,14 and we selected the area of Lazio, where the APL cases were diagnosed, in the years 2000-2010. For Spain, we used data from the Eurostat,15 and we selected the general popula- tion of Spain in the year 2008, the only year available. For both Italy and Spain, the expected BMI distribution was calculated using the available age- and gender-specific BMI distribution of the gen- eral population classified into three categories (<25, 25-29.9, ≥30). For the USA, we used the 2009-2010 data from the American National Health and Nutrition Examination Survey.16 The expected BMI distribution was calculated using the available race-, age-, and gender-specific BMI distribution of the general population classi- fied into four categories (<25, 25-29.9, 30.0-34.9, ≥35).
The global null hypothesis that the observed counts did not dif- fer from the expected ones across the BMI categories was tested in a null Poisson regression model, where the observed counts were considered as dependent variable and the expected counts as the offset. BMI was included in the model as an ordinal variable to test the log-linear relationship between BMI and the observed to expected ratio (i.e. to test for linear trend). Pearson's χ2 goodness of fit test P-value was reported.
Expression data analysis
Expression data (RPKM matrix) were down-loaded from the AML TCGA data portal. Cases with available RNAseq, BMI and French-American-British (FAB) classification data (177 of 200) were used in the present study. Cases were classified by FAB in "APL" (FAB="M3") and "non-APL" (FAB ≠ "M3"), and by BMI in "obese" (BMI ≥ 30) and "non-obese" (BMI < 30). Genes with < 0.2 reads per kilo base per million mapped reads (RPKM) in at least 75% of patients were removed.11 The Quantitative Set Analysis for Gene Expression method, as implemented in the quSAGE package15 in the R programming language (v3.2.3), was used to conduct supervised gene set enrichment analysis. For each expressed gene, the quSAGE algorithm calculates a probability density function (PDF) of differential expression between two groups of samples. For each gene set, it then calculates "activity", i.e. the mean difference in log-expression of individual genes included in a gene set. Gene sets with False Discovery Rate (FDR) < 0.05 were considered significant. We focused on the Kyoto Encyclopedia of Genes and Genomes (KEGG) and chemical and genetic perturbations (CGP) gene set collections, down-loaded from MSigDB (http://software.broadinstitute.org/gsea/msigdb/). The CGP collection was used to confirm enrichment of previously identified APL-specific gene signatures16 (Online Supplementary Table S1). We focused on the KEGG collection, as it is enriched for metabolism-associated gene annotations.16 The script to generate the present results is available on request.
Mutational data analysis
For the analysis of the TCGA data, mutational data were retrieved from the TCGA AML paper,11 and AML driver genes (restricted to those with at least 2 mutations in the dataset) were
   1560
haematologica | 2020; 105(6)
  
















































































   100   101   102   103   104