Page 206 - 2021_06-Haematologica-web
P. 206

C.W. Thorball et al.
Introduction
Methods
Human immunodeficiency virus (HIV) infection is associated with a markedly increased risk of several types of cancer compared to the general population.1–3 This elevated cancer risk can be attributed partly to viral- induced immunodeficiency, frequent co-infections with oncogenic viruses (e.g., Epstein-Barr virus [EBV], hepati- tis B and hepatitis C viruses, human herpesvirus 8 [HHV- 8] and papillomavirus), and increased prevalence of tradi- tional risk factors such as smoking.4,5 However, all of these risk factors may not entirely explain the excess cancer burden seen in the HIV-infected (HIV+) popula- tion.5,6
A previous study performed in the Swiss HIV Cohort Study (SHCS) identified two AIDS-defining cancers, Kaposi sarcoma and non-Hodgkin lymphoma (NHL) as the main types of cancer found among HIV+ patients (NHL representing 34% of all identified cancers).4 The relative risk of developing NHL in HIV+ patients was highly elevated compared to the general population (period-standardized incidence ratio [SIR] = 76.4).4 High HIV plasma viral load, absence of antiretroviral therapy (ART) as well as low CD4+ T-cell counts are known pre- dictive factors for NHL.7,8 The introduction of ART into clinical practice has led to improved overall survival and restoration of immunity by decreasing viral load and increasing CD4+ T-cell counts, and has led to a decreased risk of developing NHL. However, the risk remains sub- stantially elevated compared to the general population (suboptimal immune response [SIR] 9.1[range, 8.3–10.1])9 and NHL still represents 20% of all cancers in people liv- ing with HIV in the ART era.10 NHL associated with HIV are predominantly aggressive B-cell lymphomas. Although they are heterogeneous, they share several pathogenic mechanisms involving chronic antigen stimu- lation, impaired immune response, cytokine deregula- tion and reactivation of the oncogenic viruses EBV and HHV-8.11
The emergence of genome-wide approaches in human genomics has led to the discovery of many associations between common genetic polymorphisms and suscepti- bility to several diseases including HIV infection and multiple types of cancer.12,13 Recent genome-wide associ- ation studies (GWAS) of NHL have identified multiple susceptibility loci in the European population.14–22 These variants are located in the genes LPXN21, BTNL223, EXOC2, NCOA114, PVT114,22, CXCR5, ETS1, LPP, and BCL222 for various subtypes of NHL, as well as BCL6 in the Chinese population.24 Strong associations with varia- tion in human leukocyte antigen (HLA) genes have also been reported.15,18,22 However, in the setting of HIV infec- tion, no genome-wide analysis has been reported con- cerning the occurrence of NHL and the specific mecha- nisms driving their development remain largely unknown.
Here we report the results of the first genome-wide analysis of NHL susceptibility in individuals chronically infected with HIV. We combined three HIV cohort stud- ies from France, Switzerland and the USA and searched for associations between >6 million single nucleotide polymorphisms (SNP) and a diagnosis of NHL. We iden- tified a novel genetic locus near CXCL12 to be associated with the development of NHL among HIV+ individuals.
Ethics statement
The SHCS, the Primo ANRS and ANRS CO16 Lymphovir cohorts (ANRS) and the Multicenter AIDS Cohort Study (MACS) cohorts have been approved by the competent Ethics Committees /Institutional Review Boards of all participating institutions. A written informed consent, including consent for human genetic testing, was obtained from all study partici- pants.
Study participants and contributing centers
Genotyping and phenotypic data were obtained from a total of 2,202 HIV+ individuals enrolled in the SHCS, ANRS and MACS cohorts (278 cases and 1,924 controls) (Table 1). For details on inclusion criteria and cohorts, refer to the Online Supplementary Appendix.
Quality control and imputation of genotyping data
The genotyping data from each cohort was filtered and imputed in a similar way, with each genotyping array processed separately to minimize potential batch effects. All variants were first flipped to the correct strand orientation with BCFTOOLS (v1.8) using the human genome build GRCh37 as reference. Variants were removed if they had a larger than 20% minor allele frequency (MAF) deviation from the 1,000 genomes phase 3 EUR reference panel or if they showed a larger than 10% MAF deviation between genotyping chips in the same cohort.
The quality control (QC) filtered genotypes were phased with EAGLE225 and missing genotypes were imputed using PBWT26 with the Sanger Imputation Service,27 taking the 1,000 genomes project phase 3 panel as reference. Only high-quality variants with an imputation score (INFO >0.8) were retained for further analyses.
Genome-wide association testing and meta-analysis
In order to search for associations between human genomic variation and the development of HIV-related NHL, we first performed separate GWAS within each cohort (SHCS, ANRS and MACS) prior to combining the results in a meta-analysis.
For each cohort separately, the imputed variants were filtered out using PLINK (v2.00a2LM)28 based on missingness (>0.1), MAF (<0.02) and deviation from Hardy-Weinberg Equilibrium (PHWE<1e-6). Determination of population structure and calcula- tion of principal components was done using EIGENSTRAT (v6.1.4)29 and the HapMap3 reference panel.30 All individuals not clustering with the European HapMap3 samples were excluded from further analyses. The samples were screened using KING (v2.1.3)31 to ensure no duplicate or cryptic related samples were included. Single-marker case-control association analyses were performed using linear mixed models, with genetic relationship matrices calculated between pairs of individuals according to the leave-one-chromosome-out principle, as implemented in GCTA mlma-loco (v1.91.4beta).32,33 Sex was included as a covariate, except in the MACS cohort, which only includes men.
The results of the three GWAS were combined across cohorts using a weighted Z-score-based meta-analysis in PLINK (v1.90b5.4), after exclusion of the variants that were not present in all three cohorts.
Other methods
The details of the cohorts and other methods used, i.e., fine mapping, prediction of causal variants, long-range chromatin interactions, transcriptomic effects, comparisons to GWAS in
2234
haematologica | 2021; 106(8)


































































































   204   205   206   207   208