Page 207 - 2021_06-Haematologica-web
P. 207

Host genomics of HIV-related lymphoma
the general population and information on data sharing can be found in the Online Supplementary Appendix.
Results
Study participants and association testing
In order to identify human genetic determinants of HIV-associated NHL, we performed case-control GWAS in three groups of HIV+ patients of European ancestry (SHCS, ANRS and MACS). The characteristics of the study participants are presented in Table 1. In total, geno- typing data were obtained for 278 cases (NHL+/HIV+) and 1,924 matched controls (NHL-/HIV+). With this sam- ple size, we had 80% power to detect a common genetic variant (10% minor allele frequency) with a relative risk of 2.5, assuming an additive genetic model and using Bonferroni correction for multiple testing (Pthreshold=5e-8).34
After genome-wide imputation and quality control, 6.2 million common variants were tested for association with the development of NHL using linear mixed models including sex as a covariate. Results were combined across cohorts using a weighted Z-score-based meta- analysis (Figure 1A). The genomic inflation factor (lamb- da) was in all cases very close to 1 [1.00–1.01], indicating an absence of systematic inflation of the association results (Figure 1B; Online Supplementary Figure S2).
Association results
We observed significant associations with the develop- ment of HIV-related NHL at a single locus on chromo- some 10, downstream of CXCL12 (Figure 1C). A total of seven SNP in this locus had P-values lower than the genome-wide significance threshold (P<5e-8), with rs7919208 displaying the strongest association (Table 2). This association was only detected in the SHCS and ANRS cohorts and not among MACS study participants (Online Supplementary Table S1).
Fine mapping of the CXCL12 locus
In order to identify the causal variant(s) among associ-
ated SNP and determine their potential functional effects, we used a multi-level fine mapping approach, combining the statistical fine mapping tool PAINTOR to obtain a 99% credible set and the deep learning framework DeepSEA to predict any effects on chromatin marks and
transcription factor binding these variants may have. Using PAINTOR, we identified a single variant, rs7919208, having a high posterior probability (=100%) of being causal among the 99% credible set based on the integration of the association results, linkage disequilibri- um (LD) structure and enrichment of genomic features in
this locus (Figure 2).
Consistent with the PAINTOR result, DeepSEA also
identified rs7919208 as the sole variant, among the 99% credible set, predicted to have a functional impact by sig- nificantly increasing the probability of binding by the B- cell transcription factors BATF (log2 fold-change=3.27) and JUND (log2 fold-change=2.91) (Online Supplementary Table S2). Further analysis of the genomic sequence surrounding rs7919208 and the JASPAR transcription factor binding site (TFBS) motifs for BATF and JUND revealed that rs7919208 G->A polymorphism creates the TFBS motif required for the binding of these transcription factors (Figure 3A).
Long-range chromatin interactions
In order to assess the potential functional links between the TFBS created in the presence of the minor allele of rs7919208 and the nearby genes, we performed an analy- sis of promoter capture Hi-C data and topologically asso- ciating domains (TAD). We used the well-characterized GM12878 lymphoblastoid cell line produced by EBV transformation of B lymphocytes collected from a female European donor as a model.
First, in order to examine the interaction potential of the rs7919208 region with nearby promoters, we ana- lyzed available promoter capture Hi-C data obtained from the GM12878 cell line. This analysis revealed a sig- nificant interaction between the rs7919208 region and the CXCL12 promoter, suggesting a possible modulating impact of rs7919208 on the transcription of that gene (Figure 3B). Second, in order to further validate this observed genomic interaction, we analyzed available TAD calls from GM12878 cells,35 using the 3D Genome Browser for visualization36 (Figure 3C). We observed that rs7919208 is located within a large TAD together with CXCL12, signifying the interaction potential of the new TFBS at rs7919208 and CXCL12.
Transcriptomic effects of rs7919208
We did not observe any association between rs7919208 and mRNA expression levels of CXCL12 in peripheral
Table 1. Summary of included samples and studies.
Cohort Cases Controls Lambda
SHCS 145 1,090 1.00
Genotyping chips
Illumina HumanOmniExpress-24, Human1M, Human610, HumanHap550, HumanCore-12
Illumina Human Omni5 Exome 4v 1-2,
Illumina 300
Illumina 1MV1, Human1M-Duo, HumanHap550
Years of NHL diagnosis
2000 - 2017
2008 - 2015
1985 - 2013
Control
inclusion criteria
HIV < 2005, no cancer diagnosis as of 2017 & matched with age
No cancer diagnosis
Matched to cases in terms of age, treatment & time of infection
Age (median) Sex (male, %)
ANRS
Age (median)
Sex (male, %)
MACS
Age (median)
Sex (male, %)
61 58 91% 80%
61 562 1.00
50 34
89% 87%
72 272 1.01
69 68
100% 100%
Cohort and patient characteristics for the Swiss HIV Cohort Study (SHCS), the Primo ANRS and ANRS CO16 Lymphovir cohorts (ANRS) and the Multicenter AIDS Cohort Study (MACS) cohorts. Lambda indicates the genomic inflation factor from the individual cohort genome-wide association studies (GWAS). NHL: non-Hodgkin lymphoma; HIV: human immunodeficiency virus.
haematologica | 2021; 106(8)
2235


































































































   205   206   207   208   209