Genetic Variation in Putative Regulatory Loci Controlling Gene Expression in Breast Cancer
Candidate SNPs were analyzed for associations to an unselected whole genome pool of tumor mRNA transcripts in 50 unrelated patients with breast cancer. SNPs were selected from 203 candidate genes of the reactive oxygen species (ROS) pathway. We describe a general statistical framework for the simultaneous analysis of gene expression data and SNP genotype data measured for the same cohort, which revealed significant associations between subsets of SNPs and transcripts, shedding light on the underlying biology. We identified SNPs in EGF, IL1A, MAPK8, XPC, SOD2 and ALOX12 that are associated with the expression patterns of a significant number of transcripts, indicating the presence of regulatory SNPs in these genes. SNPs were found to act in-trans in a total of 115 genes. Of these, SNPs in 43 of these 115 genes were found to act both in-cis and in-trans. Finally, subsets of SNPs that share significantly many common associations with a set of transcripts (biclusters) were identified. The subsets of transcripts that are significantly associated to the same set of SNPs or to a single SNP were shown to be functionally coherent in GO and pathway analyses and co-expressed in other independent data sets, suggesting that many of the observed associations are within the same functional pathways. The paper is the first study to correlate SNP genotype data in the germline with somatic gene expression data in breast tumors. It provides the statistical framework for further genotype-expression correlation studies to cancer data sets.