Location of Repository

A novel approach to simulate gene-environment interactions in complex diseases

By R. (Roberto) Amato, Michele Pinelli, Daniel D’Andrea, Gennaro Miele, Mario Nicodemi, Giancarlo Raiconi and Sergio Cocozza


Background: Complex diseases are multifactorial traits caused by both genetic and environmental factors. They represent the major part of human diseases and include those with largest prevalence and mortality (cancer, heart disease, obesity, etc.). Despite a large amount of information that has been collected about both genetic and environmental risk factors, there are few examples of studies on their interactions in epidemiological literature. One reason can be the incomplete knowledge of the power of statistical methods designed to search for risk factors and their interactions in these data sets. An improvement in this direction would lead to a better understanding and description of gene-environment interactions. To this aim, a possible strategy is to challenge the different statistical methods against data sets where the underlying phenomenon is completely known and fully controllable, for example simulated ones. \ud Results: We present a mathematical approach that models gene-environment interactions. By this method it is possible to generate simulated populations having gene-environment interactions of any form, involving any number of genetic and environmental factors and also allowing non-linear interactions as epistasis. In particular, we implemented a simple version of this model in a Gene-Environment iNteraction Simulator (GENS), a tool designed to simulate case-control data sets where a one gene-one environment interaction influences the disease risk. The main aim has been to allow the input of population characteristics by using standard epidemiological measures and to implement constraints to make the simulator behaviour biologically meaningful. \ud Conclusions: By the multi-logistic model implemented in GENS it is possible to simulate case-control samples of complex disease where gene-environment interactions influence the disease risk. The user has full control of the main characteristics of the simulated population and a Monte Carlo process allows random variability. A knowledge-based approach reduces the complexity of the mathematical model by using reasonable biological constraints and makes the simulation more understandable in biological terms. Simulated data sets can be used for the assessment of novel statistical methods or for the evaluation of the statistical power when designing a study

Topics: RA
Publisher: BioMed Central Ltd.
Year: 2010
OAI identifier: oai:wrap.warwick.ac.uk:3003

Suggested articles



  1. An Alphabetic List of Genetic Analysis Software.
  2. (2010). Applied logistic regression (Wiley Series in probability and statistics) Wiley-Interscience Publication 2000. doi:10.1186/1471-2105-11-8 Cite this article as: Amato et al.: A novel approach to simulate geneenvironment interactions in complex diseases.
  3. (2005). Arjas E: Backward simulation of ancestors of sampled individuals. Theor Popul Biol doi
  4. (1993). Chromosome-based method for rapid computer simulation in human genetic linkage analysis. Genet Epidemiol doi
  5. (1989). Computer-simulation methods in human linkage analysis. doi
  6. (2005). Do we need genomic research for the prevention of common diseases with environmental causes?
  7. (2003). Eds: Applied system simulation: methodologies and applications doi
  8. (1989). Estimating the power of a proposed linkage study for a complex genetic trait.
  9. (1986). Estimating the power of a proposed linkage study: a practical computer simulation approach.
  10. (2005). Extension of the SIMLA package for generating pedigrees with complex inheritance patterns: environmental covariates, gene-gene and gene-environment interaction. Stat Appl Genet Mol Biol doi
  11. (2007). Forward-Time Simulations of Human Populations with Complex Diseases. PLoS Genetics doi
  12. (2005). Gene-environment interactions in human diseases. doi
  13. (2004). Genetic risk and gene-environment interaction in coronary artery spasm in Japanese men and women. Eur Heart J doi
  14. (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature doi
  15. (1995). GM: Polygenic disease: methods for mapping complex disease traits. Trends Genet doi
  16. (2008). GWAsimulator: a rapid whole-genome simulation program. Bioinformatics doi
  17. (2000). Implementing a unified approach to familybased tests of association. Genet Epidemiol doi
  18. (2001). JH: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. doi
  19. (2003). Mathematical multi-locus approaches to localizing complex human trait genes. Nat Rev Genet doi
  20. (2003). Meta-analysis of genetic association studies supports a contribution of common variants to suibility to common disease. Nat Genet doi
  21. (2005). Müller-Myhsok B: SimPed: a simulation program to generate haplotype and genotype data for pedigree structures. Hum Hered doi
  22. (2003). Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics doi
  23. (2007). New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nat Genet
  24. (1999). Power and sample size calculations in casecontrol studies of gene-environment interactions: comments on different approaches. doi
  25. (2002). Sample size requirements for matched case-control studies of gene-environment interaction. Stat Med doi
  26. (2008). Simulation of genomes: a review. Curr Genomics doi
  27. (1946). The interaction of nature and nurture. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.