311 research outputs found
The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases
Genetic epidemiologists have taken the challenge to identify genetic polymorphisms involved in the development of diseases. Many have collected data on large numbers of genetic markers but are not familiar with available methods to assess their association with complex diseases. Statistical methods have been developed for analyzing the relation between large numbers of genetic and environmental predictors to disease or disease-related variables in genetic association studies. In this commentary we discuss logistic regression analysis, neural networks, including the parameter decreasing method (PDM) and genetic programming optimized neural networks (GPNN) and several non-parametric methods, which include the set association approach, combinatorial partitioning method (CPM), restricted partitioning method (RPM), multifactor dimensionality reduction (MDR) method and the random forests approach. The relative strengths and weaknesses of these methods are highlighted. Logistic regression and neural networks can handle only a limited number of predictor variables, depending on the number of observations in the dataset. Therefore, they are less useful than the non-parametric methods to approach association studies with large numbers of predictor variables. GPNN on the other hand may be a useful approach to select and model important predictors, but its performance to select the important effects in the presence of large numbers of predictors needs to be examined. Both the set association approach and random forests approach are able to handle a large number of predictors and are useful in reducing these predictors to a subset of predictors with an important contribution to disease. The combinatorial methods give more insight in combination patterns for sets of genetic and/or environmental predictor variables that may be related to the outcome variable. As the non-parametric methods have different strengths and weaknesses we conclude that to approach genetic association studies using the case-control design, the application of a combination of several methods, including the set association approach, MDR and the random forests approach, will likely be a useful strategy to find the important genes and interaction patterns involved in complex diseases
Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants
Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants
Force and Compliance Measurements on Living Cells Using Atomic Force Microscopy (AFM)
We describe the use of atomic force microscopy (AFM) in studies of cell adhesion and cell compliance. Our studies use the interaction between leukocyte function associated antigen-1 (LFA-1)/intercellular adhesion molecule-1 (ICAM-1) as a model system. The forces required to unbind a single LFA-1/ICAM-1 bond were measured at different loading rates. This data was used to determine the dynamic strength of the LFA-1/ICAM-1 complex and characterize the activation potential that this complex overcomes during its breakage. Force measurements acquired at the multiple- bond level provided insight about the mechanism of cell adhesion. In addition, the AFM was used as a microindenter to determine the mechanical properties of cells. The applications of these methods are described using data from a previous study
A genetic ensemble approach for gene-gene interaction identification
<p>Abstract</p> <p>Background</p> <p>It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging.</p> <p>Methods</p> <p>In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise <it>double fault </it>is designed to quantify the degree of complementarity.</p> <p>Conclusions</p> <p>Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms.</p
A random forest approach to the detection of epistatic interactions in case-control studies
<p>Abstract</p> <p>Background</p> <p>The key roles of epistatic interactions between multiple genetic variants in the pathogenesis of complex diseases notwithstanding, the detection of such interactions remains a great challenge in genome-wide association studies. Although some existing multi-locus approaches have shown their successes in small-scale case-control data, the "combination explosion" course prohibits their applications to genome-wide analysis. It is therefore indispensable to develop new methods that are able to reduce the search space for epistatic interactions from an astronomic number of all possible combinations of genetic variants to a manageable set of candidates.</p> <p>Results</p> <p>We studied case-control data from the viewpoint of binary classification. More precisely, we treated single nucleotide polymorphism (SNP) markers as categorical features and adopted the random forest to discriminate cases against controls. On the basis of the gini importance given by the random forest, we designed a sliding window sequential forward feature selection (SWSFS) algorithm to select a small set of candidate SNPs that could minimize the classification error and then statistically tested up to three-way interactions of the candidates. We compared this approach with three existing methods on three simulated disease models and showed that our approach is comparable to, sometimes more powerful than, the other methods. We applied our approach to a genome-wide case-control dataset for Age-related Macular Degeneration (AMD) and successfully identified two SNPs that were reported to be associated with this disease.</p> <p>Conclusion</p> <p>Besides existing pure statistical approaches, we demonstrated the feasibility of incorporating machine learning methods into genome-wide case-control studies. The gini importance offers yet another measure for the associations between SNPs and complex diseases, thereby complementing existing statistical measures to facilitate the identification of epistatic interactions and the understanding of epistasis in the pathogenesis of complex diseases.</p
Dynamic force microscopy for imaging of viruses under physiological conditions
Dynamic force microscopy (DFM) allows imaging of the structure and the assessment of the function of biological specimens in their physiological environment. In DFM, the cantilever is oscillated at a given frequency and touches the sample only at the end of its downward movement. Accordingly, the problem of lateral forces displacing or even destroying bio-molecules is virtually inexistent as the contact time and friction forces are reduced. Here, we describe the use of DFM in studies of human rhinovirus serotype 2 (HRV2) weakly adhering to mica surfaces. The capsid of HRV2 was reproducibly imaged without any displacement of the virus. Release of the genomic RNA from the virions was initiated by exposure to low pH buffer and snapshots of the extrusion process were obtained. In the following, the technical details of previous DFM investigations of HRV2 are summarized
Resolving the Role of Actoymyosin Contractility in Cell Microrheology
Einstein's original description of Brownian motion established a direct relationship between thermally-excited random forces and the transport properties of a submicron particle in a viscous liquid. Recent work based on reconstituted actin filament networks suggests that nonthermal forces driven by the motor protein myosin II can induce large non-equilibrium fluctuations that dominate the motion of particles in cytoskeletal networks. Here, using high-resolution particle tracking, we find that thermal forces, not myosin-induced fluctuating forces, drive the motion of submicron particles embedded in the cytoskeleton of living cells. These results resolve the roles of myosin II and contractile actomyosin structures in the motion of nanoparticles lodged in the cytoplasm, reveal the biphasic mechanical architecture of adherent cellsβstiff contractile stress fibers interdigitating in a network at the cell cortex and a soft actin meshwork in the body of the cell, validate the method of particle tracking-microrheology, and reconcile seemingly disparate atomic force microscopy (AFM) and particle-tracking microrheology measurements of living cells
- β¦