84,879 research outputs found
Search for Risk Haplotype Segments with GWAS Data by Use of Finite Mixture Models
The region-based association analysis has been proposed to capture the
collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically
involves a list of unphased multiple-locus genotypes with
potentially sparse frequencies in cases and controls.
To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype co-classification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this
haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of
false haplotypes which hamper the detection of rare but true
haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the
risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the
previous stage.
To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization.
The performance of the proposed procedure is assessed by
simulation studies and a real data analysis. Compared to the existing
multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure
Localization of adaptive variants in human genomes using averaged one-dependence estimation.
Statistical methods for identifying adaptive mutations from population genetic data face several obstacles: assessing the significance of genomic outliers, integrating correlated measures of selection into one analytic framework, and distinguishing adaptive variants from hitchhiking neutral variants. Here, we introduce SWIF(r), a probabilistic method that detects selective sweeps by learning the distributions of multiple selection statistics under different evolutionary scenarios and calculating the posterior probability of a sweep at each genomic site. SWIF(r) is trained using simulations from a user-specified demographic model and explicitly models the joint distributions of selection statistics, thereby increasing its power to both identify regions undergoing sweeps and localize adaptive mutations. Using array and exome data from 45 ‡Khomani San hunter-gatherers of southern Africa, we identify an enrichment of adaptive signals in genes associated with metabolism and obesity. SWIF(r) provides a transparent probabilistic framework for localizing beneficial mutations that is extensible to a variety of evolutionary scenarios
Detection of regulator genes and eQTLs in gene networks
Genetic differences between individuals associated to quantitative phenotypic
traits, including disease states, are usually found in non-coding genomic
regions. These genetic variants are often also associated to differences in
expression levels of nearby genes (they are "expression quantitative trait
loci" or eQTLs for short) and presumably play a gene regulatory role, affecting
the status of molecular networks of interacting genes, proteins and
metabolites. Computational systems biology approaches to reconstruct causal
gene networks from large-scale omics data have therefore become essential to
understand the structure of networks controlled by eQTLs together with other
regulatory genes, and to generate detailed hypotheses about the molecular
mechanisms that lead from genotype to phenotype. Here we review the main
analytical methods and softwares to identify eQTLs and their associated genes,
to reconstruct co-expression networks and modules, to reconstruct causal
Bayesian gene and module networks, and to validate predicted networks in
silico.Comment: minor revision with typos corrected; review article; 24 pages, 2
figure
Bayesian Model Comparison in Genetic Association Analysis: Linear Mixed Modeling and SNP Set Testing
We consider the problems of hypothesis testing and model comparison under a
flexible Bayesian linear regression model whose formulation is closely
connected with the linear mixed effect model and the parametric models for SNP
set analysis in genetic association studies. We derive a class of analytic
approximate Bayes factors and illustrate their connections with a variety of
frequentist test statistics, including the Wald statistic and the variance
component score statistic. Taking advantage of Bayesian model averaging and
hierarchical modeling, we demonstrate some distinct advantages and
flexibilities in the approaches utilizing the derived Bayes factors in the
context of genetic association studies. We demonstrate our proposed methods
using real or simulated numerical examples in applications of single SNP
association testing, multi-locus fine-mapping and SNP set association testing
Statistical Modeling of Epistasis and Linkage Decay using Logic Regression
Logic regression has been recognized as a tool that can identify and model non-additive genetic interactions using Boolean logic groups. Logic regression, TASSEL-GLM and SAS-GLM were compared for analytical precision using a previously characterized model system to identify the best genetic model explaining epistatic interaction for vernalization-sensitivity in barley. A genetic model containing two molecular markers identified in vernalization response in barley was selected using logic regression while both TASSEL-GLM and SAS-GLM included spurious associations in their models. The results also suggest the logic regression can be used to identify dominant/recessive relationships between epistatic alleles through its use of conjugate operators
Recommended from our members
A survey of intrusion detection techniques in Cloud
Cloud computing provides scalable, virtualized on-demand services to the end users with greater flexibility and lesser infrastructural investment. These services are provided over the Internet using known networking protocols, standards and formats under the supervision of different managements. Existing bugs and vulnerabilities in underlying technologies and legacy protocols tend to open doors for intrusion. This paper, surveys different intrusions affecting availability, confidentiality and integrity of Cloud resources and services. It examines proposals incorporating Intrusion Detection Systems (IDS) in Cloud and discusses various types and techniques of IDS and Intrusion Prevention Systems (IPS), and recommends IDS/IPS positioning in Cloud architecture to achieve desired security in the next generation networks
- …