Search CORE

273 research outputs found

A Bayesian method for evaluating and discovering disease loci associations

Author: A Galvin
AB Moffa
B Kuschel
B Tycko
C Hoggart
D Heckerman
DF Easton
DJ Hunter
DR Velez
EM Reiman
GF Cooper
Gregory F. Cooper
H Shi
J Wakefield
J Wu
JD Storey
JD Storey
JD Storey
JS Barnholtz-Sloan
KD Coon
L Ding
LW Hahn
M McCarthy
M. Michael Barmada
MD Fallin
Michael J. Becich
N Bonifaci
N Risch
P Sebastiani
R Grose
RA Fisher
RA Fisher
RE Neapolitan
RE Neapolitan
RE Neapolitan
S Visweswaran
S Wacholder
Vladimir Brusic
X Jiang
X Jiang
X Jiang
X Liang
Xia Jiang
Y Benjamin
Y Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Background: A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings: We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance: We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. © 2011 Jiang et al

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Learning predictive interactions using information gain and Bayesian network scoring

Author: Jao J
Jiang X
Neapolitan R
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Background The problems of correlation and classification are long-standing in the fields of statistics and machine learning, and techniques have been developed to address these problems. We are now in the era of high-dimensional data, which is data that can concern billions of variables. These data present new challenges. In particular, it is difficult to discover predictive variables, when each variable has little marginal effect. An example concerns Genomewide Association Studies (GWAS) datasets, which involve millions of single nucleotide polymorphism (SNPs), where some of the SNPs interact epistatically to affect disease status. Towards determining these interacting SNPs, researchers developed techniques that addressed this specific problem. However, the problem is more general, and so these techniques are applicable to other problems concerning interactions. A difficulty with many of these techniques is that they do not distinguish whether a learned interaction is actually an interaction or whether it involves several variables with strong marginal effects. Methodology/Findings We address this problem using information gain and Bayesian network scoring. First, we identify candidate interactions by determining whether together variables provide more information than they do separately. Then we use Bayesian network scoring to see if a candidate interaction really is a likely model. Our strategy is called MBS-IGain. Using 100 simulated datasets and a real GWAS Alzheimer's dataset, we investigated the performance of MBS-IGain. Conclusions/Significance When analyzing the simulated datasets, MBS-IGain substantially out-performed nine previous methods at locating interacting predictors, and at identifying interactions exactly. When analyzing the real Alzheimer's dataset, we obtained new results and results that substantiated previous findings. We conclude that MBS-IGain is highly effective at finding interactions in high-dimensional datasets. This result is significant because we have increasingly abundant high-dimensional data in many domains, and to learn causes andperform prediction/classification using these data, we often must first identify interactions

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

FigShare

Learning genetic epistasis using Bayesian network scoring criteria

Author: A Heidema
A Herbert
AJ Brookes
B Han
BA Logsdon
BM Armes
CJ Verzilli
D Brinza
D Heckerman
D Thomas
DR Velez
E Castillo
E Perrier
E Segal
EM Reiman
FV Jensen
FV Jensen
GF Cooper
HJ Cordell
J Pearl
J Rissanen
J Suzuki
J Wu
JC Lambert
JH Moore
K Korb
KD Coon
LW Hahn
M Chickering
M Fishelson
M Fishelson
M Michael Barmada
M Spinola
MD Ritchie
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
P Sebastiani
P Spirtes
RE Neapolitan
RE Neapolitan
RE Neapolitan
RI Nagel
Richard E Neapolitan
RW Robinson
S Visweswaran
Shyam Visweswaran
T Silander
TT Wu
W Bateson
W Wongseree
X Jiang
X Wan
X Zhang
Xia Jiang
Y Meng
Y Meng
YM Cho
Publication venue: BioMed Central
Publication date: 01/03/2011
Field of study

Abstract Background Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is <it>Multifactor Dimensionality Reduction </it>(MDR). Jiang et al. created a combinatorial epistasis learning method called <it>BNMBL </it>to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL. Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model. Results We evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at <it>recall </it>using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set. Conclusions We conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.</p

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Corrected score methods for estimating Bayesian networks with error-prone nodes

Author: Aragam B
Chickering DM
Cormen TH
Edwards D
Fuller WA
Grace YY
Hauser A
Jensen FV
Lauritzen SL
Neapolitan RE
Pearl J
Robinson RW
Spirtes P
Suzuki J
Tibshirani R
Zheng X
Publication venue
Publication date: 10/02/2020
Field of study

Motivated by inferring cellular signaling networks using noisy flow cytometry data, we develop procedures to draw inference for Bayesian networks based on error-prone data. Two methods for inferring causal relationships between nodes in a network are proposed based on penalized estimation methods that account for measurement error and encourage sparsity. We discuss consistency of the proposed network estimators and develop an approach for selecting the tuning parameter in the penalized estimation methods. Empirical studies are carried out to compare the proposed methods and a naive method that ignores measurement error with applications to synthetic data and to single cell flow cytometry data

arXiv.org e-Print Archive

University of Memphis Digital Commons

Crossref

A spatio‑temporal model of homicide in El Salvador

Author: A Mollié
AB Lawson
D Archer
D Gamerman
DJ Lunn
DJ Spiegelhalter
J Besag
J Law
JL Neapolitan
JM Cruz
L Cohen
M Townsley
MA Andersen
PL Brantingham
RJ Bursik
W Rosa Alvarado
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/08/2015
Field of study

This paper examines the spatio-temporal evolution of homicide across the municipalities of El Salvador. It aims at identifying both temporal trends and spatial clusters that may contribute to the formation of time-stable corridors lying behind a historically (recurrent) high homicide rate. The results from this study reveal the presence of significant clusters of high homicide municipalities in the Western part of the country that have remained stable over time, and a process of formation of high homicide clusters in the Eastern region. The results show an increasing homicide trend from 2002 to 2013 with significant municipality-specific differential trends across the country. The data suggests that links may exist between the dynamics of homicide rates, drug trafficking and organized crime

Crossref

Repositorio Digital de la Ciencia y Cultura de El Salvador REDICCES

An evolutionary technique to approximate multiple optimal alignments

Author: A Adriansyah
B Dongen van
B Vázquez-Barreiros
D Reißner
D Ruppert
F Mannhardt
F Taymouri
F Taymouri
J Munoz-Gama
M Koorneef
M Leoni de
R Neapolitan
SB Needleman
SJJ Leemans
T Murata
WMP Aalst van der
WMP Aalst van der
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The alignment of observed and modeled behavior is an essential aid for organizations, since it opens the door for root-cause analysis and enhancement of processes. The state-of-the-art technique for computing alignments has exponential time and space complexity, hindering its applicability for medium and large instances. Moreover, the fact that there may be multiple optimal alignments is perceived as a negative situation, while in reality it may provide a more comprehensive picture of the model’s explanation of observed behavior, from which other techniques may benefit. This paper presents a novel evolutionary technique for approximating multiple optimal alignments. Remarkably, the memory footprint of the proposed technique is bounded, representing an unprecedented guarantee with respect to the state-of-the-art methods for the same task. The technique is implemented into a tool, and experiments on several benchmarks are provided.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

P26. Metabolism of sialo-glyco-conjugates is defective in huntington’s disease

Author: Amico Enrico
Annual Meeting of the Neapolitan Brain Group 8. <2018
Boltje Thomas J.
Castaldo Salvatore
Di Pardo Alba
Maglione Vittorio
Pepe Giuseppe
Publication venue
Publication date: 01/01/2019
Field of study

EleA@UniSA - Università degli Studi di Salerno

Learned student models with item to item knowledge structures

Author: A. Kobsa
A. Mitrovic
B. Martin
C. Carmona
C. Chow
C. Conati
C.E. Dowling
E. Millán
F.V. Jensen
G.F. Cooper
J. Heller
J. Pearl
J. Vomlel
J.-C. Falmagne
J.-C. Falmagne
J.-D. Zapata-Rivera
J.-P. Doignon
K. Tatsuoka
K.P. Murphy
M. Kambouri
M.C. Desmarais
M.C. Desmarais
Michel C. Desmarais
Michel Gagnon
N. Friedman
P. Domingos
P. Spirtes
Peyman Meshkinfam
R.E. Neapolitan
V. Kodaganallur
W.-C. Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Dynamic reliability assessment of flare systems by combining fault tree analysis and Bayesian networks

Author: Abbassi R.
Berrouane M. T.
BP
Henley E.
IEA
Mohammed Taleb-Berrouane
Neapolitan R. E.
Pearl J.
Sinaki S.
Sohag Kabir
Taleb-Berrouane M.
Yiannis Papadopoulos
Publication venue: 'Informa UK Limited'
Publication date: 24/09/2019
Field of study

YesFlaring is a combustion process commonly used in the oil and gas industry to dispose flammable waste gases. Flare flameout occurs when these gases escape unburnt from the flare tip causing the discharge of flammable and/or toxic vapor clouds. The toxic gases released during this process have the potential to initiate safety hazards and cause serious harm to the ecosystem and human health. Flare flameout could be caused by environmental conditions, equipment failure, and human error. However, to better understand the causes of flare flameout, a rigorous analysis of the behavior of flare systems under failure conditions is required. In this article, we used fault tree analysis (FTA) and the dynamic Bayesian network (DBN) to assess the reliability of flare systems. In this study, we analyzed 40 different combinations of basic events that can cause flare flameout to determine the event with the highest impact on system failure. In the quantitative analysis, we use both constant and time-dependent failure rates of system components. The results show that combining these two approaches allows for robust probabilistic reasoning on flare system reliability, which can help improving the safety and asset integrity of process facilities. The proposed DBN model constitutes a significant step to improve the safety and reliability of flare systems in the oil and gas industry

Repository@Hull - Worktribe

Crossref

Bradford Scholars

Constraint solving in uncertain and dynamic environments - a survey

Author: A. Borning
A. Davenport
A. Mackworth
B. Faltings
B. Freeman-Benson
C. Boutilier
C. Lottaz
D. Fowler
E. Gelle
E. Hebrard
F. Fages
G. Verfaillie
Gérard Verfaillie
H. E. Sakkout
I. Miguel
J. Amilhastre
J. Doyle
J. Kleer de
J. Pearl
L. Bordeaux
M. Ginsberg
M. Littman
M. Puterman
M. Sannella
M. Yokoo
N. Jussien
N. Muscettola
Narendra Jussien
P. Berlandier
P. V. Hentenryck
R. Alami
R. Bryant
R. Debruyne
R. Debruyne
R. Dechter
R. Dechter
R. Neapolitan
R. Wallace
S. Bistarelli
S. Minton
T. Schiex
T. Vidal
T. Walsh
U. Montanari
W. Harvey
Y. Georget
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

International audienceThis article follows a tutorial, given by the authors on dynamic constraint solving at CP 2003 (Ninth International Conference on Principles and Practice of Constraint Programming) in Kinsale, Ireland. It aims at offering an overview of the main approaches and techniques that have been proposed in the domain of constraint satisfaction to deal with uncertain and dynamic environments

Crossref

INRIA a CCSD electronic archive server

HAL Mines Nantes