6,473 research outputs found
Model selection and local geometry
We consider problems in model selection caused by the geometry of models
close to their points of intersection. In some cases---including common classes
of causal or graphical models, as well as time series models---distinct models
may nevertheless have identical tangent spaces. This has two immediate
consequences: first, in order to obtain constant power to reject one model in
favour of another we need local alternative hypotheses that decrease to the
null at a slower rate than the usual parametric (typically we will
require or slower); in other words, to distinguish between the
models we need large effect sizes or very large sample sizes. Second, we show
that under even weaker conditions on their tangent cones, models in these
classes cannot be made simultaneously convex by a reparameterization.
This shows that Bayesian network models, amongst others, cannot be learned
directly with a convex method similar to the graphical lasso. However, we are
able to use our results to suggest methods for model selection that learn the
tangent space directly, rather than the model itself. In particular, we give a
generic algorithm for learning Bayesian network models
Graphical methods for inequality constraints in marginalized DAGs
We present a graphical approach to deriving inequality constraints for
directed acyclic graph (DAG) models, where some variables are unobserved. In
particular we show that the observed distribution of a discrete model is always
restricted if any two observed variables are neither adjacent in the graph, nor
share a latent parent; this generalizes the well known instrumental inequality.
The method also provides inequalities on interventional distributions, which
can be used to bound causal effects. All these constraints are characterized in
terms of a new graphical separation criterion, providing an easy and intuitive
method for their derivation.Comment: A final version will appear in the proceedings of the 22nd Workshop
on Machine Learning and Signal Processing, 201
Graphs for margins of Bayesian networks
Directed acyclic graph (DAG) models, also called Bayesian networks, impose
conditional independence constraints on a multivariate probability
distribution, and are widely used in probabilistic reasoning, machine learning
and causal inference. If latent variables are included in such a model, then
the set of possible marginal distributions over the remaining (observed)
variables is generally complex, and not represented by any DAG. Larger classes
of mixed graphical models, which use multiple edge types, have been introduced
to overcome this; however, these classes do not represent all the models which
can arise as margins of DAGs. In this paper we show that this is because
ordinary mixed graphs are fundamentally insufficiently rich to capture the
variety of marginal models.
We introduce a new class of hyper-graphs, called mDAGs, and a latent
projection operation to obtain an mDAG from the margin of a DAG. We show that
each distinct marginal of a DAG model is represented by at least one mDAG, and
provide graphical results towards characterizing when two such marginal models
are the same. Finally we show that mDAGs correctly capture the marginal
structure of causally-interpreted DAGs under interventions on the observed
variables
Margins of discrete Bayesian networks
Bayesian network models with latent variables are widely used in statistics
and machine learning. In this paper we provide a complete algebraic
characterization of Bayesian network models with latent variables when the
observed variables are discrete and no assumption is made about the state-space
of the latent variables. We show that it is algebraically equivalent to the
so-called nested Markov model, meaning that the two are the same up to
inequality constraints on the joint probabilities. In particular these two
models have the same dimension. The nested Markov model is therefore the best
possible description of the latent variable model that avoids consideration of
inequalities, which are extremely complicated in general. A consequence of this
is that the constraint finding algorithm of Tian and Pearl (UAI 2002,
pp519-527) is complete for finding equality constraints.
Latent variable models suffer from difficulties of unidentifiable parameters
and non-regular asymptotics; in contrast the nested Markov model is fully
identifiable, represents a curved exponential family of known dimension, and
can easily be fitted using an explicit parameterization.Comment: 41 page
Causal Inference through a Witness Protection Program
One of the most fundamental problems in causal inference is the estimation of
a causal effect when variables are confounded. This is difficult in an
observational study, because one has no direct evidence that all confounders
have been adjusted for. We introduce a novel approach for estimating causal
effects that exploits observational conditional independencies to suggest
"weak" paths in a unknown causal graph. The widely used faithfulness condition
of Spirtes et al. is relaxed to allow for varying degrees of "path
cancellations" that imply conditional independencies but do not rule out the
existence of confounding causal paths. The outcome is a posterior distribution
over bounds on the average causal effect via a linear programming approach and
Bayesian inference. We claim this approach should be used in regular practice
along with other default tools in observational studies.Comment: 41 pages, 7 figure
One-Component Regular Variation and Graphical Modeling of Extremes
The problem of inferring the distribution of a random vector given that its
norm is large requires modeling a homogeneous limiting density. We suggest an
approach based on graphical models which is suitable for high-dimensional
vectors.
We introduce the notion of one-component regular variation to describe a
function that is regularly varying in its first component. We extend the
representation and Karamata's theorem to one-component regularly varying
functions, probability distributions and densities, and explain why these
results are fundamental in multivariate extreme-value theory. We then
generalize Hammersley-Clifford theorem to relate asymptotic conditional
independence to a factorization of the limiting density, and use it to model
multivariate tails
Asthma management: an ecosocial framework for disparity research
Background: Asthma management disparities (AMD) between African and White Americans are significant and alarming. Various determinants have been suggested by research frameworks that affect the unfair distribution of resources for asthma management to groups who are more or less advantaged socially. Ecosocial models organize determinants into individual/family, healthcare, community, and sociocultural levels. Multilevel interventions can affect AMD through simultaneous actions on different levels and pathways between determinants.
Objective: Provide a comprehensive summary of the known determinants of AMD.
Method: Peer reviewed research frameworks of AMD from 1998-2009 were retrieved from PubMed/ Web of Science databases using (âSocioeconomic Factorsâ[Mesh] OR (âHealthcare Disparitiesâ[Mesh] OR âHealth Status Disparitiesâ[Mesh])) AND âAsthmaâ[Mesh] AND âAfrican Americansâ[Mesh] OR âEthnic Groupsâ[Mesh]). Abstracts assessed for a focus on AMD, and determinants. Articles were analyzed for ecosocial levels and determinants.
Results: 13 research frameworks described 34 determinants. Compared to other levels, Individual/family levels had the most emphasis, and frameworks using healthcare and community levels were the most narrow in focus. Stress, poverty, violence/crime, quality of care, healthcare access, and indoor air quality were well described determinants.
Conclusions: Multilevel investigations should include those well described determinants of AMD and increase knowledge of pathway interactions between healthcare and community levels
Predicting and controlling the dynamics of infectious diseases
This paper introduces a new optimal control model to describe and control the
dynamics of infectious diseases. In the present model, the average time of
isolation (i.e. hospitalization) of infectious population is the main
time-dependent parameter that defines the spread of infection. All the
preventive measures aim to decrease the average time of isolation under given
constraints
Dynamics of Ebola epidemics in West Africa 2014
This paper investigates the dynamics of Ebola virus transmission in West
Africa during 2014. The reproduction numbers for the total period of epidemic
and for different consequent time intervals are estimated based on a newly
suggested linear model. It contains one major variable - the average time of
infectiousness (time from onset to hospitalization) that is considered as a
parameter for controlling the future dynamics of epidemics.
Numerical implementations are carried out on data collected from three
countries Guinea, Sierra Leone and Liberia as well as the total data collected
worldwide. Predictions are provided by considering different scenarios
involving the average times of infectiousness for the next few months and the
end of the current epidemic is estimated according to each scenario
- âŠ