259,172 research outputs found
Foundational principles for large scale inference: Illustrations through correlation mining
When can reliable inference be drawn in the "Big Data" context? This paper
presents a framework for answering this fundamental question in the context of
correlation mining, with implications for general large scale inference. In
large scale data applications like genomics, connectomics, and eco-informatics
the dataset is often variable-rich but sample-starved: a regime where the
number of acquired samples (statistical replicates) is far fewer than the
number of observed variables (genes, neurons, voxels, or chemical
constituents). Much of recent work has focused on understanding the
computational complexity of proposed methods for "Big Data." Sample complexity
however has received relatively less attention, especially in the setting when
the sample size is fixed, and the dimension grows without bound. To
address this gap, we develop a unified statistical framework that explicitly
quantifies the sample complexity of various inferential tasks. Sampling regimes
can be divided into several categories: 1) the classical asymptotic regime
where the variable dimension is fixed and the sample size goes to infinity; 2)
the mixed asymptotic regime where both variable dimension and sample size go to
infinity at comparable rates; 3) the purely high dimensional asymptotic regime
where the variable dimension goes to infinity and the sample size is fixed.
Each regime has its niche but only the latter regime applies to exa-scale data
dimension. We illustrate this high dimensional framework for the problem of
correlation mining, where it is the matrix of pairwise and partial correlations
among the variables that are of interest. We demonstrate various regimes of
correlation mining based on the unifying perspective of high dimensional
learning rates and sample complexity for different structured covariance models
and different inference tasks
Sparse Median Graphs Estimation in a High Dimensional Semiparametric Model
In this manuscript a unified framework for conducting inference on complex
aggregated data in high dimensional settings is proposed. The data are assumed
to be a collection of multiple non-Gaussian realizations with underlying
undirected graphical structures. Utilizing the concept of median graphs in
summarizing the commonality across these graphical structures, a novel
semiparametric approach to modeling such complex aggregated data is provided
along with robust estimation of the median graph, which is assumed to be
sparse. The estimator is proved to be consistent in graph recovery and an upper
bound on the rate of convergence is given. Experiments on both synthetic and
real datasets are conducted to illustrate the empirical usefulness of the
proposed models and methods
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
Faithfulness and learning hypergraphs from discrete distributions
The concepts of faithfulness and strong-faithfulness are important for
statistical learning of graphical models. Graphs are not sufficient for
describing the association structure of a discrete distribution. Hypergraphs
representing hierarchical log-linear models are considered instead, and the
concept of parametric (strong-) faithfulness with respect to a hypergraph is
introduced. Strong-faithfulness ensures the existence of uniformly consistent
parameter estimators and enables building uniformly consistent procedures for a
hypergraph search. The strength of association in a discrete distribution can
be quantified with various measures, leading to different concepts of
strong-faithfulness. Lower and upper bounds for the proportions of
distributions that do not satisfy strong-faithfulness are computed for
different parameterizations and measures of association.Comment: 23 pages, 6 figure
Analysis and design of multiagent systems using MAS-CommonKADS
This article proposes an agent-oriented methodology called MAS-CommonKADS and develops a case study. This methodology extends the knowledge engineering methodology CommonKADSwith techniquesfrom objectoriented and protocol engineering methodologies. The methodology consists of the development of seven models: Agent Model, that describes the characteristics of each agent; Task Model, that describes the tasks that the agents carry out; Expertise Model, that describes the knowledge needed by the agents to achieve their goals; Organisation Model, that describes the structural relationships between agents (software agents and/or human agents); Coordination Model, that describes the dynamic relationships between software agents; Communication Model, that describes the dynamic relationships between human agents and their respective personal assistant software agents; and Design Model, that refines the previous models and determines the most suitable agent architecture for each agent, and the requirements of the agent network
mgm: Estimating Time-Varying Mixed Graphical Models in High-Dimensional Data
We present the R-package mgm for the estimation of k-order Mixed Graphical
Models (MGMs) and mixed Vector Autoregressive (mVAR) models in high-dimensional
data. These are a useful extensions of graphical models for only one variable
type, since data sets consisting of mixed types of variables (continuous,
count, categorical) are ubiquitous. In addition, we allow to relax the
stationarity assumption of both models by introducing time-varying versions
MGMs and mVAR models based on a kernel weighting approach. Time-varying models
offer a rich description of temporally evolving systems and allow to identify
external influences on the model structure such as the impact of interventions.
We provide the background of all implemented methods and provide fully
reproducible examples that illustrate how to use the package
Collaborative Verification-Driven Engineering of Hybrid Systems
Hybrid systems with both discrete and continuous dynamics are an important
model for real-world cyber-physical systems. The key challenge is to ensure
their correct functioning w.r.t. safety requirements. Promising techniques to
ensure safety seem to be model-driven engineering to develop hybrid systems in
a well-defined and traceable manner, and formal verification to prove their
correctness. Their combination forms the vision of verification-driven
engineering. Often, hybrid systems are rather complex in that they require
expertise from many domains (e.g., robotics, control systems, computer science,
software engineering, and mechanical engineering). Moreover, despite the
remarkable progress in automating formal verification of hybrid systems, the
construction of proofs of complex systems often requires nontrivial human
guidance, since hybrid systems verification tools solve undecidable problems.
It is, thus, not uncommon for development and verification teams to consist of
many players with diverse expertise. This paper introduces a
verification-driven engineering toolset that extends our previous work on
hybrid and arithmetic verification with tools for (i) graphical (UML) and
textual modeling of hybrid systems, (ii) exchanging and comparing models and
proofs, and (iii) managing verification tasks. This toolset makes it easier to
tackle large-scale verification tasks
A toolkit of mechanism and context independent widgets
Most human-computer interfaces are designed to run on a static platform (e.g. a workstation with a monitor) in a static environment (e.g. an office). However, with mobile devices becoming ubiquitous and capable of running applications similar to those found on static devices, it is no longer valid to design static interfaces. This paper describes a user-interface architecture which allows interactors to be flexible about the way they are presented. This flexibility is defined by the different input and output mechanisms used. An interactor may use different mechanisms depending upon their suitability in the current context, user preference and the resources available for presentation using that mechanism
- …