610 research outputs found
Causal inference using the algorithmic Markov condition
Inferring the causal structure that links n observables is usually based upon
detecting statistical dependences and choosing simple graphs that make the
joint measure Markovian. Here we argue why causal inference is also possible
when only single observations are present.
We develop a theory how to generate causal graphs explaining similarities
between single objects. To this end, we replace the notion of conditional
stochastic independence in the causal Markov condition with the vanishing of
conditional algorithmic mutual information and describe the corresponding
causal inference rules.
We explain why a consistent reformulation of causal inference in terms of
algorithmic complexity implies a new inference principle that takes into
account also the complexity of conditional probability densities, making it
possible to select among Markov equivalent causal graphs. This insight provides
a theoretical foundation of a heuristic principle proposed in earlier work.
We also discuss how to replace Kolmogorov complexity with decidable
complexity criteria. This can be seen as an algorithmic analog of replacing the
empirically undecidable question of statistical independence with practical
independence tests that are based on implicit or explicit assumptions on the
underlying distribution.Comment: 16 figure
Entrograms and coarse graining of dynamics on complex networks
Using an information theoretic point of view, we investigate how a dynamics
acting on a network can be coarse grained through the use of graph partitions.
Specifically, we are interested in how aggregating the state space of a Markov
process according to a partition impacts on the thus obtained lower-dimensional
dynamics. We highlight that for a dynamics on a particular graph there may be
multiple coarse grained descriptions that capture different, incomparable
features of the original process. For instance, a coarse graining induced by
one partition may be commensurate with a time-scale separation in the dynamics,
while another coarse graining may correspond to a different lower-dimensional
dynamics that preserves the Markov property of the original process. Taking
inspiration from the literature of Computational Mechanics, we find that a
convenient tool to summarise and visualise such dynamical properties of a
coarse grained model (partition) is the entrogram. The entrogram gathers
certain information-theoretic measures, which quantify how information flows
across time steps. These information theoretic quantities include the entropy
rate, as well as a measure for the memory contained in the process, i.e., how
well the dynamics can be approximated by a first order Markov process. We use
the entrogram to investigate how specific macro-scale connection patterns in
the state-space transition graph of the original dynamics result in desirable
properties of coarse grained descriptions. We thereby provide a fresh
perspective on the interplay between structure and dynamics in networks, and
the process of partitioning from an information theoretic perspective. We focus
on networks that may be approximated by both a core-periphery or a clustered
organization, and highlight that each of these coarse grained descriptions can
capture different aspects of a Markov process acting on the network.Comment: 17 pages, 6 figue
Differentially Private Empirical Risk Minimization
Privacy-preserving machine learning algorithms are crucial for the
increasingly common setting in which personal data, such as medical or
financial records, are analyzed. We provide general techniques to produce
privacy-preserving approximations of classifiers learned via (regularized)
empirical risk minimization (ERM). These algorithms are private under the
-differential privacy definition due to Dwork et al. (2006). First we
apply the output perturbation ideas of Dwork et al. (2006), to ERM
classification. Then we propose a new method, objective perturbation, for
privacy-preserving machine learning algorithm design. This method entails
perturbing the objective function before optimizing over classifiers. If the
loss and regularizer satisfy certain convexity and differentiability criteria,
we prove theoretical results showing that our algorithms preserve privacy, and
provide generalization bounds for linear and nonlinear kernels. We further
present a privacy-preserving technique for tuning the parameters in general
machine learning algorithms, thereby providing end-to-end privacy guarantees
for the training process. We apply these results to produce privacy-preserving
analogues of regularized logistic regression and support vector machines. We
obtain encouraging results from evaluating their performance on real
demographic and benchmark data sets. Our results show that both theoretically
and empirically, objective perturbation is superior to the previous
state-of-the-art, output perturbation, in managing the inherent tradeoff
between privacy and learning performance.Comment: 40 pages, 7 figures, accepted to the Journal of Machine Learning
Researc
A General Framework for Sensor Placement in Source Localization
When an epidemic spreads in a given network of individuals or communities, can we detect its source using only the information provided by a small set of nodes? We propose a general framework that incorporates two dimensions. First, we can either rely exclusively on a set of selected nodes (i.e., sensors) which always reveal their state independently of any particular epidemic (these are called static), or we can add some sensors (called dynamic) as an epidemic spreads, depending on which additional information is required. Second, the method can either localizes the source after an epidemic has spread through the entire network (offline), or while the epidemic is ongoing (online). We empirically study the performance of offline and online localization both with and without dynamic sensors. Our analysis shows that, by using dynamic sensors, the number of sensors necessary to localize the source is reduced by up to a factor of 10 and that, even with high-variance transmission delays, the source can be localized by using fewer than 5% of the nodes as sensors
Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers
Building on recent progress at the intersection of combinatorial optimization
and deep learning, we propose an end-to-end trainable architecture for deep
graph matching that contains unmodified combinatorial solvers. Using the
presence of heavily optimized combinatorial solvers together with some
improvements in architecture design, we advance state-of-the-art on deep graph
matching benchmarks for keypoint correspondence. In addition, we highlight the
conceptual advantages of incorporating solvers into deep learning
architectures, such as the possibility of post-processing with a strong
multi-graph matching solver or the indifference to changes in the training
setting. Finally, we propose two new challenging experimental setups. The code
is available at https://github.com/martius-lab/blackbox-deep-graph-matchingComment: ECCV 2020 conference pape
Complexity Theory, Game Theory, and Economics: The Barbados Lectures
This document collects the lecture notes from my mini-course "Complexity
Theory, Game Theory, and Economics," taught at the Bellairs Research Institute
of McGill University, Holetown, Barbados, February 19--23, 2017, as the 29th
McGill Invitational Workshop on Computational Complexity.
The goal of this mini-course is twofold: (i) to explain how complexity theory
has helped illuminate several barriers in economics and game theory; and (ii)
to illustrate how game-theoretic questions have led to new and interesting
complexity theory, including recent several breakthroughs. It consists of two
five-lecture sequences: the Solar Lectures, focusing on the communication and
computational complexity of computing equilibria; and the Lunar Lectures,
focusing on applications of complexity theory in game theory and economics. No
background in game theory is assumed.Comment: Revised v2 from December 2019 corrects some errors in and adds some
recent citations to v1 Revised v3 corrects a few typos in v
Learning how to act: making good decisions with machine learning
This thesis is about machine learning and statistical approaches
to decision making. How can we learn from data to anticipate the
consequence of, and optimally select, interventions or actions?
Problems such as deciding which medication to prescribe to
patients, who should be released on bail, and how much to charge
for insurance are ubiquitous, and have far reaching impacts on
our lives. There are two fundamental approaches to learning how
to act: reinforcement learning, in which an agent directly
intervenes in a system and learns from the outcome, and
observational causal inference, whereby we seek to infer the
outcome of an intervention from observing the system.
The goal of this thesis to connect and unify these key
approaches. I introduce causal bandit problems: a synthesis that
combines causal graphical models, which were developed for
observational causal inference, with multi-armed bandit problems,
which are a subset of reinforcement learning problems that are
simple enough to admit formal analysis. I show that knowledge of
the causal structure allows us to transfer information learned
about the outcome of one action to predict the outcome of an
alternate action, yielding a novel form of structure between
bandit arms that cannot be exploited by existing algorithms. I
propose an algorithm for causal bandit problems and prove bounds
on the simple regret demonstrating it is close to mini-max
optimal and better than algorithms that do not use the additional
causal information
Efficient algorithms for analyzing large scale network dynamics: Centrality, community and predictability
Large scale networks are an indispensable part of our daily life; be it biological network, smart grids, academic collaboration networks, social networks, vehicular networks, or the networks as part of various smart environments, they are fast becoming ubiquitous. The successful realization of applications and services over them depend on efficient solution to their computational challenges that are compounded with network dynamics. The core challenges underlying large scale networks, for example: determining central (influential) nodes (and edges), interactions and contacts among nodes, are the basis behind the success of applications and services. Though at first glance these challenges seem to be trivial, the network characteristics affect their effective and efficient evaluation strategy. We thus propose to leverage large scale network structural characteristics and temporal dynamics in addressing these core conceptual challenges in this dissertation.
We propose a divide and conquer based computationally efficient algorithm that leverages the underlying network community structure for deterministic computation of betweenness centrality indices for all nodes. As an integral part of it, we also propose a computationally efficient agglomerative hierarchical community detection algorithm. Next, we propose a network structure evolution based novel probabilistic link prediction algorithm that predicts set of links occurring over subsequent time periods with higher accuracy. To best capture the evolution process and have higher prediction accuracy we propose multiple time scales with the Markov prediction model. Finally, we propose to capture the multi-periodicity of human mobility pattern with sinusoidal intensity function of a cascaded nonhomogeneous Poisson process, to predict the future contacts over mobile networks. We use real data set and benchmarked approaches to validate the better performance of our proposed approaches --Abstract, page iii
Back to the Source: an Online Approach for Sensor Placement and Source Localization
Source localization, the act of finding the originator of a disease or rumor in a network, has become an important problem in sociology and epidemiology. The localization is done using the infection state and time of infection of a few designated sensor nodes; however, maintaining sensors can be very costly in practice. We propose the first online approach to source localization: We deploy a priori only a small number of sensors (which reveal if they are reached by an infection) and then iteratively choose the best location to place new sensors in order to localize the source. This approach allows for source localization with a very small number of sensors; moreover, the source can be found while the epidemic is still ongoing. Our method applies to a general network topology and performs well even with random transmission delays
Metrical Service Systems with Transformations
We consider a generalization of the fundamental online metrical service systems (MSS) problem where the feasible region can be transformed between requests. In this problem, which we call T-MSS, an algorithm maintains a point in a metric space and has to serve a sequence of requests. Each request is a map (transformation) : → between subsets and of the metric space. To serve it, the algorithm has to go to a point ∈ , paying the distance from its previous position. Then, the transformation is applied, modifying the algorithm’s state to ( ). Such transformations can model, e.g., changes to the environment that are outside of an algorithm’s control, and we therefore do not charge any additional cost to the algorithm when the transformation is applied. The transformations also allow to model requests occurring in the -taxi problem.
We show that for -Lipschitz transformations, the competitive ratio is Θ()−2 on -point metrics. Here, the upper bound is achieved by a deterministic algorithm and the lower bound holds even for randomized algorithms. For the -taxi problem, we prove a competitive ratio of Õ(( log )2). For chasing convex bodies, we show that even with contracting transformations no competitive algorithm exists.
The problem T-MSS has a striking connection to the following deep mathematical question: Given a finite metric space M, what is the required cardinality of an extension M̂ ⊇ M where each partial isometry on M extends to an automorphism? We give partial answers for special cases
- …