2,565 research outputs found
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
A Contrastive Approach to Online Change Point Detection
We suggest a novel procedure for online change point detection. Our approach
expands an idea of maximizing a discrepancy measure between points from
pre-change and post-change distributions. This leads to a flexible procedure
suitable for both parametric and nonparametric scenarios. We prove
non-asymptotic bounds on the average running length of the procedure and its
expected detection delay. The efficiency of the algorithm is illustrated with
numerical experiments on synthetic and real-world data sets.Comment: Accepted for presentation at AISTATS 2023; 28 page
Classifier Calibration: A survey on how to assess and improve predicted class probabilities
This paper provides both an introduction to and a detailed overview of the
principles and practice of classifier calibration. A well-calibrated classifier
correctly quantifies the level of uncertainty or confidence associated with its
instance-wise predictions. This is essential for critical applications, optimal
decision making, cost-sensitive classification, and for some types of context
change. Calibration research has a rich history which predates the birth of
machine learning as an academic field by decades. However, a recent increase in
the interest on calibration has led to new methods and the extension from
binary to the multiclass setting. The space of options and issues to consider
is large, and navigating it requires the right set of concepts and tools. We
provide both introductory material and up-to-date technical details of the main
concepts and methods, including proper scoring rules and other evaluation
metrics, visualisation approaches, a comprehensive account of post-hoc
calibration methods for binary and multiclass classification, and several
advanced topics
Pairwise versus mutual independence: visualisation, actuarial applications and central limit theorems
Accurately capturing the dependence between risks, if it exists, is an increasingly relevant topic of actuarial research. In recent years, several authors have started to relax the traditional 'independence assumption', in a variety of actuarial settings. While it is known that 'mutual independence' between random variables is not equivalent to their 'pairwise independence', this thesis aims to provide a better understanding of the materiality of this difference. The distinction between mutual and pairwise independence matters because, in practice, dependence is often assessed via pairs only, e.g., through correlation matrices, rank-based measures of association, scatterplot matrices, heat-maps, etc. Using such pairwise methods, it is possible to miss some forms of dependence. In this thesis, we explore how material the difference between pairwise and mutual independence is, and from several angles.
We provide relevant background and motivation for this thesis in Chapter 1, then conduct a literature review in Chapter 2.
In Chapter 3, we focus on visualising the difference between pairwise and mutual independence. To do so, we propose a series of theoretical examples (some of them new) where random variables are pairwise independent but (mutually) dependent, in short, PIBD. We then develop new visualisation tools and use them to illustrate what PIBD variables can look like. We showcase that the dependence involved is possibly very strong. We also use our visualisation tools to identify subtle forms of dependence, which would otherwise be hard to detect.
In Chapter 4, we review common dependence models (such has elliptical distributions and Archimedean copulas) used in actuarial science and show that they do not allow for the possibility of PIBD data. We also investigate concrete consequences of the 'nonequivalence' between pairwise and mutual independence. We establish that many results which hold for mutually independent variables do not hold under sole pairwise independent. Those include results about finite sums of random variables, extreme value theory and bootstrap methods. This part thus illustrates what can potentially 'go wrong' if one assumes mutual independence where only pairwise independence holds.
Lastly, in Chapters 5 and 6, we investigate the question of what happens for PIBD variables 'in the limit', i.e., when the sample size goes to infi nity. We want to see if the 'problems' caused by dependence vanish for sufficiently large samples. This is a broad question, and we concentrate on the important classical Central Limit Theorem (CLT), for which we fi nd that the answer is largely negative. In particular, we construct new sequences of PIBD variables (with arbitrary margins) for which a CLT does not hold. We derive explicitly the asymptotic distribution of the standardised mean of our sequences, which allows us to illustrate the extent of the 'failure' of a CLT for PIBD variables. We also propose a general methodology to construct dependent K-tuplewise independent (K an arbitrary integer) sequences of random variables with arbitrary margins. In the case K = 3, we use this methodology to derive explicit examples of triplewise independent sequences for which no CLT hold. Those results illustrate that mutual independence is a crucial assumption within CLTs, and that having larger samples is not always a viable solution to the problem of non-independent data
The stationary horizon as the central multi-type invariant measure in the KPZ universality class
The Kardar-Parisi-Zhang (KPZ) universality class describes a large class of
2-dimensional models of random growth, which exhibit universal scaling
exponents and limiting statistics. The last ten years has seen remarkable
progress in this area, with the formal construction of two interrelated
limiting objects, now termed the KPZ fixed point and the directed landscape
(DL). This dissertation focuses on a third central object, termed the
stationary horizon (SH). The SH was first introduced (and named) by Busani as
the scaling limit of the Busemann process in exponential last-passage
percolation. Shortly after, in the author's joint work with Sepp\"al\"ainen, it
was independently constructed in the context of Brownian last-passage
percolation. In this dissertation, we give an alternate construction of the SH,
directly from the description of its finite-dimensional distributions and
without reference to Busemann functions. From this description, we give several
exact distributional formulas for the SH. Next, we show the significance of the
SH as a key object in the KPZ universality class by showing that the SH is the
unique coupled invariant distribution for the DL. A major consequence of this
result is that the SH describes the Busemann process for the DL. From this
connection, we give a detailed description of the collection of semi-infinite
geodesics in the DL, from all initial points and in all directions. As a
further evidence of the universality of the SH, we show that it appears as the
scaling limit of the multi-species invariant measures for the totally
asymmetric simple exclusion process (TASEP). This dissertation is adapted from
two joint works with Sepp\"al\"ainen and two joint works with Busani and
Sepp\"al\"ainen.Comment: v2: minor typos corrected, PhD dissertation, University of
Wisconsin--Madison (2023). Contains material adapted from arXiv:2103.01172,
arXiv:2112.10729, arXiv:2203.13242 and arXiv:2211.04651. Chapter 3 gives an
alternate proof of the invariance of the SH shown in arXiv:2203.1324
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
The mechanisms behind the success of multi-view self-supervised learning
(MVSSL) are not yet fully understood. Contrastive MVSSL methods have been
studied through the lens of InfoNCE, a lower bound of the Mutual Information
(MI). However, the relation between other MVSSL methods and MI remains unclear.
We consider a different lower bound on the MI consisting of an entropy and a
reconstruction term (ER), and analyze the main MVSSL families through its lens.
Through this ER bound, we show that clustering-based methods such as
DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of
distillation-based approaches such as BYOL and DINO, showing that they
explicitly maximize the reconstruction term and implicitly encourage a stable
entropy, and we confirm this empirically. We show that replacing the objectives
of common MVSSL methods with this ER bound achieves competitive performance,
while making them stable when training with smaller batch sizes or smaller
exponential moving average (EMA) coefficients.
Github repo: https://github.com/apple/ml-entropy-reconstruction.Comment: 18 pages: 9 of main text, 2 of references, and 7 of supplementary
material. Appears in the proceedings of ICML 202
A multifractal model of asset (in)variances
This study extends Mandelbrot’s (2008) multifractal model of asset returns to model realized variances across different time frequencies. In a comparative manner, various degrees of time deformations are explored for implementation of the multiplicative cascade. In doing so, this study focuses on two effects: discontinuity measured by the specific power-law exponent and dependency measured by the Hurst exponent. This study shows that the benchmark model, for which Mandelbrot’s (2008) “cartoon” is the foundation, has some remarkable properties as it is capable of explaining the realized variances for the GBP/USD exchange rate and Bitcoin. Notably, the realized variances for crude oil and the S&P 500 require a more extreme time deformation. The invariance hypothesis is confirmed for all realized variances because the power-law exponents for weekly and monthly data coincide with predictions of the multifractal model. Overall, the novel results derived from the proposed multifractal models suggest that some realized variances of otherwise unrelated asset markets are driven by the same underlying “driving force”—a common multifractal cascade.© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).fi=vertaisarvioitu|en=peerReviewed
Measuring the impact of COVID-19 on hospital care pathways
Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted
- …