17,751 research outputs found

    Rank-based linkage I: triplet comparisons and oriented simplicial complexes

    Full text link
    Rank-based linkage is a new tool for summarizing a collection SS of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on SS. Rank-based linkage is applied to the KK-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected KK-nearest neighbor graph on SS. In ∣S∣K2|S| K^2 steps it builds an edge-weighted linkage graph (S,L,σ)(S, \mathcal{L}, \sigma) where σ({x,y})\sigma(\{x, y\}) is called the in-sway between objects xx and yy. Take Lt\mathcal{L}_t to be the links whose in-sway is at least tt, and partition SS into components of the graph (S,Lt)(S, \mathcal{L}_t), for varying tt. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not ``rip apart`` the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.Comment: 37 pages, 12 figure

    Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

    Full text link
    We target the problem of automatically synthesizing proofs of semantic equivalence between two programs made of sequences of statements. We represent programs using abstract syntax trees (AST), where a given set of semantics-preserving rewrite rules can be applied on a specific AST pattern to generate a transformed and semantically equivalent program. In our system, two programs are equivalent if there exists a sequence of application of these rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. The system outputs a sequence of rewrites, and the validity of the sequence is simply checked by verifying it can be applied. If no valid sequence is produced by the neural network, the system reports the programs as non-equivalent, ensuring by design no programs may be incorrectly reported as equivalent. Our system is fully implemented for a given grammar which can represent straight-line programs with function calls and multiple types. To efficiently train the system to generate such sequences, we develop an original incremental training technique, named self-supervised sample selection. We extensively study the effectiveness of this novel training approach on proofs of increasing complexity and length. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programsComment: 30 pages including appendi

    Properties of a model of sequential random allocation

    Get PDF
    Probabilistic models of allocating shots to boxes according to a certain probability distribution have commonly been used for processes involving agglomeration. Such processes are of interest in many areas of research such as ecology, physiology, chemistry and genetics. Time could be incorporated into the shots-and-boxes model by considering multiple layers of boxes through which the shots move, where the layers represent the passing of time. Such a scheme with multiple layers, each with a certain number of occupied boxes is naturally associated with a random tree. It lends itself to genetic applications where the number of ancestral lineages of a sample changes through the generations. This multiple-layer scheme also allows us to explore the difference in the number of occupied boxes between layers, which gives a measure of how quickly merges are happening. In particular, results for the multiple-layer scheme corresponding to those known for a single-layer scheme, where, under certain conditions, the limiting distribution of the number of occupied boxes is either Poisson or normal, are derived. To provide motivation and demonstrate which methods work well, a detailed study of a small, finite example is provided. A common approach for establishing a limiting distribution for a random variable of interest is to first show that it can be written as a sum of independent Bernoulli random variables as this then allows us to apply standard central limit theorems. Additionally, it allows us to, for example, provide an upper bound on the distance to a Poisson distribution. One way of showing that a random variable can be written as a sum of independent Bernoulli random variables is to show that its probability generating function (p.g.f.) has all real roots. Various methods are presented and considered for proving the p.g.f. of the number of occupied boxes in any given layer of the scheme has all real roots. By considering small finite examples some of these methods could be ruled out for general N. Finally, the scheme for general N boxes and n shots is considered, where again a uniform allocation of shots is used. It is shown that, under certain conditions, the distribution of the number of occupied boxes tends towards either a normal or Poisson limit. Equivalent results are also demonstrated for the distribution of the difference in the number of occupied boxes between consecutive layers

    Path integrals and stochastic calculus

    Full text link
    Path integrals are a ubiquitous tool in theoretical physics. However, their use is sometimes hindered by the lack of control on various manipulations -- such as performing a change of the integration path -- one would like to carry out in the light-hearted fashion that physicists enjoy. Similar issues arise in the field of stochastic calculus, which we review to prepare the ground for a proper construction of path integrals. At the level of path integration, and in arbitrary space dimension, we not only report on existing Riemannian geometry-based approaches that render path integrals amenable to the standard rules of calculus, but also bring forth new routes, based on a fully time-discretized approach, that achieve the same goal. We illustrate these various definitions of path integration on simple examples such as the diffusion of a particle on a sphere.Comment: 96 pages, 4 figures. New title, expanded introduction and additional references. Version accepted in Advandes in Physic

    Bayesian Optimization with Conformal Prediction Sets

    Full text link
    Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm bandits, active learning, and black-box optimization. Bayesian optimization selects decisions (i.e. objective function queries) with maximal expected utility with respect to the posterior distribution of a Bayesian model, which quantifies reducible, epistemic uncertainty about query outcomes. In practice, subjectively implausible outcomes can occur regularly for two reasons: 1) model misspecification and 2) covariate shift. Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models and a simple mechanism to correct for covariate shift. We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity, and investigate its behavior on a suite of black-box optimization tasks and tabular ranking tasks. In many cases we find that query coverage can be significantly improved without harming sample-efficiency.Comment: For code, see https://www.github.com/samuelstanton/conformal-bayesopt.gi

    From classical to quantum Oppenheimer-Snyder model: non-marginal case

    Full text link
    We first present a consistent canonical formulation of the general (non-marginal) Oppenheimer-Snyder model. The switching between comoving and stationary observer is achieved by promoting coordinate transformations between dust proper time and Schwarzschild-Killing time to canonical ones. This leads to a multivalued Hamiltonian which is deparameterizable. We then discuss the quantization of comoving and stationary observers by employing the method of Affine Coherent State Quantization (ACSQ). We thereby demonstrate that under certain conditions the quantum corrected trajectories can replace the classical singularity by a bounce. We then show that both comoving and stationary observers see this bouncing collapse behavior. We finally discuss a switching between these classes of observers at the quantum level.Comment: 22 pages, 10 figure

    Learning disentangled speech representations

    Get PDF
    A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody. The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions. In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks. This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically

    Moduli Stabilisation and the Statistics of Low-Energy Physics in the String Landscape

    Get PDF
    In this thesis we present a detailed analysis of the statistical properties of the type IIB flux landscape of string theory. We focus primarily on models constructed via the Large Volume Scenario (LVS) and KKLT and study the distribution of various phenomenologically relevant quantities. First, we compare our considerations with previous results and point out the importance of Kähler moduli stabilisation, which has been neglected in this context so far. We perform different moduli stabilisation procedures and compare the resulting distributions. To this end, we derive the expressions for the gravitino mass, various quantities related to axion physics and other phenomenologically interesting quantities in terms of the fundamental flux dependent quantities gsg_s, W0W_0 and n\mathfrak{n}, the parameter which specifies the nature of the non-perturbative effects. Exploiting our knowledge of the distribution of these fundamental parameters, we can derive a distribution for all the quantities we are interested in. For models that are stabilised via LVS we find a logarithmic distribution, whereas for KKLT and perturbatively stabilised models we find a power-law distribution. We continue by investigating the statistical significance of a newly found class of KKLT vacua and present a search algorithm for such constructions. We conclude by presenting an application of our findings. Given the mild preference for higher scale supersymmetry breaking, we present a model of the early universe, which allows for additional periods of early matter domination and ultimately leads to rather sharp predictions for the dark matter mass in this model. We find the dark matter mass to be in the very heavy range mχ∼1010−1011 GeVm_{\chi}\sim 10^{10}-10^{11}\text{ GeV}
    • …
    corecore