Search CORE

415 research outputs found

Quantum Algorithm Implementations for Beginners

As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been developing classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review aims to explain the principles of quantum programming, which are quite different from classical programming, with straightforward algebra that makes understanding of the underlying fascinating quantum mechanical principles optional. We give an introduction to quantum computing algorithms and their implementation on real quantum hardware. We survey 20 different quantum algorithms, attempting to describe each in a succinct and self-contained fashion. We show how these algorithms can be implemented on IBM's quantum computer, and in each case, we discuss the results of the implementation with respect to differences between the simulator and the actual hardware runs. This article introduces computer scientists, physicists, and engineers to quantum algorithms and provides a blueprint for their implementations

arXiv.org e-Print Archive

Market state discovery

Author: Singo Unarine
Publication venue: 'University of Zagreb, Faculty of Science, Department of Mathematics'
Publication date: 21/04/2023
Field of study

We explore the concept of financial market state discovery by assessing the robustness of two unsupervised machine learning algorithms: Inverse Covariance Clustering (ICC) and Agglomerative Super Paramagnetic Clustering (ASPC). The assessment is carried out by: simulating market datasets varying in complexity; implementing ICC and ASPC to estimate the underlying states (using only simulated log-returns as inputs); and measuring the algorithms' ability to recover the underlying states, using the Adjusted Rand Index (ARI) as a performance metric. Experiments revealed that ASPC is a more robust and better performing algorithm than ICC. ICC is able to produce competitive results in 2-state markets; however, ICC's primary disadvantage is its inability to maintain strong performance in 3, 4 and 5-state markets. For example, ASPC produced ARI numbers that were up to 800% superior to ICC in 5-state markets. Furthermore, ASPC does not rely on the art of selecting good hyper-parameters such as, the number of states a priori. ICC's utility as a market state discovery algorithm is limited

Cape Town University OpenUCT

2D growth processes: SLE and Loewner chains

Author: Ahlfors
Aizenman
Bauer
Bauer
Bauer
Bauer
Bauer
Bauer
Bauer
Bauer
Bazant
Beffara
Belavin
Bensimon
Bettelheim
Bettelheim
Cardy
Cardy
Cardy
Cardy
Cardy
Cardy
Cardy
Cardy
Carleson
Casademunt
Combescot
Conway
Davidovitch
Denis Bernard
Di Francesco
Dotsenko
Dubédat
Dubédat
Dudley
Duplantier
Duplantier
Duplantier
Duplantier
Duplantier
Duplantier
Duplantier
Duplantier
Duplantier
Fortuin
Friedrich
Friedrich
Friedrich
Gollub
Gruzberg
Halsey
Hasley
Hasting
Hastings
Hentschel
Jensen
Kager
Kalthoff
Kennedy
Kennedy
Kenyon
Kenyon
Kenyon
Knizhnik
Kondev
Kondev
Langlands
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Lawler
Makarov
Mandelbrot
Mandelbrot
Michel Bauer
Niemeyer
Nienhuis
Nienhuis
Richardson
Rohde
Saffman
Schramm
Schramm
Schramm
Sharon
Shraiman
Shraiman
Smirnov
Smirnov
Sokal
Somfai
Tanveer
Watts
Werner
Werner
Werner
Werner
Wiegmann
Witten
Øksendal
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

This review provides an introduction to two dimensional growth processes. Although it covers a variety processes such as diffusion limited aggregation, it is mostly devoted to a detailed presentation of stochastic Schramm-Loewner evolutions (SLE) which are Markov processes describing interfaces in 2D critical systems. It starts with an informal discussion, using numerical simulations, of various examples of 2D growth processes and their connections with statistical mechanics. SLE is then introduced and Schramm's argument mapping conformally invariant interfaces to SLE is explained. A substantial part of the review is devoted to reveal the deep connections between statistical mechanics and processes, and more specifically to the present context, between 2D critical systems and SLE. Some of the SLE remarkable properties are explained, as well as the tools for computing with SLE. This review has been written with the aim of filling the gap between the mathematical and the physical literatures on the subject.Comment: A review on Stochastic Loewner evolutions for Physics Reports, 172 pages, low quality figures, better quality figures upon request to the authors, comments welcom

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-CEA

A network approach to topic models

Author: Altmann Eduardo G.
Gerlach Martin
Peixoto Tiago P.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 04/07/2018
Field of study

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here we obtain a fresh view on the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. This is achieved by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods -- using a stochastic block model (SBM) with non-parametric priors -- we obtain a more versatile and principled framework for topic modeling (e.g., it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. More importantly, our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields.Comment: 22 pages, 10 figures, code available at https://topsbm.github.io

arXiv.org e-Print Archive

MPG.PuRe

Unsupervised inference methods for protein sequence data

Author: SESTA LUCA
Publication venue: country:Italy
Publication date: 12/05/2023
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Recommended from our members

Modernizing Markov Chains Monte Carlo for Scientific and Bayesian Modeling

Author: Margossian Charles Christopher
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

The advent of probabilistic programming languages has galvanized scientists to write increasingly diverse models to analyze data. Probabilistic models use a joint distribution over observed and latent variables to describe at once elaborate scientific theories, non-trivial measurement procedures, information from previous studies, and more. To effectively deploy these models in a data analysis, we need inference procedures which are reliable, flexible, and fast. In a Bayesian analysis, inference boils down to estimating the expectation values and quantiles of the unnormalized posterior distribution. This estimation problem also arises in the study of non-Bayesian probabilistic models, a prominent example being the Ising model of Statistical Physics. Markov chains Monte Carlo (MCMC) algorithms provide a general-purpose sampling method which can be used to construct sample estimators of moments and quantiles. Despite MCMC’s compelling theory and empirical success, many models continue to frustrate MCMC, as well as other inference strategies, effectively limiting our ability to use these models in a data analysis. These challenges motivate new developments in MCMC. The term “modernize” in the title refers to the deployment of methods which have revolutionized Computational Statistics and Machine Learning in the past decade, including: (i) hardware accelerators to support massive parallelization, (ii) approximate inference based on tractable densities, (iii) high-performance automatic differentiation and (iv) continuous relaxations of discrete systems. The growing availability of hardware accelerators such as GPUs has in the past years motivated a general MCMC strategy, whereby we run many chains in parallel with a short sampling phase, rather than a few chains with a long sampling phase. Unfortunately existing convergence diagnostics are not designed for the “many short chains” regime. This is notably the case of the popular R statistics which claims convergence only if the effective sample size per chain is large. We present the nested R, denoted nR, a generalization of R which does not conflate short chains and poor mixing, and offers a useful diagnostic provided we run enough chains and meet certain initialization conditions. Combined with nR the short chain regime presents us with the opportunity to identify optimal lengths for the warmup and sampling phases, as well as the optimal number of chains; tuning parameters of MCMC which are otherwise chosen using heuristics or trial-and-error. We next focus on semi-specialized algorithms for latent Gaussian models, arguably the most widely used of class of hierarchical models. It is well understood that MCMC often struggles with the geometry of the posterior distribution generated by these models. Using a Laplace approximation, we marginalize out the latent Gaussian variables and then integrate the remaining parameters with Hamiltonian Monte Carlo (HMC), a gradient-based MCMC. This approach combines MCMC and a distributional approximation, and offers a useful alternative to pure MCMC or pure approximation methods such as Variational Inference. We compare the three paradigms across a range of general linear models, which admit a sophisticated prior, i.e. a Gaussian process and a Horseshoe prior. To implement our scheme efficiently, we derive a novel automatic differentiation method called the adjoint-differentiated Laplace approximation. This differentiation algorithm propagates the minimal information needed to construct the gradient of the approximate marginal likelihood, and yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation for high-dimensional hyperparameters. We next discuss the application of our algorithm to models with an unconventional likelihood, going beyond the classical setting of general linear models. This necessitates a non-trivial generalization of the adjoint-differentiated Laplace approximation, which we implement using higher-order adjoint methods. The generalization works out to be both more general and more efficient. We apply the resulting method to an unconventional latent Gaussian model, identifying promising features and highlighting persistent challenges. The final chapter of this dissertation focuses on a specific but rich problem: the Ising model of Statistical Physics, and its generalization as the Potts and Spin Glass models. These models are challenging because they are discrete, precluding the immediate use of gradient-based algorithms, and exhibit multiple modes, notably at cold temperatures. We propose a new class of MCMC algorithms to draw samples from Potts models by augmenting the target space with a carefully constructed auxiliary Gaussian variable. In contrast to existing methods of a similar flavor, our algorithm can take advantage of the low-rank structure of the coupling matrix and scales linearly with the number of states in a Potts model. The method is applied to a broad range of coupling and temperature regimes and compared to several sampling methods, allowing us to paint a nuanced algorithmic landscape

Columbia University Academic Commons

Global and Local Information in Clustering Labeled Block Models

Author: Kanade Varun
Mossel Elchanan
Schramm Tselil
Publication venue
Publication date: 01/01/2014
Field of study

The stochastic block model is a classical cluster-exhibiting random graph model that has been widely studied in statistics, physics and computer science. In its simplest form, the model is a random graph with two equal-sized clusters, with intra-cluster edge probability p, and inter-cluster edge probability q. We focus on the sparse case, i.e., p, q = O(1/n), which is practically more relevant and also mathematically more challenging. A conjecture of Decelle, Krzakala, Moore and Zdeborova, based on ideas from statistical physics, predicted a specific threshold for clustering. The negative direction of the conjecture was proved by Mossel, Neeman and Sly (2012), and more recently the positive direction was proven independently by Massoulie and Mossel, Neeman, and Sly. In many real network clustering problems, nodes contain information as well. We study the interplay between node and network information in clustering by studying a labeled block model, where in addition to the edge information, the true cluster labels of a small fraction of the nodes are revealed. In the case of two clusters, we show that below the threshold, a small amount of node information does not affect recovery. On the other hand, we show that for any small amount of information efficient local clustering is achievable as long as the number of clusters is sufficiently large (as a function of the amount of revealed information).Comment: 24 pages, 2 figures. A short abstract describing these results will appear in proceedings of RANDOM 201

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server