2,807 research outputs found

    A Framework for Monte Carlo based Multiple Testing

    Full text link
    We are concerned with a situation in which we would like to test multiple hypotheses with tests whose p-values cannot be computed explicitly but can be approximated using Monte Carlo simulation. This scenario occurs widely in practice. We are interested in obtaining the same rejections and non-rejections as the ones obtained if the p-values for all hypotheses had been available. The present article introduces a framework for this scenario by providing a generic algorithm for a general multiple testing procedure. We establish conditions which guarantee that the rejections and non-rejections obtained through Monte Carlo simulations are identical to the ones obtained with the p-values. Our framework is applicable to a general class of step-up and step-down procedures which includes many established multiple testing corrections such as the ones of Bonferroni, Holm, Sidak, Hochberg or Benjamini-Hochberg. Moreover, we show how to use our framework to improve algorithms available in the literature in such a way as to yield theoretical guarantees on their results. These modifications can easily be implemented in practice and lead to a particular way of reporting multiple testing results as three sets together with an error bound on their correctness, demonstrated exemplarily using a real biological dataset

    QuickMMCTest - Quick Multiple Monte Carlo Testing

    Get PDF
    Multiple hypothesis testing is widely used to evaluate scientific studies involving statistical tests. However, for many of these tests, p-values are not available and are thus often approximated using Monte Carlo tests such as permutation tests or bootstrap tests. This article presents a simple algorithm based on Thompson Sampling to test multiple hypotheses. It works with arbitrary multiple testing procedures, in particular with step-up and step-down procedures. Its main feature is to sequentially allocate Monte Carlo effort, generating more Monte Carlo samples for tests whose decisions are so far less certain. A simulation study demonstrates that for a low computational effort, the new approach yields a higher power and a higher degree of reproducibility of its results than previously suggested methods

    Statistical Methods for Monte-Carlo based Multiple Hypothesis Testing

    Get PDF
    Statistical hypothesis testing is a key technique to perform statistical inference. The main focus of this work is to investigate multiple testing under the assumption that the analytical p-values underlying the tests for all hypotheses are unknown. Instead, we assume that they can be approximated by drawing Monte Carlo samples under the null. The first part of this thesis focuses on the computation of test results with a guarantee on their correctness, that is decisions on multiple hypotheses which are identical to the ones obtained with the unknown p-values. We present MMCTest, an algorithm to implement a multiple testing procedure which yields correct decisions on all hypotheses (up to a pre-specified error probability) based solely on Monte Carlo simulation. MMCTest offers novel ways to evaluate multiple hypotheses as it allows to obtain the (previously unknown) correct decision on hypotheses (for instance, genes) in real data studies (again up to an error probability pre-specified by the user). The ideas behind MMCTest are generalised in a framework for Monte Carlo based multiple testing, demonstrating that existing methods giving no guarantees on their test results can be modified to yield certain theoretical guarantees on the correctness of their outputs. The second part deals with multiple testing from a practical perspective. We assume that in practice, it might also be desired to sacrifice the additional computational effort needed to obtain guaranteed decisions and to invest it instead in the computation of a more accurate ad-hoc test result. This is attempted by QuickMMCTest, an algorithm which adaptively allocates more samples to hypotheses whose decisions are more prone to random fluctuations, thereby achieving an improved accuracy. This work also derives the optimal allocation of a finite number of samples to finitely many hypotheses under a normal approximation, where the optimal allocation is understood as the one minimising the expected number of erroneously classified hypotheses (with respect to the classification based on the analytical p-values). An empirical comparison of the optimal allocation of samples to the one computed by QuickMMCTest indicates that the behaviour of QuickMMCTest might not be too far away from being optimal.Open Acces

    Penalized Principal Component Analysis using Nesterov Smoothing

    Full text link
    Principal components computed via PCA (principal component analysis) are traditionally used to reduce dimensionality in genomic data or to correct for population stratification. In this paper, we explore the penalized eigenvalue problem (PEP) which reformulates the computation of the first eigenvector as an optimization problem and adds an L1 penalty constraint. The contribution of our article is threefold. First, we extend PEP by applying Nesterov smoothing to the original LASSO-type L1 penalty. This allows one to compute analytical gradients which enable faster and more efficient minimization of the objective function associated with the optimization problem. Second, we demonstrate how higher order eigenvectors can be calculated with PEP using established results from singular value decomposition (SVD). Third, using data from the 1000 Genome Project dataset, we empirically demonstrate that our proposed smoothed PEP allows one to increase numerical stability and obtain meaningful eigenvectors. We further investigate the utility of the penalized eigenvector approach over traditional PCA.Comment: 14 pages, 3 figures (10 files

    Initial state encoding via reverse quantum annealing and h-gain features

    Full text link
    Quantum annealing is a specialized type of quantum computation that aims to use quantum fluctuations in order to obtain global minimum solutions of combinatorial optimization problems. D-Wave Systems, Inc., manufactures quantum annealers, which are available as cloud computing resources, and allow users to program the anneal schedules used in the annealing computation. In this paper, we are interested in improving the quality of the solutions returned by a quantum annealer by encoding an initial state. We explore two D-Wave features allowing one to encode such an initial state: the reverse annealing and the h-gain features. Reverse annealing (RA) aims to refine a known solution following an anneal path starting with a classical state representing a good solution, going backwards to a point where a transverse field is present, and then finishing the annealing process with a forward anneal. The h-gain (HG) feature allows one to put a time-dependent weighting scheme on linear (hh) biases of the Hamiltonian, and we demonstrate that this feature likewise can be used to bias the annealing to start from an initial state. We also consider a hybrid method consisting of a backward phase resembling RA, and a forward phase using the HG initial state encoding. Importantly, we investigate the idea of iteratively applying RA and HG to a problem, with the goal of monotonically improving on an initial state that is not optimal. The HG encoding technique is evaluated on a variety of input problems including the weighted Maximum Cut problem and the weighted Maximum Clique problem, demonstrating that the HG technique is a viable alternative to RA for some problems. We also investigate how the iterative procedures perform for both RA and HG initial state encoding on random spin glasses with the native connectivity of the D-Wave Chimera and Pegasus chips.Comment: arXiv admin note: substantial text overlap with arXiv:2009.0500

    Advanced unembedding techniques for quantum annealers

    Full text link
    The D-Wave quantum annealers make it possible to obtain high quality solutions of NP-hard problems by mapping a problem in a QUBO (quadratic unconstrained binary optimization) or Ising form to the physical qubit connectivity structure on the D-Wave chip. However, the latter is restricted in that only a fraction of all pairwise couplers between physical qubits exists. Modeling the connectivity structure of a given problem instance thus necessitates the computation of a minor embedding of the variables in the problem specification onto the logical qubits, which consist of several physical qubits "chained" together to act as a logical one. After annealing, it is however not guaranteed that all chained qubits get the same value (-1 or +1 for an Ising model, and 0 or 1 for a QUBO), and several approaches exist to assign a final value to each logical qubit (a process called "unembedding"). In this work, we present tailored unembedding techniques for four important NP-hard problems: the Maximum Clique, Maximum Cut, Minimum Vertex Cover, and Graph Partitioning problems. Our techniques are simple and yet make use of structural properties of the problem being solved. Using Erd\H{o}s-R\'enyi random graphs as inputs, we compare our unembedding techniques to three popular ones (majority vote, random weighting, and minimize energy). We demonstrate that our proposed algorithms outperform the currently available ones in that they yield solutions of better quality, while being computationally equally efficient

    Inferring the Dynamics of the State Evolution During Quantum Annealing

    Full text link
    To solve an optimization problem using a commercial quantum annealer, one has to represent the problem of interest as an Ising or a quadratic unconstrained binary optimization (QUBO) problem and submit its coefficients to the annealer, which then returns a user-specified number of low-energy solutions. It would be useful to know what happens in the quantum processor during the anneal process so that one could design better algorithms or suggest improvements to the hardware. However, existing quantum annealers are not able to directly extract such information from the processor. Hence, in this work we propose to use advanced features of D-Wave 2000Q to indirectly infer information about the dynamics of the state evolution during the anneal process. Specifically, D-Wave 2000Q allows the user to customize the anneal schedule, that is, the schedule with which the anneal fraction is changed from the start to the end of the anneal. Using this feature, we design a set of modified anneal schedules whose outputs can be used to generate information about the states of the system at user-defined time points during a standard anneal. With this process, called "slicing", we obtain approximate distributions of lowest-energy anneal solutions as the anneal time evolves. We use our technique to obtain a variety of insights into the annealer, such as the state evolution during annealing, when individual bits in an evolving solution flip during the anneal process and when they stabilize, and we introduce a technique to estimate the freeze-out point of both the system as well as of individual qubits
    • …
    corecore