Search CORE

5 research outputs found

Recommended from our members

Probabilistic language models with model efficiency and data efficiency

Author: Zhang Shujian (Ph. D. in statistics)
Publication venue
Publication date: 06/07/2024
Field of study

Probabilistic language models have provided remarkable performance improvements in the domain of natural language processing (NLP). This dissertation presents new approaches addressing three facets of language models from a probabilistic standpoint. Initially, the focus lies on attention-based mechanisms intrinsic to transformer architectures. With the advent of the self-attention transformer, attention mechanisms have laid the groundwork for numerous cutting-edge models. A central proposition herein is the alignment attention, aiming to regularize the query and key projection matrices within each self-attention layer. Next, despite the impressive performance of expansive language models across a range of applications, they typically demand vast datasets. In this study, we suggest an active learning method to label samples using an acquisition function with local sensitivity and learning difficulty. This method creates data replicas via subtle perturbations, prioritizing those data points that show the most distinct predictive likelihoods compared to their replicas. Lastly, we turn our attention to improving inference efficiency. We propose a switchable decision to accelerate inference by dynamically assigning computation resources for each data instance. Automatically making decisions on where to skip and how to balance quality and computation cost with constrained optimization, our dynamic neural generation networks enforce the efficient inference path and determine the optimized trade-off. All three strategies from the above consistently surpass established benchmarks in performance.Statistic

Texas ScholarWorks

Assessing, testing, and challenging the computational power of quantum devices

Author: Hangleiter Dominik
Publication venue
Publication date: 14/12/2020
Field of study

Randomness is an intrinsic feature of quantum theory. The outcome of any measurement will be random, sampled from a probability distribution that is defined by the measured quantum state. The task of sampling from a prescribed probability distribution therefore seems to be a natural technological application of quantum devices. And indeed, certain random sampling tasks have been proposed to experimentally demonstrate the speedup of quantum over classical computation, so-called “quantum computational supremacy”. In the research presented in this thesis, I investigate the complexity-theoretic and physical foundations of quantum sampling algorithms. Using the theory of computational complexity, I assess the computational power of natural quantum simulators and close loopholes in the complexity-theoretic argument for the classical intractability of quantum samplers (Part I). In particular, I prove anticoncentration for quantum circuit families that give rise to a 2-design and review methods for proving average-case hardness. I present quantum random sampling schemes that are tailored to large-scale quantum simulation hardware but at the same time rise up to the highest standard in terms of their complexity-theoretic underpinning. Using methods from property testing and quantum system identification, I shed light on the question, how and under which conditions quantum sampling devices can be tested or verified in regimes that are not simulable on classical computers (Part II). I present a no-go result that prevents efficient verification of quantum random sampling schemes as well as approaches using which this no-go result can be circumvented. In particular, I develop fully efficient verification protocols in what I call the measurement-device-dependent scenario in which single-qubit measurements are assumed to function with high accuracy. Finally, I try to understand the physical mechanisms governing the computational boundary between classical and quantum computing devices by challenging their computational power using tools from computational physics and the theory of computational complexity (Part III). I develop efficiently computable measures of the infamous Monte Carlo sign problem and assess those measures both in terms of their practicability as a tool for alleviating or easing the sign problem and the computational complexity of this task. An overarching theme of the thesis is the quantum sign problem which arises due to destructive interference between paths – an intrinsically quantum effect. The (non-)existence of a sign problem takes on the role as a criterion which delineates the boundary between classical and quantum computing devices. I begin the thesis by identifying the quantum sign problem as a root of the computational intractability of quantum output probabilities. It turns out that the intricate structure of the probability distributions the sign problem gives rise to, prohibits their verification from few samples. In an ironic twist, I show that assessing the intrinsic sign problem of a quantum system is again an intractable problem

arXiv.org e-Print Archive

Institutional Repository of the Freie Universität Berlin

Recommended from our members

On computationally efficient learning for stabilizers and beyond

Author: Liang Daniel You
Publication venue
Publication date: 12/12/2023
Field of study

Artificial intelligence, big data, machine learning, neural networks – look up any recent research proposal and with good probability at least one of these phrases will appear. It’s no secret that learning has taken this era of computer science by storm in our attempt to create software that perform extremely complicated tasks. As one of the most accurate models of our physical world currently known, it then makes sense to think about what kinds of quantum systems can or cannot be learned. As with many problems in quantum information and quantum computing, the simplest non-trivial versions of these problems start with the stabilizer formalism. In this dissertation, we examine learning problems centered around the stabilizer formalism in various different models from a theoretical standpoint using the tools of computer science and quantum information. Specifically, our focus will be on computational complexity, rather than sample complexity. We begin by looking at learning in the tomographical sense. Here, one has black-box access to copies of an unknown quantum state |ψ⟩ and want to learn properties of the state or outright given an approximation of |ψ⟩. In this setting, [Mon17] gave an efficient learning algorithm for stabilizer states. The key algorithmic tool was Bell difference sampling, which allows one to sample from the stabilizer group of a stabilizer state. [GNW21] extended the analysis of Bell difference sampling beyond just stabilizer states. Throughout Part I we turn to Bell difference sampling to improve upon learning algorithms for states with only a few (i.e., either O(log n) or strictly less than n depending on context) T gates. By using symplectic Fourier analysis, which is the generalization of Boolean Fourier analysis for a symplectic vector space over [superscript 2n over subscript 2], we derive powerful tools to understand the Bell difference sampling distribution. With these tools we first give a tolerant property testing algorithm for stabilizer states. That is, we give an algorithm that distinguishes whether a state is ε1 close to some stabilizer state or ε2 far from all stabilizer states for certain parameter regimes of ε1 and ε2. We use our improved knowledge of Bell difference sampling to improve upon the completeness and soundness analysis of the property tester given by [GNW21], which is not tolerant. A second application is stabilizer fidelity estimation and approximation. Given a state |ψ⟩ that is O(1) close to a stabilizer state, we output such a stabilizer state in time 2 [superscript O(n)]. This beats the previous 2 [superscript O(n²)] brute force search algorithm. Having such a stabilizer state also lets us figure out how close |ψ⟩ is to being stabilizer. A third application is extending Montanaro’s learning algorithm to the output of Clifford + O(log n) non-Clifford gate circuits. More generally, our algorithm interpolates between Montanaro’s algorithm and pure state tomography algorithms with runtime that is poly(n)∗exp(t) where t is the number of non-Clifford gates. This asymptotically matches the runtime of classical simulation algorithms for such circuits. A key algorithmic step in this work is the ability to “compress” the “stabilizer-ness” of a state onto a few qubits, allowing the “non-stabilizer-ness” to be brute-forced on the remaining qubits. Our final application is pseudorandomness lower bounds. Introduced by [JLS18], a pseudorandom quantum state ensemble is a set of quantum states that are computationally indistinguishable from Haar random. By re-purposing algorithms from above, we produce a test that behaves differently when given a state produced by less than n T gates in a Clifford + T circuit versus being given a Haar random state. We note that this is tight assuming the existence of linear-time quantum-secure One-Way Functions. Pivoting now, we also study the stabilizer formalism in the PAC learning framework proposed by [Val84]. Here one does not have control over the measurements, but must make do regardless (within information theoretic limits). We analyze the problem in two ways. First we show that, unlike stabilizer states, learning the associated Clifford unitaries in the proper PAC model is NP-hard. This is done by a reduction to the problem of finding a full rank matrix in an affine subspace of matrices over ₂. The second is studying stabilizer states in the presence of noise. We utilize the Statistical Query framework, a popular modification to the PAC learning framework that is inherently tolerant to noise. There, we also show hardness in this framework by a reduction to Learning Parities with Noise. This gives evidence that even in the PAC model stabilizer states are hard to learn with noise.Computer Science

Texas ScholarWorks

Watermarking Cryptographic Functionalities from Standard Lattice Assumptions

Author: David J. Wu
Sam Kim
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 01/01/2017
Field of study

A software watermarking scheme allows one to embed a mark into a program without significantly altering the behavior of the program. Moreover, it should be difficult to remove the watermark without destroying the functionality of the program. Recently, Cohen et al. (STOC 2016) and Boneh et al. (PKC 2017) showed how to watermark cryptographic functions such as PRFs using indistinguishability obfuscation. Notably, in their constructions, the watermark remains intact even against arbitrary removal strategies. A natural question is whether we can build watermarking schemes from standard assumptions that achieve this strong mark-unremovability property. We give the first construction of a watermarkable family of PRFs that satisfy this strong mark-unremovability property from standard lattice assumptions (namely, the learning with errors (LWE) and the one-dimensional short integer solution (SIS) problems). As part of our construction, we introduce a new cryptographic primitive called a translucent PRF. Next, we give a concrete construction of a translucent PRF family from standard lattice assumptions. Finally, we show that using our new lattice-based translucent PRFs, we obtain the first watermarkable family of PRFs with strong unremovability against arbitrary strategies from standard assumptions

Crossref

Cryptology ePrint Archive

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server