5 research outputs found
Recommended from our members
Probabilistic language models with model efficiency and data efficiency
Probabilistic language models have provided remarkable performance improvements in the domain of natural language processing (NLP). This dissertation presents new approaches addressing three facets of language models from a probabilistic standpoint. Initially, the focus lies on attention-based mechanisms intrinsic to transformer architectures. With the advent of the self-attention transformer, attention mechanisms have laid the groundwork for numerous cutting-edge models. A central proposition herein is the alignment attention, aiming to regularize the query and key projection matrices within each self-attention layer. Next, despite the impressive performance of expansive language models across a range of applications, they typically demand vast datasets. In this study, we suggest an active learning method to label samples using an acquisition function with local sensitivity and learning difficulty. This method creates data replicas via subtle perturbations, prioritizing those data points that show the most distinct predictive likelihoods compared to their replicas. Lastly, we turn our attention to improving inference efficiency. We propose a switchable decision to accelerate inference by dynamically assigning computation resources for each data instance. Automatically making decisions on where to skip and how to balance quality and computation cost with constrained optimization, our dynamic neural generation networks enforce the efficient inference path and determine the optimized trade-off. All three strategies from the above consistently surpass established benchmarks in performance.Statistic
Assessing, testing, and challenging the computational power of quantum devices
Randomness is an intrinsic feature of quantum theory. The outcome of any measurement will be random, sampled from a probability distribution that is defined by the measured quantum state. The task of sampling from a prescribed probability distribution therefore seems to be a natural technological application of quantum devices. And indeed, certain random sampling tasks have been proposed to experimentally demonstrate the speedup of quantum over classical computation, so-called âquantum computational supremacyâ.
In the research presented in this thesis, I investigate the complexity-theoretic and physical foundations of quantum sampling algorithms. Using the theory of computational complexity, I assess the computational power of natural quantum simulators and close loopholes in the complexity-theoretic argument for the classical intractability of quantum samplers (Part I). In particular, I prove anticoncentration for quantum circuit families that give rise to a 2-design and review methods for proving average-case hardness. I present quantum random sampling schemes that are tailored to large-scale quantum simulation hardware but at the same time rise up to the highest standard in terms of their complexity-theoretic underpinning. Using methods from property testing and quantum system identification, I shed light on the question, how and under which conditions quantum sampling devices can be tested or verified in regimes that are not simulable on classical computers (Part II). I present a no-go result that prevents efficient verification of quantum random sampling schemes as well as approaches using which this no-go result can be circumvented. In particular, I develop fully efficient verification protocols in what I call the measurement-device-dependent scenario in which single-qubit measurements are assumed to function with high accuracy. Finally, I try to understand the physical mechanisms governing the computational boundary between classical and quantum computing devices by challenging their computational power using tools from computational physics and the theory of computational complexity (Part III). I develop efficiently computable measures of the infamous Monte Carlo sign problem and assess those measures both in terms of their practicability as a tool for alleviating or easing the sign problem and the computational complexity of this task.
An overarching theme of the thesis is the quantum sign problem which arises due to destructive interference between paths â an intrinsically quantum effect. The (non-)existence of a sign problem takes on the role as a criterion which delineates the boundary between classical and quantum computing devices. I begin the thesis by identifying the quantum sign problem as a root of the computational intractability of quantum output probabilities. It turns out that the intricate structure of the probability distributions the sign problem gives rise to, prohibits their verification from few samples. In an ironic twist, I show that assessing the intrinsic sign problem of a quantum system is again an intractable problem
Recommended from our members
On computationally efficient learning for stabilizers and beyond
Artificial intelligence, big data, machine learning, neural networks â look up any recent research proposal and with good probability at least one of these phrases will appear. Itâs no secret that learning has taken this era of computer science by storm in our attempt to create software that perform extremely complicated tasks. As one of the most accurate models of our physical world currently known, it then makes sense to think about what kinds of quantum systems can or cannot be learned. As with many problems in quantum information and quantum computing, the simplest non-trivial versions of these problems start with the stabilizer formalism. In this dissertation, we examine learning problems centered around the stabilizer formalism in various different models from a theoretical standpoint using the tools of computer science and quantum information. Specifically, our focus will be on computational complexity, rather than sample complexity. We begin by looking at learning in the tomographical sense. Here, one has black-box access to copies of an unknown quantum state |Ïâ© and want to learn properties of the state or outright given an approximation of |Ïâ©. In this setting, [Mon17] gave an efficient learning algorithm for stabilizer states. The key algorithmic tool was Bell difference sampling, which allows one to sample from the stabilizer group of a stabilizer state. [GNW21] extended the analysis of Bell difference sampling beyond just stabilizer states. Throughout Part I we turn to Bell difference sampling to improve upon learning algorithms for states with only a few (i.e., either O(log n) or strictly less than n depending on context) T gates. By using symplectic Fourier analysis, which is the generalization of Boolean Fourier analysis for a symplectic vector space over [superscript 2n over subscript 2], we derive powerful tools to understand the Bell difference sampling distribution. With these tools we first give a tolerant property testing algorithm for stabilizer states. That is, we give an algorithm that distinguishes whether a state is Δ1 close to some stabilizer state or Δ2 far from all stabilizer states for certain parameter regimes of Δ1 and Δ2. We use our improved knowledge of Bell difference sampling to improve upon the completeness and soundness analysis of the property tester given by [GNW21], which is not tolerant. A second application is stabilizer fidelity estimation and approximation. Given a state |Ïâ© that is O(1) close to a stabilizer state, we output such a stabilizer state in time 2 [superscript O(n)]. This beats the previous 2 [superscript O(nÂČ)] brute force search algorithm. Having such a stabilizer state also lets us figure out how close |Ïâ© is to being stabilizer. A third application is extending Montanaroâs learning algorithm to the output of Clifford + O(log n) non-Clifford gate circuits. More generally, our algorithm interpolates between Montanaroâs algorithm and pure state tomography algorithms with runtime that is poly(n)âexp(t) where t is the number of non-Clifford gates. This asymptotically matches the runtime of classical simulation algorithms for such circuits. A key algorithmic step in this work is the ability to âcompressâ the âstabilizer-nessâ of a state onto a few qubits, allowing the ânon-stabilizer-nessâ to be brute-forced on the remaining qubits. Our final application is pseudorandomness lower bounds. Introduced by [JLS18], a pseudorandom quantum state ensemble is a set of quantum states that are computationally indistinguishable from Haar random. By re-purposing algorithms from above, we produce a test that behaves differently when given a state produced by less than n T gates in a Clifford + T circuit versus being given a Haar random state. We note that this is tight assuming the existence of linear-time quantum-secure One-Way Functions. Pivoting now, we also study the stabilizer formalism in the PAC learning framework proposed by [Val84]. Here one does not have control over the measurements, but must make do regardless (within information theoretic limits). We analyze the problem in two ways. First we show that, unlike stabilizer states, learning the associated Clifford unitaries in the proper PAC model is NP-hard. This is done by a reduction to the problem of finding a full rank matrix in an affine subspace of matrices over â. The second is studying stabilizer states in the presence of noise. We utilize the Statistical Query framework, a popular modification to the PAC learning framework that is inherently tolerant to noise. There, we also show hardness in this framework by a reduction to Learning Parities with Noise. This gives evidence that even in the PAC model stabilizer states are hard to learn with noise.Computer Science
Watermarking Cryptographic Functionalities from Standard Lattice Assumptions
A software watermarking scheme allows one to embed a mark into a program without significantly altering the behavior of the program. Moreover, it should be difficult to remove the watermark without destroying the functionality of the program. Recently, Cohen et al. (STOC 2016) and Boneh et al. (PKC 2017) showed how to watermark cryptographic functions such as PRFs using indistinguishability obfuscation. Notably, in their constructions, the watermark remains intact even against arbitrary removal strategies. A natural question is whether we can build watermarking schemes from standard assumptions that achieve this strong mark-unremovability property.
We give the first construction of a watermarkable family of PRFs that satisfy this strong mark-unremovability property from standard lattice assumptions (namely, the learning with errors (LWE) and the one-dimensional short integer solution (SIS) problems). As part of our construction, we introduce a new cryptographic primitive called a translucent PRF. Next, we give a concrete construction of a translucent PRF family from standard lattice assumptions. Finally, we show that using our new lattice-based translucent PRFs, we obtain the first watermarkable family of PRFs with strong unremovability against arbitrary strategies from standard assumptions
LIPIcs, Volume 244, ESA 2022, Complete Volume
LIPIcs, Volume 244, ESA 2022, Complete Volum