342 research outputs found
Analog Photonics Computing for Information Processing, Inference and Optimisation
This review presents an overview of the current state-of-the-art in photonics
computing, which leverages photons, photons coupled with matter, and
optics-related technologies for effective and efficient computational purposes.
It covers the history and development of photonics computing and modern
analogue computing platforms and architectures, focusing on optimization tasks
and neural network implementations. The authors examine special-purpose
optimizers, mathematical descriptions of photonics optimizers, and their
various interconnections. Disparate applications are discussed, including
direct encoding, logistics, finance, phase retrieval, machine learning, neural
networks, probabilistic graphical models, and image processing, among many
others. The main directions of technological advancement and associated
challenges in photonics computing are explored, along with an assessment of its
efficiency. Finally, the paper discusses prospects and the field of optical
quantum computing, providing insights into the potential applications of this
technology.Comment: Invited submission by Journal of Advanced Quantum Technologies;
accepted version 5/06/202
The Potts model and the independence polynomial:Uniqueness of the Gibbs measure and distributions of complex zeros
Part 1 of this dissertation studies the antiferromagnetic Potts model, which originates in statistical physics. In particular the transition from multiple Gibbs measures to a unique Gibbs measure for the antiferromagnetic Potts model on the infinite regular tree is studied. This is called a uniqueness phase transition. A folklore conjecture about the parameter at which the uniqueness phase transition occurs is partly confirmed. The proof uses a geometric condition, which comes from analysing an associated dynamical system.Part 2 of this dissertation concerns zeros of the independence polynomial. The independence polynomial originates in statistical physics as the partition function of the hard-core model. The location of the complex zeros of the independence polynomial is related to phase transitions in terms of the analycity of the free energy and plays an important role in the design of efficient algorithms to approximately compute evaluations of the independence polynomial. Chapter 5 directly relates the location of the complex zeros of the independence polynomial to computational hardness of approximating evaluations of the independence polynomial. This is done by moreover relating the set of zeros of the independence polynomial to chaotic behaviour of a naturally associated family of rational functions; the occupation ratios. Chapter 6 studies boundedness of zeros of the independence polynomial of tori for sequences of tori converging to the integer lattice. It is shown that zeros are bounded for sequences of balanced tori, but unbounded for sequences of highly unbalanced tori
The long-range Falicov-Kimball model and the amorphous Kitaev model: Quantum many-body systems I have known and loved
Large systems of interacting objects can give rise to a rich array of emergent behaviours. Make those objects quantum and the possibilities only expand. Interacting quantum many-body systems, as such systems are called, include essentially all physical systems. Luckily, we don't usually need to consider this full quantum many-body description. The world at the human scale is essentially classical (not quantum), while at the microscopic scale of condensed matter physics we can often get by without interactions. Strongly correlated materials, however, do require the full description. Some of the most exciting topics in modern condensed matter fall under this umbrella: the spin liquids, the fractional quantum Hall effect, high temperature superconductivity and much more. Unfortunately, strongly correlated materials are notoriously difficult to study, defying many of the established theoretical techniques within the field. Enter exactly solvable models, these are interacting quantum many-body systems with extensively many local symmetries. The symmetries give rise to conserved charges. These charges break the model up into many non-interacting quantum systems which are more amenable to standard theoretical techniques. This thesis will focus on two such exactly solvable models.
The first, the Falicov-Kimball (FK) model is an exactly solvable limit of the famous Hubbard model which describes itinerant fermions interacting with a classical Ising background field. Originally introduced to explain metal-insulator transitions, it has a rich set of ground state and thermodynamic phases. Disorder or interactions can turn metals into insulators and the FK model features both transitions. We will define a generalised FK model in 1D with long-range interactions. This model shows a similarly rich phase diagram to its higher dimensional cousins. We use an exact Markov Chain Monte Carlo method to map the phase diagram and compute the energy resolved localisation properties of the fermions. This allows us to look at how the move to 1D affects the physics of the model. We show that the model can be understood by comparison to a simpler model of fermions coupled to binary disorder.
The second, the Kitaev Honeycomb (KH) model, was the one of the first solvable 2D models with a Quantum Spin Liquid (QSL) ground state. QSLs are generally expected to arise from Mott insulators, when frustration prevents magnetic ordering all the way to zero temperature. The QSL state defies the traditional Landau-Ginzburg-Wilson paradigm of phases being defined by local order parameters. It is instead a topologically ordered phase. Recent work generalising non-interacting topological insulator phases to amorphous lattices raises the question of whether interacting phases like the QSLs can be similarly generalised. We extend the KH model to random lattices with fixed coordination number three generated by Voronoi partitions of the plane. We show that this model remains solvable and hosts a chiral amorphous QSL ground state. The presence of plaquettes with an odd number of sides leads to a spontaneous breaking of time reversal symmetry. We unearth a rich phase diagram displaying Abelian as well as a non-Abelian QSL phases with a remarkably simple ground state flux pattern. Furthermore, we show that the system undergoes a phase transition to a conducting thermal metal state and discuss possible experimental realisations.Open Acces
Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration With Provable Guarantees
Satisfiability Modulo Counting (SMC) encompasses problems that require both
symbolic decision-making and statistical reasoning. Its general formulation
captures many real-world problems at the intersection of symbolic and
statistical Artificial Intelligence. SMC searches for policy interventions to
control probabilistic outcomes. Solving SMC is challenging because of its
highly intractable nature(-complete), incorporating
statistical inference and symbolic reasoning. Previous research on SMC solving
lacks provable guarantees and/or suffers from sub-optimal empirical
performance, especially when combinatorial constraints are present. We propose
XOR-SMC, a polynomial algorithm with access to NP-oracles, to solve highly
intractable SMC problems with constant approximation guarantees. XOR-SMC
transforms the highly intractable SMC into satisfiability problems, by
replacing the model counting in SMC with SAT formulae subject to randomized XOR
constraints. Experiments on solving important SMC problems in AI for social
good demonstrate that XOR-SMC finds solutions close to the true optimum,
outperforming several baselines which struggle to find good approximations for
the intractable model counting in SMC
Combining Cubic Dynamical Solvers with Make/Break Heuristics to Solve SAT
Dynamical solvers for combinatorial optimization are usually based on 2superscript{nd} degree polynomial interactions, such as the Ising model. These exhibit high success for problems that map naturally to their formulation. However, SAT requires higher degree of interactions. As such, these quadratic dynamical solvers (QDS) have shown poor solution quality due to excessive auxiliary variables and the resulting increase in search-space complexity. Thus recently, a series of cubic dynamical solver (CDS) models have been proposed for SAT and other problems. We show that such problem-agnostic CDS models still perform poorly on moderate to large problems, thus motivating the need to utilize SAT-specific heuristics. With this insight, our contributions can be summarized into three points. First, we demonstrate that existing make-only heuristics perform poorly on scale-free, industrial-like problems when integrated into CDS. This motivates us to utilize break counts as well. Second, we derive a relationship between make/break and the CDS formulation to efficiently recover break counts. Finally, we utilize this relationship to propose a new make/break heuristic and combine it with a state-of-the-art CDS which is projected to solve SAT problems several orders of magnitude faster than existing software solvers
Low Power Memory/Memristor Devices and Systems
This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within
Smooth Monotonic Networks
Monotonicity constraints are powerful regularizers in statistical modelling.
They can support fairness in computer supported decision making and increase
plausibility in data-driven scientific models. The seminal min-max (MM) neural
network architecture ensures monotonicity, but often gets stuck in undesired
local optima during training because of vanishing gradients. We propose a
simple modification of the MM network using strictly-increasing smooth
non-linearities that alleviates this problem. The resulting smooth min-max
(SMM) network module inherits the asymptotic approximation properties from the
MM architecture. It can be used within larger deep learning systems trained
end-to-end. The SMM module is considerably simpler and less computationally
demanding than state-of-the-art neural networks for monotonic modelling. Still,
in our experiments, it compared favorably to alternative neural and non-neural
approaches in terms of generalization performance
D4.2 Intelligent D-Band wireless systems and networks initial designs
This deliverable gives the results of the ARIADNE project's Task 4.2: Machine Learning based network intelligence. It presents the work conducted on various aspects of network management to deliver system level, qualitative solutions that leverage diverse machine learning techniques. The different chapters present system level, simulation and algorithmic models based on multi-agent reinforcement learning, deep reinforcement learning, learning automata for complex event forecasting, system level model for proactive handovers and resource allocation, model-driven deep learning-based channel estimation and feedbacks as well as strategies for deployment of machine learning based solutions. In short, the D4.2 provides results on promising AI and ML based methods along with their limitations and potentials that have been investigated in the ARIADNE project
Nesting optimization with adversarial games, meta-learning, and deep equilibrium models
Nested optimization, whereby an optimization problem is constrained by the solutions of other optimization problems, has recently seen a surge in its application to Deep Learning.
While the study of such problems started nearly a century ago in the context of market theory, many of the algorithms developed since do not scale to modern Deep Learning applications. In this thesis, I push the understanding and applicability of nested optimization to three machine learning domains: 1) adversarial games, 2) meta-learning and 3) deep equilibrium models. For each domain, I tackle a particular goal.
In 1) I adversarially learn model compression, in the case where training data isn't available, in 2) I meta-learn hyperparameters for long optimization processes without introducing greediness, and in 3) I use deep equilibrium models to improve temporal coherence in video landmark detection.
The first part of my thesis deals with casting model compression as an adversarial game. Performing knowledge transfer from a large teacher network to a smaller student is a popular task in deep learning. However, due to growing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. I propose a novel method which trains a student to match the predictions of its teacher without using any data or metadata. This is achieved by nesting the training optimization of the student with that of an adversarial generator, which searches for images on which the student poorly matches the teacher. These images are used to train the student in an online fashion. The student closely approximates its teacher for simple datasets like SVHN, and on CIFAR10 I improve on the state-of-the-art for few-shot distillation (with images per class), despite using no data. Finally, I also propose a metric to quantify the degree of belief matching between teacher and student in the vicinity of decision boundaries, and observe a significantly higher match between the zero-shot student and the teacher, than between a student distilled with real data and the teacher.
The second part of my thesis deals with meta-learning hyperparameters in the case when the nested optimization to be differentiated is itself solved by many gradient steps. Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. I propose forward-mode differentiation with sharing (FDS), a simple and efficient algorithm which tackles memory scaling issues with forward-mode differentiation, and gradient degradation issues by sharing hyperparameters that are contiguous in time. I provide theoretical guarantees about the noise reduction properties of my algorithm, and demonstrate its efficiency empirically by differentiating through gradient steps of unrolled optimization. I consider large hyperparameter search ranges on CIFAR-10 where I significantly outperform greedy gradient-based alternatives, while achieving speedups compared to the state-of-the-art black-box methods.
The third part of my thesis deals with converting deep equilibrium models to a form of nested optimization in order to perform robust video landmark detection. Cascaded computation, whereby predictions are recurrently refined over several stages, has been a persistent theme throughout the development of landmark detection models. I show that the recently proposed deep equilibrium model (DEQ) can be naturally adapted to this form of computation, given appropriate regularization. My landmark model achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching normalized mean error with fewer parameters and a training memory cost of in the number of recurrent modules. Furthermore, I show that DEQs are particularly suited for landmark detection in videos. In this setting, it is typical to train on still images due to the lack of labeled videos. This can lead to a ``flickering'' effect at inference time on video, whereby a model can rapidly oscillate between different plausible solutions across consecutive frames. I show that the DEQ root solving problem can be turned into a constrained optimization problem in a way that emulates recurrence at inference time, despite not having access to temporal data at training time. I call this "Recurrence without Recurrence'', and demonstrate that it helps reduce landmark flicker by introducing a new metric, and contributing a new facial landmark video dataset targeting landmark uncertainty. On the hard subset of this new dataset, made up of videos, my model improves the accuracy and temporal coherence by and respectively, compared to the strongest previously published model using a hand-tuned conventional filter
- …