327 research outputs found
Adaptive Algorithms and Collusion via Coupling
We develop a theoretical model to study strategic interactions between
adaptive learning algorithms. Applying continuous-time techniques, we uncover
the mechanism responsible for collusion between Artificial Intelligence
algorithms documented by recent experimental evidence. We show that spontaneous
coupling between the algorithms' estimates leads to periodic coordination on
actions that are more profitable than static Nash equilibria. We provide a
sufficient condition under which this coupling is guaranteed to disappear, and
algorithms learn to play undominated strategies. We apply our results to
interpret and complement experimental findings in the literature and to the
design of learning-robust strategy-proof mechanisms. We show that ex-post
feedback provision guarantees robustness to the presence of learning agents. We
fully characterize the optimal learning-robust mechanisms: they are menu
mechanisms.Comment: 57 pages, 13 figure
Recommended from our members
Nonlinear opinion models and other networked systems
Networks play a critical role in many physical, biological, and social systems. In this thesis, we investigate tools to model and analyze networked systems. We first examine some of the ways in which we can model social dynamics that take place on networks. We then study two recently developed data-analysis methods that employ a network framework and explore new ways in which they can be used to find meaningful signals in large data sets. In the first half of the thesis, we study opinion dynamics on networks. We begin by examining a class of opinion models, known as coevolving voter models (CVM), that couple the mechanisms of opinion formation and changing social connections. We then propose a version of CVMs that incorporates nonlinearity. In our models, we assume that individuals strive to achieve harmony and avoid disagreement, both by changing their social connections to reflect their opinions and by changing their opinions to reflect their social connections. By taking a minimalist approach to modeling social dynamics, we hope to gain a deeper understanding of how these two mechanisms can give rise to social phenomena such as the ``majority illusion''. Comparing several versions of CVMs, we find that seemingly small changes in update rules can lead to strikingly different behaviors. A particularly interesting feature of our nonlinear CVMs is that, under certain conditions, the opinion state that is held initially by a minority of the nodes can effectively spread to almost every node in a network if the minority nodes view themselves as the majority. We then discuss an ongoing project that involves another class of opinion models called bounded-confidence models. Specifically, we examine extensions of bounded-confidence models on hypergraphs and discuss some preliminary findings. In the second half of the thesis, we study problems in data analysis. We begin by considering topological structures as a tool to study integrated circuit (IC) devices. In particular, we examine a problem in the design and manufacturing of IC devices using topological data analysis (TDA), which is based on network structures called simplicial complexes. Failures in IC devices generally occur near the tolerance limits of photolithography systems, such as at the minimum separation distance between adjacent electronic components. However, for complex arrangements of electronic components, simply ensuring minimal separation is insufficient to guarantee that one can manufacture an IC design accurately and reliably. We apply tools from TDA to compare data from IC designs. Without inputting domain knowledge, we are able to infer several results about the IC design-manufacturing process. Finally, we discuss an ongoing project in the analysis of network data. Specifically, we explore applications of a recently developed algorithm called network dictionary learning (NDL) and discuss problems of network reconstruction and denoising using NDL on both synthetic and real-world networks
Recommended from our members
Ray: A Distributed Execution Engine for the Machine Learning Ecosystem
In recent years, growing data volumes and more sophisticated computational procedures have greatly increased the demand for computational power. Machine learning and artificial intelligence applications, for example, are notorious for their computational requirements. At the same time, Moores law is ending and processor speeds are stalling. As a result, distributed computing has become ubiquitous. While the cloud makes distributed hardware infrastructure widely accessible and therefore offers the potential of horizontal scale, developing these distributed algorithms and applications remains surprisingly hard. This is due to the inherent complexity of concurrent algorithms, the engineering challenges that arise when communicating between many machines, the requirements like fault tolerance and straggler mitigation that arise at large scale and the lack of a general-purpose distributed execution engine that can support a wide variety of applications.In this thesis, we study the requirements for a general-purpose distributed computation model and present a solution that is easy to use yet expressive and resilient to faults. At its core our model takes familiar concepts from serial programming, namely functions and classes, and generalizes them to the distributed world, therefore unifying stateless and stateful distributed computation. This model not only supports many machine learning workloads like training or serving, but is also a good t for cross-cutting machine learning applications like reinforcement learning and data processing applications like streaming or graph processing. We implement this computational model as an open-source system called Ray, which matches or exceeds the performance of specialized systems in many application domains, while also offering horizontally scalability and strong fault tolerance properties
Progressive load balancing of asynchronous algorithms
Massively parallel supercomputers are susceptible to variable performance due to
factors such as differences in chip manufacturing, heat management and network congestion. As a result, the same code with the same input can have a different execution
time from run to run. Synchronisation under these circumstances is a key challenge
that prevents applications from scaling to large problems and machines.
Asynchronous algorithms offer a partial solution. In these algorithms fast processes
are not forced to synchronise with slower ones. Instead, they continue computing updates, and moving towards the solution, using the latest data available to them, which
may have become stale (i.e. the data is a number of iterations out of date compared
to the most recent version). While this allows for high computational efficiency, the
convergence rate of asynchronous algorithms tends to be lower than synchronous algorithms due to the use of stale values. A large degree of performance variability can
eliminate the performance advantage of asynchronous algorithms or even cause the
results to diverge.
To address this problem, we use the unique properties of asynchronous algorithms
to develop a load balancing strategy for iterative convergent asynchronous algorithms
in both shared and distributed memory. The proposed approach – Progressive Load
Balancing (PLB) – aims to balance progress levels over time, rather than attempting to
equalise iteration rates across parallel workers. This approach attenuates noise without
sacrificing performance, resulting in a significant reduction in progress imbalance and
improving time to solution.
The developed method is evaluated in a variety of scenarios using the asynchronous
Jacobi algorithm. In shared memory, we show that it can essentially eliminate the
negative effects of a single core in a node slowed down by 19%. Work stealing, an
alternative load balancing approach, is shown to be ineffective. In distributed memory,
the method reduces the impact of up to 8 slow nodes out of 15, each slowed down
by 40%, resulting in 1.03×–1.10× reduction in time to solution and 1.11×–2.89×
reduction in runtime variability. Furthermore, we successfully apply the method in
a scenario with real faulty components running 75% slower than normal. Broader
applicability of progressive load balancing is established by emulating its application
to asynchronous stochastic gradient descent where it is found to improve both training
time and the learned model’s accuracy.
Overall, this thesis demonstrates that enhancing asynchronous algorithms with
PLB is an effective method for tackling performance variability in supercomputers
Design and analysis of SRAMs for energy harvesting systems
PhD ThesisAt present, the battery is employed as a power source for wide varieties of microelectronic systems ranging from biomedical implants and sensor net-works to portable devices. However, the battery has several limitations and incurs many challenges for the majority of these systems. For instance, the design considerations of implantable devices concern about the battery from two aspects, the toxic materials it contains and its lifetime since replacing the battery means a surgical operation. Another challenge appears in wire-less sensor networks, where hundreds or thousands of nodes are scattered around the monitored environment and the battery of each node should be maintained and replaced regularly, nonetheless, the batteries in these nodes do not all run out at the same time.
Since the introduction of portable systems, the area of low power designs has witnessed extensive research, driven by the industrial needs, towards the aim of extending the lives of batteries. Coincidentally, the continuing innovations in the field of micro-generators made their outputs in the same range of several portable applications. This overlap creates a clear oppor-tunity to develop new generations of electronic systems that can be powered, or at least augmented, by energy harvesters. Such self-powered systems benefit applications where maintaining and replacing batteries are impossi-ble, inconvenient, costly, or hazardous, in addition to decreasing the adverse effects the battery has on the environment.
The main goal of this research study is to investigate energy harvesting aware design techniques for computational logic in order to enable the capa-
II
bility of working under non-deterministic energy sources. As a case study, the research concentrates on a vital part of all computational loads, SRAM, which occupies more than 90% of the chip area according to the ITRS re-ports.
Essentially, this research conducted experiments to find out the design met-ric of an SRAM that is the most vulnerable to unpredictable energy sources, which has been confirmed to be the timing. Accordingly, the study proposed a truly self-timed SRAM that is realized based on complete handshaking protocols in the 6T bit-cell regulated by a fully Speed Independent (SI) tim-ing circuitry. The study proved the functionality of the proposed design in real silicon. Finally, the project enhanced other performance metrics of the self-timed SRAM concentrating on the bit-line length and the minimum operational voltage by employing several additional design techniques.Umm Al-Qura University, the Ministry of Higher Education in the Kingdom of Saudi Arabia, and the Saudi Cultural Burea
Towards a continuous dynamic model of the Hopfield theory on neuronal interaction and memory storage
The purpose of this work is to study the Hopfield model for neuronal
interaction and memory storage, in particular the convergence to the stored
patterns. Since the hypothesis of symmetric synapses is not true for the
brain, we will study how we can extend it to the case of asymmetric
synapses using a probabilistic approach. We then focus on the description
of another feature of the memory process and brain: oscillations. Using the
Kuramoto model we will be able to describe them completely, gaining the
presence of synchronization between neurons. Our aim is therefore to
understand how and why neurons can be seen as oscillators and to establish
a strong link between this model and the Hopfield approach
Checkpoint-based forward recovery using lookahead execution and rollback validation in parallel and distributed systems
This thesis studies a forward recovery strategy using checkpointing and optimistic execution in parallel and distributed systems. The approach uses replicated tasks executing on different processors for forwared recovery and checkpoint comparison for error detection. To reduce overall redundancy, this approach employs a lower static redundancy in the common error-free situation to detect error than the standard N Module Redundancy scheme (NMR) does to mask off errors. For the rare occurrence of an error, this approach uses some extra redundancy for recovery. To reduce the run-time recovery overhead, look-ahead processes are used to advance computation speculatively and a rollback process is used to produce a diagnosis for correct look-ahead processes without rollback of the whole system. Both analytical and experimental evaluation have shown that this strategy can provide a nearly error-free execution time even under faults with a lower average redundancy than NMR
Variational Gibbs inference for statistical model estimation from incomplete data
Statistical models are central to machine learning with broad applicability
across a range of downstream tasks. The models are controlled by free
parameters that are typically estimated from data by maximum-likelihood
estimation or approximations thereof. However, when faced with real-world
datasets many of the models run into a critical issue: they are formulated in
terms of fully-observed data, whereas in practice the datasets are plagued with
missing data. The theory of statistical model estimation from incomplete data
is conceptually similar to the estimation of latent-variable models, where
powerful tools such as variational inference (VI) exist. However, in contrast
to standard latent-variable models, parameter estimation with incomplete data
often requires estimating exponentially-many conditional distributions of the
missing variables, hence making standard VI methods intractable. We address
this gap by introducing variational Gibbs inference (VGI), a new
general-purpose method to estimate the parameters of statistical models from
incomplete data. We validate VGI on a set of synthetic and real-world
estimation tasks, estimating important machine learning models such as VAEs and
normalising flows from incomplete data. The proposed method, whilst
general-purpose, achieves competitive or better performance than existing
model-specific estimation methods.Comment: Improved clarity and references. Added algorithms 2-5. Experiment
results remain unchange
Bio-inspired cellular machines:towards a new electronic paper architecture
Information technology has only been around for about fifty years. Although the beginnings of automatic calculation date from as early as the 17th century (W. Schickard built the first mechanical calculator in 1623), it took the invention of the transistor by W. Shockley, J. Bardeen and W. Brattain in 1947 to catapult calculators out of the laboratory and produce the omnipresence of information and communication systems in today's world. Computers not only boast very high performance, capable of carrying out billions of operations per second, they are taking over our world, working their way into every last corner of our environment. Microprocessors are in everything, from the quartz watch to the PC via the mobile phone, the television and the credit card. Their continuing spread is very probable, and they will even be able to get into our clothes and newspapers. The incessant search for increasingly powerful, robust and intelligent systems is not only based on the improvement of technologies for the manufacture of electronic chips, but also on finding new computer architectures. One important source of inspiration for the research of new architectures is the biological world. Nature is fascinating for an engineer: what could be more robust, intelligent and able to adapt and evolve than a living organism? Out of a simple cell, equipped with its own blueprint in the form of DNA, develops a complete multi-cellular organism. The characteristics of the natural world have often been studied and imitated in the design of adaptive, robust and fault-tolerant artificial systems. The POE model resumes the three major sources of bio-inspiration: the evolution of species (P: phylogeny), the development of a multi-cellular organism by division and differentiation (O: ontogeny) and learning by interaction with the environment (E: epigenesis). This thesis aims to contribute to the ontogenetic branch of the POE model, through the study of three completely original cellular machines for which the basic element respects the six following characteristics: it is (1) reconfigurable, (2) of minimal complexity, (3) present in large numbers, (4) interconnected locally with its neighboring elements, (5) equipped with a display capacity and (6) with sensor allowing minimal interaction. Our first realization, the BioWall, is made up of a surface of 4,000 basic elements or molecules, capable of creating all cellular systems with a maximum of 160 × 25 elements. The second realization, the BioCube, transposes the two-dimensional architecture of the BioWall into a two-dimensional space, limited to 4 × 4 × 4 = 64 basic elements or spheres. It prefigures a three-dimensional computer built using nanotechnologies. The third machine, named BioTissue, uses the same hypothesis as the BioWall while pushing its performance to the limits of current technical possibilities and offering the benefits of an autonomous system. The convergence of these three realizations, studied in the context of emerging technologies, has allowed us to propose and define the computer architecture of the future: the electronic paper
- …