7,697 research outputs found
Asynchronous Logic Design with Flip-Flop Constraints
Some techniques are presented to permit the implementation of asynchronous sequential circuits using standard flip-flops. An algorithm is presented for the RS flip-flop, and it is shown that any flow table may be realized using the algorithm (the flow table is assumed to be realizable using standard logic gates). The approach is shown to be directly applicable to synchronous circuits, and transition flip-flops (JK, D, and T) are analyzed using the ideas developed. Constraints are derived for the flow tables to meet to be realizable using transition flip-flops in asynchronous situations, and upper and lower bounds on the number of transition flip-flops required to implement a given flow table are stated
Self-directedness, integration and higher cognition
In this paper I discuss connections between self-directedness, integration and higher cognition. I present a model of self-directedness as a basis for approaching higher cognition from a situated cognition perspective. According to this model increases in sensorimotor complexity create pressure for integrative higher order control and learning processes for acquiring information about the context in which action occurs. This generates complex articulated abstractive information processing, which forms the major basis for higher cognition. I present evidence that indicates that the same integrative characteristics found in lower cognitive process such as motor adaptation are present in a range of higher cognitive process, including conceptual learning. This account helps explain situated cognition phenomena in humans because the integrative processes by which the brain adapts to control interaction are relatively agnostic concerning the source of the structure participating in the process. Thus, from the perspective of the motor control system using a tool is not fundamentally different to simply controlling an arm
A model and framework for reliable build systems
Reliable and fast builds are essential for rapid turnaround during
development and testing. Popular existing build systems rely on correct manual
specification of build dependencies, which can lead to invalid build outputs
and nondeterminism. We outline the challenges of developing reliable build
systems and explore the design space for their implementation, with a focus on
non-distributed, incremental, parallel build systems. We define a general model
for resources accessed by build tasks and show its correspondence to the
implementation technique of minimum information libraries, APIs that return no
information that the application doesn't plan to use. We also summarize
preliminary experimental results from several prototype build managers
Automatic Generation of Models of Microarchitectures
Detailed microarchitectural models are necessary to predict, explain, or optimize the performance of software running on modern microprocessors. Building such models often requires a significant manual effort, as the documentation provided by hardware manufacturers is typically not precise enough. The goal of this thesis is to develop techniques for generating microarchitectural models automatically. In the first part, we focus on recent x86 microarchitectures. We implement a tool to accurately evaluate small microbenchmarks using hardware performance counters. We then describe techniques to automatically generate microbenchmarks for measuring the performance of individual instructions and for characterizing cache architectures. We apply our implementations to more than a dozen different microarchitectures. In the second part of the thesis, we study more general techniques to obtain models of hardware components. In particular, we propose the concept of gray-box learning, and we develop a learning algorithm for Mealy machines that exploits prior knowledge about the system to be learned. Finally, we show how this algorithm can be adapted to minimize incompletely specified Mealy machines—a well-known NP-complete problem. Our implementation outperforms existing exact minimization techniques by several orders of magnitude on a number of hard benchmarks; it is even competitive with state-of-the-art heuristic approaches.Zur Vorhersage, Erklärung oder Optimierung der Leistung von Software auf modernen Mikroprozessoren werden detaillierte Modelle der verwendeten Mikroarchitekturen benötigt. Das Erstellen derartiger Modelle ist oft mit einem hohen Aufwand verbunden, da die erforderlichen Informationen von den Prozessorherstellern typischerweise nicht zur Verfügung gestellt werden. Das Ziel der vorliegenden Arbeit ist es, Techniken zu entwickeln, um derartige Modelle automatisch zu erzeugen. Im ersten Teil beschäftigen wir uns mit aktuellen x86-Mikroarchitekturen. Wir entwickeln zuerst ein Tool, das kleine Microbenchmarks mithilfe von Performance Countern auswerten kann. Danach beschreiben wir Techniken, um automatisch Microbenchmarks zu erzeugen, mit denen die Leistung einzelner Instruktionen gemessen sowie die Cache-Architektur charakterisiert werden kann. Im zweiten Teil der Arbeit betrachten wir allgemeinere Techniken, um Hardwaremodelle zu erzeugen. Wir schlagen das Konzept des “Gray-Box Learning” vor, und wir entwickeln einen Lernalgorithmus für Mealy-Maschinen, der bekannte Informationen über das zu lernende System berücksichtigt. Zum Abschluss zeigen wir, wie dieser Algorithmus auf das Problem der Minimierung unvollständig spezifizierter Mealy-Maschinen übertragen werden kann. Hierbei handelt es sich um ein bekanntes NP-vollständiges Problem. Unsere Implementierung ist in mehreren Benchmarks um Größenordnungen schneller als vorherige Ansätze
Parallel evaluation strategies for lazy data structures in Haskell
Conventional parallel programming is complex and error prone. To improve programmer
productivity, we need to raise the level of abstraction with a higher-level
programming model that hides many parallel coordination aspects. Evaluation
strategies use non-strictness to separate the coordination and computation aspects
of a Glasgow parallel Haskell (GpH) program. This allows the specification of high
level parallel programs, eliminating the low-level complexity of synchronisation and
communication associated with parallel programming.
This thesis employs a data-structure-driven approach for parallelism derived through
generic parallel traversal and evaluation of sub-components of data structures. We
focus on evaluation strategies over list, tree and graph data structures, allowing
re-use across applications with minimal changes to the sequential algorithm.
In particular, we develop novel evaluation strategies for tree data structures, using
core functional programming techniques for coordination control, achieving more
flexible parallelism. We use non-strictness to control parallelism more flexibly. We
apply the notion of fuel as a resource that dictates parallelism generation, in particular,
the bi-directional flow of fuel, implemented using a circular program definition,
in a tree structure as a novel way of controlling parallel evaluation. This is the first
use of circular programming in evaluation strategies and is complemented by a lazy
function for bounding the size of sub-trees.
We extend these control mechanisms to graph structures and demonstrate performance
improvements on several parallel graph traversals. We combine circularity
for control for improved performance of strategies with circularity for computation
using circular data structures. In particular, we develop a hybrid traversal strategy
for graphs, exploiting breadth-first order for exposing parallelism initially, and
then proceeding with a depth-first order to minimise overhead associated with a full
parallel breadth-first traversal.
The efficiency of the tree strategies is evaluated on a benchmark program, and
two non-trivial case studies: a Barnes-Hut algorithm for the n-body problem and
sparse matrix multiplication, both using quad-trees. We also evaluate a graph search
algorithm implemented using the various traversal strategies.
We demonstrate improved performance on a server-class multicore machine with
up to 48 cores, with the advanced fuel splitting mechanisms proving to be more
flexible in throttling parallelism. To guide the behaviour of the strategies, we develop
heuristics-based parameter selection to select their specific control parameters
Quantum Cloning Machines and the Applications
No-cloning theorem is fundamental for quantum mechanics and for quantum
information science that states an unknown quantum state cannot be cloned
perfectly. However, we can try to clone a quantum state approximately with the
optimal fidelity, or instead, we can try to clone it perfectly with the largest
probability. Thus various quantum cloning machines have been designed for
different quantum information protocols. Specifically, quantum cloning machines
can be designed to analyze the security of quantum key distribution protocols
such as BB84 protocol, six-state protocol, B92 protocol and their
generalizations. Some well-known quantum cloning machines include universal
quantum cloning machine, phase-covariant cloning machine, the asymmetric
quantum cloning machine and the probabilistic quantum cloning machine etc. In
the past years, much progress has been made in studying quantum cloning
machines and their applications and implementations, both theoretically and
experimentally. In this review, we will give a complete description of those
important developments about quantum cloning and some related topics. On the
other hand, this review is self-consistent, and in particular, we try to
present some detailed formulations so that further study can be taken based on
those results.Comment: 98 pages, 12 figures, 400+ references. Physics Reports (published
online
- …