414 research outputs found
Separation logic for high-level synthesis
High-level synthesis (HLS) promises a significant shortening of the digital hardware design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. However, applications using dynamic, pointer-based data structures remain difficult to implement well, yet such constructs are widely used in software. Automated optimisations that leverage the memory bandwidth of dedicated hardware implementations by distributing the application data over separate on-chip memories and parallelise the implementation are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis that disambiguates pointer-based memory accesses. This thesis takes a step towards closing this gap. We explore recent advances in separation logic, a rigorous mathematical framework that enables formal reasoning about the memory access of heap-manipulating programs. We develop a static analysis that automatically splits heap-allocated data structures into provably disjoint regions. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable loop parallelisation and physical memory partitioning by off-the-shelf HLS tools.
We then extend the scope of our technique to pointer-based memory-intensive implementations that require access to an off-chip memory. The extended HLS design aid generates parallel on-chip multi-cache architectures. It uses the disjointness property of memory accesses to support non-overlapping memory regions by private caches. It also identifies regions which are shared after parallelisation and which are supported by parallel caches with a coherency mechanism and synchronisation, resulting in automatically specialised memory systems. We show up to 15x acceleration from heap partitioning, parallelisation and the insertion of the custom cache system in demonstrably practical applications.Open Acces
Q(sqrt(-3))-Integral Points on a Mordell Curve
We use an extension of quadratic Chabauty to number fields,recently developed by the author with Balakrishnan, Besser and M ̈uller,combined with a sieving technique, to determine the integral points overQ(√−3) on the Mordell curve y2 = x3 − 4
Learning to see across domains and modalities
Deep learning has recently raised hopes and expectations as a general solution for many applications (computer vision, natural language processing, speech recognition, etc.); indeed it has proven effective, but it also showed a strong dependence on large quantities of data. Generally speaking, deep learning models are especially susceptible to overfitting, due to their large number of internal parameters.
Luckily, it has also been shown that, even when data is scarce, a successful model can be trained by reusing prior knowledge. Thus, developing techniques for \textit{transfer learning} (as this process is known), in its broadest definition, is a crucial element towards the deployment of effective and accurate intelligent systems into the real world.
This thesis will focus on a family of transfer learning methods applied to the task of visual object recognition, specifically image classification. The visual recognition problem is central to computer vision research: many desired applications, from robotics to information retrieval, demand the ability to correctly identify categories, places, and objects.
Transfer learning is a general term, and specific settings have been given specific names: when the learner has access to only unlabeled data from the target domain (where the model should perform) and labeled data from a different domain (the source), the problem is called unsupervised domain adaptation (DA). The first part of this thesis will focus on three methods for this setting.
The three presented techniques for domain adaptation are fully distinct: the first one proposes the use of Domain Alignment layers to structurally align the distributions of the source and target domains in feature space. While the general idea of aligning feature distribution is not novel,
we distinguish our method by being one of the very few that do so without adding losses. The second method is based on GANs: we propose a bidirectional architecture that jointly learns how to map the source images into the target visual style and vice-versa, thus alleviating the domain shift at the pixel level. The third method features an adversarial learning process that transforms both the images and the features of both domains in order to map them to a common, agnostic, space.
While the first part of the thesis presented general purpose DA methods, the second part will focus on the real life issues of robotic perception, specifically RGB-D recognition.
Robotic platforms are usually not limited to color perception; very often they also carry a Depth camera.
Unfortunately, the depth modality is rarely used for visual recognition due to the lack of pretrained models from which to transfer and little data to train one on from scratch.
We will first explore the use of synthetic data as proxy for real images by training a Convolutional Neural Network (CNN) on virtual depth maps, rendered from 3D CAD models, and then testing it on real robotic datasets. The second approach leverages the existence of RGB pretrained models, by learning how to map the depth data into the most discriminative RGB representation and then using existing models for recognition. This second technique is actually a pretty generic Transfer Learning method which can be applied to share knowledge across modalities
A Dynamically Scheduled HLS Flow in MLIR
In High-Level Synthesis (HLS), we consider abstractions that span from software to hardware and target heterogeneous architectures. Therefore, managing the complexity introduced by this is key to implementing good, maintainable, and extendible HLS compilers. Traditionally, HLS flows have been built on top of software compilation infrastructure such as LLVM, with hardware aspects of the flow existing peripherally to the core of the compiler. Through this work, we aim to show that MLIR, a compiler infrastructure with a focus on domain-specific intermediate representations (IR), is a better infrastructure for HLS compilers. Using MLIR, we define HLS and hardware abstractions as first-class citizens of the compiler, simplifying analysis, transformations, and optimization. To demonstrate this, we present a C-to-RTL, dynamically scheduled HLS flow. We find that our flow generates circuits comparable to those of an equivalent LLVM-based HLS compiler. Notably, we achieve this while lacking key optimization passes typically found in HLS compilers and through the use of an experimental front-end. To this end, we show that significant improvements in the generated RTL are but low-hanging fruit, requiring engineering effort to attain. We believe that our flow is more modular and more extendible than comparable open-source HLS compilers and is thus a good candidate as a basis for future research. Apart from the core HLS flow, we provide MLIR-based tooling for C-to-RTL cosimulation and visual debugging, with the ultimate goal of building an MLIR-based HLS infrastructure that will drive innovation in the field
A framework for network traffic analysis using GPUs
During the last years the computer networks have become an important part of our society.
Networks have kept growing in size and complexity, making more complex its management
and traffic monitoring and analysis processes, due to the huge amount of data and calculations
involved.
In the last decade, several researchers found effective to use graphics processing units (GPUs)
rather than a traditional processors (CPU) to boost the execution of some algorithms not related
to graphics (GPGPU). In 2006 the GPU chip manufacturer NVIDIA launched CUDA, a
library that allows software developers to use their GPUs to perform general purpose algorithm
calculations, using the C programming language.
This thesis presents a framework which tries to simplify the task of programming network traffic
analysis with CUDA to software developers. The objectives of the framework have been abstracting
the task of obtaining network packets, simplify the task of creating network analysis
programs using CUDA and offering an easy way to reuse the analysis code. Several network
traffic analysis have also been developed
Recommended from our members
Towards justifying computer algebra algorithms in Isabelle/HOL
As verification efforts using interactive theorem proving grow, we are in need of certified algorithms in computer algebra to tackle problems over the real numbers. This is important because uncertified procedures can drastically increase the size of the trust base and under- mine the overall confidence established by interactive theorem provers, which usually rely on a small kernel to ensure the soundness of derived results.
This thesis describes an ongoing effort using the Isabelle theorem prover to certify the cylindrical algebraic decomposition (CAD) algorithm, which has been widely implemented to solve non-linear problems in various engineering and mathematical fields. Because of the sophistication of this algorithm, people are in doubt of the correctness of its implementation when deploying it to safety-critical verification projects, and such doubts motivate this thesis.
In particular, this thesis proposes a library of real algebraic numbers, whose distinguishing features include a modular architecture and a sign determination algorithm requiring only rational arithmetic. With this library, an Isabelle tactic based on univariate CAD has been built in a certificate-based way: external, untrusted code delivers solutions in the form of certificates that are checked within Isabelle. To lay the foundation for the multivariate case, I have formalised various analytical results including Cauchy’s residue theorem and the bivariate case of the projection theorem of CAD. During this process, I have also built a tactic to evaluate winding numbers through Cauchy indices and verified procedures to count complex roots in some domains.
The formalisation effort in this thesis can be considered as the first step towards a certified computer algebra system inside a theorem prover, so that various engineering projections and mathematical calculations can be carried out in a high-confidence framework
- …