10,156 research outputs found
Towards A Practical High-Assurance Systems Programming Language
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
Determination of the strong coupling αs from transverse energy-energy correlations in multi-jet events at â s = 13 TeV with the ATLAS detector
Tesis Doctoral inĂ©dita leĂda en la Universidad AutĂłnoma de Madrid, Facultad de Ciencias, Departamento de FĂsica TeĂłrica. Fecha de Lectura: 24-02-202
Modular Collaborative Program Analysis
With our world increasingly relying on computers, it is important to ensure the quality, correctness, security, and performance of software systems. Static analysis that computes properties of computer programs without executing them has been an important method to achieve this for decades. However, static analysis faces major chal-
lenges in increasingly complex programming languages and software systems and increasing and sometimes conflicting demands for soundness, precision, and scalability. In order to cope with these challenges, it is necessary to build static analyses for complex problems from small, independent, yet collaborating modules that can be developed in isolation and combined in a plug-and-play manner.
So far, no generic architecture to implement and combine a broad range of dissimilar static analyses exists. The goal of this thesis is thus to design such an architecture and implement it as a generic framework for developing modular, collaborative static analyses. We use several, diverse case-study analyses from which we systematically derive requirements to guide the design of the framework. Based on this, we propose the use of a blackboard-architecture style collaboration of analyses that we implement in the OPAL framework. We also develop a formal model of our architectures core concepts and show how it enables freely composing analyses while retaining their soundness guarantees.
We showcase and evaluate our architecture using the case-study analyses, each of which shows how important and complex problems of static analysis can be addressed using a modular, collaborative implementation style. In particular, we show how a modular architecture for the construction of call graphs ensures consistent soundness of different algorithms. We show how modular analyses for different aspects of immutability mutually benefit each other. Finally, we show how the analysis of method purity can benefit from the use of other complex analyses in a collaborative manner and from exchanging different analysis implementations that exhibit different characteristics. Each of these case studies improves over the respective state of the art in terms of soundness, precision, and/or scalability and shows how our architecture enables experimenting with and fine-tuning trade-offs between these qualities
Improving Data Locality in Applications through Execution Delegation
With the slowing or even death of Mooreâs Law, computer system architectures are trending toward more CPU cores. This trend has driven systems researchers to explore novel ways of utilizing this computational power for improved efficiency and performance. One such approach is to use this power to help alleviate the memory wall problem through execution delegation. The memory wall problem describes the issue whereby system performance hits a wall that is dictated by the latency of accessing main memory. Using execution delegation, the execution of the application on one core is delegated to another core. The desired result is that the cores of the system are specialized to access mostly disjoint sets of data. In this way, data locality and, therefore, performance are improved.
The aim of this work is to develop tools and methods for predicting situations in which execution delegation via user thread migration is useful for improving an applicationâs data locality. To this end, a microbenchmarking tool named Accesstest is used to perform a systematic study of execution delegation via user thread migration. Further, an approach, which makes use of a working set characterization tool named Accessprof, is developed to predict the qualitative impact of delegating an execution sequence. This prediction approach is verified and used to improve the Apache HTTP serverâs performance by as much as 11%
Operatic Pasticcios in 18th-Century Europe
In Early Modern times, techniques of assembling, compiling and arranging pre-existing material were part of the established working methods in many arts. In the world of 18th-century opera, such practices ensured that operas could become a commercial success because the substitution or compilation of arias fitting the singer's abilities proved the best recipe for fulfilling the expectations of audiences. Known as »pasticcios« since the 18th-century, these operas have long been considered inferior patchwork. The volume collects essays that reconsider the pasticcio, contextualize it, define its preconditions, look at its material aspects and uncover its aesthetical principles
LASSO â an observatorium for the dynamic selection, analysis and comparison of software
Mining software repositories at the scale of 'big code' (i.e., big data) is a challenging activity. As well as finding a suitable software corpus and making it programmatically accessible through an index or database, researchers and practitioners have to establish an efficient analysis infrastructure and precisely define the metrics and data extraction approaches to be applied. Moreover, for analysis results to be generalisable, these tasks have to be applied at a large enough scale to have statistical significance, and if they are to be repeatable, the artefacts need to be carefully maintained and curated over time. Today, however, a lot of this work is still performed by human beings on a case-by-case basis, with the level of effort involved often having a significant negative impact on the generalisability and repeatability of studies, and thus on their overall scientific value.
The general purpose, 'code mining' repositories and infrastructures that have emerged in recent years represent a significant step forward because they automate many software mining tasks at an ultra-large scale and allow researchers and practitioners to focus on defining the questions they would like to explore at an abstract level. However, they are currently limited to static analysis and data extraction techniques, and thus cannot support (i.e., help automate) any studies which involve the execution of software systems. This includes experimental validations of techniques and tools that hypothesise about the behaviour (i.e., semantics) of software, or data analysis and extraction techniques that aim to measure dynamic properties of software.
In this thesis a platform called LASSO (Large-Scale Software Observatorium) is introduced that overcomes this limitation by automating the collection of dynamic (i.e., execution-based) information about software alongside static information. It features a single, ultra-large scale corpus of executable software systems created by amalgamating existing Open Source software repositories and a dedicated DSL for defining abstract selection and analysis pipelines. Its key innovations are integrated capabilities for searching for selecting software systems based on their exhibited behaviour and an 'arena' that allows their responses to software tests to be compared in a purely data-driven way. We call the platform a 'software observatorium' since it is a place where the behaviour of large numbers of software systems can be observed, analysed and compared
Deployment of Deep Neural Networks on Dedicated Hardware Accelerators
Deep Neural Networks (DNNs) have established themselves as powerful tools for
a wide range of complex tasks, for example computer vision or natural language
processing. DNNs are notoriously demanding on compute resources and as a
result, dedicated hardware accelerators for all use cases are developed. Different
accelerators provide solutions from hyper scaling cloud environments for the
training of DNNs to inference devices in embedded systems. They implement
intrinsics for complex operations directly in hardware. A common example
are intrinsics for matrix multiplication. However, there exists a gap between
the ecosystems of applications for deep learning practitioners and hardware
accelerators. HowDNNs can efficiently utilize the specialized hardware intrinsics
is still mainly defined by human hardware and software experts.
Methods to automatically utilize hardware intrinsics in DNN operators are a
subject of active research. Existing literature often works with transformationdriven
approaches, which aim to establish a sequence of program rewrites and
data-layout transformations such that the hardware intrinsic can be used to
compute the operator. However, the complexity this of task has not yet been
explored, especially for less frequently used operators like Capsule Routing. And
not only the implementation of DNN operators with intrinsics is challenging,
also their optimization on the target device is difficult. Hardware-in-the-loop
tools are often used for this problem. They use latency measurements of implementations
candidates to find the fastest one. However, specialized accelerators
can have memory and programming limitations, so that not every arithmetically
correct implementation is a valid program for the accelerator. These invalid
implementations can lead to unnecessary long the optimization time.
This work investigates the complexity of transformation-driven processes to
automatically embed hardware intrinsics into DNN operators. It is explored
with a custom, graph-based intermediate representation (IR). While operators
like Fully Connected Layers can be handled with reasonable effort, increasing
operator complexity or advanced data-layout transformation can lead to scaling issues.
Building on these insights, this work proposes a novel method to embed
hardware intrinsics into DNN operators. It is based on a dataflow analysis.
The dataflow embedding method allows the exploration of how intrinsics and
operators match without explicit transformations. From the results it can derive
the data layout and program structure necessary to compute the operator with
the intrinsic. A prototype implementation for a dedicated hardware accelerator
demonstrates state-of-the art performance for a wide range of convolutions, while
being agnostic to the data layout. For some operators in the benchmark, the
presented method can also generate alternative implementation strategies to
improve hardware utilization, resulting in a geo-mean speed-up of Ă2.813 while
reducing the memory footprint. Lastly, by curating the initial set of possible
implementations for the hardware-in-the-loop optimization, the median timeto-
solution is reduced by a factor of Ă2.40. At the same time, the possibility to
have prolonged searches due a bad initial set of implementations is reduced,
improving the optimizationâs robustness by Ă2.35
Molecular dynamics simulations of nanoclusters in neuromorphic systems
Neuromorphic computing is a new computing paradigm that deals with computing tasks using inter-connected artificial neurons inspired by the natural neurons in the human brain. This computational architecture is more efficient in performing many complex tasks such a pattern recognition and has promise at overcoming some of the limitations of conventional computers. Among the emerging types of artificial neurons, a cluster-based neuromorphic device is a promising system with good costefficiency because of a simple fabrication process. This device functions using the formation and breakage of the connections (âsynapsesâ) between clusters, driven by the bias voltage applied to the clusters. The mechanisms of the formation and breakage of these connections are thus of the utmost interest. In this thesis, the molecular dynamics simulation method is used to explore the mechanisms of the formation and breakage of the connections (âfilamentsâ) between the clusters in a model of neuromorphic device. First, the Joule heating mechanism of filament breakage is explored using a model consisting of Au nanowire that connects two Au1415 clusters. Upon heating, the atoms of the nanofilament gradually aggregate towards the clusters, causing the middle of the wire to graduallythin and then suddenly break. Most of the system remains crystalline during this process, but the centre becomes molten. The terminal clusters increase the melting point of the nanowires by fixing them and act as recrystallisation regions. A strong dependence of the breaking temperature is found not only on the width of the nanowires but also their length and atomic structure. Secondly, the bridge formation and thermal breaking processes between Au1415 clusters on a graphite substrate are also simulated. The bridging process , which can heal a broken filament, is driven by diffusion of gold along the graphite substrate. The characteristic times of bridge formation are explored at elevated simulation temperatures to estimate the longer characteristic times of formation at room-temperature conditions. The width of the bridge formed has a power-law dependence on the simulation time, and the mechanism is a combination of diffusion and viscous flow. Simulations of bridgebreaking are also conducted and reveal the existence of a voltage threshold that must be reached to activate the breakage. The role of the substrate in the bridge formation and breakage processes is revealed as a medium of diffusion. Lastly, to explore future potential cluster materials, the thermal behaviour of Pb-Al core-shell clusters is studied. The core and shell are found to melt separately. In fact, the core atoms of nanoclusters tend to escape their shells and partially cover them, leading to a preference for a segregated state. The melting point of the core can either be depressed or elevated, depending on the thickness of the shell due to different mechanisms
- âŠ