850 research outputs found
Asynchronous Execution of Python Code on Task Based Runtime Systems
Despite advancements in the areas of parallel and distributed computing, the
complexity of programming on High Performance Computing (HPC) resources has
deterred many domain experts, especially in the areas of machine learning and
artificial intelligence (AI), from utilizing performance benefits of such
systems. Researchers and scientists favor high-productivity languages to avoid
the inconvenience of programming in low-level languages and costs of acquiring
the necessary skills required for programming at this level. In recent years,
Python, with the support of linear algebra libraries like NumPy, has gained
popularity despite facing limitations which prevent this code from distributed
runs. Here we present a solution which maintains both high level programming
abstractions as well as parallel and distributed efficiency. Phylanx, is an
asynchronous array processing toolkit which transforms Python and NumPy
operations into code which can be executed in parallel on HPC resources by
mapping Python and NumPy functions and variables into a dependency tree
executed by HPX, a general purpose, parallel, task-based runtime system written
in C++. Phylanx additionally provides introspection and visualization
capabilities for debugging and performance analysis. We have tested the
foundations of our approach by comparing our implementation of widely used
machine learning algorithms to accepted NumPy standards
A Short Survey on Data Clustering Algorithms
With rapidly increasing data, clustering algorithms are important tools for
data analytics in modern research. They have been successfully applied to a
wide range of domains; for instance, bioinformatics, speech recognition, and
financial analysis. Formally speaking, given a set of data instances, a
clustering algorithm is expected to divide the set of data instances into the
subsets which maximize the intra-subset similarity and inter-subset
dissimilarity, where a similarity measure is defined beforehand. In this work,
the state-of-the-arts clustering algorithms are reviewed from design concept to
methodology; Different clustering paradigms are discussed. Advanced clustering
algorithms are also discussed. After that, the existing clustering evaluation
metrics are reviewed. A summary with future insights is provided at the end
An Object-Oriented Framework for Explicit-State Model Checking
This paper presents a conceptual architecture for an object-oriented framework to support the development of formal veriļ¬cation tools (i.e. model checkers). The objective of the architecture is to support the reuse of algorithms and to encourage a modular design of tools. The conceptual framework is accompanied by a C++ implementation which provides reusable algorithms for the simulation and veriļ¬cation of explicit-state models as well as a model representation for simple models based on guard-based process descriptions. The framework has been successfully used to develop a model checker for a subset of PROMELA
A Simple and Flexible Way of Computing Small Unsatisfiable Cores in SAT Modulo Theories
Finding small unsatisfiable cores for SAT problems has recently received a lot of interest, mostly for its applications in formal verification. However, propositional logic is often not expressive enough for representing many interesting verification problems, which can be more naturally addressed in the framework of Satisfiability Modulo Theories, SMT. Surprisingly, the problem of finding unsatisfiable cores in SMT has received very little attention in the literature; in particular, we are not aware of any work aiming at producing small unsatisfiable cores in SMT. In this paper we present a novel approach to this problem. The main idea is to combine an SMT solver with an external propositional core extractor: the SMT solver produces the theory lemmas found during the search; the core extractor is then called on the boolean abstraction of the original SMT problem and of the theory lemmas. This results in an unsatisfiable core for the original SMT problem, once the remaining theory lemmas have been removed. The approach is conceptually interesting, since the SMT solver is used to dynamically lift the suitable amount of theory information to the boolean level, and it also has several advantages in practice. In fact, it is extremely simple to implement and to update, and it can be interfaced with every propositional core extractor in a plug-and-play manner, so that to benefit for free of all unsat-core reduction techniques which have been or will be made available. We have evaluated our approach by an extensive empirical test on SMT-LIB benchmarks, which confirms the validity and potential of this approach
Separation logic for high-level synthesis
High-level synthesis (HLS) promises a significant shortening of the digital hardware design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. However, applications using dynamic, pointer-based data structures remain difficult to implement well, yet such constructs are widely used in software. Automated optimisations that leverage the memory bandwidth of dedicated hardware implementations by distributing the application data over separate on-chip memories and parallelise the implementation are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis that disambiguates pointer-based memory accesses. This thesis takes a step towards closing this gap. We explore recent advances in separation logic, a rigorous mathematical framework that enables formal reasoning about the memory access of heap-manipulating programs. We develop a static analysis that automatically splits heap-allocated data structures into provably disjoint regions. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable loop parallelisation and physical memory partitioning by off-the-shelf HLS tools.
We then extend the scope of our technique to pointer-based memory-intensive implementations that require access to an off-chip memory. The extended HLS design aid generates parallel on-chip multi-cache architectures. It uses the disjointness property of memory accesses to support non-overlapping memory regions by private caches. It also identifies regions which are shared after parallelisation and which are supported by parallel caches with a coherency mechanism and synchronisation, resulting in automatically specialised memory systems. We show up to 15x acceleration from heap partitioning, parallelisation and the insertion of the custom cache system in demonstrably practical applications.Open Acces
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training
In an era where symbolic mathematical equations are indispensable for
modeling complex natural phenomena, scientific inquiry often involves
collecting observations and translating them into mathematical expressions.
Recently, deep learning has emerged as a powerful tool for extracting insights
from data. However, existing models typically specialize in either numeric or
symbolic domains, and are usually trained in a supervised manner tailored to
specific tasks. This approach neglects the substantial benefits that could
arise from a task-agnostic unified understanding between symbolic equations and
their numeric counterparts. To bridge the gap, we introduce SNIP, a
Symbolic-Numeric Integrated Pre-training, which employs joint contrastive
learning between symbolic and numeric domains, enhancing their mutual
similarities in the pre-trained embeddings. By performing latent space
analysis, we observe that SNIP provides cross-domain insights into the
representations, revealing that symbolic supervision enhances the embeddings of
numeric data and vice versa. We evaluate SNIP across diverse tasks, including
symbolic-to-numeric mathematical property prediction and numeric-to-symbolic
equation discovery, commonly known as symbolic regression. Results show that
SNIP effectively transfers to various tasks, consistently outperforming fully
supervised baselines and competing strongly with established task-specific
methods, especially in few-shot learning scenarios where available data is
limited
Verification and synthesis of asynchronous control circuits using petri net unfoldings
PhD ThesisDesign of asynchronous control circuits has traditionally been associated with application of
formal methods. Event-based models, such as Petri nets, provide a compact and easy to
understand way of specifying asynchronous behaviour. However, analysis of their behavioural
properties is often hindered by the problem of exponential growth of reachable state space.
This work proposes a new method for analysis of asynchronous circuit models based on Petri
nets. The new approach is called PN-unfolding segment. It extends and improves existing
Petri nets unfolding approaches. In addition, this thesis proposes a new analysis technique
for Signal Transition Graphs along with an efficient verification technique which is also based
on the Petri net unfolding. The former is called Full State Graph, the latter - STG-unfolding
segment. The boolean logic synthesis is an integral part of the asynchronous circuit design
process. In many cases, even if the verification of an asynchronous circuit specification has
been performed successfully, it is impossible to obtain its implementation using existing methods
because they are based on the reachability analysis. A new approach is proposed here
for automated synthesis of speed-independent circuits based on the STG-unfolding segment
constructed during the verification of the circuit's specification. Finally, this work presents
experimental results showing the need for the new Petri net unfolding techniques and confirming
the advantages of application of partial order approach to analysis, verification and
synthesis of asynchronous circuits.The Research Committee, Newcastle University:
Overseas Research Studentship Award
Exploring Spin-transfer-torque devices and memristors for logic and memory applications
As scaling CMOS devices is approaching its physical limits, researchers have begun exploring newer devices and architectures to replace CMOS.
Due to their non-volatility and high density, Spin Transfer Torque (STT) devices are among the most prominent candidates for logic and memory applications. In this research, we first considered a new logic style called All Spin Logic (ASL). Despite its advantages, ASL consumes a large amount of static power; thus, several optimizations can be performed to address this issue. We developed a systematic methodology to perform the optimizations to ensure stable operation of ASL.
Second, we investigated reliable design of STT-MRAM bit-cells and addressed the conflicting read and write requirements, which results in overdesign of the bit-cells. Further, a Device/Circuit/Architecture co-design framework was developed to optimize the STT-MRAM devices by exploring the design space through jointly considering yield enhancement techniques at different levels of abstraction.
Recent advancements in the development of memristive devices have opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this research, we utilized swarm intelligence of memristive networks to perform image edge detection. First, we proposed a hardware-friendly algorithm for image edge detection based on ant colony. Next, we designed the image edge detection algorithm using memristive networks
Obtaining Real-World Benchmark Programs from Open-Source Repositories Through Abstract-Semantics Preserving Transformations
Benchmark programs are an integral part of program analysis research. Researchers use benchmark programs to evaluate existing techniques and test the feasibility of new approaches. The larger and more realistic the set of benchmarks, the more confident a researcher can be about the correctness and reproducibility of their results. However, obtaining an adequate set of benchmark programs has been a long-standing challenge in the program analysis community.
In this thesis, we present the APT tool, a framework we designed and implemented to automate the generation of realistic benchmark programs suitable for program analysis evaluations. Our tool targets intra-procedural analyses that operate on an integer domain, specifically symbolic execution. The framework is composed of three main stages. In the first stage, the tool extracts potential benchmark programs from open-source repositories suitable for symbolic execution. In the second stage, the tool transforms the extracted programs into compilable, stand-alone benchmarks by removing external dependencies and nonlinear expressions. In the third stage, the benchmarks are verified and made available for the user.
We have designed our transformation algorithms to remove program dependencies and nonlinear expressions while preserving their semantics-equivalence in the abstraction of symbolic analysis. That is, we want the information the analysis computes on the original program and its transformed version to be equivalent. Our work provides static analysis researchers with concise, compilable benchmark programs that are relevant to symbolic execution, allowing them to focus their efforts on advancing analysis techniques. Furthermore, our work benefits the software engineering community by enabling static analysis researchers to perform benchmarking with a large, realistic set of programs, thus strengthening the empirical evidence of the advancements in static program analysis
StaticFixer: From Static Analysis to Static Repair
Static analysis tools are traditionally used to detect and flag programs that
violate properties. We show that static analysis tools can also be used to
perturb programs that satisfy a property to construct variants that violate the
property. Using this insight we can construct paired data sets of unsafe-safe
program pairs, and learn strategies to automatically repair property
violations. We present a system called \sysname, which automatically repairs
information flow vulnerabilities using this approach. Since information flow
properties are non-local (both to check and repair), \sysname also introduces a
novel domain specific language (DSL) and strategy learning algorithms for
synthesizing non-local repairs. We use \sysname to synthesize strategies for
repairing two types of information flow vulnerabilities, unvalidated dynamic
calls and cross-site scripting, and show that \sysname successfully repairs
several hundred vulnerabilities from open source {\sc JavaScript} repositories,
outperforming neural baselines built using {\sc CodeT5} and {\sc Codex}. Our
datasets can be downloaded from \url{http://aka.ms/StaticFixer}
- ā¦