850 research outputs found

    Asynchronous Execution of Python Code on Task Based Runtime Systems

    Get PDF
    Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenience of programming in low-level languages and costs of acquiring the necessary skills required for programming at this level. In recent years, Python, with the support of linear algebra libraries like NumPy, has gained popularity despite facing limitations which prevent this code from distributed runs. Here we present a solution which maintains both high level programming abstractions as well as parallel and distributed efficiency. Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++. Phylanx additionally provides introspection and visualization capabilities for debugging and performance analysis. We have tested the foundations of our approach by comparing our implementation of widely used machine learning algorithms to accepted NumPy standards

    A Short Survey on Data Clustering Algorithms

    Full text link
    With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end

    An Object-Oriented Framework for Explicit-State Model Checking

    Get PDF
    This paper presents a conceptual architecture for an object-oriented framework to support the development of formal veriļ¬cation tools (i.e. model checkers). The objective of the architecture is to support the reuse of algorithms and to encourage a modular design of tools. The conceptual framework is accompanied by a C++ implementation which provides reusable algorithms for the simulation and veriļ¬cation of explicit-state models as well as a model representation for simple models based on guard-based process descriptions. The framework has been successfully used to develop a model checker for a subset of PROMELA

    A Simple and Flexible Way of Computing Small Unsatisfiable Cores in SAT Modulo Theories

    Get PDF
    Finding small unsatisfiable cores for SAT problems has recently received a lot of interest, mostly for its applications in formal verification. However, propositional logic is often not expressive enough for representing many interesting verification problems, which can be more naturally addressed in the framework of Satisfiability Modulo Theories, SMT. Surprisingly, the problem of finding unsatisfiable cores in SMT has received very little attention in the literature; in particular, we are not aware of any work aiming at producing small unsatisfiable cores in SMT. In this paper we present a novel approach to this problem. The main idea is to combine an SMT solver with an external propositional core extractor: the SMT solver produces the theory lemmas found during the search; the core extractor is then called on the boolean abstraction of the original SMT problem and of the theory lemmas. This results in an unsatisfiable core for the original SMT problem, once the remaining theory lemmas have been removed. The approach is conceptually interesting, since the SMT solver is used to dynamically lift the suitable amount of theory information to the boolean level, and it also has several advantages in practice. In fact, it is extremely simple to implement and to update, and it can be interfaced with every propositional core extractor in a plug-and-play manner, so that to benefit for free of all unsat-core reduction techniques which have been or will be made available. We have evaluated our approach by an extensive empirical test on SMT-LIB benchmarks, which confirms the validity and potential of this approach

    Separation logic for high-level synthesis

    Get PDF
    High-level synthesis (HLS) promises a significant shortening of the digital hardware design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. However, applications using dynamic, pointer-based data structures remain difficult to implement well, yet such constructs are widely used in software. Automated optimisations that leverage the memory bandwidth of dedicated hardware implementations by distributing the application data over separate on-chip memories and parallelise the implementation are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis that disambiguates pointer-based memory accesses. This thesis takes a step towards closing this gap. We explore recent advances in separation logic, a rigorous mathematical framework that enables formal reasoning about the memory access of heap-manipulating programs. We develop a static analysis that automatically splits heap-allocated data structures into provably disjoint regions. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable loop parallelisation and physical memory partitioning by off-the-shelf HLS tools. We then extend the scope of our technique to pointer-based memory-intensive implementations that require access to an off-chip memory. The extended HLS design aid generates parallel on-chip multi-cache architectures. It uses the disjointness property of memory accesses to support non-overlapping memory regions by private caches. It also identifies regions which are shared after parallelisation and which are supported by parallel caches with a coherency mechanism and synchronisation, resulting in automatically specialised memory systems. We show up to 15x acceleration from heap partitioning, parallelisation and the insertion of the custom cache system in demonstrably practical applications.Open Acces

    SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

    Full text link
    In an era where symbolic mathematical equations are indispensable for modeling complex natural phenomena, scientific inquiry often involves collecting observations and translating them into mathematical expressions. Recently, deep learning has emerged as a powerful tool for extracting insights from data. However, existing models typically specialize in either numeric or symbolic domains, and are usually trained in a supervised manner tailored to specific tasks. This approach neglects the substantial benefits that could arise from a task-agnostic unified understanding between symbolic equations and their numeric counterparts. To bridge the gap, we introduce SNIP, a Symbolic-Numeric Integrated Pre-training, which employs joint contrastive learning between symbolic and numeric domains, enhancing their mutual similarities in the pre-trained embeddings. By performing latent space analysis, we observe that SNIP provides cross-domain insights into the representations, revealing that symbolic supervision enhances the embeddings of numeric data and vice versa. We evaluate SNIP across diverse tasks, including symbolic-to-numeric mathematical property prediction and numeric-to-symbolic equation discovery, commonly known as symbolic regression. Results show that SNIP effectively transfers to various tasks, consistently outperforming fully supervised baselines and competing strongly with established task-specific methods, especially in few-shot learning scenarios where available data is limited

    Verification and synthesis of asynchronous control circuits using petri net unfoldings

    Get PDF
    PhD ThesisDesign of asynchronous control circuits has traditionally been associated with application of formal methods. Event-based models, such as Petri nets, provide a compact and easy to understand way of specifying asynchronous behaviour. However, analysis of their behavioural properties is often hindered by the problem of exponential growth of reachable state space. This work proposes a new method for analysis of asynchronous circuit models based on Petri nets. The new approach is called PN-unfolding segment. It extends and improves existing Petri nets unfolding approaches. In addition, this thesis proposes a new analysis technique for Signal Transition Graphs along with an efficient verification technique which is also based on the Petri net unfolding. The former is called Full State Graph, the latter - STG-unfolding segment. The boolean logic synthesis is an integral part of the asynchronous circuit design process. In many cases, even if the verification of an asynchronous circuit specification has been performed successfully, it is impossible to obtain its implementation using existing methods because they are based on the reachability analysis. A new approach is proposed here for automated synthesis of speed-independent circuits based on the STG-unfolding segment constructed during the verification of the circuit's specification. Finally, this work presents experimental results showing the need for the new Petri net unfolding techniques and confirming the advantages of application of partial order approach to analysis, verification and synthesis of asynchronous circuits.The Research Committee, Newcastle University: Overseas Research Studentship Award

    Exploring Spin-transfer-torque devices and memristors for logic and memory applications

    Get PDF
    As scaling CMOS devices is approaching its physical limits, researchers have begun exploring newer devices and architectures to replace CMOS. Due to their non-volatility and high density, Spin Transfer Torque (STT) devices are among the most prominent candidates for logic and memory applications. In this research, we first considered a new logic style called All Spin Logic (ASL). Despite its advantages, ASL consumes a large amount of static power; thus, several optimizations can be performed to address this issue. We developed a systematic methodology to perform the optimizations to ensure stable operation of ASL. Second, we investigated reliable design of STT-MRAM bit-cells and addressed the conflicting read and write requirements, which results in overdesign of the bit-cells. Further, a Device/Circuit/Architecture co-design framework was developed to optimize the STT-MRAM devices by exploring the design space through jointly considering yield enhancement techniques at different levels of abstraction. Recent advancements in the development of memristive devices have opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this research, we utilized swarm intelligence of memristive networks to perform image edge detection. First, we proposed a hardware-friendly algorithm for image edge detection based on ant colony. Next, we designed the image edge detection algorithm using memristive networks

    Obtaining Real-World Benchmark Programs from Open-Source Repositories Through Abstract-Semantics Preserving Transformations

    Get PDF
    Benchmark programs are an integral part of program analysis research. Researchers use benchmark programs to evaluate existing techniques and test the feasibility of new approaches. The larger and more realistic the set of benchmarks, the more confident a researcher can be about the correctness and reproducibility of their results. However, obtaining an adequate set of benchmark programs has been a long-standing challenge in the program analysis community. In this thesis, we present the APT tool, a framework we designed and implemented to automate the generation of realistic benchmark programs suitable for program analysis evaluations. Our tool targets intra-procedural analyses that operate on an integer domain, specifically symbolic execution. The framework is composed of three main stages. In the first stage, the tool extracts potential benchmark programs from open-source repositories suitable for symbolic execution. In the second stage, the tool transforms the extracted programs into compilable, stand-alone benchmarks by removing external dependencies and nonlinear expressions. In the third stage, the benchmarks are verified and made available for the user. We have designed our transformation algorithms to remove program dependencies and nonlinear expressions while preserving their semantics-equivalence in the abstraction of symbolic analysis. That is, we want the information the analysis computes on the original program and its transformed version to be equivalent. Our work provides static analysis researchers with concise, compilable benchmark programs that are relevant to symbolic execution, allowing them to focus their efforts on advancing analysis techniques. Furthermore, our work benefits the software engineering community by enabling static analysis researchers to perform benchmarking with a large, realistic set of programs, thus strengthening the empirical evidence of the advancements in static program analysis

    StaticFixer: From Static Analysis to Static Repair

    Full text link
    Static analysis tools are traditionally used to detect and flag programs that violate properties. We show that static analysis tools can also be used to perturb programs that satisfy a property to construct variants that violate the property. Using this insight we can construct paired data sets of unsafe-safe program pairs, and learn strategies to automatically repair property violations. We present a system called \sysname, which automatically repairs information flow vulnerabilities using this approach. Since information flow properties are non-local (both to check and repair), \sysname also introduces a novel domain specific language (DSL) and strategy learning algorithms for synthesizing non-local repairs. We use \sysname to synthesize strategies for repairing two types of information flow vulnerabilities, unvalidated dynamic calls and cross-site scripting, and show that \sysname successfully repairs several hundred vulnerabilities from open source {\sc JavaScript} repositories, outperforming neural baselines built using {\sc CodeT5} and {\sc Codex}. Our datasets can be downloaded from \url{http://aka.ms/StaticFixer}
    • ā€¦
    corecore