45,529 research outputs found

    High-Performance Architecture for Binary-Tree-Based Finite State Machines

    Get PDF
    A binary-tree-based finite state machine (BT-FSM) is a state machine with a 1-bit input signal whose state transition graph is a binary tree. BT-FSMs are useful in those application areas where searching in a binary tree is required, such as computer networks, compression, automatic control, or cryptography. This paper presents a new architecture for implementing BT-FSMs which is based on the model finite virtual state machine (FVSM). The proposed architecture has been compared with the general FVSM and conventional approaches by using both synthetic test benches and very large BT-FSMs obtained from a real application. In synthetic test benches, the average speed improvement of the proposed architecture respect to the best results of the other approaches achieves 41% (there are some cases in which the speed is more than double). In the case of the real application, the average speed improvement achieves 155%

    Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach

    Get PDF
    While it is well-known and acknowledged that the performance of graph algorithms is heavily dependent on the input data, there has been surprisingly little research to quantify and predict the impact the graph structure has on performance. Parallel graph algorithms, running on many-core systems such as GPUs, are no exception: most research has focused on how to efficiently implement and tune different graph operations on a specific GPU. However, the performance impact of the input graph has only been taken into account indirectly as a result of the graphs used to benchmark the system. In this work, we present a case study investigating how to use the properties of the input graph to improve the performance of the breadth-first search (BFS) graph traversal. To do so, we first study the performance variation of 15 different BFS implementations across 248 graphs. Using this performance data, we show that significant speed-up can be achieved by combining the best implementation for each level of the traversal. To make use of this data-dependent optimization, we must correctly predict the relative performance of algorithms per graph level, and enable dynamic switching to the optimal algorithm for each level at runtime. We use the collected performance data to train a binary decision tree, to enable high-accuracy predictions and fast switching. We demonstrate empirically that our decision tree is both fast enough to allow dynamic switching between implementations, without noticeable overhead, and accurate enough in its prediction to enable significant BFS speedup. We conclude that our model-driven approach (1) enables BFS to outperform state of the art GPU algorithms, and (2) can be adapted for other BFS variants, other algorithms, or more specific datasets

    Hardware-based Security for Virtual Trusted Platform Modules

    Full text link
    Virtual Trusted Platform modules (TPMs) were proposed as a software-based alternative to the hardware-based TPMs to allow the use of their cryptographic functionalities in scenarios where multiple TPMs are required in a single platform, such as in virtualized environments. However, virtualizing TPMs, especially virutalizing the Platform Configuration Registers (PCRs), strikes against one of the core principles of Trusted Computing, namely the need for a hardware-based root of trust. In this paper we show how strength of hardware-based security can be gained in virtual PCRs by binding them to their corresponding hardware PCRs. We propose two approaches for such a binding. For this purpose, the first variant uses binary hash trees, whereas the other variant uses incremental hashing. In addition, we present an FPGA-based implementation of both variants and evaluate their performance

    CloudTree: A Library to Extend Cloud Services for Trees

    Full text link
    In this work, we propose a library that enables on a cloud the creation and management of tree data structures from a cloud client. As a proof of concept, we implement a new cloud service CloudTree. With CloudTree, users are able to organize big data into tree data structures of their choice that are physically stored in a cloud. We use caching, prefetching, and aggregation techniques in the design and implementation of CloudTree to enhance performance. We have implemented the services of Binary Search Trees (BST) and Prefix Trees as current members in CloudTree and have benchmarked their performance using the Amazon Cloud. The idea and techniques in the design and implementation of a BST and prefix tree is generic and thus can also be used for other types of trees such as B-tree, and other link-based data structures such as linked lists and graphs. Preliminary experimental results show that CloudTree is useful and efficient for various big data applications

    HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

    Full text link
    Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities. While leading companies successfully advance their HESoC products, research lags behind due to the challenges of building a prototyping platform that unites an industry-standard host processor with an open research PMCA architecture. In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore host processor. The PMCA architecture mapped on the FPGA is silicon-proven, scalable, configurable, and fully modifiable. HERO includes a complete software stack that consists of a heterogeneous cross-compilation toolchain with support for OpenMP accelerator programming, a Linux driver, and runtime libraries for both host and PMCA. HERO is designed to facilitate rapid exploration on all software and hardware layers: run-time behavior can be accurately analyzed by tracing events, and modifications can be validated through fully automated hard ware and software builds and executed tests. We demonstrate the usefulness of HERO by means of case studies from our research

    Identifying Native Applications with High Assurance

    Get PDF
    The work described in this paper investigates the problem of identifying and deterring stealthy malicious processes on a host. We point out the lack of strong application iden- tication in main stream operating systems. We solve the application identication problem by proposing a novel iden- tication model in which user-level applications are required to present identication proofs at run time to be authenti- cated by the kernel using an embedded secret key. The se- cret key of an application is registered with a trusted kernel using a key registrar and is used to uniquely authenticate and authorize the application. We present a protocol for secure authentication of applications. Additionally, we de- velop a system call monitoring architecture that uses our model to verify the identity of applications when making critical system calls. Our system call monitoring can be integrated with existing policy specication frameworks to enforce application-level access rights. We implement and evaluate a prototype of our monitoring architecture in Linux as device drivers with nearly no modication of the ker- nel. The results from our extensive performance evaluation shows that our prototype incurs low overhead, indicating the feasibility of our model
    corecore