2,907 research outputs found

    On Designing Multicore-aware Simulators for Biological Systems

    Full text link
    The stochastic simulation of biological systems is an increasingly popular technique in bioinformatics. It often is an enlightening technique, which may however result in being computational expensive. We discuss the main opportunities to speed it up on multi-core platforms, which pose new challenges for parallelisation techniques. These opportunities are developed in two general families of solutions involving both the single simulation and a bulk of independent simulations (either replicas of derived from parameter sweep). Proposed solutions are tested on the parallelisation of the CWC simulator (Calculus of Wrapped Compartments) that is carried out according to proposed solutions by way of the FastFlow programming framework making possible fast development and efficient execution on multi-cores.Comment: 19 pages + cover pag

    Dynamic Vision Sensor integration on FPGA-based CNN accelerators for high-speed visual classification

    Get PDF
    Deep-learning is a cutting edge theory that is being applied to many fields. For vision applications the Convolutional Neural Networks (CNN) are demanding significant accuracy for classification tasks. Numerous hardware accelerators have populated during the last years to improve CPU or GPU based solutions. This technology is commonly prototyped and tested over FPGAs before being considered for ASIC fabrication for mass production. The use of commercial typical cameras (30fps) limits the capabilities of these systems for high speed applications. The use of dynamic vision sensors (DVS) that emulate the behavior of a biological retina is taking an incremental importance to improve this applications due to its nature, where the information is represented by a continuous stream of spikes and the frames to be processed by the CNN are constructed collecting a fixed number of these spikes (called events). The faster an object is, the more events are produced by DVS, so the higher is the equivalent frame rate. Therefore, these DVS utilization allows to compute a frame at the maximum speed a CNN accelerator can offer. In this paper we present a VHDL/HLS description of a pipelined design for FPGA able to collect events from an Address-Event-Representation (AER) DVS retina to obtain a normalized histogram to be used by a particular CNN accelerator, called NullHop. VHDL is used to describe the circuit, and HLS for computation blocks, which are used to perform the normalization of a frame needed for the CNN. Results outperform previous implementations of frames collection and normalization using ARM processors running at 800MHz on a Zynq7100 in both latency and power consumption. A measured 67% speedup factor is presented for a Roshambo CNN real-time experiment running at 160fps peak rate.Comment: 7 page

    Tree-structured small-world connected wireless network-on-chip with adaptive routing

    Get PDF
    Traditional Network-on-Chip (NoC) systems comprised of many cores suffer from debilitating bottlenecks of latency and significant power dissipation due to the overhead inherent in multi-hop communication. In addition, these systems remain vulnerable to malicious circuitry incorporated into the design by untrustworthy vendors in a world where complex multi-stage design and manufacturing processes require the collective specialized services of a variety of contractors. This thesis proposes a novel small-world tree-based network-on-chip (SWTNoC) structure designed for high throughput, acceptable energy consumption, and resiliency to attacks and node failures resulting from the insertion of hardware Trojans. This tree-based implementation was devised as a means of reducing average network hop count, providing a large degree of local connectivity, and effective long-range connectivity by means of a novel wireless link approach based on carbon nanotube (CNT) antenna design. Network resiliency is achieved by means of a devised adaptive routing algorithm implemented to work with TRAIN (Tree-based Routing Architecture for Irregular Networks). Comparisons are drawn with benchmark architectures with optimized wireless link placement by means of the simulated annealing (SA) metaheuristic. Experimental results demonstrate a 21% throughput improvement and a 23% reduction in dissipated energy per packet over the closest competing architecture. Similar trends are observed at increasing system sizes. In addition, the SWTNoC maintains this throughput and energy advantage in the presence of a fault introduced into the system. By designing a hierarchical topology and designating a higher level of importance on a subset of the nodes, much higher network throughput can be attained while simultaneously guaranteeing deadlock freedom as well as a high degree of resiliency and fault-tolerance

    Bounding inconsistency using a novel threshold metric for dead reckoning update packet generation

    Get PDF
    Human-to-human interaction across distributed applications requires that sufficient consistency be maintained among participants in the face of network characteristics such as latency and limited bandwidth. The level of inconsistency arising from the network is proportional to the network delay, and thus a function of bandwidth consumption. Distributed simulation has often used a bandwidth reduction technique known as dead reckoning that combines approximation and estimation in the communication of entity movement to reduce network traffic, and thus improve consistency. However, unless carefully tuned to application and network characteristics, such an approach can introduce more inconsistency than it avoids. The key tuning metric is the distance threshold. This paper questions the suitability of the standard distance threshold as a metric for use in the dead reckoning scheme. Using a model relating entity path curvature and inconsistency, a major performance related limitation of the distance threshold technique is highlighted. We then propose an alternative time—space threshold criterion. The time—space threshold is demonstrated, through simulation, to perform better for low curvature movement. However, it too has a limitation. Based on this, we further propose a novel hybrid scheme. Through simulation and live trials, this scheme is shown to perform well across a range of curvature values, and places bounds on both the spatial and absolute inconsistency arising from dead reckoning

    An FPGA implementation of a sleep enabled PON system

    Get PDF
    Owing to the growing demand for bandwidth-hungry video-on-demand applications, Passive Optical Network (PON) has been widely considered as one of the most promising solutions for broadband access. Environmental concerns motivated network designers to lower energy consumption of optical access networks. A well-known approach to reduce energy consumption is to allow network elements to switch to the sleep mode. In this framework, an improved Optical network Unit (ONU) architecture in TDM-PON is proposed to reduce the handover time of status switching. Energy-saving performances of current and improved architectures are compared in different scenarios. The simulation results show that by applying a proper sleep mode mechanism, the improved architecture can effectively reduce the ONU energy consumption. We further implement the cycle sleep scheme on a multi-ONU testbed based on the improved ONU architecture. The experimental results have substantiated the viability of the improved ONU architecture

    Simulation of Mixed Critical In-vehicular Networks

    Full text link
    Future automotive applications ranging from advanced driver assistance to autonomous driving will largely increase demands on in-vehicular networks. Data flows of high bandwidth or low latency requirements, but in particular many additional communication relations will introduce a new level of complexity to the in-car communication system. It is expected that future communication backbones which interconnect sensors and actuators with ECU in cars will be built on Ethernet technologies. However, signalling from different application domains demands for network services of tailored attributes, including real-time transmission protocols as defined in the TSN Ethernet extensions. These QoS constraints will increase network complexity even further. Event-based simulation is a key technology to master the challenges of an in-car network design. This chapter introduces the domain-specific aspects and simulation models for in-vehicular networks and presents an overview of the car-centric network design process. Starting from a domain specific description language, we cover the corresponding simulation models with their workflows and apply our approach to a related case study for an in-car network of a premium car
    corecore