2,907 research outputs found
On Designing Multicore-aware Simulators for Biological Systems
The stochastic simulation of biological systems is an increasingly popular
technique in bioinformatics. It often is an enlightening technique, which may
however result in being computational expensive. We discuss the main
opportunities to speed it up on multi-core platforms, which pose new challenges
for parallelisation techniques. These opportunities are developed in two
general families of solutions involving both the single simulation and a bulk
of independent simulations (either replicas of derived from parameter sweep).
Proposed solutions are tested on the parallelisation of the CWC simulator
(Calculus of Wrapped Compartments) that is carried out according to proposed
solutions by way of the FastFlow programming framework making possible fast
development and efficient execution on multi-cores.Comment: 19 pages + cover pag
Dynamic Vision Sensor integration on FPGA-based CNN accelerators for high-speed visual classification
Deep-learning is a cutting edge theory that is being applied to many fields.
For vision applications the Convolutional Neural Networks (CNN) are demanding
significant accuracy for classification tasks. Numerous hardware accelerators
have populated during the last years to improve CPU or GPU based solutions.
This technology is commonly prototyped and tested over FPGAs before being
considered for ASIC fabrication for mass production. The use of commercial
typical cameras (30fps) limits the capabilities of these systems for high speed
applications. The use of dynamic vision sensors (DVS) that emulate the behavior
of a biological retina is taking an incremental importance to improve this
applications due to its nature, where the information is represented by a
continuous stream of spikes and the frames to be processed by the CNN are
constructed collecting a fixed number of these spikes (called events). The
faster an object is, the more events are produced by DVS, so the higher is the
equivalent frame rate. Therefore, these DVS utilization allows to compute a
frame at the maximum speed a CNN accelerator can offer. In this paper we
present a VHDL/HLS description of a pipelined design for FPGA able to collect
events from an Address-Event-Representation (AER) DVS retina to obtain a
normalized histogram to be used by a particular CNN accelerator, called
NullHop. VHDL is used to describe the circuit, and HLS for computation blocks,
which are used to perform the normalization of a frame needed for the CNN.
Results outperform previous implementations of frames collection and
normalization using ARM processors running at 800MHz on a Zynq7100 in both
latency and power consumption. A measured 67% speedup factor is presented for a
Roshambo CNN real-time experiment running at 160fps peak rate.Comment: 7 page
Tree-structured small-world connected wireless network-on-chip with adaptive routing
Traditional Network-on-Chip (NoC) systems comprised of many cores suffer from debilitating bottlenecks of latency and significant power dissipation due to the overhead inherent in multi-hop communication. In addition, these systems remain vulnerable to malicious circuitry incorporated into the design by untrustworthy vendors in a world where complex multi-stage design and manufacturing processes require the collective specialized services of a variety of contractors. This thesis proposes a novel small-world tree-based network-on-chip (SWTNoC) structure designed for high throughput, acceptable energy consumption, and resiliency to attacks and node failures resulting from the insertion of hardware Trojans. This tree-based implementation was devised as a means of reducing average network hop count, providing a large degree of local connectivity, and effective long-range connectivity by means of a novel wireless link approach based on carbon nanotube (CNT) antenna design. Network resiliency is achieved by means of a devised adaptive routing algorithm implemented to work with TRAIN (Tree-based Routing Architecture for Irregular Networks). Comparisons are drawn with benchmark architectures with optimized wireless link placement by means of the simulated annealing (SA) metaheuristic. Experimental results demonstrate a 21% throughput improvement and a 23% reduction in dissipated energy per packet over the closest competing architecture. Similar trends are observed at increasing system sizes. In addition, the SWTNoC maintains this throughput and energy advantage in the presence of a fault introduced into the system. By designing a hierarchical topology and designating a higher level of importance on a subset of the nodes, much higher network throughput can be attained while simultaneously guaranteeing deadlock freedom as well as a high degree of resiliency and fault-tolerance
Bounding inconsistency using a novel threshold metric for dead reckoning update packet generation
Human-to-human interaction across distributed applications requires that sufficient consistency be maintained among participants in the face of network characteristics such as latency and limited bandwidth. The level of inconsistency arising from the network is proportional to the network delay, and thus a function of bandwidth consumption. Distributed simulation has often used a bandwidth reduction technique known as dead reckoning that combines approximation and estimation in the communication of entity movement to reduce network traffic, and thus improve consistency. However, unless carefully tuned to application and network characteristics, such an approach can introduce more inconsistency than it avoids. The key tuning metric is the distance threshold. This paper questions the suitability of the standard distance threshold as a metric for use in the dead reckoning scheme. Using a model relating entity path curvature and inconsistency, a major performance related limitation of the distance threshold technique is highlighted. We then propose an alternative time—space threshold criterion. The time—space threshold is demonstrated, through simulation, to perform better for low curvature movement. However, it too has a limitation. Based on this, we further propose a novel hybrid scheme. Through simulation and live trials, this scheme is shown to perform well across a range of curvature values, and places bounds on both the spatial and absolute inconsistency arising from dead reckoning
An FPGA implementation of a sleep enabled PON system
Owing to the growing demand for bandwidth-hungry video-on-demand applications, Passive Optical Network (PON) has been widely considered as one of the most promising solutions for broadband access. Environmental concerns motivated network designers to lower energy consumption of optical access networks. A well-known approach to reduce energy consumption is to allow network elements to switch to the sleep mode.
In this framework, an improved Optical network Unit (ONU) architecture in TDM-PON is proposed to reduce the handover time of status switching. Energy-saving performances of current and improved architectures are compared in different scenarios. The simulation results show that by applying a proper sleep mode mechanism, the improved architecture can effectively reduce the ONU energy consumption. We further implement the cycle sleep scheme on a multi-ONU testbed based on the improved ONU architecture. The experimental results have substantiated the viability of the improved ONU architecture
Simulation of Mixed Critical In-vehicular Networks
Future automotive applications ranging from advanced driver assistance to
autonomous driving will largely increase demands on in-vehicular networks. Data
flows of high bandwidth or low latency requirements, but in particular many
additional communication relations will introduce a new level of complexity to
the in-car communication system. It is expected that future communication
backbones which interconnect sensors and actuators with ECU in cars will be
built on Ethernet technologies. However, signalling from different application
domains demands for network services of tailored attributes, including
real-time transmission protocols as defined in the TSN Ethernet extensions.
These QoS constraints will increase network complexity even further.
Event-based simulation is a key technology to master the challenges of an
in-car network design. This chapter introduces the domain-specific aspects and
simulation models for in-vehicular networks and presents an overview of the
car-centric network design process. Starting from a domain specific description
language, we cover the corresponding simulation models with their workflows and
apply our approach to a related case study for an in-car network of a premium
car
- …