127 research outputs found
Real-Time Big Data: the JUNIPER Approach
REACTION 2014. 3rd International Workshop on Real-time and Distributed Computing in Emerging Applications. Rome, Italy. December 2nd, 2014.Cloud computing offers the possibility for Cyber-Physical Systems (CPS) to offload computation and utilise large stored data sets in order to increase the overall system utility. However, for cloud platforms and applications to be effective for CPS, they need to exhibit real-time behaviour so that some level of performance can be guaranteed to the CPS. This paper considers the infrastructure developed by the EU JUNIPER project for enabling real-time big data systems to be built so that appropriate guarantees can be given to the CPS components. The technologies developed include a real-time Java programming approach, hardware acceleration to provide performance, and operating system resource manage-ment (time and disk) based upon resource reservation in order to enhance timeliness.This work is partially funded by the European Union’s Seventh Framework Programme under grant agreement FP7-ICT-611731Publicad
Improving the predictability of distributed stream processors
Next generation real-time applications demand big-data infrastructures to process huge and continuous data volumes under complex computational constraints. This type of application raises new issues on current big-data processing infrastructures. The first issue to be considered is that most of current infrastructures for big-data processing were defined for general purpose applications. Thus, they set aside real-time performance, which is in some cases an implicit requirement. A second important limitation is the lack of clear computational models that could be supported by current big-data frameworks. In an effort to reduce this gap, this article contributes along several lines. First, it provides a set of improvements to a computational model called distributed stream processing in order to formalize it as a real-time infrastructure. Second, it proposes some extensions to Storm, one of the most popular stream processors. These extensions are designed to gain an extra control over the resources used by the application in order to improve its predictability. Lastly, the article presents some empirical evidences on the performance that can be expected from this type of infrastructure.This work has been partially supported by HERMES (Healthy and Efficient Routes in Massive open-data basEd Smart cities). It has been also partially financed by Distributed Java Infrastructure for Real-Time Big Data (CAS14/00118). It has been also partially funded by eMadrid (S2013/ICE-2715) and by European Union’s 7th Framework Programme ​under Grant Agreement FP7-IC6-318763
Recommended from our members
Addressing resource contention and timing predictability for multi-core architectures with shared memory interconnects
Multi-core architectures are increasingly being used in real-time embedded systems. In general, such systems have more processors than the shared memory modules, potentially causing severe interference over memory accesses. This resource contention could lead to substantial variation on memory access latencies, and thus wide fluctuation in the overall system performance, which is highly undesirable especially for the time-critical applications. In this paper, we address resource contention and timing predictability for multi-core architectures with distributed memory interconnects. We focus on the locally arbitrated interconnect constructed by pipelined multiplexing stages with local arbitration, while the globally arbitrated interconnect employing global scheduling to the same architecture potentially suffers synchronisation issue and requires strict coordination. Our contributions are mainly threefold: (i) We analyse the resource contention across the memory access data path, and report the accurate calculational method to bound the worst-case behaviour. (ii) We compare the average-case behaviour of the locally arbitrated and the globally arbitrated architectures with experiments, demonstrating varying memory latencies caused by the resource sharing issue. (iii) We propose an architectural modification to smooth resource sharing. Evaluations on simulators and FPGA implementations with synthetic memory workload show that the latency variation is significantly reduced, contributing towards timing predictability of multi-core systems
Recommended from our members
A distributed stream library for Java 8
Java 8 has introduced new capabilities such as lambda expressions and streams which simplify data-parallel computing. However, as a base language for Big Data systems, it still lacks a number of important capabilities such as processing very large datasets and distributing the computation over multiple machines. This paper gives an overview of the Java 8 Streams API and proposes extensions to allow its use in Big Data systems. It also shows how the API can be used to implement a range of standard Big Data paradigms. Finally, it compares performance with that of Hadoop and Spark. Despite being a proof-of-concept implementation, results indicate that it is a lightweight and efficient framework, comparable in performance to Hadoop and Spark, and is up to 5 times faster for the largest input sizes tested
Recommended from our members
Dynamic and static task allocation for hard real-time video stream decoding on NoCs
Hard real-time (HRT) video systems require admission control decisions that rely on two factors. Firstly, schedulability analysis of the data-dependent, communicating tasks within the application need to be carried out in order to guarantee timing and predictability. Secondly, the allocation of the tasks to multi-core processing elements would generate different results in the schedulability analysis. Due to the conservative nature of the state-of-the-art schedulability analysis of tasks and message lows, and the unpredictability in the application, the system resources are often under-utilised. In this paper we propose two blocking-aware dynamic task allocation techniques that exploit application and platform characteristics, in order to increase the number of simultaneous, fully schedulable, video streams handled by the system. A novel, worst-case response time aware, search-based, static hard real-time task mapper is introduced to act as an upper-baseline to the proposed techniques. Further evaluations are carried out against existing heuristic-based dynamic mappers. Improvements to the admission rates and the system utilisation under a range of different workloads and platform sizes are explored
Schedulability-Driven Frame Packing for Multi-Cluster Distributed Embedded Systems
We present an approach to frame packing for multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. In our approach, the application messages are packed into frames such that the application is schedulable. Thus, we have also proposed a schedulability analysis for applications consisting of mixed event-triggered and time-triggered processes and messages, and a worst case queuing delay analysis for the gateways, responsible for routing inter-cluster traffic. Optimization heuristics for frame packing aiming at producing a schedulable system have been proposed. Extensive experiments and a real-life example show the efficiency of our frame-packing approach
Recommended from our members
A Globally Arbitrated Memory Tree for Mixed-Time-Criticality Systems
Embedded systems are increasingly based on multi-core platforms to accommodate a growing number of applications, some of which have real-time requirements. Resources, such as off-chip DRAM, are typically shared between the applications using memory interconnects with different arbitration polices to cater to diverse bandwidth and latency requirements. However, traditional centralized interconnects are not scalable as the number of clients increase. Similarly, current distributed interconnects either cannot satisfy the diverse requirements or have decoupled arbitration stages, resulting in larger area, power and worst-case latency. The four main contributions of this article are: 1) a Globally Arbitrated Memory Tree (GAMT) with a distributed architecture that scales well with the number of cores, 2) an RTL-level implementation that can be configured with five arbitration policies (three distinct and two as special cases), 3) the concept of mixed arbitration policies that allows the policy to be selected individually per core, and 4) a worst-case analysis for a mixed arbitration policy that combines TDM and FBSP arbitration.We compare the performance of GAMT with centralized implementations and show that it can run up to four times faster and have over 51 and 37 percent reduction in area and power consumption, respectively, for a given bandwidth
Ada and cc-NUMA Architectures What can be achieved with Ada 2005?
Abstract Real-time systems are finding it difficult to make the shift from single processor systems to multiprocessors because of the lack of support from programming platforms for multiprocessors. Although, Ada provides some support for SMPs, it's goal is to hide the complexity of the architectures so that the programmers are not distracted by low-level architectural issues. This paper argues that programmer should be given enough visibility to use the underlying architecture predictably and efficiently. We focus on the issue of memory management and memory accesses on a cc-NUMA architecture. A cc-NUMA architecture is chosen, as we believe it to be more scalable than SMP systems
Peptidergic control in a fruit crop pest: The spotted-wing drosophila, Drosophila suzukii
Neuropeptides play an important role in the regulation of feeding in insects and offer potential targets for the development of new chemicals to control insect pests. A pest that has attracted much recent attention is the highly invasive Drosophila suzukii, a polyphagous pest that can cause serious economic damage to soft fruits. Previously we showed by mass spectrometry the presence of the neuropeptide myosuppressin (TDVDHVFLRFamide) in the nerve bundle suggesting that this peptide is involved in regulating the function of the crop, which in adult dipteran insects has important roles in the processing of food, the storage of carbohydrates and the movement of food into the midgut for digestion. In the present study antibodies that recognise the C-terminal RFamide epitope of myosuppressin stain axons in the crop nerve bundle and reveal peptidergic fibres covering the surface of the crop. We also show using an in vitro bioassay that the neuropeptide is a potent inhibitor (EC50 of 2.3 nM) of crop contractions and that this inhibition is mimicked by the non-peptide myosuppressin agonist, benzethonium chloride (Bztc). Myosuppressin also inhibited the peristaltic contractions of the adult midgut, but was a much weaker agonist (EC50 = 5.7 ÎĽM). The oral administration of Bztc (5 mM) in a sucrose diet to adult female D. suzukii over 4 hours resulted in less feeding and longer exposure to dietary Bztc led to early mortality. We therefore suggest that myosuppressin and its cognate receptors are potential targets for disrupting feeding behaviour of adult D. suzukii
An extensible framework for multicore response time analysis
In this paper, we introduce a multicore response time analysis (MRTA) framework, which decouples response time analysis from a reliance on context independent WCET values. Instead, the analysis formulates response times directly from the demands placed on different hardware resources. The MRTA framework is extensible to different multicore architectures, with a variety of arbitration policies for the common interconnects, and different types and arrangements of local memory. We instantiate the framework for single level local data and instruction memories (cache or scratchpads), for a variety of memory bus arbitration policies, including: Round-Robin, FIFO, Fixed-Priority, Processor-Priority, and TDMA, and account for DRAM refreshes. The MRTA framework provides a general approach to timing verification for multicore systems that is parametric in the hardware configuration and so can be used at the architectural design stage to compare the guaranteed levels of real-time performance that can be obtained with different hardware configurations. We use the framework in this way to evaluate the performance of multicore systems with a variety of different architectural components and policies. These results are then used to compose a predictable architecture, which is compared against a reference architecture designed for good average-case behaviour. This comparison shows that the predictable architecture has substantially better guaranteed real-time performance, with the precision of the analysis verified using cycle-accurate simulation
- …