80,793 research outputs found

    Run Time Approximation of Non-blocking Service Rates for Streaming Systems

    Full text link
    Stream processing is a compute paradigm that promises safe and efficient parallelism. Modern big-data problems are often well suited for stream processing's throughput-oriented nature. Realization of efficient stream processing requires monitoring and optimization of multiple communications links. Most techniques to optimize these links use queueing network models or network flow models, which require some idea of the actual execution rate of each independent compute kernel within the system. What we want to know is how fast can each kernel process data independent of other communicating kernels. This is known as the "service rate" of the kernel within the queueing literature. Current approaches to divining service rates are static. Modern workloads, however, are often dynamic. Shared cloud systems also present applications with highly dynamic execution environments (multiple users, hardware migration, etc.). It is therefore desirable to continuously re-tune an application during run time (online) in response to changing conditions. Our approach enables online service rate monitoring under most conditions, obviating the need for reliance on steady state predictions for what are probably non-steady state phenomena. First, some of the difficulties associated with online service rate determination are examined. Second, the algorithm to approximate the online non-blocking service rate is described. Lastly, the algorithm is implemented within the open source RaftLib framework for validation using a simple microbenchmark as well as two full streaming applications.Comment: technical repor

    Resource optimization of edge servers dealing with priority-based workloads by utilizing service level objective-aware virtual rebalancing

    Get PDF
    IoT enables profitable communication between sensor/actuator devices and the cloud. Slow network causing Edge data to lack Cloud analytics hinders real-time analytics adoption. VRebalance solves priority-based workload performance for stream processing at the Edge. BO is used in VRebalance to prioritize workloads and find optimal resource configurations for efficient resource management. Apache Storm platform was used with RIoTBench IoT benchmark tool for real-time stream processing. Tools were used to evaluate VRebalance. Study shows VRebalance is more effective than traditional methods, meeting SLO targets despite system changes. VRebalance decreased SLO violation rates by almost 30% for static priority-based workloads and 52.2% for dynamic priority-based workloads compared to hill climbing algorithm. Using VRebalance decreased SLO violations by 66.1% compared to Apache Storm\u27s default allocation

    Combining edge and cloud computing for mobility analytics

    Full text link
    Mobility analytics using data generated from the Internet of Mobile Things (IoMT) is facing many challenges which range from the ingestion of data streams coming from a vast number of fog nodes and IoMT devices to avoiding overflowing the cloud with useless massive data streams that can trigger bottlenecks [1]. Managing data flow is becoming an important part of the IoMT because it will dictate in which platform analytical tasks should run in the future. Data flows are usually a sequence of out-of-order tuples with a high data input rate, and mobility analytics requires a real-time flow of data in both directions, from the edge to the cloud, and vice-versa. Before pulling the data streams to the cloud, edge data stream processing is needed for detecting missing, broken, and duplicated tuples in addition to recognize tuples whose arrival time is out of order. Analytical tasks such as data filtering, data cleaning and low-level data contextualization can be executed at the edge of a network. In contrast, more complex analytical tasks such as graph processing can be deployed in the cloud, and the results of ad-hoc queries and streaming graph analytics can be pushed to the edge as needed by a user application. Graphs are efficient representations used in mobility analytics because they unify knowledge about connectivity, proximity and interaction among moving things. This poster describes the preliminary results from our experimental prototype developed for supporting transit systems, in which edge and cloud computing are combined to process transit data streams forwarded from fog nodes into a cloud. The motivation of this research is to understand how to perform meaningfulness mobility analytics on transit feeds by combining cloud and fog computing architectures in order to improve fleet management, mass transit and remote asset monitoringComment: Edge Computing, Cloud Computing, Mobility Analytics, Internet of Mobile Things, Edge Fog Fabri

    Fog Computing: Issues, Challenges and Future Directions

    Get PDF
    In Cloud Computing, all the processing of the data collected by the node is done in the central server. This involves a lot of time as data has to be transferred from the node to central server before the processing of data can be done in the server. Also it is not practical to stream terabytes of data from the node to the cloud and back. To overcome these disadvantages, an extension of cloud computing, known as fog computing, is introduced. In this, the processing of data is done completely in the node if the data does not require higher computing power and is done partially if the data requires high computing power, after which the data is transferred to the central server for the remaining computations. This greatly reduces the time involved in the process and is more efficient as the central server is not overloaded. Fog is quite useful in geographically dispersed areas where connectivity can be irregular. The ideal use case requires intelligence near the edge where ultra-low latency is critical, and is promised by fog computing. The concepts of cloud computing and fog computing will be explored and their features will be contrasted to understand which is more efficient and better suited for real-time application

    The Intersection of Function-as-a-Service and Stream Computing

    Get PDF
    With recent advancements in the field of computing including the emergence of cloud computing, the consumption and accessibility of computational resources have increased drastically. Although there have been significant movements towards more sustainable computing, there are many more steps to be taken to decrease the amount of energy consumed and greenhouse gases released from the computing sector. Historically, the switch from on-premises computing to cloud computing has led to less energy consumption through the design of efficient data centers. By releasing direct control of the hardware that their software is run on, an organization can also increase efficiency and reduce costs. A new development in cloud computing has been serverless computing. Even though the term "serverless" is a misnomer because all applications are still executed on servers, serverless lets an organization resign another level of control, managing instances of virtual machines, to their cloud provider in order to reduce their cost. The cloud provider then provisions resources on-demand enabling less idle time. This reduction of idle time is a direct reduction of computing resources used, therefore resulting in a decrease in energy consumption. One form of serverless computing, Function-as-a-Service (FaaS), may have a promising future replacing some stream computing applications in order to increase efficiency and reduce waste. To explore these possibilities, the development of a stream processing application using traditional methods through Kafka Streams and FaaS through AWS Lambda was completed in order to demonstrate that FaaS can be used for stateless stream processing

    Curracurrong: a stream processing system for distributed environments

    Get PDF
    Advances in technology have given rise to applications that are deployed on wireless sensor networks (WSNs), the cloud, and the Internet of things. There are many emerging applications, some of which include sensor-based monitoring, web traffic processing, and network monitoring. These applications collect large amount of data as an unbounded sequence of events and process them to generate a new sequences of events. Such applications need an adequate programming model that can process large amount of data with minimal latency; for this purpose, stream programming, among other paradigms, is ideal. However, stream programming needs to be adapted to meet the challenges inherent in running it in distributed environments. These challenges include the need for modern domain specific language (DSL), the placement of computations in the network to minimise energy costs, and timeliness in real-time applications. To overcome these challenges we developed a stream programming model that achieves easy-to-use programming interface, energy-efficient actor placement, and timeliness. This thesis presents Curracurrong, a stream data processing system for distributed environments. In Curracurrong, a query is represented as a stream graph of stream operators and communication channels. Curracurrong provides an extensible stream operator library and adapts to a wide range of applications. It uses an energy-efficient placement algorithm that optimises communication and computation. We extend the placement problem to support dynamically changing networks, and develop a dynamic program with polynomially bounded runtime to solve the placement problem. In many stream-based applications, real-time data processing is essential. We propose an approach that measures time delays in stream query processing; this model measures the total computational time from input to output of a query, i.e., end-to-end delay

    Curracurrong: a stream processing system for distributed environments

    Get PDF
    Advances in technology have given rise to applications that are deployed on wireless sensor networks (WSNs), the cloud, and the Internet of things. There are many emerging applications, some of which include sensor-based monitoring, web traffic processing, and network monitoring. These applications collect large amount of data as an unbounded sequence of events and process them to generate a new sequences of events. Such applications need an adequate programming model that can process large amount of data with minimal latency; for this purpose, stream programming, among other paradigms, is ideal. However, stream programming needs to be adapted to meet the challenges inherent in running it in distributed environments. These challenges include the need for modern domain specific language (DSL), the placement of computations in the network to minimise energy costs, and timeliness in real-time applications. To overcome these challenges we developed a stream programming model that achieves easy-to-use programming interface, energy-efficient actor placement, and timeliness. This thesis presents Curracurrong, a stream data processing system for distributed environments. In Curracurrong, a query is represented as a stream graph of stream operators and communication channels. Curracurrong provides an extensible stream operator library and adapts to a wide range of applications. It uses an energy-efficient placement algorithm that optimises communication and computation. We extend the placement problem to support dynamically changing networks, and develop a dynamic program with polynomially bounded runtime to solve the placement problem. In many stream-based applications, real-time data processing is essential. We propose an approach that measures time delays in stream query processing; this model measures the total computational time from input to output of a query, i.e., end-to-end delay

    Decentralized Control of Distributed Cloud Networks with Generalized Network Flows

    Full text link
    Emerging distributed cloud architectures, e.g., fog and mobile edge computing, are playing an increasingly important role in the efficient delivery of real-time stream-processing applications such as augmented reality, multiplayer gaming, and industrial automation. While such applications require processed streams to be shared and simultaneously consumed by multiple users/devices, existing technologies lack efficient mechanisms to deal with their inherent multicast nature, leading to unnecessary traffic redundancy and network congestion. In this paper, we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). We then design a novel queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov drift-plus-penalty theory to develop the first fully decentralized, throughput- and cost-optimal algorithm for multicast cloud network flow control. Numerical experiments validate analytical results and demonstrate the performance gain of the proposed design over existing cloud network control techniques

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks
    • …
    corecore