37 research outputs found

    Scalable and responsive real time event processing using cloud computing

    Get PDF
    PhD ThesisCloud computing provides the potential for scalability and adaptability in a cost e ective manner. However, when it comes to achieving scalability for real time applications response time cannot be high. Many applications require good performance and low response time, which need to be matched with the dynamic resource allocation. The real time processing requirements can also be characterized by unpredictable rates of incoming data streams and dynamic outbursts of data. This raises the issue of processing the data streams across multiple cloud computing nodes. This research analyzes possible methodologies to process the real time data in which applications can be structured as multiple event processing networks and be partitioned over the set of available cloud nodes. The approach is based on queuing theory principles to encompass the cloud computing. The transformation of the raw data into useful outputs occurs in various stages of processing networks which are distributed across the multiple computing nodes in a cloud. A set of valid options is created to understand the response time requirements for each application. Under a given valid set of conditions to meet the response time criteria, multiple instances of event processing networks are distributed in the cloud nodes. A generic methodology to scale-up and scale-down the event processing networks in accordance to the response time criteria is de ned. The real time applications that support sophisticated decision support mechanisms need to comply with response time criteria consisting of interdependent data ow paradigms making it harder to improve the performance. Consideration is given for ways to reduce the latency,improve response time and throughput of the real time applications by distributing the event processing networks in multiple computing nodes

    Performance management of event processing systems

    Get PDF
    This thesis is a study of performance management of Complex Event Processing (CEP) systems. Since CEP systems have distinct characteristics from other well-studied computer systems such as batch and online transaction processing systems and database-centric applications, these characteristics introduce new challenges and opportunities to the performance management for CEP systems. Methodologies used in benchmarking CEP systems in many performance studies focus on scaling the load injection, but not considering the impact of the functional capabilities of CEP systems. This thesis proposes the approach of evaluating the performance of CEP engines’ functional behaviours on events and develops a benchmark platform for CEP systems: CEPBen. The CEPBen benchmark platform is developed to explore the fundamental functional performance of event processing systems: filtering, transformation and event pattern detection. It is also designed to provide a flexible environment for exploring new metrics and influential factors for CEP systems and evaluating the performance of CEP systems. Studies on factors and new metrics are carried out using the CEPBen benchmark platform on Esper. Different measurement points of response time in performance management of CEP systems are discussed and response time of targeted event is proposed to be used as a metric for quality of service evaluation combining with the traditional response time in CEP systems. Maximum query load as a capacity indicator regarding to the complexity of queries and number of live objects in memory as a performance indicator regarding to the memory management are proposed in performance management of CEP systems. Query depth is studied as a performance factor that influences CEP system performance

    On the cloud deployment of a session abstraction for service/data aggregation

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaThe global cyber-infrastructure comprehends a growing number of resources, spanning over several abstraction layers. These resources, which can include wireless sensor devices or mobile networks, share common requirements such as richer inter-connection capabilities and increasing data consumption demands. Additionally, the service model is now widely spread, supporting the development and execution of distributed applications. In this context, new challenges are emerging around the “big data” topic. These challenges include service access optimizations, such as data-access context sharing, more efficient data filtering/ aggregation mechanisms, and adaptable service access models that can respond to context changes. The service access characteristics can be aggregated to capture specific interaction models. Moreover, ubiquitous service access is a growing requirement, particularly regarding mobile clients such as tablets and smartphones. The Session concept aggregates the service access characteristics, creating specific interaction models, which can then be re-used in similar contexts. Existing Session abstraction implementations also allow dynamic reconfigurations of these interaction models, so that the model can adapt to context changes, based on service, client or underlying communication medium variables. Cloud computing on the other hand, provides ubiquitous access, along with large data persistence and processing services. This thesis proposes a Session abstraction implementation, deployed on a Cloud platform, in the form of a middleware. This middleware captures rich/dynamic interaction models between users with similar interests, and provides a generic mechanism for interacting with datasources based on multiple protocols. Such an abstraction contextualizes service/users interactions, can be reused by other users in similar contexts. This Session implementation also permits data persistence by saving all data in transit in a Cloud-based repository, The aforementioned middleware delivers richer datasource-access interaction models, dynamic reconfigurations, and allows the integration of heterogenous datasources. The solution also provides ubiquitous access, allowing client connections from standard Web browsers or Android based mobile devices

    Workload Management for Data-Intensive Services

    Get PDF
    <p>Data-intensive web services are typically composed of three tiers: i) a display tier that interacts with users and serves rich content to them, ii) a storage tier that stores the user-generated or machine-generated data used to create this content, and iii) an analytics tier that runs data analysis tasks in order to create and optimize new content. Each tier has different workloads and requirements that result in a diverse set of systems being used in modern data-intensive web services.</p><p>Servers are provisioned dynamically in the display tier to ensure that interactive client requests are served as per the latency and throughput requirements. The challenge is not only deciding automatically how many servers to provision but also when to provision them, while ensuring stable system performance and high resource utilization. To address these challenges, we have developed a new control policy for provisioning resources dynamically in coarse-grained units (e.g., adding or removing servers or virtual machines in cloud platforms). Our new policy, called proportional thresholding, converts a user-specified performance target value into a target range in order to account for the relative effect of provisioning a server on the overall workload performance.</p><p>The storage tier is similar to the display tier in some respects, but poses the additional challenge of needing redistribution of stored data when new storage nodes are added or removed. Thus, there will be some delay before the effects of changing a resource allocation will appear. Moreover, redistributing data can cause some interference to the current workload because it uses resources that can otherwise be used for processing requests. We have developed a system, called Elastore, that addresses the new challenges found in the storage tier. Elastore not only coordinates resource allocation and data redistribution to preserve stability during dynamic resource provisioning, but it also finds the best tradeoff between workload interference and data redistribution time.</p><p>The workload in the analytics tier consists of data-parallel workflows that can either be run in a batch fashion or continuously as new data becomes available. Each workflow is composed of smaller units that have producer-consumer relationships based on data. These workflows are often generated from declarative specifications in languages like SQL, so there is a need for a cost-based optimizer that can generate an efficient execution plan for a given workflow. There are a number of challenges when building a cost-based optimizer for data-parallel workflows, which includes characterizing the large execution plan space, developing cost models to estimate the execution costs, and efficiently searching for the best execution plan. We have built two cost-based optimizers: Stubby for batch data-parallel workflows running on MapReduce systems, and Cyclops for continuous data-parallel workflows where the choice of execution system is made a part of the execution plan space.</p><p>We have conducted a comprehensive evaluation that shows the effectiveness of each tier's automated workload management solution.</p>Dissertatio

    Real-Time QoS Monitoring and Anomaly Detection on Microservice-based Applications in Cloud-Edge Infrastructure

    Get PDF
    Ph. D. Thesis.Microservices have emerged as a new approach for developing and deploying cloud applications that require higher levels of agility, scale, and reliability. A microservicebased cloud application architecture advocates decomposition of monolithic application components into independent software components called \microservices". As the independent microservices can be developed, deployed, and updated independently of each other, it leads to complex run-time performance monitoring and management challenges. The deployment environment for microservices in multi-cloud environments is very complex as there are numerous components running in heterogeneous environments (VM/container) and communicating frequently with each other using REST-based/REST-less APIs. In some cases, multiple components can also be executed inside a VM/container making any failure or anomaly detection very complicated. It is necessary to monitor the performance variation of all the service components to detect any reason for failure. Microservice and container architecture allows to design loose-coupled services and run them in a lightweight runtime environment for more e cient scaling. Thus, containerbased microservice deployment is now the standard model for hosting cloud applications across industries. Despite the strongest scalability characteristic of this model which opens the doors for further optimizations in both application structure and performance, such characteristic adds an additional level of complexity to monitoring application performance. Performance monitoring system can lead to severe application outages if it is not able to successfully and quickly detecting failures and localizing their causes. Machine learning-based techniques have been applied to detect anomalies in microservice-based cloud-based applications. The existing research works used di erent tracking algorithms to search the root cause if anomaly observed behaviour. However, linking the observed failures of an application with their root causes by the use of these techniques is still an open research problem. Osmotic computing is a new IoT application programming paradigm that's driven by the signi cant increase in resource capacity/capability at the network edge, along with support for data transfer protocols that enable such resources to interact more seamlessly with cloud-based services. Much of the di culty in Quality of Service (QoS) and performance monitoring of IoT applications in an osmotic computing environment is due to the massive scale and heterogeneity (IoT + edge + cloud) of computing environments. To handle monitoring and anomaly detection of microservices in cloud and edge datacenters, this thesis presents multilateral research towards monitoring and anomaly detection on microservice-based applications performance in cloud-edge infrastructure. The key contributions of this thesis are as following: • It introduces a novel system, Multi-microservices Multi-virtualization Multicloud monitoring (M3 ) that provides a holistic approach to monitor the performance of microservice-based application stacks deployed across multiple cloud data centers. • A framework forMonitoring, Anomaly Detection and Localization System (MADLS) which utilizes a simpli ed approach that depends on commonly available metrics o ering a simpli ed deployment environment for the developer. • Developing a uni ed monitoring model for cloud-edge that provides an IoT application administrator with detailed QoS information related to microservices deployed across cloud and edge datacenters.Royal Embassy of Saudi Arabia Cultural Bureau in London, government of Saudi Arabi

    Automating Computational Placement for the Internet of Things

    Get PDF
    PhD ThesisThe PATH2iot platform presents a new approach to distributed data analytics for Internet of Things applications. It automatically partitions and deploys stream-processing computations over the available infrastructure (e.g. sensors, field gateways, clouds and the networks that connect them) so as to meet non-functional requirements including network limitations and energy. To enable this, the user gives a high-level declarative description of the computation as a set of Event Processing Language queries. These are compiled, optimised, and partitioned to meet the non-functional requirements using a combination of distributed query processing techniques that optimise the computation, and cost models that enable PATH2iot to select the best deployment plan given the non-functional requirements. This thesis describes the resulting PATH2iot system, illustrated with two real-world use cases. First, a digital healthcare analytics system in which sensor battery life is the main non-functional requirement to be optimized. This shows that the tool can automatically partition and distribute the computation across a healthcare wearable, a mobile phone and the cloud - increasing the battery life of the smart watch by 453% when compared to other possible allocations. The energy cost of sending messages over a wireless network is a key component of the cost model, and we show how this can be modelled. Furthermore, the uncertainty of the model is addressed with two alternative approaches: one frequentist and one Bayesian The second use case is one in which an acoustic data analytics for transport monitoring is automatically distributed so as enable it to run over a low-bandwidth LORA network connecting the sensor to the cloud. Overall, the paper shows how the PATH2iot system can automatically bring the benefits of edge computing to the increasing set of IoT applications that perform distributed data analytics

    Intelligent IoT and Dynamic Network Semantic Maps for more Trustworthy Systems

    Get PDF
    As technology evolves, the Internet of Things (IoT) concept is gaining importance for constituting a foundation to reach optimum connectivity between people and things. For this to happen and to allow easier integration of sensors and other devices in these technologic environments (or networks), the configuration is a key process, promoting interoperability between heterogeneous devices and providing strategies and processes to enhance the network capabilities. The optimization of this important process of creating a truly dynamic network must be based on models that provide a standardization of communication patterns, protocols and technologies between the sensors. Despite standing as a major tendency today, many obstacles still arise when implementing an intelligent dynamic network. Existing models are not as widely adopted as expected and semantics are often not properly represented, hence resulting in complex and unsuitable configuration time. Thus, this work aims to understand the ideal models and ontologies to achieve proper architectures and semantic maps, which allow management and redundancy based on the information of the whole network, without compromising performance, and to develop a competent configuration of sensors to integrate in a contemporary industrial typical dynamic network

    Semantic IoT for reasoning and BigData analytics

    Get PDF
    Recent developments in the IoT industries have led to an increase in data availability that is starting to weight heavily on the traditional idea of pushing data to the Cloud. This study focuses on identifying tasks that can be pulled from the Cloud in a semantic stream processing context

    Innovative techniques for deployment of microservices in cloud-edge environment

    Get PDF
    PhD ThesisThe evolution of microservice architecture allows complex applications to be structured into independent modular components (microservices) making them easier to develop and manage. Complemented with containers, microservices can be deployed across any cloud and edge environment. Although containerized microservices are getting popular in industry, less research is available specially in the area of performance characterization and optimized deployment of microservices. Depending on the application type (e.g. web, streaming) and the provided functionalities (e.g. ltering, encryption/decryption, storage), microservices are heterogeneous with speci c functional and Quality of Service (QoS) requirements. Further, cloud and edge environments are also complex with a huge number of cloud providers and edge devices along with their host con gurations. Due to these complexities, nding a suitable deployment solution for microservices becomes challenging. To handle the deployment of microservices in cloud and edge environments, this thesis presents multilateral research towards microservice performance characterization, run-time evaluation and system orchestration. Considering a variety of applications, numerous algorithms and policies have been proposed, implemented and prototyped. The main contributions of this thesis are given below: Characterizes the performance of containerized microservices considering various types of interference in the cloud environment. Proposes and models an orchestrator, SDBO for benchmarking simple webapplication microservices in a multi-cloud environment. SDBO is validated using an e-commerce test web-application. Proposes and models an advanced orchestrator, GeoBench for the deployment of complex web-application microservices in a multi-cloud environment. GeoBench is validated using a geo-distributed test web-application. - i - Proposes and models a run-time deployment framework for distributed streaming application microservices in a hybrid cloud-edge environment. The model is validated using a real-world healthcare analytics use case for human activity recognition.
    corecore