125 research outputs found

    Big Data Application and System Co-optimization in Cloud and HPC Environment

    Get PDF
    The emergence of big data requires powerful computational resources and memory subsystems that can be scaled efficiently to accommodate its demands. Cloud is a new well-established computing paradigm that can offer customized computing and memory resources to meet the scalable demands of big data applications. In addition, the flexible pay-as-you-go pricing model offers opportunities for using large scale of resources with low cost and no infrastructure maintenance burdens. High performance computing (HPC) on the other hand also has powerful infrastructure that has potential to support big data applications. In this dissertation, we explore the application and system co-optimization opportunities to support big data in both cloud and HPC environments. Specifically, we explore the unique features of both application and system to seek overlooked optimization opportunities or tackle challenges that are difficult to be addressed by only looking at the application or system individually. Based on the characteristics of the workloads and their underlying systems to derive the optimized deployment and runtime schemes, we divide the workflow into four categories: 1) memory intensive applications; 2) compute intensive applications; 3) both memory and compute intensive applications; 4) I/O intensive applications.When deploying memory intensive big data applications to the public clouds, one important yet challenging problem is selecting a specific instance type whose memory capacity is large enough to prevent out-of-memory errors while the cost is minimized without violating performance requirements. In this dissertation, we propose two techniques for efficient deployment of big data applications with dynamic and intensive memory footprint in the cloud. The first approach builds a performance-cost model that can accurately predict how, and by how much, virtual memory size would slow down the application and consequently, impact the overall monetary cost. The second approach employs a lightweight memory usage prediction methodology based on dynamic meta-models adjusted by the application's own traits. The key idea is to eliminate the periodical checkpointing and migrate the application only when the predicted memory usage exceeds the physical allocation. When applying compute intensive applications to the clouds, it is critical to make the applications scalable so that it can benefit from the massive cloud resources. In this dissertation, we first use the Kirchhoff law, which is one of the most widely used physical laws in many engineering principles, as an example workload for our study. The key challenge of applying the Kirchhoff law to real-world applications at scale lies in the high, if not prohibitive, computational cost to solve a large number of nonlinear equations. In this dissertation, we propose a high-performance deep-learning-based approach for Kirchhoff analysis, namely HDK. HDK employs two techniques to improve the performance: (i) early pruning of unqualified input candidates which simplify the equation and select a meaningful input data range; (ii) parallelization of forward labelling which execute steps of the problem in parallel. When it comes to both memory and compute intensive applications in clouds, we use blockchain system as a benchmark. Existing blockchain frameworks exhibit a technical barrier for many users to modify or test out new research ideas in blockchains. To make it worse, many advantages of blockchain systems can be demonstrated only at large scales, which are not always available to researchers. In this dissertation, we develop an accurate and efficient emulating system to replay the execution of large-scale blockchain systems on tens of thousands of nodes in the cloud. For I/O intensive applications, we observe one important yet often neglected side effect of lossy scientific data compression. Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds, but the compressed data size is often highly skewed and thus impact the performance of parallel I/O. Therefore, we believe it is critical to pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression

    Robust, Resilient and Reliable Architecture for V2X Communication

    Get PDF
    The new developments in mobile edge computing (MEC) and vehicle-to-everything (V2X) communications has positioned 5G and beyond in a strong position to answer the market need towards future emerging intelligent transportation systems and smart city applications. The major attractive features of V2X communication is the inherent ability to adapt to any type of network, device, or data, and to ensure robustness, resilience and reliability of the network, which is challenging to realize. In this work, we propose to drive these further these features by proposing a novel robust, resilient and reliable architecture for V2X communication based on harnessing MEC and blockchain technology. A three stage computing service is proposed. Firstly, a hierarchcial computing architecture is deployed spanning over the vehicular network that constitutes cloud computing (CC), edge computing (EC), fog computing (FC) nodes. The resources and data bases can migrate from the high capacity cloud services (furthest away from the individual node of the network) to the edge (medium) and low level fog node, according to computing service requirements. Secondly, the resource allocation filters the data according to its significance, and rank the nodes according to their usability, and selects the network technology according to their physical channel characteristics. Thirdly, we propose a blockchain-based transaction service that ensures reliability. We discussed two use cases for experimental analysis, plug- in electric vehicles in smart grid scenarios, and massive IoT data services for autonomous cars. The results show that car connectivity prediction is accurate 98% of the times, where 92% more data blocks are added using micro-blockchain solution compared to the public blockchain, where it is able to reduce the time to sign and compute the proof-of-work (PoW), and deliver a low-overhead Proof-of-Stake (PoS) consensus mechanism. This approach can be considered a strong candidate architecture for future V2X, and with more general application for everything- to-everything (X2X) communications

    Report from GI-Dagstuhl Seminar 16394: Software Performance Engineering in the DevOps World

    Get PDF
    This report documents the program and the outcomes of GI-Dagstuhl Seminar 16394 "Software Performance Engineering in the DevOps World". The seminar addressed the problem of performance-aware DevOps. Both, DevOps and performance engineering have been growing trends over the past one to two years, in no small part due to the rise in importance of identifying performance anomalies in the operations (Ops) of cloud and big data systems and feeding these back to the development (Dev). However, so far, the research community has treated software engineering, performance engineering, and cloud computing mostly as individual research areas. We aimed to identify cross-community collaboration, and to set the path for long-lasting collaborations towards performance-aware DevOps. The main goal of the seminar was to bring together young researchers (PhD students in a later stage of their PhD, as well as PostDocs or Junior Professors) in the areas of (i) software engineering, (ii) performance engineering, and (iii) cloud computing and big data to present their current research projects, to exchange experience and expertise, to discuss research challenges, and to develop ideas for future collaborations

    Enhancing data privacy and security in Internet of Things through decentralized models and services

    Get PDF
    exploits a Byzantine Fault Tolerant (BFT) blockchain, in order to perform collaborative and dynamic botnet detection by collecting and auditing IoT devices\u2019 network traffic flows as blockchain transactions. Secondly, we take the challenge to decentralize IoT, and design a hybrid blockchain architecture for IoT, by proposing Hybrid-IoT. In Hybrid-IoT, subgroups of IoT devices form PoW blockchains, referred to as PoW sub-blockchains. Connection among the PoW sub-blockchains employs a BFT inter-connector framework. We focus on the PoW sub-blockchains formation, guided by a set of guidelines based on a set of dimensions, metrics and bounds

    A study on ĐApps characteristics

    Get PDF
    Repositories are important indicators for liveness and maturity in software development communities. They host user-facing applications or re-usable artefacts to build such applications. While rarely decentralised themselves, they are important for hosting code for decentralised applications. In this study, we investigate public repositories dedicated to decentralised applications, or DApps, -- executing on heterogeneous blockchain platforms. The study is the first to report aggregated metrics on the repository-level and application-level characteristics including -- DApps metadata, associated smart contracts composition and inconsistencies between repositories in both schema and content. The main contributions are data acquisition tools and an evolving public dataset along with an initial analysis to derive key metrics in a reproducible way. Insights provided encompass the dominance of Ethereum, the absence of smart contracts for a significant portion of applications, and unused application advertisement potential by absence from popular repositories. The insights can be exploited by developers to build high-quality and highly popular applications and set up corresponding quality checks

    Enhancing data privacy and security in Internet of Things through decentralized models and services

    Get PDF
    exploits a Byzantine Fault Tolerant (BFT) blockchain, in order to perform collaborative and dynamic botnet detection by collecting and auditing IoT devices’ network traffic flows as blockchain transactions. Secondly, we take the challenge to decentralize IoT, and design a hybrid blockchain architecture for IoT, by proposing Hybrid-IoT. In Hybrid-IoT, subgroups of IoT devices form PoW blockchains, referred to as PoW sub-blockchains. Connection among the PoW sub-blockchains employs a BFT inter-connector framework. We focus on the PoW sub-blockchains formation, guided by a set of guidelines based on a set of dimensions, metrics and bounds
    corecore