923 research outputs found

    Capturing Topology in Graph Pattern Matching

    Get PDF
    Graph pattern matching is often defined in terms of subgraph isomorphism, an NP-complete problem. To lower its complexity, various extensions of graph simulation have been considered instead. These extensions allow pattern matching to be conducted in cubic-time. However, they fall short of capturing the topology of data graphs, i.e., graphs may have a structure drastically different from pattern graphs they match, and the matches found are often too large to understand and analyze. To rectify these problems, this paper proposes a notion of strong simulation, a revision of graph simulation, for graph pattern matching. (1) We identify a set of criteria for preserving the topology of graphs matched. We show that strong simulation preserves the topology of data graphs and finds a bounded number of matches. (2) We show that strong simulation retains the same complexity as earlier extensions of simulation, by providing a cubic-time algorithm for computing strong simulation. (3) We present the locality property of strong simulation, which allows us to effectively conduct pattern matching on distributed graphs. (4) We experimentally verify the effectiveness and efficiency of these algorithms, using real-life data and synthetic data.Comment: VLDB201

    An Analysis of Performance Interference Effects on Energy-Efficiency of Virtualized Cloud Environments

    Get PDF
    Co-allocated workloads in a virtualized computing environment often have to compete for resources, thereby suffering from performance interference. While this phenomenon has a direct impact on the Quality of Service provided to customers, it also changes the patterns of resource utilization and reduces the amount of work per Watt consumed. Unfortunately, there has been only limited research into how performance interference affects energy-efficiency of servers in such environments. In reality, there is a highly dynamic and complicated correlation among resource utilization, performance interference and energy-efficiency. This paper presents a comprehensive analysis that quantifies the negative impact of performance interference on the energy-efficiency of virtualized servers. Our analysis methodology takes into account the heterogeneous workload characteristics identified from a real Cloud environment. In particular, we investigate the impact due to different workload type combinations and develop a method for approximating the levels of performance interference and energy-efficiency degradation. The proposed method is based on profiles of pair combinations of existing workload types and the patterns derived from the analysis. Our experimental results reveal a non-linear relationship between the increase in interference and the reduction in energy-efficiency as well as an average precision within +/-5% of error margin for the estimation of both parameters. These findings provide vital information for research into dynamic trade-offs between resource utilization, performance, and energy-efficiency of a data center

    Cider: a Rapid Docker Container Deployment System through Sharing Network Storage

    Get PDF
    Container technology has been prevalent and widely-adopted in production environment considering the huge benefits to application packing, deploying and management. However, the deployment process is relatively slow by using conventional approaches. In large-scale concurrent deployments, resource contentions on the central image repository would aggravate such situation. In fact, it is observable that the image pulling operation is mainly responsible for the degraded performance. To this end, we propose Cider - a novel deployment system to enable rapid container deployment in a high concurrent and scalable manner at scale. Firstly, on-demand image data loading is proposed by altering the local Docker storage of worker nodes into all-nodes-sharing network storage. Also, the local copy-on-write layer for containers can ensure Cider to achieve the scalability whilst improving the cost-effectiveness during the holistic deployment. Experimental results reveal that Cider can shorten the overall deployment time by 85% and 62% on average when deploying one container and 100 concurrent containers respectively

    Improved energy-efficiency in cloud datacenters with interference-aware virtual machine placement

    Get PDF
    Virtualization is one of the main technologies used for improving resource efficiency in datacenters; it allows the deployment of co-existing computing environments over the same hardware infrastructure. However, the co-existing of environments ā€” along with management inefficiencies ā€” often creates scenarios of high-competition for resources between running workloads, leading to performance degradation. This phenomenon is known as Performance Interference, and introduces a non-negligible overhead that affects both a datacenter's Quality of Service and its energy-efficiency. This paper introduces a novel approach to workload allocation that improves energy-efficiency in Cloud datacenters by taking into account their workload heterogeneity. We analyze the impact of performance interference on energy-efficiency using workload characteristics identified from a real Cloud environment, and develop a model that implements various decision-making techniques intelligently to select the best workload host according to its internal interference level. Our experimental results show reductions in interference by 27.5% and increased energy-efficiency up to 15% in contrast to current mechanisms for workload allocation

    D^2PS: A Dependable Data Provisioning Service in Multi-tenant Cloud Environment

    Get PDF
    Software as a Service (SaaS) is a software delivery and business model widely used by Cloud computing. Instead of purchasing and maintaining a software suite permanently, customers only need to lease the software on-demand. The domain of high assurance distributed systems has focused greatly on the areas of fault tolerance and dependability. In a multi-tenant context, it is particularly important to store, manage and provision data services to customers in a highly efficient and dependable manner due to a large number of file operations involved in running such services. It is also desirable to allow a user group to share and cooperate (e.g., co-edit) on some specific data. In this paper we present a dependable data provisioning service in a multi-tenant Cloud environment. We describe a metadata management approach and leverage multiple replicated metadata caching to shorten the file access time, with the improved efficiency of data sharing. In order to reduce frequent data transmission and data access latency, we introduce a distributed cooperative disk cache mechanism that supports effective cache placement and pull-push cache synchronization. In addition, we use efficient component failover to enhance the service dependability whilst avoiding negative impact from system failures. Our experimental results show that our system can significantly reduce both unused data transmission and response latency. Specifically, over 50% network transmission and operational latency can be saved for random reads while 28.24% network traffic and 25% response latency can be reduced for random write operations. We believe that these findings are demonstrating positive results along the right direction of resolving storage-related challenges in a multi-tenant Cloud environment

    Perphon: a ML-based Agent for Workload Co-location via Performance Prediction and Resource Inference

    Get PDF
    Cluster administrators are facing great pressures to improve cluster utilization through workload co-location. Guaranteeing performance of long-running applications (LRAs), however, is far from settled as unpredictable interference across applications is catastrophic to QoS [2]. Current solutions such as [1] usually employ sandboxed and offline profiling for different workload combinations and leverage them to predict incoming interference. However, the time complexity restricts the applicability to complex co-locations. Hence, this issue entails a new framework to harness runtime performance and mitigate the time cost with machine intelligence: i) It is desirable to explore a quantitative relationship between allocated resource and consequent workload performance, not relying on analyzing interference derived from different workload combinations. The majority of works, however, depend on offline profiling and training which may lead to model aging problem. Moreover, multi-resource dimensions (e.g., LLC contention) that are not completely included by existing works but have impact on performance interference need to be considered [3]. ii) Workload co-location also necessitates fine-grained isolation and access control mechanism. Once performance degradation is detected, dynamic resource adjustment will be enforced and application will be assigned an access to specific slices of each resources. Inferring a "just enough" amount of resource adjustment ensures the application performance can be secured whilst improving cluster utilization. We present Perphon, a runtime agent on a per node basis, that decouples ML-based performance prediction and resource inference from centralized scheduler. Figure 1 outlines the proposed architecture. We initially exploit sensitivity of applications to multi-resources to establish performance prediction. To achieve this, Metric Monitor aggregates application fingerprint and system-level performance metrics including CPU, memory, Last Level Cache (LLC), memory bandwidth (MBW) and number of running threads, etc. They are enabled by Intel-RDT and precisely obtained from resource group manager. Perphon employs an Online Gradient Boost Regression Tree (OGBRT) approach to resolve model aging problem. Res-Perf Model warms up via offline learning that merely relies on a small volume of profiling in the early stage, but evolves with arrival of workloads. Consequently, parameters will be automatically updated and synchronized among agents. Anomaly Detector can timely pinpoint a performance degradation via LSTM time-series analysis and determine when and which application need to be re-allocated resources. Once abnormal performance counter or load is detected, Resource Inferer conducts a gradient ascend based inference to work out a proper slice of resources, towards dynamically recovering targeted performance. Upon receiving an updated re-allocation, Access Controller re-assigns a specific portion of the node resources to the affected application. Eventually, Isolation Executor enforces resource manipulation and ensures performance isolation across applications. Specifically, we use cgroup cpuset and memory subsystem to control usage of CPU and memory while leveraging Intel-RDT technology to underpin the manipulation of LLC and MBW. For fine-granularity management, we create different groups for LRA and batch jobs when the agent starts. Our prototype integration with Node Manager of Apache YARN shows that throughput of Kafka data-streaming application in Perphon is 2.0x and 1.82x times that of isolation execution schemes in native YARN and pure cgroup cpu subsystem

    The contribution of pre-symptomatic infection to the transmission dynamics of COVID-2019.

    Get PDF
    Background: Pre-symptomatic transmission can be a key determinant of the effectiveness of containment and mitigation strategies for infectious diseases, particularly if interventions rely on syndromic case finding. For COVID-19, infections in the absence of apparent symptoms have been reported frequently alongside circumstantial evidence for asymptomatic or pre-symptomatic transmission. We estimated the potential contribution of pre-symptomatic cases to COVID-19 transmission. Methods: Using the probability for symptom onset on a given day inferred from the incubation period, we attributed the serial interval reported from Shenzen, China, into likely pre-symptomatic and symptomatic transmission. We used the serial interval derived for cases isolated more than 6 days after symptom onset as the no active case finding scenario and the unrestricted serial interval as the active case finding scenario. We reported the estimate assuming no correlation between the incubation period and the serial interval alongside a range indicating alternative assumptions of positive and negative correlation. Results: We estimated that 23% (range accounting for correlation: 12 - 28%) of transmissions in Shenzen may have originated from pre-symptomatic infections. Through accelerated case isolation following symptom onset, this percentage increased to 46% (21 - 46%), implying that about 35% of secondary infections among symptomatic cases have been prevented. These results were robust to using reported incubation periods and serial intervals from other settings. Conclusions: Pre-symptomatic transmission may be essential to consider for containment and mitigation strategies for COVID-19

    ScalaRDF: A Distributed, Elastic and Scalable In-Memory RDF Triple Store

    Get PDF
    The Resource Description Framework (RDF) andSPARQL query language are gaining increasing popularity andacceptance. The ever-increasing RDF data has reached a billionscale of triples, resulting in the proliferation of distributed RDFstore systems within the Semantic Web community. However, theelasticity and performance issues are still far from settled inface of data volume explosion and workload spike. In addition, providers face great pressures to provision uninterrupted reliablestorage service whilst reducing the operational costs due to avariety of system failures. Therefore, how to efficiently realizesystem fault tolerance remains an intractable problem. In this paper, we introduce ScalaRDF, a distributed and elastic in-memoryRDF triple store to provision a fault-tolerant and scalable RDFstore and query mechanism. Specifically, we describe a consistenthashing protocol that optimizes the RDF data placement, dataoperations (especially for online RDF triple update operations)and achieves an autonomously elastic data re-distribution in theevent of cluster node joining or departing, avoiding the holisticoscillation of data storage. In addition, the data store is ableto realize a rapid and transparent failover through replicationmechanism which stores in-memory data replica in the next hashhop. The experiments demonstrate that query time and updatetime are reduced by 87% and 90% respectively compared to otherapproaches. For an 18G source dataset, the data redistributiontakes at most 60 seconds when system scales out and at most 100seconds for recovery when nodes undergo crash-stop failures

    Performance-Aware Speculative Resource Oversubscription for Large-Scale Clusters

    Get PDF
    It is a long-standing challenge to achieve a high degree of resource utilization in cluster scheduling. Resource oversubscription has become a common practice in improving resource utilization and cost reduction. However, current centralized approaches to oversubscription suffer from the issue with resource mismatch and fail to take into account other performance requirements, e.g., tail latency. In this article we present ROSE, a new resource management platform capable of conducting performance-aware resource oversubscription. ROSE allows latency-sensitive long-running applications (LRAs) to co-exist with computation-intensive batch jobs. Instead of waiting for resource allocation to be confirmed by the centralized scheduler, job managers in ROSE can independently request to launch speculative tasks within specific machines according to their suitability for oversubscription. Node agents of those machines can however, avoid any excessive resource oversubscription by means of a mechanism for admission control using multi-resource threshold control and performance-aware resource throttle. Experiments show that in case of mixed co-location of batch jobs and latency-sensitive LRAs, the CPU utilization and the disk utilization can reach 56.34 and 43.49 percent, respectively, but the 95th percentile of read latency in YCSB workloads only increases by 5.4 percent against the case of executing the LRAs alone

    Chronic cough and esomeprazole: A double-blind placebo-controlled parallel study

    Get PDF
    Background and objective: Gastro-oesophageal reflux has been implicated in the pathogenesis of chronic cough. Guidelines on management suggest a therapeutic trial of anti-reflux medication. Esomeprazole is a proton pump inhibitor licensed for the long-term treatment of acid reflux in adults and we compared the effects of esomeprazole and placebo on patients with chronic cough. Methods: This was a prospective, single-centre, randomized, double-blind, placebo-controlled, parallel group study conducted over 8 weeks. Fifty adult non-smokers with chronic cough and normal spirometry were randomized. Patients completed cough-related quality-of-life and symptom questionnaires and subjective scores of cough frequency and severity at the beginning and end of the study. They also kept a daily diary of symptom scores. Citric acid cough challenge and laryngoscopic examination were performed at baseline and the end of the study. The primary outcome was improvement in cough score. Results: There were no differences in cough scores in the placebo and treatment arms of the study although some significant improvements were noted when compared to baseline. In the cough diary scores there was a trend towards greater improvement in the treatment arm in patients with dyspepsia. Conclusions: Esomeprazole did not have a clinically important effect greater than placebo in patients with cough. It suggests a marked placebo effect in the treatment of cough. There is paucity of evidence on which to base the treatment of reflux-associated cough. We demonstrate that acid suppressive therapy does not lead to a significant clinical effect in these patients. There may be some improvement in those with coexisting dyspeptic symptoms and therapy should be restricted to this group. Ā© 2011 Asian Pacific Society of Respirology
    • ā€¦
    corecore