16 research outputs found

    Fault Tolerance for Stream Programs on Parallel Platforms

    Get PDF
    A distributed system is defined as a collection of autonomous computers connected by a network, and with the appropriate distributed software for the system to be seen by users as a single entity capable of providing computing facilities. Distributed systems with centralised control have a distinguished control node, called leader node. The main role of a leader node is to distribute and manage shared resources in a resource-efficient manner. A distributed system with centralised control can use stream processing networks for communication. In a stream processing system, applications typically act as continuous queries, ingesting data continuously, analyzing and correlating the data, and generating a stream of results. Fault tolerance is the ability of a system to process the information, even if it happens any failure or anomaly in the system. Fault tolerance has become an important requirement for distributed systems, due to the possibility of failure has currently risen to the increase in number of nodes and the runtime of applications in distributed system. Therefore, to resolve this problem, it is important to add fault tolerance mechanisms order to provide the internal capacity to preserve the execution of the tasks despite the occurrence of faults. If the leader on a centralised control system fails, it is necessary to elect a new leader. While leader election has received a lot of attention in message-passing systems, very few solutions have been proposed for shared memory systems, as we propose. In addition, rollback-recovery strategies are important fault tolerance mechanisms for distributed systems, since that it is based on storing information into a stable storage in failure-free state and when a failure affects a node, the system uses the information stored to recover the state of the node before the failure appears. In this thesis, we are focused on creating two fault tolerance mechanisms for distributed systems with centralised control that uses stream processing for communication. These two mechanism created are leader election and log-based rollback-recovery, implemented using LPEL. The leader election method proposed is based on an atomic Compare-And-Swap (CAS) instruction, which is directly available on many processors. Our leader election method works with idle nodes, meaning that only the non-busy nodes compete to become the new leader while the busy nodes can continue with their tasks and later update their leader reference. Furthermore, this leader election method has short completion time and low space complexity. The log-based rollback-recovery method proposed for distributed systems with stream processing networks is a novel approach that is free from domino effect and does not generate orphan messages accomplishing the always-no-orphans consistency condition. Additionally, this approach has lower overhead impact into the system compared to other approaches, and it is a mechanism that provides scalability, because it is insensitive to the number of nodes in the system

    Real-time power cycling in video on demand data centres using online Bayesian prediction

    Get PDF
    Energy usage in data centres continues to be a major and growing concern as an increasing number of everyday services depend on these facilities. Research in this area has examined topics including power smoothing using batteries and deep learning to control cooling systems, in addition to optimisation techniques for the software running inside data centres. We present a novel real-time power-cycling architecture, supported by a media distribution approach and online prediction model, to automatically determine when servers are needed based on demand. We demonstrate with experimental evaluation that this approach can save up to 31% of server energy in a cluster. Our evaluation is conducted on typical rack mount servers in a data centre testbed and uses a recent real-world workload trace from the BBC iPlayer, an extremely popular video on demand service in the UK

    Improving Spark Application Throughput Via Memory Aware Task Co-location:A Mixture of Experts Approach

    Get PDF
    Data analytic applications built upon big data processing frameworks such as Apache Spark are an important class of applications. Many of these applications are not latency-sensitive and thus can run as batch jobs in data centers. By running multiple applications on a computing host, task co-location can significantly improve the server utilization and system throughput. However, effective task co-location is a non-trivial task, as it requires an understanding of the computing resource requirement of the co-running applications, in order to determine what tasks, and how many of them, can be co-located. State-of-the-art co-location schemes either require the user to supply the resource demands which are often far beyond what is needed; or use a one-size-fits-all function to estimate the requirement, which, unfortunately, is unlikely to capture the diverse behaviors of applications. In this paper, we present a mixture-of-experts approach to model the memory behavior of Spark applications. We achieve this by learning, off-line, a range of specialized memory models on a range of typical applications; we then determine at runtime which of the memory models, or experts, best describes the memory behavior of the target application. We show that by accurately estimating the resource level that is needed, a co-location scheme can effectively determine how many applications can be co-located on the same host to improve the system throughput, by taking into consideration the memory and CPU requirements of co-running application tasks. Our technique is applied to a set of representative data analytic applications built upon the Apache Spark framework. We evaluated our approach for system throughput and average normalized turnaround time on a multi-core cluster. Our approach achieves over 83.9% of the performance delivered using an ideal memory predictor. We obtain, on average, 8.69x improvement on system throughput and a 49% reduction on turnaround time over executing application tasks in isolation, which translates to a 1.28x and 1.68x improvement over a state-of-the-art co-location scheme for system throughput and turnaround time respectively

    An Ontological Architecture for Principled and Automated System of Systems Composition

    Get PDF
    A distributed system's functionality must continuously evolve, especially when environmental context changes. Such required evolution imposes unbearable complexity on system development. An alternative is to make systems able to self-adapt by opportunistically composing at runtime to generate systems of systems (SoSs) that offer value-added functionality. The success of such an approach calls for abstracting the heterogeneity of systems and enabling the programmatic construction of SoSs with minimal developer intervention. We propose a general ontology-based approach to describe distributed systems, seeking to achieve abstraction and enable runtime reasoning between systems. We also propose an architecture for systems that utilize such ontologies to enable systems to discover and `understand' each other, and potentially compose, all at runtime. We detail features of the ontology and the architecture through two contrasting case studies. We also quantitatively evaluate the scalability and validity of our approach through experiments and simulations. Our approach enables system developers to focus on high-level SoS composition without being tied down with the specific deployment-specific implementation details

    Adaptive Deep Learning Model Selection on Embedded Systems

    Get PDF
    The recent ground-breaking advances in deep learning networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-limited embedded devices. Offloading the computation into the cloud is often infeasible due to privacy concerns, high latency, or the lack of connectivity. As such, there is a critical need to find a way to effectively execute the DNN models locally on the devices. This paper presents an adaptive scheme to determine which DNN model to use for a given input, by considering the desired accuracy and inference time. Our approach employs machine learning to develop a predictive model to quickly select a pre-trained DNN to use for a given input and the optimization constraint. We achieve this by first training off-line a predictive model, and then use the learnt model to select a DNN model to use for new, unseen inputs. We apply our approach to the image classification task and evaluate it on a Jetson TX2 embedded deep learning platform using the ImageNet ILSVRC 2012 validation dataset. We consider a range of influential DNN models. Experimental results show that our approach achieves a 7.52% improvement in inference accuracy, and a 1.8x reduction in inference time over the most-capable, single DNN model

    Artificial Neural Network (ANN) as a tool to reduce human-animal interaction improves Senegalese sole production

    Get PDF
    El artículo pertenece al special issue Big Data Analysis in Biomolecular Research, Bioinformatics, and Systems Biology with Complex Networks and Multi-Label Machine Learning Models[EN] Manipulation is usually required for biomass calculation and food estimation for optimal fish growth in production facilities. However, the advances in computer-based systems have opened a new range of applied possibilities. In this study we used image analysis and a neural network algorithm that allowed us to successfully provide highly accurate biomass data. This developed system allowed us to compare the effects of reduced levels of human-animal interaction on the culture of adult Senegalese sole (Solea senegalensis) in terms of body weight gain. For this purpose, 30 adult fish were split into two homogeneous groups formed by three replicates (n = 5) each: a control group (CTRL), which was standard manipulated and an experimental group (EXP), which was maintained under a lower human-animal interaction culture using our system for biomass calculation. Visible implant elastomer was, for the first time, applied as tagging technology for tracking soles during the experiment (four months). The experimental group achieved a statistically significant weight gain (p < 0.0100) while CTRL animals did not report a statistical before-after weight increase. Individual body weight increment was lower (p < 0.0100) in standard-handled animals. In conclusion, our experimental approach provides evidence that our developed system for biomass calculation, which implies lower human-animal interaction, improves biomass gain in Senegalese sole individuals in a short period of time.S

    Principled and automated system of systems composition using an ontological architecture

    Get PDF
    A distributed system’s functionality must continuously evolve, especially when environmental context changes. Such required evolution imposes unbearable complexity on system development. An alternative is to make systems able to self-adapt by opportunistically composing at runtime to generate systems of systems (SoSs) that offer value-added functionality. The success of such an approach calls for abstracting the heterogeneity of systems and enabling the programmatic construction of SoSs with minimal developer intervention. We propose a general ontology-based approach to describe distributed systems, seeking to achieve abstraction and enable runtime reasoning between systems. We also propose an architecture for systems that utilizes such ontologies to enable systems to discover and ‘understand’ each other, and potentially compose, all at runtime. We detail features of the ontology and the architecture through three contrasting case studies: one on controlling multiple systems in smart home environment, another on the management of dynamic computing clusters, and a third on autonomic connection of rescue teams. We also quantitatively evaluate the scalability and validity of our approach through experiments and simulations. Our approach enables system developers to focus on high-level SoS composition without being constrained by deployment-specific implementation details. We demonstrate the feasibility of our approach to raise the level of abstraction of SoS construction through reasoned composition at runtime. Our architecture presents a strong foundation for further work due to its generality and extensibility

    Very low frequency Syndromes

    Get PDF
    Dismorfología, Citogenética y Clínica: Resultados de estudios sobre los datos del ECEMCThe aim of this chapter is to summarize updated knowledge about the clinical characteristics, etiology, genetic and molecular aspects, as well as mechanisms involved in syndromes having very low frequency, in order to promote their better recognition. During the last five years, a total of 30 syndromes have been published in this chapter of the Boletín del ECEMC. This issue includes the following selected syndromes: Crouzon, Pfeiffer, Apert, Saethre-Chotzen, Carpenter and Muenke. All share craniosynostosis as the main clinical feature but also present with other birth defects, the most important being limb malformations, specially syndactyly and polydactyly. Over 100 syndromes with craniosynostosis have been described, usually involving multiple sutures, and several of them are associated with limb malformations. The clinical overlapping between those syndromes makes difficult to perform a neonatal diagnosis, based on their clinical findings. However, molecular genetic testing, specifically of the FRGR1-3 and TWIST1 genes, could help to establish the diagnosis of some of them. Early diagnosis is important for establishing the most suitable treatment for each patient, as well as to offer an accurate genetic counselling and the possibility of preimplantational and/or prenatal diagnosis.N

    Ruxolitinib for Glucocorticoid-Refractory Acute Graft-versus-Host Disease

    Get PDF
    BACKGROUND: Acute graft-versus-host disease (GVHD) remains a major limitation of allogeneic stem-cell transplantation; not all patients have a response to standard glucocorticoid treatment. In a phase 2 trial, ruxolitinib, a selective Janus kinase (JAK1 and JAK2) inhibitor, showed potential efficacy in patients with glucocorticoid-refractory acute GVHD. METHODS: We conducted a multicenter, randomized, open-label, phase 3 trial comparing the efficacy and safety of oral ruxolitinib (10 mg twice daily) with the investigator's choice of therapy from a list of nine commonly used options (control) in patients 12 years of age or older who had glucocorticoid-refractory acute GVHD after allogeneic stem-cell transplantation. The primary end point was overall response (complete response or partial response) at day 28. The key secondary end point was durable overall response at day 56. RESULTS: A total of 309 patients underwent randomization; 154 patients were assigned to the ruxolitinib group and 155 to the control group. Overall response at day 28 was higher in the ruxolitinib group than in the control group (62% [96 patients] vs. 39% [61]; odds ratio, 2.64; 95% confidence interval [CI], 1.65 to 4.22; P<0.001). Durable overall response at day 56 was higher in the ruxolitinib group than in the control group (40% [61 patients] vs. 22% [34]; odds ratio, 2.38; 95% CI, 1.43 to 3.94; P<0.001). The estimated cumulative incidence of loss of response at 6 months was 10% in the ruxolitinib group and 39% in the control group. The median failure-free survival was considerably longer with ruxolitinib than with control (5.0 months vs. 1.0 month; hazard ratio for relapse or progression of hematologic disease, non-relapse-related death, or addition of new systemic therapy for acute GVHD, 0.46; 95% CI, 0.35 to 0.60). The median overall survival was 11.1 months in the ruxolitinib group and 6.5 months in the control group (hazard ratio for death, 0.83; 95% CI, 0.60 to 1.15). The most common adverse events up to day 28 were thrombocytopenia (in 50 of 152 patients [33%] in the ruxolitinib group and 27 of 150 [18%] in the control group), anemia (in 46 [30%] and 42 [28%], respectively), and cytomegalovirus infection (in 39 [26%] and 31 [21%]). CONCLUSIONS: Ruxolitinib therapy led to significant improvements in efficacy outcomes, with a higher incidence of thrombocytopenia, the most frequent toxic effect, than that observed with control therapy

    AICO, Artificial Intelligent COach

    Get PDF
    Choosing effective strategies before playing against an opponent team is a laborious task and one of the main challenges that American football coaches have to cope with. For this reason, we have developed an artificial intelligent American football coach (AICO), a novel system which helps coaches to decide the best defensive strategies to be used against an opponent. Similar to coaches who prepare a winning game plan based on their vast experience and previously obtained opponents’ statistics, AICO uses power of machine learning and video analysis. Tracking every player of the last recorded matches of the opponent team, AICO learns the strategies used by them and then calculates how successfully their own defensive strategies will perform against them. We have used 7350 videos in our experiments obtaining that AICO can recognize the opponent’s strategies with about 93% accuracy and provides the successful rate of each strategy to be used against them with 94% accuracy
    corecore