2,052 research outputs found

    DRAGON: Decentralized fault tolerance in edge federations

    Get PDF
    Edge Federation is a new computing paradigm that seamlessly interconnects the resources of multiple edge service providers. A key challenge in such systems is the deployment of latency-critical and AI based resource-intensive applications in constrained devices. To address this challenge, we propose a novel memory-efficient deep learning based model, namely generative optimization networks (GON). Unlike GANs, GONs use a single network to both discriminate input and generate samples, significantly reducing their memory footprint. Leveraging the low memory footprint of GONs, we propose a decentralized fault-tolerance method called DRAGON that runs simulations (as per a digital modeling twin) to quickly predict and optimize the performance of the edge federation. Extensive experiments with real-world edge computing benchmarks on multiple Raspberry-Pi based federated edge configurations show that DRAGON can outperform the baseline methods in fault-detection and Quality of Service (QoS) metrics. Specifically, the proposed method gives higher F1 scores for fault-detection than the best deep learning (DL) method, while consuming lower memory than the heuristic methods. This allows for improvement in energy consumption, response time and service level agreement violations by up to 74, 63 and 82 percent, respectively

    An SOA-Based Framework of Computational Offloading for Mobile Cloud Computing

    Get PDF
    Mobile Computing is a technology that allows transmission of audio, video, and other types of data via a computer or any other wireless-enabled device without having to be connected to a fixed physical link. Despite increasing usage of mobile computing, exploiting its full potential is difficult due to its inherent problems such as resource scarcity, connection instability, and limited computational power. In particular, the advent of connecting mobile devices to the internet offers the possibility of offloading computation and data intensive tasks from mobile devices to remote cloud servers for efficient execution. This proposed thesis develops an algorithm that uses an objective function to adaptively decide strategies for computational offloading according to changing context information. By following the style of Service-Oriented Architecture (SOA), the proposed framework brings cloud computing to mobile devices for mobile applications to benefit from remote execution of tasks in the cloud. This research discusses the algorithm and framework, along with the results of the experiments with a newly developed system for self-driving vehicles and points out the anticipated advantages of Adaptive Computational Offloading

    Architecture for Fault Tolerance in Mobile Cloud Computing using Disease Resistance Approach

    Get PDF
    The mobile cloud computing (MCC) is one of the emerging fields in the distributed computing. MCC is an integration of both mobile computing and cloud computing. The limitations of the mobile devices are storage, battery and processing proficiency.These sensitive characteristics of mobile devices can be effectively handled with the introduction of cloud computing. The increasing functionality of the cloud and complexity of the applications causes resource failures in the cloud computing and it reduces the overall performance of the MCC environment. On the other hand, the existing approaches for resource scheduling in MCC proposed several architectures and they are only concentrated on the allocation of resources. The existing architectures are lack of fault tolerance mechanism to handle the faulty resources. To overcome the issues stated above, this paper proposes architecture for fault tolerance in MCC using Disease Resistance approach (DRFT). The main aim of the DRFT approach is to effectively handle the faultyVMs in the MCC. This DRFT approach utilizes the human disease resistance mechanism which is used as materials and methods in the proposed model. The DRFT is capable of identifying the faulty virtual machines and reschedules the tasks to the identified suitable virtual machines. This procedure ultimately leads to minimization of makespan value and it improves the overall performance of the scheduling process. To validate the effectiveness of the proposed approach, a series of simulations has been carried out using CloudSim simulator. The performance of the proposed DRFT approach is compared with the Dynamic group based fault tolerance approach (DGFT-approach). The makespan value of DRFT is reduced to 7% and the performance of DRFT is increased when compare to the DGFT approach. The experimental results show the effectiveness of the proposed approach

    Developing Real-Time Emergency Management Applications: Methodology for a Novel Programming Model Approach

    Get PDF
    The last years have been characterized by the arising of highly distributed computing platforms composed of a heterogeneity of computing and communication resources including centralized high-performance computing architectures (e.g. clusters or large shared-memory machines), as well as multi-/many-core components also integrated into mobile nodes and network facilities. The emerging of computational paradigms such as Grid and Cloud Computing, provides potential solutions to integrate such platforms with data systems, natural phenomena simulations, knowledge discovery and decision support systems responding to a dynamic demand of remote computing and communication resources and services. In this context time-critical applications, notably emergency management systems, are composed of complex sets of application components specialized for executing specific computations, which are able to cooperate in such a way as to perform a global goal in a distributed manner. Since the last years the scientific community has been involved in facing with the programming issues of distributed systems, aimed at the definition of applications featuring an increasing complexity in the number of distributed components, in the spatial distribution and cooperation between interested parties and in their degree of heterogeneity. Over the last decade the research trend in distributed computing has been focused on a crucial objective. The wide-ranging composition of distributed platforms in terms of different classes of computing nodes and network technologies, the strong diffusion of applications that require real-time elaborations and online compute-intensive processing as in the case of emergency management systems, lead to a pronounced tendency of systems towards properties like self-managing, self-organization, self-controlling and strictly speaking adaptivity. Adaptivity implies the development, deployment, execution and management of applications that, in general, are dynamic in nature. Dynamicity concerns the number and the specific identification of cooperating components, the deployment and composition of the most suitable versions of software components on processing and networking resources and services, i.e., both the quantity and the quality of the application components to achieve the needed Quality of Service (QoS). In time-critical applications the QoS specification can dynamically vary during the execution, according to the user intentions and the Developing Real-Time Emergency Management Applications: Methodology for a Novel Programming Model Approach Gabriele Mencagli and Marco Vanneschi Department of Computer Science, University of Pisa, L. Bruno Pontecorvo, Pisa Italy 2 2 Will-be-set-by-IN-TECH information produced by sensors and services, as well as according to the monitored state and performance of networks and nodes. The general reference point for this kind of systems is the Grid paradigm which, by definition, aims to enable the access, selection and aggregation of a variety of distributed and heterogeneous resources and services. However, though notable advancements have been achieved in recent years, current Grid technology is not yet able to supply the needed software tools with the features of high adaptivity, ubiquity, proactivity, self-organization, scalability and performance, interoperability, as well as fault tolerance and security, of the emerging applications. For this reason in this chapter we will study a methodology for designing high-performance computations able to exploit the heterogeneity and dynamicity of distributed environments by expressing adaptivity and QoS-awareness directly at the application level. An effective approach needs to address issues like QoS predictability of different application configurations as well as the predictability of reconfiguration costs. Moreover adaptation strategies need to be developed assuring properties like the stability degree of a reconfiguration decision and the execution optimality (i.e. select reconfigurations accounting proper trade-offs among different QoS objectives). In this chapter we will present the basic points of a novel approach that lays the foundations for future programming model environments for time-critical applications such as emergency management systems. The organization of this chapter is the following. In Section 2 we will compare the existing research works for developing adaptive systems in critical environments, highlighting their drawbacks and inefficiencies. In Section 3, in order to clarify the application scenarios that we are considering, we will present an emergency management system in which the run-time selection of proper application configuration parameters is of great importance for meeting the desired QoS constraints. In Section 4we will describe the basic points of our approach in terms of how compute-intensive operations can be programmed, how they can be dynamically modified and how adaptation strategies can be expressed. In Section 5 our approach will be contextualize to the definition of an adaptive parallel module, which is a building block for composing complex and distributed adaptive computations. Finally in Section 6 we will describe a set of experimental results that show the viability of our approach and in Section 7 we will give the concluding remarks of this chapter
    • …
    corecore