Search CORE

279 research outputs found

The Architecture of an Autonomic, Resource-Aware, Workstation-Based Distributed Database System

Author: Macdonald Angus
Publication venue
Publication date: 19/07/2012
Field of study

Distributed software systems that are designed to run over workstation machines within organisations are termed workstation-based. Workstation-based systems are characterised by dynamically changing sets of machines that are used primarily for other, user-centric tasks. They must be able to adapt to and utilize spare capacity when and where it is available, and ensure that the non-availability of an individual machine does not affect the availability of the system. This thesis focuses on the requirements and design of a workstation-based database system, which is motivated by an analysis of existing database architectures that are typically run over static, specially provisioned sets of machines. A typical clustered database system -- one that is run over a number of specially provisioned machines -- executes queries interactively, returning a synchronous response to applications, with its data made durable and resilient to the failure of machines. There are no existing workstation-based databases. Furthermore, other workstation-based systems do not attempt to achieve the requirements of interactivity and durability, because they are typically used to execute asynchronous batch processing jobs that tolerate data loss -- results can be re-computed. These systems use external servers to store the final results of computations rather than workstation machines. This thesis describes the design and implementation of a workstation-based database system and investigates its viability by evaluating its performance against existing clustered database systems and testing its availability during machine failures.Comment: Ph.D. Thesi

arXiv.org e-Print Archive

St Andrews Research Repository

Boosting big data streaming applications in clouds with burstFlow

Author: Anjos Julio Cesar Santos dos
Freitas Edison Pignaton de
Geyer Claudio Fernando Resin
Leithardt Valderi Reis Quietinho
Matteussi Kassiano José
Murciego Álvaro Lozano
Souza Junior Paulo Ricardo Rodrigues de
Veith Alexandre da Silva
Zanchetta Breno Fanchiotti
Publication venue
Publication date: 01/01/2020
Field of study

The rapid growth of stream applications in financial markets, health care, education, social media, and sensor networks represents a remarkable milestone for data processing and analytic in recent years, leading to new challenges to handle Big Data in real-time. Traditionally, a single cloud infrastructure often holds the deployment of Stream Processing applications because it has extensive and adaptative virtual computing resources. Hence, data sources send data from distant and different locations of the cloud infrastructure, increasing the application latency. The cloud infrastructure may be geographically distributed and it requires to run a set of frameworks to handle communication. These frameworks often comprise a Message Queue System and a Stream Processing Framework. The frameworks explore Multi-Cloud deploying each service in a different cloud and communication via high latency network links. This creates challenges to meet real-time application requirements because the data streams have different and unpredictable latencies forcing cloud providers' communication systems to adjust to the environment changes continually. Previous works explore static micro-batch demonstrating its potential to overcome communication issues. This paper introduces BurstFlow, a tool for enhancing communication across data sources located at the edges of the Internet and Big Data Stream Processing applications located in cloud infrastructures. BurstFlow introduces a strategy for adjusting the micro-batch sizes dynamically according to the time required for communication and computation. BurstFlow also presents an adaptive data partition policy for distributing incoming streams across available machines by considering memory and CPU capacities. The experiments use a real-world multi-cloud deployment showing that BurstFlow can reduce the execution time up to 77% when compared to the state-of-the-art solutions, improving CPU efficiency by up to 49%

Lume 5.8

Un environnement pour le calcul intensif pair à pair

Author: Nguyen The Tung
Publication venue: INPT
Publication date: 16/11/2011
Field of study

Le concept de pair à pair (P2P) a connu récemment de grands développements dans les domaines du partage de fichiers, du streaming vidéo et des bases de données distribuées. Le développement du concept de parallélisme dans les architectures de microprocesseurs et les avancées en matière de réseaux à haut débit permettent d'envisager de nouvelles applications telles que le calcul intensif distribué. Cependant, la mise en oeuvre de ce nouveau type d'application sur des réseaux P2P pose de nombreux défis comme l'hétérogénéité des machines, le passage à l'échelle et la robustesse. Par ailleurs, les protocoles de transport existants comme TCP et UDP ne sont pas bien adaptés à ce nouveau type d'application. Ce mémoire de thèse a pour objectif de présenter un environnement décentralisé pour la mise en oeuvre de calculs intensifs sur des réseaux pair à pair. Nous nous intéressons à des applications dans les domaines de la simulation numérique et de l'optimisation qui font appel à des modèles de type parallélisme de tâches et qui sont résolues au moyen d'algorithmes itératifs distribués or parallèles. Contrairement aux solutions existantes, notre environnement permet des communications directes et fréquentes entre les pairs. L'environnement est conçu à partir d'un protocole de communication auto-adaptatif qui peut se reconfigurer en adoptant le mode de communication le plus approprié entre les pairs en fonction de choix algorithmiques relevant de la couche application ou d'éléments de contexte comme la topologie au niveau de la couche réseau. Nous présentons et analysons des résultats expérimentaux obtenus sur diverses plateformes comme GRID'5000 et PlanetLab pour le problème de l'obstacle et des problèmes non linéaires de flots dans les réseaux. ABSTRACT : The concept of peer-to-peer (P2P) has known great developments these years in the domains of file sharing, video streaming or distributed databases. Recent advances in microprocessors architecture and networks permit one to consider new applications like distributed high performance computing. However, the implementation of this new type of application on P2P networks gives raise to numerous challenges like heterogeneity, scalability and robustness. In addition, existing transport protocols like TCP and UDP are not well suited to this new type of application. This thesis aims at designing a decentralized and robust environment for the implementation of high performance computing applications on peer-to-peer networks. We are interested in applications in the domains of numerical simulation and optimization that rely on tasks parallel models and that are solved via parallel or distributed iterative algorithms. Unlike existing solutions, our environment allows frequent direct communications between peers. The environment is based on a self adaptive communication protocol that can reconfigure itself dynamically by choosing the most appropriate communication mode between any peers according to decisions concerning algorithmic choice made at the application level or elements of context at transport level, like topology. We present and analyze computational results obtained on several testeds like GRID’5000 and PlanetLab for the obstacle problem and nonlinear network flow problems

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Institut National Polytechnique de Toulouse (Theses)

HAL-INSA Toulouse

A Novel Data Replication and Management Protocol for Mobile Computing Systems

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2006
Field of study

Crossref

Recommended from our members

Research and development of accounting system in grid environment

Author: Chen Xiaoyu
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The Grid has been recognised as the next-generation distributed computing paradigm by seamlessly integrating heterogeneous resources across administrative domains as a single virtual system. There are an increasing number of scientific and business projects that employ Grid computing technologies for large-scale resource sharing and collaborations. Early adoptions of Grid computing technologies have custom middleware implemented to bridge gaps between heterogeneous computing backbones. These custom solutions form the basis to the emerging Open Grid Service Architecture (OGSA), which aims at addressing common concerns of Grid systems by defining a set of interoperable and reusable Grid services. One of common concerns as defined in OGSA is the Grid accounting service. The main objective of the Grid accounting service is to ensure resources to be shared within a Grid environment in an accountable manner by metering and logging accurate resource usage information. This thesis discusses the origins and fundamentals of Grid computing and accounting service in the context of OGSA profile. A prototype was developed and evaluated based on OGSA accounting-related standards enabling sharing accounting data in a multi-Grid environment, the World-wide Large Hadron Collider Grid (WLCG). Based on this prototype and lessons learned, a generic middleware solution was also implemented as a toolkit that eases migration of existing accounting system to be standard compatible.Engineering and Physical Sciences Research Council (EPSRC), Stanford Universit

Brunel University Research Archive

P3-Mobile Parallel Peer-to-Peer computing on mobile devices

Author: Daniel Filipe Pereira Moreira da Silva
Publication venue
Publication date: 26/07/2016
Field of study

Repositório Aberto da Universidade do Porto

Distributed computing and farm management with application to the search for heavy gauge bosons using the ATLAS experiment at the LHC (CERN)

Author: Lopez-Perez Juan Antonio
Publication venue: University of Valencia
Publication date: 01/01/2008
Field of study

The Standard Model of particle physics describes the strong, weak, and electromagnetic forces between the fundamental particles of ordinary matter. However, it presents several problems and some questions remain unanswered so it cannot be considered a complete theory of fundamental interactions. Many extensions have been proposed in order to address these problems. Some important recent extensions are the Extra Dimensions theories. In the context of some models with Extra Dimensions of size about

1 TeV^{-}1

, in particular in the ADD model with only fermions confined to a D-brane, heavy Kaluza-Klein excitations are expected, with the same properties as SM gauge bosons but more massive. In this work, three hadronic decay modes of some of such massive gauge bosons, Z* and W*, are investigated using the ATLAS experiment at the Large Hadron Collider (LHC), presently under construction at CERN. These hadronic modes are more difficult to detect than the leptonic ones, but they should allow a measurement of the couplings between heavy gauge bosons and quarks. The events were generated using the ATLAS fast simulation and reconstruction MC program Atlfast coupled to the Monte Carlo generator PYTHIA. We found that for an integrated luminosity of

3 × 10^{5} pb^{-}1

and a heavy gauge boson mass of 2 TeV, the channels Z*->bb and Z*->tt would be difficult to detect because the signal would be very small compared with the expected backgrou nd, although the significance in the case of Z*->tt is larger. In the channel W*->tb , the decay might yield a signal separable from the background and a significance larger than 5 so we conclude that it would be possible to detect this particular mode at the LHC. The analysis was also performed for masses of 1 TeV and we conclude that the observability decreases with the mass. In particular, a significance higher than 5 may be achieved below approximately 1.4, 1.9 and 2.2 TeV for Z*->bb , Z*->tt and W*->tb respectively. The LHC will start to operate in 2008 and collect data in 2009. It will produce roughly 15 Petabytes of data per year. Access to this experimental data has to be provided for some 5,000 scientists working in 500 research institutes and universities. In addition, all data need to be available over the estimated 15-year lifetime of the LHC. The analysis of the data, including comparison with theoretical simulations, requires an enormous computing power. The computing challenges that scientists have to face are the huge amount of data, calculations to perform and collaborators. The Grid has been proposed as a solution for those challenges. The LHC Computing Grid project (LCG) is the Grid used by ATLAS and the other LHC experiments and it is analised in depth with the aim of studying the possible complementary use of it with another Grid project. That is the Berkeley Open Infrastructure for Network C omputing middle-ware (BOINC) developed for the SETI@home project, a Grid specialised in high CPU requirements and in using volunteer computing resources. Several important packages of physics software used by ATLAS and other LHC experiments have been successfully adapted/ported to be used with this platform with the aim of integrating them into the LHC@home project at CERN: Atlfast, PYTHIA, Geant4 and Garfield. The events used in our physics analysis with Atlfast were reproduced using BOINC obtaining exactly the same results. The LCG software, in particular SEAL, ROOT and the external software, was ported to the Solaris/sparc platform to study it's portability in general as well. A testbed was performed including a big number of heterogeneous hardware and software that involves a farm of 100 computers at CERN's computing center (lxboinc) together with 30 PCs from CIEMAT and 45 from schools from Extremadura (Spain). That required a preliminary study, development and creation of components of the Quattor software and configuration management tool to install and manage the lxboinc farm and it also involved the set up of a collaboration between the Spanish research centers and government and CERN. The testbed was successful and 26,597 Grid jobs were delivered, executed and received successfully. We conclude that BOINC and LCG are complementary and useful kinds of Grid that can be used by ATLAS and the other LHC experiments. LCG has very good data distribution, management and storage capabilities that BOINC does not have. In the other hand, BOINC does not need high bandwidth or Internet speed and it also can provide a huge and inexpensive amount of computing power coming from volunteers. In addition, it is possible to send jobs from LCG to BOINC and vice versa. So, possible complementary cases are to use volunteer BOINC nodes when the LCG nodes have too many jobs to do or to use BOINC for high CPU tasks like event generators or reconstructions while concentrating LCG for data analysis

CERN Document Server

A distributed platform for the volunteer execution of workflows on a local area network

Author: Silva Jaquilino Lopes
Publication venue: Faculdade de Ciências e Tecnologia
Publication date: 01/01/2014
Field of study

Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer ScienceAlbatroz Engineering has developed a framework for over-head power lines inspection data acquisition and analysis, which includes hardware and software. The framework’s software components include inspection data analysis and reporting tools, commonly known as PLMI2 application/platform. In PLMI2, the analysis of over-head power line maintenance inspection data consists of a sequence of Automatic Tasks (ATs) interleaved with Manual Tasks (MTs). An AT consists of a set of algorithms that receives as input one or more datasets, processes them and returns new datasets. In turn, an MT enables human supervisors (also known as lines inspection operators) to correct, improve and validate the results of ATs. ATs run faster than MTs and in the overall work cycle, ATs take less than 10% of total processing time, but still take a few minutes. There is data flow dependency among tasks, which can be modelled with a workflow and even if MTs are omitted from this workflow, it is possible to carry the sequence of ATs, postponing MTs. In fact, if the computing cost and waiting time are negligible, it may be advantageous to run ATs earlier in the workflow, prior to validation. To address this opportunity, Albatroz Engineering has invested in a new procedure to stream the data through all ATs fully unattended. Considering these scenarios, it could be useful to have a system capable of detecting available workstations at a given instant and subsequently distribute the ATs to them. In this way, operators could schedule the execution of future ATs for a given inspection data, while they are performing MTs of another. The requirements of the system to implement fall within the field Volunteer Computing Systems and we will address some of the challenges posed by these kinds of systems, namely the hosts volatility and failures. Volunteer Computing is a type of distributed computing which exploits idle CPU cycles from computing resources donated by volunteers and connected through the Internet/Intranet to compute large-scale simulations. This thesis proposes and designs a new distributed task scheduling system in the context of Volunteer Computing Systems, able to schedule the ATs of PLMI2 and exploit idle CPU cycles from workstations within the company’s local area network (LAN) to accelerate the data analysis, being aware of data flow interdependencies. To evaluate the proposed system, a prototype has been implemented, and the simulations results have shown that it is scalable and supports fault-tolerance of tasks execution, by employing the rescheduling mechanism

Repositório da Universidade Nova de Lisboa

Reliable and energy efficient resource provisioning in cloud computing systems

Author: Sharma Yogesh
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2018
Field of study

Cloud Computing has revolutionized the Information Technology sector by giving computing a perspective of service. The services of cloud computing can be accessed by users not knowing about the underlying system with easy-to-use portals. To provide such an abstract view, cloud computing systems have to perform many complex operations besides managing a large underlying infrastructure. Such complex operations confront service providers with many challenges such as security, sustainability, reliability, energy consumption and resource management. Among all the challenges, reliability and energy consumption are two key challenges focused on in this thesis because of their conflicting nature. Current solutions either focused on reliability techniques or energy efficiency methods. But it has been observed that mechanisms providing reliability in cloud computing systems can deteriorate the energy consumption. Adding backup resources and running replicated systems provide strong fault tolerance but also increase energy consumption. Reducing energy consumption by running resources on low power scaling levels or by reducing the number of active but idle sitting resources such as backup resources reduces the system reliability. This creates a critical trade-off between these two metrics that are investigated in this thesis. To address this problem, this thesis presents novel resource management policies which target the provisioning of best resources in terms of reliability and energy efficiency and allocate them to suitable virtual machines. A mathematical framework showing interplay between reliability and energy consumption is also proposed in this thesis. A formal method to calculate the finishing time of tasks running in a cloud computing environment impacted with independent and correlated failures is also provided. The proposed policies adopted various fault tolerance mechanisms while satisfying the constraints such as task deadlines and utility values. This thesis also provides a novel failure-aware VM consolidation method, which takes the failure characteristics of resources into consideration before performing VM consolidation. All the proposed resource management methods are evaluated by using real failure traces collected from various distributed computing sites. In order to perform the evaluation, a cloud computing framework, 'ReliableCloudSim' capable of simulating failure-prone cloud computing systems is developed. The key research findings and contributions of this thesis are: 1. If the emphasis is given only to energy optimization without considering reliability in a failure prone cloud computing environment, the results can be contrary to the intuitive expectations. Rather than reducing energy consumption, a system ends up consuming more energy due to the energy losses incurred because of failure overheads. 2. While performing VM consolidation in a failure prone cloud computing environment, a significant improvement in terms of energy efficiency and reliability can be achieved by considering failure characteristics of physical resources. 3. By considering correlated occurrence of failures during resource provisioning and VM allocation, the service downtime or interruption is reduced significantly by 34% in comparison to the environments with the assumption of independent occurrence of failures. Moreover, measured by our mathematical model, the ratio of reliability and energy consumption is improved by 14%

Western Sydney ResearchDirect