Search CORE

170 research outputs found

Policy-based techniques for self-managing parallel applications

Author: Anthony Richard
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/09/2006
Field of study

This paper presents an empirical investigation of policy-based self-management techniques for parallel applications executing in loosely-coupled environments. The dynamic and heterogeneous nature of these environments is discussed and the special considerations for parallel applications are identified. An adaptive strategy for the run-time deployment of tasks of parallel applications is presented. The strategy is based on embedding numerous policies which are informed by contextual and environmental inputs. The policies govern various aspects of behaviour, enhancing flexibility so that the goals of efficiency and performance are achieved despite high levels of environmental variability. A prototype self-managing parallel application is used as a vehicle to explore the feasibility and benefits of the strategy. In particular, several aspects of stability are investigated. The implementation and behaviour of three policies are discussed and sample results examined

Crossref

Greenwich Academic Literature Archive

Toward Broad-Spectrum Autonomic Management

Author: Anderson Paul
Smith Edmund
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2007
Field of study

Edinburgh Research Explorer

Towards Operator-less Data Centers Through Data-Driven, Predictive, Proactive Autonomics

Author: Babaoglu Ozalp
Sîrbu Alina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5%, we can achieve true positive rates between 27% and 88% with precision varying between 50% and 72%.This level of performance allows us to recover large fraction of jobs' executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. [...

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Towards Data-Driven Autonomics in Data Centers

Author: Babaoglu Ozalp
Sîrbu Alina
Publication venue
Publication date: 01/01/2015
Field of study

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using generated data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating a predictive model for node failures. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing machine state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if machines will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5%, we can achieve true positive rates between 27% and 88% with precision varying between 50% and 72%. We discuss the practicality of including our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available from the authors' website.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Autonomicity – An Antidote for Complexity?

Author: Hinchey MH
Sterritt Roy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Crossref

Ulster University's Research Portal

Autonomic Management of Application Workflows on Hybrid Computing Infrastructure

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2011
Field of study

Crossref

Semantic SOA - IT Catalyst for Business Transformation

Author: Kumar Rajat
Raichura Bhavin
Publication venue: AIS Electronic Library (AISeL)
Publication date: 31/12/2007
Field of study

AIS Electronic Library (AISeL)

PACT: Personal Autonomic Computing Tools

Author: Bradley M
Sterritt Roy
Symth B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2005
Field of study

Ulster University's Research Portal

Autonomics at the edge: resource orchestration for edge native applications

Author: Balouek-Thomert Daniel
Bittencourt Luiz Fernando
Parashar Manish
Petri Ioan
Rana Omer
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/11/2020
Field of study

With increasing availability of edge computing resources there is a need to develop edge orchestration and resource management techniques to support application resilience and performance. Similar to the use of containers and microservices for cloud environments, it is important to understand the key attributes that characterise “edge native” applications. As edge devices increase in their autonomy and intelligence, orchestration techniques are needed to respond to changes in device properties, availability, security credentials, migration and network connectivity protocols. Implementing autonomics techniques for edge computing can increase resilience of the interaction between devices and applications reducing execution time and cost. The use of autonomics at the network edge can address the complexity requirement of industrial workflows to overcome execution latency, data privacy and reliability constraints

Online Research @ Cardiff