Search CORE

18 research outputs found

Graph based liability analysis for the microservice architecture

Author: Gür Gürkan
Kalinagac Onur
Soussi Wissem
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

In this work, we present Graph Based Liability Analysis Framework (GRALAF) for root cause analysis (RCA) of the microservices. In this Proof-of-Concept (PoC) tool, we keep track of the performance metrics of microservices, such as service response time and CPU level values, to detect anomalies. By injecting faults in the services, we construct a Causal Bayesian Network (CBN) which represents the relation between service faults and metrics. The constructed CBN is used to predict the fault probability of services under given metrics which are assigned discrete values according to their anomaly states

ZHAW digitalcollection

ZORA

Root cause and liability analysis in the microservices architecture for edge IoT services

Author: Anser Yacine
Gaber Chrystel
Gür Gürkan
Kalinagac Onur
Soussi Wissem
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

In this work, we present a liability analysis framework for root cause analysis (RCA) in the microservices architecture with IoT-oriented containerized network services. We keep track of the performance metrics of microservices, such as service response time, memory usage and availability, to detect anomalies. By injecting faults in the services, we construct a Causal Bayesian Network (CBN) which represents the relation between service faults and metrics. Service Level Agreement (SLA) data obtained from a descriptor named TRAILS (sTakeholder Responsibility, AccountabIlity and Liability deScriptor) is also used to flag service providers which have failed their commitments. In the case of SLA violation, the constructed CBN is used to predict the fault probability of services under given metric readings and to identify the root cause

ZHAW digitalcollection

Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters

Author: A. Matsunaga
A.J. Oliner
B. Ludäscher
C. Cortes
C. Killian
C. Yuan
D. Bohme
E. Thereska
H. Brunst
I. Cohen
I. Jolliffe
J.A. Hartigan
L. Adhianto
M. Kim
M. Tikir
M. Zaharia
P. Bod
P. Chen
R. Duan
R.R. Sambasivan
S. Kundu
S. Williams
S.S. Shende
W. Xu
W. Yoo
W. Yoo
X. Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe- art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster

Crossref

eScholarship - University of California