Search CORE

561 research outputs found

Distributed Load Testing by Modeling and Simulating User Behavior

Author: Parrott Chester Ira
Publication venue: LSU Digital Commons
Publication date: 30/12/2020
Field of study

Modern human-machine systems such as microservices rely upon agile engineering practices which require changes to be tested and released more frequently than classically engineered systems. A critical step in the testing of such systems is the generation of realistic workloads or load testing. Generated workload emulates the expected behaviors of users and machines within a system under test in order to find potentially unknown failure states. Typical testing tools rely on static testing artifacts to generate realistic workload conditions. Such artifacts can be cumbersome and costly to maintain; however, even model-based alternatives can prevent adaptation to changes in a system or its usage. Lack of adaptation can prevent the integration of load testing into system quality assurance, leading to an incomplete evaluation of system quality. The goal of this research is to improve the state of software engineering by addressing open challenges in load testing of human-machine systems with a novel process that a) models and classifies user behavior from streaming and aggregated log data, b) adapts to changes in system and user behavior, and c) generates distributed workload by realistically simulating user behavior. This research contributes a Learning, Online, Distributed Engine for Simulation and Testing based on the Operational Norms of Entities within a system (LODESTONE): a novel process to distributed load testing by modeling and simulating user behavior. We specify LODESTONE within the context of a human-machine system to illustrate distributed adaptation and execution in load testing processes. LODESTONE uses log data to generate and update user behavior models, cluster them into similar behavior profiles, and instantiate distributed workload on software systems. We analyze user behavioral data having differing characteristics to replicate human-machine interactions in a modern microservice environment. We discuss tools, algorithms, software design, and implementation in two different computational environments: client-server and cloud-based microservices. We illustrate the advantages of LODESTONE through a qualitative comparison of key feature parameters and experimentation based on shared data and models. LODESTONE continuously adapts to changes in the system to be tested which allows for the integration of load testing into the quality assurance process for cloud-based microservices

Louisiana State University

An attribute oriented induction based methodology to aid in predictive maintenance: anomaly detection, root cause analysis and remaining useful life

Author: Fernandez Anakabe Javier
Publication venue: Mondragon Unibertsitatea. Goi Eskola Politeknikoa
Publication date: 01/01/2019
Field of study

Predictive Maintenance is the maintenance methodology that provides the best performance to industrial organisations in terms of time, equipment effectiveness and economic savings. Thanks to the recent advances in technology, capturing process data from machines and sensors attached to them is no longer a challenging task, and can be used to perform complex analyses to help with maintenance requirements. On the other hand, knowledge of domain experts can be combined with information extracted from the machines’ assets to provide a better understanding of the underlying phenomena. This thesis proposes a methodology to assess the different requirements in relation to Predictive Maintenance. These are (i) Anomaly Detection (AD), (ii) Root Cause Analysis (RCA) and (iii) estimation of Remaining Useful Life (RUL). Multiple machine learning techniques and algorithms can be found in the literature to carry out the calculation of these requirements. In this thesis, the Attribute Oriented Induction (AOI) algorithm has been adopted and adapted to the Predictive Maintenance methodology needs. AOI has the capability of performing RCA, but also possibility to be used as an AD system. With the purpose of performing Predictive Maintenance, a variant, Repetitive Weighted Attribute Oriented Induction (ReWAOI ), has been proposed. ReWAOI has the ability to combine information extracted from the machine with the knowledge of experts in the field to describe its behaviour, and derive the Predictive Maintenance requirements. Through the use of ReWAOI, one-dimensional quantification function from multidimensional data can be obtained. This function is correlated with the evolution of the machine’s wear over time, and thus, the estimation of AD and RUL has been accomplished. In addition, the ReWAOI helps in the description of failure root causes. The proposed contributions of the thesis have been validated in different scenarios, both emulated but also real industrial case studies.Enpresei errendimendu hoberena eskaintzen dien mantentze metodologia Mantentze Prediktiboa da, denbora, ekipamenduen eraginkortasun, eta ekonomia alorretan. Azken urteetan eman diren teknologia aurrerapenei esker, makina eta sensoreetatiko datuen eskuraketa jada ez da erronka, eta manentenimendurako errekerimenduak betetzen laguntzeko analisi konplexuak egiteko erabili daitezke. Bestalde, alorreko jakintsuen ezagutza makinetatik eskuratzen den informazioarekin bateratu daiteke, gertakarien gaineko ulermena hobea izan dadin. Tesi honetan metodologia berri bat proposatzen da, Mantentze Prediktiboarekin lotura duten errekerimenduak betearazten dituena. Ondorengoak dira: (i) Anomalien Detekzioa (AD), (ii) Erro-Kausaren Analisia (RCA), eta (iii) Gainontzeko Bizitza Erabilgarriaren (RUL) estimazioa. Errekerimendu hauen kalkulua burutzeko, ikasketa automatikoko hainbat algoritmo aurkitu daitezke literaturan. Tesi honetan Attribute Oriented Induction (AOI) algoritmoa erabili eta egokitu da Mantentze Prediktiboaren beharretara. AOI-k RCA estimatzeko ahalmena dauka, baina AD kalkulatzeko erabilia izan daiteke baita ere. Mantentze Prediktiboa aplikatzeko helburuarekin, AOI-rentzat aldaera bat proposatu da: Repetitive Weighted Attribute Oriented Induction (ReWAOI ). ReWAOI-k alorreko jakintsuen ezagutza eta makinetatik eskuratutako informazioa bateratzeko ahalmena dauka, makinen portaera deskribatu ahal izateko, eta horrela, Mantentze Prediktiboaren errekerimenduak betetzeko. ReWAOI-ren erabileraren ondorioz, dimentsio bakarreko kuantifikazio funtzioa eskuratu daiteke hainbat dimentsiotako datuetatik. Funztio hau denboran zehar makinak duen higadurarekin erlazionatuta dago, eta beraz, AD eta RUL-aren estimazioak burutu daitezke. Horretaz gain, ReWAOI-k hutsegiteen erro-kausaren deskribapenak eskaintzeko ahalmena dauka. Tesian proposatutako kontribuzioak hainbat erabilpen kasutan balioztatu dira, batzuk emulatuak, eta beste batzuk industria alorreko kasu errealak izanik.El Mantenimiento Predictivo es la metodología de mantenimiento que mejor rendimiento aporta a las organizaciones industriales en cuestiones de tiempo, eficiencia del equipamiento, y rendimiento económico. Gracias a los recientes avances en tecnología, la captura de datos de proceso de máquinas y sensores ya no es un reto, y puede utilizarse para realizar complejos análisis que ayuden con el cumplimiento de los requerimientos de mantenimiento. Por otro lado, el conocimiento de expertos de dominio puede ser combinado con la información extraída de las máquinas para otorgar una mejor comprensión de los fenómenos ocurridos. Esta tesis propone una metodología que cumple con diferentes requerimientos establecidos para el Mantenimiento Predictivo. Estos son (i) la Detección de Anomalías (AD), el Análisis de la Causa-Raíz (RCA) y (iii) la estimación de la Vida Útil Remanente. Pueden encontrarse múltiples técnicas y algoritmos de aprendizaje automático en la literatura para llevar a cabo el cálculo de estos requerimientos. En esta tesis, el algoritmo Attribute Oriented Induction (AOI) ha sido seleccionado y adaptado a las necesidades que establece el Mantenimiento Predictivo. AOI tiene la capacidad de estimar el RCA, pero puede usarse, también, para el cálculo de la AD. Con el propósito de aplicar Mantenimiento Predictivo, se ha propuesto una variante del algoritmo, denominada Repetitive Weighted Attribute Oriented Induction (ReWAOI ). ReWAOI tiene la capacidad de combinar información extraída de la máquina y conocimiento de expertos de área para describir su comportamiento, y así, poder cumplir con los requerimientos del Mantenimiento Predictivo. Mediante el uso de ReWAOI, se puede obtener una función de cuantificación unidimensional, a partir de datos multidimensionales. Esta función está correlacionada con la evolución de la máquina en el tiempo, y por lo tanto, la estimación de AD y RUL puede ser realizada. Además, ReWAOI facilita la descripción de las causas-raíz de los fallos producidos. Las contribuciones propuestas en esta tesis han sido validadas en distintos escenarios, tanto en casos de uso industriales emulados como reales

eBiltegia

Contributions to High-Throughput Computing Based on the Peer-to-Peer Paradigm

Author: Pérez Miguel Carlos
Publication venue
Publication date: 18/06/2015
Field of study

XII, 116 p.This dissertation focuses on High Throughput Computing (HTC) systems and how to build a working HTC system using Peer-to-Peer (P2P) technologies. The traditional HTC systems, designed to process the largest possible number of tasks per unit of time, revolve around a central node that implements a queue used to store and manage submitted tasks. This central node limits the scalability and fault tolerance of the HTC system. A usual solution involves the utilization of replicas of the master node that can replace it. This solution is, however, limited by the number of replicas used. In this thesis, we propose an alternative solution that follows the P2P philosophy: a completely distributed system in which all worker nodes participate in the scheduling tasks, and with a physically distributed task queue implemented on top of a P2P storage system. The fault tolerance and scalability of this proposal is, therefore, limited only by the number of nodes in the system. The proper operation and scalability of our proposal have been validated through experimentation with a real system. The data availability provided by Cassandra, the P2P data management framework used in our proposal, is analysed by means of several stochastic models. These models can be used to make predictions about the availability of any Cassandra deployment, as well as to select the best possible con guration of any Cassandra system. In order to validate the proposed models, an experimentation with real Cassandra clusters is made, showing that our models are good descriptors of Cassandra's availability. Finally, we propose a set of scheduling policies that try to solve a common problem of HTC systems: re-execution of tasks due to a failure in the node where the task was running, without additional resource misspending. In order to reduce the number of re-executions, our proposals try to nd good ts between the reliability of nodes and the estimated length of each task. An extensive simulation-based experimentation shows that our policies are capable of reducing the number of re-executions, improving system performance and utilization of nodes

Archivo Digital para la Docencia y la Investigación

Automatically classifying test results by semi-supervised learning

Author: Almaghairbe Rafig
Roper Marc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

A key component of software testing is deciding whether a test case has passed or failed: an expensive and error-prone manual activity. We present an approach to automatically classify passing and failing executions using semi-supervised learning on dynamic execution data (test inputs/outputs and execution traces). A small proportion of the test data is labelled as passing or failing and used in conjunction with the unlabelled data to build a classifier which labels the remaining outputs (classify them as passing or failing tests). A range of learning algorithms are investigated using several faulty versions of three systems along with varying types of data (inputs/outputs alone, or in combination with execution traces) and different labelling strategies (both failing and passing tests, and passing tests alone). The results show that in many cases labelling just a small proportion of the test cases – as low as 10% – is sufficient to build a classifier that is able to correctly categorise the large majority of the remaining test cases. This has important practical potential: when checking the test results from a system a developer need only examine a small proportion of these and use this information to train a learning algorithm to automatically classify the remainder

Crossref

University of Strathclyde Institutional Repository

CONFPROFITT: A CONFIGURATION-AWARE PERFORMANCE PROFILING, TESTING, AND TUNING FRAMEWORK

Author: Han Xue
Publication venue: UKnowledge
Publication date: 01/01/2019
Field of study

Modern computer software systems are complicated. Developers can change the behavior of the software system through software configurations. The large number of configuration option and their interactions make the task of software tuning, testing, and debugging very challenging. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as specific configurations. While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, many of these techniques are not effective in highly-configurable systems. To improve the non-functional qualities of configurable software systems, testing engineers need to be able to understand the performance influence of configuration options, adjust the performance of a system under different configurations, and detect configuration-related performance bugs. This research will provide an automated framework that allows engineers to effectively analyze performance-influence configuration options, detect performance bugs in highly-configurable software systems, and adjust configuration options to achieve higher long-term performance gains. To understand real-world performance bugs in highly-configurable software systems, we first perform a performance bug characteristics study from three large-scale opensource projects. Many researchers have studied the characteristics of performance bugs from the bug report but few have reported what the experience is when trying to replicate confirmed performance bugs from the perspective of non-domain experts such as researchers. This study is meant to report the challenges and potential workaround to replicate confirmed performance bugs. We also want to share a performance benchmark to provide real-world performance bugs to evaluate future performance testing techniques. Inspired by our performance bug study, we propose a performance profiling approach that can help developers to understand how configuration options and their interactions can influence the performance of a system. The approach uses a combination of dynamic analysis and machine learning techniques, together with configuration sampling techniques, to profile the program execution, analyze configuration options relevant to performance. Next, the framework leverages natural language processing and information retrieval techniques to automatically generate test inputs and configurations to expose performance bugs. Finally, the framework combines reinforcement learning and dynamic state reduction techniques to guide subject application towards achieving higher long-term performance gains

University of Kentucky

Recommended from our members

Robust behavioral malware detection

Author: Kazdagli Mikhail
Publication venue
Publication date: 13/08/2018
Field of study

Computer security attacks evolve to evade deployed defenses. Recent attacks have ranged from exploiting generic software vulnerabilities in memory-unsafe languages such as buffer overflows and format string vulnerabilities to exploiting logic errors in web applications, through means such as SQL injection and cross-site scripting. Furthermore, recent attacks have focused on escalating privileges and stealing sensitive information by exploiting new hardware or operating system (OS) interfaces. Computer security attacks are also now relying on social engineering techniques to run malicious programs on victims' machines; instances of such abuse include phishing and watering hole attacks, both of which trick people into running malicious code or divulging confidential information. Thus, traditional computer security methods, such as OS confinement and program analysis, will not prevent new attacks that do not violate OS confinement or present illegal program behaviors. Another challenge is that traditional security approaches have large trusted code bases (TCBs), which include hardware, OSs, and other software components that implement authentication and authorization logic across a distributed system. This is a vulnerable area because these components are complex and often contain vulnerabilities that undermine the overall system's integrity or confidentiality. Evasive attacks on vulnerable systems -- especially in instances where trusted components turn malicious -- inspire the creation of defenses that can augment formally specified mechanisms against known threats. Specifically, this thesis advances the state of the art in behavioral malware detection -- detecting previously unknown malware in the very early stages of infection within an enterprise network. Here we assess three fundamental insights of modern-day attacks and then describe a cross-layer defense against such attacks. First, we make a low-level machine state visible to behavioral analysis, significantly minimizing the TCB and its associated vulnerabilities. Specifically, our behavioral detector utilizes an executable code's dynamic properties, with architectural and micro-architectural states as input. Second, we evaluate behavioral detectors against adaptive adversaries. For this purpose, we introduce a new metric to determine a detector's robustness against malware modifications, which serves as a step toward explainability of machine learning-based malware detectors. Finally, we exploit the fact that attacks spread through only a limited number of vectors and propose new techniques to analyze the resulting dynamic correlations created among machines. These insights show that behavioral detectors can efficiently protect both individual devices and end hosts within enterprise networks. We present three types of such behavioral detectors. Sherlock protects resource-constrained devices, such as mobile phones and Internet-of-things (IoT) devices, without modifying the software/hardware stack. Sherlock's supervised and unsupervised versions outperform prior work by 24.7% and 12.5% (area under the curve (AUC) metric), respectively, and detects stealthy malware that often evades static analysis tools. The second behavioral detector, Shape-GD, protects devices within an enterprise network. It monitors devices on the network, aggregates data from weak local detectors, overlays that with network-level information, and then makes early, robust predictions regarding malicious activity. Shape-GD achieves its goals by exploiting latent attack semantics. Specifically, it analyzes communication patterns across multiple devices, partitioning them into neighborhoods. Devices within the same neighborhood are likely to be exposed to the same attack vector. Furthermore, we hypothesize that the conditional distribution of false positives is different from that of true positives; i.e., given a neighborhood of nodes, we can compute the aggregate distributional shape of alert feature vectors from the neighborhood itself and provide robust labels. We evaluate Shape-GD by emulating a large community of Windows systems using the system call traces from a few thousand malicious and benign applications; we simulate both a phishing attack in a corporate email network as well as a watering hole attack through a popular website. In both scenarios, Shape-GD identifies malware early on (~100 infected nodes in a ~100K-node system for watering hole attacks, and ~10 of ~1,000 for phishing attacks) and robustly (with ~100% global true-positive and ~1% global false-positive rates). The third behavioral detector, Centurion, detects malware across machines monitored by an anti-virus company. It is able to analyze behavior from 5 million Symantec client machines in real time and discovers malware by correlating file downloads across multiple machines. Compared with a recent local detector that analyzes metadata from file downloads, Centurion reduced the number of false positives from ~1M to ~110K and increased the true-positive rate by a factor of ~2.5. In addition, on average, Centurion detects malware 345 days earlier than commercial anti-virus products.Electrical and Computer Engineerin

Texas ScholarWorks

Predictive Process Monitoring for Lead-to-Contract Process Optimization

Author: Tipirishetty Madhu
Publication venue
Publication date: 01/01/2016
Field of study

Äriprotsesside toetamiseks on üha laiemalt kasutusele võetud ettevõtte ressursside planeerimise (ERP) tööriistad, sealhulgas CRM süsteemid müügiprotsessi jaoks. ERP süsteemid salvestavad oma töö käigus protsesside logisid, mille oskuslik käsitlemine võimaldab efektiivistada äriprotsesse. Protsessilogide analüüsimiseks on välja töötatud protsessikaeve meetodid, mis oskavad logidest pöördprojekteerida tegelikult käivitatud protsesside mudeleid. Neid meetodeid on rakendatud koos ennustava seire meetoditega protsesside tulemuste soovitud ja soovimatute tulemuste varajaseks tuvastamiseks.\n\rKuigi ennustav seire on hiljuti rohkelt tähelepanu saanud ja leidnud rakendamist soovitusmootorites, mis pakuvad välja soovitusi äriprotsesside parendamiseks, ei ole seni palju uuritud kontekstiandmete, nt müügisüsteemi kirjetes klientide finantsandmed, mõju ennustava seire tulemustele soovituste kontekstis.\n\rKäesolevas magistritöös uuritakse kontekstiandmete mõju ennustava seire mudelite kvaliteedile müügiprotsessi optimeerimise kontekstis. Eksperimendid näitavad, et välistel kontekstiandmetel on pigem negatiivne mõju, samas kui sisemistel, protsessi käigus kogutud kontekstiandmetel on positiivne mõju mudelite kvaliteedile. Muuhulgas selgub eksperimentidest, et juba kolme esimese sündmuse baasil saab müügiprotsessis ennustada müügi õnnestumist.Business processes today are supported by enterprise systems such as Enter-\n\rprise Resource Planning systems. These systems store large amounts of process execution\n\rlog data that can be used to improve business processes across the organization. The\n\rprocess mining methods have been developed to analyze such logs, which are capable of\n\rextracting process models. These methods, in turn, have been applied in conjunctions\n\rwith predictive monitoring methods for early differentiation of desired and undesired\n\routcomes. Although predictive monitoring approach has recently caught attention and\n\rfound application in recommendation engines, which suggest cases to improve business\n\rprocess outcomes, there is no much research on how contextual data, such as clients fi-\n\rnancial indicators and other external data, may improve the quality of recommendations.\n\rThis thesis examines whether including the external data with the event data affects the\n\raccuracy of predictive monitoring for early predictions positively. More specifically, this\n\rthesis reveals usage of context data had the adverse effect on the performance of learned\n\rmodels. Furthermore, the study indicated that the usage of first three events from the\n\revent logs with internal data is sufficient to predict the label of an opportunity in the\n\rsales funnel

DSpace at Tartu University Library