1,250 research outputs found

    Efficient Aggregated Deliveries with Strong Guarantees in an Event-based Distributed System

    Get PDF
    A popular approach to designing large scale distributed systems is to follow an event-based approach. In an event-based approach, a set of software components interact by producing and consuming events. The event-based model allows for the decoupling of software components, allowing distributed systems to scale to a large number of components. Event correlation allows for higher order reasoning of events by constructing complex events from single, consumable events. In many cases, event correlation applications rely on centralized setups or broker overlay networks. In the case of centralized setups, the guarantees for complex event delivery are stronger, however, centralized setups create performance bottlenecks and single points of failure. With broker overlays, the performance and fault tolerance are improved but at the cost of weaker guarantees

    Anatomy of a Native XML Base Management System

    Full text link
    Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML

    Research, development and evaluation of a practical model for sentiment analysis

    Get PDF
    Sentiment Analysis is the task of extracting subjective information from input sources coming from a speaker or writer. Usually it refers to identifying whether a text holds a positive or negative polarity. The main approaches to carry out Sentiment Analysis are lexicon or dictionary-based methods and machine learning schemes. Lexicon-based models make use of a prede ned set of words, where each of the words composing the set has an associated polarity. Document polarity will depend on the feature selection method, and how their scores are combined. Machine-learning approaches usually rely on supervised classifiers. Although classifiers offer adaptability for specific contexts, they need to be trained with huge amounts of labelled data which may not be available, specially for upcoming topics. This project, contrary to most scientific researches over this field, aims to go further in emotion detection and puts its efforts on identifying the actual sentiment of documents, instead of focusing on whether it may have a positive or negative connotation. The set of sentiments used for this approach have been extracted from Plutchik's wheel of emotions, which defines eight basic bipolar sentiments and another eight advanced emotions composed of two basic ones. Moreover, in this project we have created a new scheme for SA combining a lexicon-based model for getting term emotions and a statistical approach to identify the most relevant topics in the document which are the targets of the sentiments. By taking this approach we have tried to overcome the disadvantages of simple Bag-of-words models that do not make any distinctions between parts of speech (POS) and weight all words commonly using the tf-idf scheme which leads to overweight most frequently used words. Furthermore, in order to improve knowledge, this projects presents a heuristic learning method that allows improving initial knowledge by converging to human-like sensitivity. In order to test proposed scheme's performance, an Android application for mobile devices has been developed. This app allows users taking photos and introducing descriptions which are processed and classi ed with emotions. Classi cation that may be corrected by the user so that system performance statistics can be extracted.El Análisis de Sentimientos consiste en extraer información subjetiva de lenguaje escrito u oral. Habitualmente se basa en identificar si un texto es positivo o negativo, es decir, extraer su polaridad. Las principales formas de llevar a cabo el Análisis de Sentimientos son los métodos basados en dictionarios y en aprendizaje automático. Los modelos basados en léxicos hacen uso de un conjunto predefinido de palabras que tienen asociada una polaridad. La polaridad del texto dependerá los elementos analizados y la forma en la que se combinan sus valores. Las aproximaciones basadas en aprendizaje automático, por el contrario, normalmente se apoyan en clasificadores supervisados. A pesar de que los claificadores ofrecen adaptabilidad para contextos muy específicos, necesitan gran cantidad de datos para ser entrenados no siempre disponibles, como por ejemplo en temas muy novedosos. Este proyecto, al contrario que la mayoría de investigaciones en este campo, intenta ir m as allá en la detección de emociones y pretende identificar los sentimientos del texto en vez de centrarse en su polaridad. El conjunto de sentimientos usados para este proyecto esrá basado en la Rueda de las Emociones de Plutchik, que define ocho sentimientos básicos y ocho complejos formados por dos básicos. Además, en este proyecto se ha creado un nuevo modelo de AS combinando léxicos para extraer las emociones de las palabras con otro estadístico que trata de identificar los temas más importantes del texto. De esta forma, se ha intentado superar las desventajas de los modelos Bag-of-words que no diferencian entre clases de palabras y ponderan todas las palabras usando el esquema tf-idf, que conlleva sobreponderar las palabras más usadas. Asimismo, para mejorar el conocimiento del proyecto, se ha implementado un método de aprendizaje heurístico que permite mejorar el conocimiento inicial para converger con la sensibilidad real de los humanos. Para evaluar el rendimiento del modelo propuesto, una aplicación Android para móviles ha sido desarrollada. Esta app permite a los usuarios tomar fotos e introducir descripciones que son procesadas y clasificadas por emociones. Clasificación que puede ser corregida por el usuario permitiendo así extraer estadísticas del rendimiento del sistema.Ingeniería Informátic

    On the Limits and Practice of Automatically Designing Self-Stabilization

    Get PDF
    A protocol is said to be self-stabilizing when the distributed system executing it is guaranteed to recover from any fault that does not cause permanent damage. Designing such protocols is hard since they must recover from all possible states, therefore we investigate how feasible it is to synthesize them automatically. We show that synthesizing stabilization on a fixed topology is NP-complete in the number of system states. When a solution is found, we further show that verifying its correctness on a general topology (with any number of processes) is undecidable, even for very simple unidirectional rings. Despite these negative results, we develop an algorithm to synthesize a self-stabilizing protocol given its desired topology, legitimate states, and behavior. By analogy to shadow puppetry, where a puppeteer may design a complex puppet to cast a desired shadow, a protocol may need to be designed in a complex way that does not even resemble its specification. Our shadow/puppet synthesis algorithm addresses this concern and, using a complete backtracking search, has automatically designed 4 new self-stabilizing protocols with minimal process space requirements: 2-state maximal matching on bidirectional rings, 5-state token passing on unidirectional rings, 3-state token passing on bidirectional chains, and 4-state orientation on daisy chains

    Rule-Based Dynamic Modification of Workflows in a Medical Domain

    Get PDF
    A major limitation of current workflow systems is their lack of supporting dynamic workflow modifications. However, this functionality is a major requirement for next-generation systems in order to provide sufficient flexibility to cope with unexpected situations and failures. For example, our experience with data intensive medical domains such as cancer therapy shows that the large number of medical exceptions is hard to manage for domain experts. We therefore have developed a rule- based approach for partially automated management of semantic exceptions during workflow instance execution. When an exception occurs, we automatically determine which running workflow instances w.r.t. which workflow regions are affected, and adjust the control flow. Rules are being used to detect semantic exceptions and to decide which activities have to be dropped or added. For dynamic modification of an affected workflow instance, we provide two algorithms (drcd-and p-algorithm) which locate appropriate deletion or insertion points and carry out the dynamic change of control flow

    Garbage Collection for General Graphs

    Get PDF
    Garbage collection is moving from being a utility to a requirement of every modern programming language. With multi-core and distributed systems, most programs written recently are heavily multi-threaded and distributed. Distributed and multi-threaded programs are called concurrent programs. Manual memory management is cumbersome and difficult in concurrent programs. Concurrent programming is characterized by multiple independent processes/threads, communication between processes/threads, and uncertainty in the order of concurrent operations. The uncertainty in the order of operations makes manual memory management of concurrent programs difficult. A popular alternative to garbage collection in concurrent programs is to use smart pointers. Smart pointers can collect all garbage only if developer identifies cycles being created in the reference graph. Smart pointer usage does not guarantee protection from memory leaks unless cycle can be detected as process/thread create them. General garbage collectors, on the other hand, can avoid memory leaks, dangling pointers, and double deletion problems in any programming environment without help from the programmer. Concurrent programming is used in shared memory and distributed memory systems. State of the art shared memory systems use a single concurrent garbage collector thread that processes the reference graph. Distributed memory systems have very few complete garbage collection algorithms and those that exist use global barriers, are centralized and do not scale well. This thesis focuses on designing garbage collection algorithms for shared memory and distributed memory systems that satisfy the following properties: concurrent, parallel, scalable, localized (decentralized), low pause time, high promptness, no global synchronization, safe, complete, and operates in linear time

    From software failure to explanation

    Get PDF
    “Why does my program crash?”—This ever recurring question drives the developer both when trying to reconstruct a failure that happened in the field and during the analysis and debugging of the test case that captures the failure. This is the question this thesis attempts to answer. For that I will present two approaches which, when combined, start off with only a dump of the memory at the moment of the crash (a core dump) and eventually give a full explanation of the failure in terms of the important runtime features of the program such as critical branches, state predicates or any other execution aspect that is deemed helpful for understanding the underlying problem. The first approach (called RECORE) takes a core dump of a crash and by means of search-based test case generation comes up with a small, self-contained and easy to understand unit test that is similar to the test as it is attached to a bug report and reproduces the failure. This test case can server as a starting point for analysis and manual debugging. Our evaluation shows that in five out of seven real cases, the resulting test captures the essence of the failure. But this failing test case can also serve as the starting point for the second approach (called BUGEX). BUGEX is a universal debugging framework that applies the scientific method and can be implemented for arbitrary runtime features (called facts). First it observes those facts during the execution of the failing test case. Using state-of-the-art statistical debugging, these facts are then correlated to the failure, forming a hypothesis. Then it performs experiments: it generates additional executions to challenge these facts and from these additional observations refines the hypothesis. The result is a correlation of critical execution aspects to the failure with unprecedented accuracy and instantaneously point the developer to the problem. This general debugging framework can be implemented for any runtime aspects; for evaluation purposes I implemented it for branches and state predicates. The evaluation shows that in six out of seven real cases, the resulting facts pinpoint the failure. Both approaches are independent form one another and each automates a tedious and error prone task. When being combined, they automate a large part of the debugging process, where the remaining manual task—fixing the defect—can never be fully automated.“Warum stürzt mein Programm ab?” – Diese ewig wiederkehrende Frage beschäftigt den Entwickler, sowohl beim Versuch den Fehler so zu rekonstruieren wie er beim Benutzer auftrat, als auch bei der Analyse und beim Debuggen des automatisierten Testfalles der den Fehler auslöst. Und dies ist auch die Frage, die diese Thesis zu beantworten versucht. Dazu präsentiere ich zwei Ansätze, die, wenn man sie kombiniert, als Eingabe lediglich einen Speicherabzug (“core dump”) im Augenblick des Absturzes haben, und als Endergebnis eine Erklärung des Absturzes in Form von wichtigen Ausführungseigenschaften des Programmes liefert (wie z.B. Zweige, Zustandsprädikate oder jedes andere Merkmal der Programmausführung das für das Fehlerverständnis hilfreich sein könnte). Der erste Ansatz (namens RECORE) nimmt einen Speicherabzug, der beim Absturz erstellt wurde, und generiert mittels suchbasierter Testfallerzeugung einen kleinen, leicht verständlichen und in sich abgeschlossenen Testfall, der denen die den Fehlerberichten (“bug reports”) beigefügt sind ähnelt und den Fehler reproduziert. Dieser Testfall kann als Ausgangspunkt der Analyse und zum manuellem Debuggen dienen. Unsere Evaluation zeigt, dass in fünf von sieben Fällen der erzeugte Testfall den Absturz erfolgreich nachstellt. Dieser fehlschlagende Testfall kann aber auch als Ausgangspunkt für den zweiten Ansatz (namens BUGEX) dienen. BUGEX ist ein universelles Rahmenwerk, das die wissenschaftliche Methode verwendet und für beliebige Ausführungsmerkmale des Programmes implementiert werden kann. Zuerst wird der fehlschlagende Testfall bezüglich dieser Merkmale beobachtet, d.h. die Merkmale werden aufgezeichnet. Dann werden aktuelle Methoden des Statistischen Debugging verwendet, um die Merkmale mit dem Testfall zu korrelieren, also um eine Hypothese zu bilden. Anschließend werden Experimente ausgeführt: BUGEX generiert zusätzliche Programmausführungen um diese Korrelation zu prüfen und die Hypothese zu verfeinern. Das Ergebnis ist eine Korrelation zwischen kritischen Ausführungseigenschaften und dem Fehlschlagen des Programmes mit beispielloser Genauigkeit. Die entsprechenden Merkmale zeigen dem Entwickler unmittelbar das Problem auf. Dieses allgemeine Rahmenwerk kann für beliebige Ausführungsmerkmale implementiert werden. Zu Evaluationszwecken habe ich es für Programmzweige und Zustandsprädikate implementiert. Die Evaluation zeigt, dass in sechs von sieben realen Fällen die resultierenden Merkmale den Fehler genau bestimmen. Beide Ansätze sind unabhängig von einander und jeder automatisiert eine mühsame und fehleranfällige Aufgabe. Wenn man sie kombiniert automatisieren sie einen großteil des Debugging Prozesses. Die verbleibende manuelle Aufgabe – den zu Fehler beheben – kann nie vollständig automatisiert werden

    Complete Model-Based Testing Applied to the Railway Domain

    Get PDF
    Testing is the most important verification technique to assert the correctness of an embedded system. Model-based testing (MBT) is a popular approach that generates test cases from models automatically. For the verification of safety-critical systems, complete MBT strategies are most promising. Complete testing strategies can guarantee that all errors of a certain kind are revealed by the generated test suite, given that the system-under-test fulfils several hypotheses. This work presents a complete testing strategy which is based on equivalence class abstraction. Using this approach, reactive systems, with a potentially infinite input domain but finitely many internal states, can be abstracted to finite-state machines. This allows for the generation of finite test suites providing completeness. However, for a system-under-test, it is hard to prove the validity of the hypotheses which justify the completeness of the applied testing strategy. Therefore, we experimentally evaluate the fault-detection capabilities of our equivalence class testing strategy in this work. We use a novel mutation-analysis strategy which introduces artificial errors to a SystemC model to mimic typical HW/SW integration errors. We provide experimental results that show the adequacy of our approach considering case studies from the railway domain (i.e., a speed-monitoring function and an interlocking-system controller) and from the automotive domain (i.e., an airbag controller). Furthermore, we present extensions to the equivalence class testing strategy. We show that a combination with randomisation and boundary-value selection is able to significantly increase the probability to detect HW/SW integration errors
    • …
    corecore