688 research outputs found

    Avoiding Unnecessary Information Loss: Correct and Efficient Model Synchronization Based on Triple Graph Grammars

    Full text link
    Model synchronization, i.e., the task of restoring consistency between two interrelated models after a model change, is a challenging task. Triple Graph Grammars (TGGs) specify model consistency by means of rules that describe how to create consistent pairs of models. These rules can be used to automatically derive further rules, which describe how to propagate changes from one model to the other or how to change one model in such a way that propagation is guaranteed to be possible. Restricting model synchronization to these derived rules, however, may lead to unnecessary deletion and recreation of model elements during change propagation. This is inefficient and may cause unnecessary information loss, i.e., when deleted elements contain information that is not represented in the second model, this information cannot be recovered easily. Short-cut rules have recently been developed to avoid unnecessary information loss by reusing existing model elements. In this paper, we show how to automatically derive (short-cut) repair rules from short-cut rules to propagate changes such that information loss is avoided and model synchronization is accelerated. The key ingredients of our rule-based model synchronization process are these repair rules and an incremental pattern matcher informing about suitable applications of them. We prove the termination and the correctness of this synchronization process and discuss its completeness. As a proof of concept, we have implemented this synchronization process in eMoflon, a state-of-the-art model transformation tool with inherent support of bidirectionality. Our evaluation shows that repair processes based on (short-cut) repair rules have considerably decreased information loss and improved performance compared to former model synchronization processes based on TGGs.Comment: 33 pages, 20 figures, 3 table

    Formal Foundations for Information-Preserving Model Synchronization Processes Based on Triple Graph Grammars

    Get PDF
    Zwischen verschiedenen Artefakten, die Informationen teilen, wieder Konsistenz herzustellen, nachdem eines von ihnen geändert wurde, ist ein wichtiges Problem, das in verschiedenen Bereichen der Informatik auftaucht. Mit dieser Dissertation legen wir eine Lösung für das grundlegende Modellsynchronisationsproblem vor. Bei diesem Problem ist ein Paar solcher Artefakte (Modelle) gegeben, von denen eines geändert wurde; Aufgabe ist die Wiederherstellung der Konsistenz. Tripelgraphgrammatiken (TGGs) sind ein etablierter und geeigneter Formalismus, um dieses und verwandte Probleme anzugehen. Da sie auf der algebraischen Theorie der Graphtransformation und dem (Double-)Pushout Zugang zu Ersetzungssystemen basieren, sind sie besonders geeignet, um Lösungen zu entwickeln, deren Eigenschaften formal bewiesen werden können. Doch obwohl TGG-basierte Ansätze etabliert sind, leiden viele von ihnen unter dem Problem des Informationsverlustes. Wenn ein Modell geändert wurde, können während eines Synchronisationsprozesses Informationen verloren gehen, die nur im zweiten Modell vorliegen. Das liegt daran, dass solche Synchronisationsprozesse darauf zurückfallen Konsistenz dadurch wiederherzustellen, dass sie das geänderte Modell (bzw. große Teile von ihm) neu übersetzen. Wir schlagen einen TGG-basierten Ansatz vor, der fortgeschrittene Features von TGGs unterstützt (Attribute und negative Constraints), durchgängig formalisiert ist, implementiert und inkrementell in dem Sinne ist, dass er den Informationsverlust im Vergleich mit vorherigen Ansätzen drastisch reduziert. Bisher gibt es keinen TGG-basierten Ansatz mit vergleichbaren Eigenschaften. Zentraler Beitrag dieser Dissertation ist es, diesen Ansatz formal auszuarbeiten und seine wesentlichen Eigenschaften, nämlich Korrektheit, Vollständigkeit und Termination, zu beweisen. Die entscheidende neue Idee unseres Ansatzes ist es, Reparaturregeln anzuwenden. Dies sind spezielle Regeln, die es erlauben, Änderungen an einem Modell direkt zu propagieren anstatt auf Neuübersetzung zurückzugreifen. Um diese Reparaturregeln erstellen und anwenden zu können, entwickeln wir grundlegende Beiträge zur Theorie der algebraischen Graphtransformation. Zunächst entwickeln wir eine neue Art der sequentiellen Komposition von Regeln. Im Gegensatz zur gewöhnlichen Komposition, die zu Regeln führt, die Elemente löschen und dann wieder neu erzeugen, können wir Regeln herleiten, die solche Elemente stattdessen bewahren. Technisch gesehen findet der Synchronisationsprozess, den wir entwickeln, außerdem in der Kategorie der partiellen Tripelgraphen statt und nicht in der der normalen Tripelgraphen. Daher müssen wir sicherstellen, dass die für Double-Pushout-Ersetzungssysteme ausgearbeitete Theorie immer noch gültig ist. Dazu entwickeln wir eine (kategorientheoretische) Konstruktion neuer Kategorien aus gegebenen und zeigen, dass (i) diese Konstruktion die Axiome erhält, die nötig sind, um die Theorie für Double-Pushout-Ersetzungssysteme zu entwickeln, und (ii) partielle Tripelgraphen als eine solche Kategorie konstruiert werden können. Zusammen ermöglichen diese beiden grundsätzlichen Beiträge es uns, unsere Lösung für das grundlegende Modellsynchronisationsproblem vollständig formal auszuarbeiten und ihre zentralen Eigenschaften zu beweisen.Restoring consistency between different information-sharing artifacts after one of them has been changed is an important problem that arises in several areas of computer science. In this thesis, we provide a solution to the basic model synchronization problem. There, a pair of such artifacts (models), one of which has been changed, is given and consistency shall be restored. Triple graph grammars (TGGs) are an established and suitable formalism to address this and related problems. Being based on the algebraic theory of graph transformation and (double-)pushout rewriting, they are especially suited to develop solutions whose properties can be formally proven. Despite being established, many TGG-based solutions do not satisfactorily deal with the problem of information loss. When one model is changed, in the process of restoring consistency such solutions may lose information that is only present in the second model because the synchronization process resorts to restoring consistency by re-translating (large parts of) the updated model. We introduce a TGG-based approach that supports advanced features of TGGs (attributes and negative constraints), is comprehensively formalized, implemented, and is incremental in the sense that it drastically reduces the amount of information loss compared to former approaches. Up to now, a TGG-based approach with these characteristics is not available. The central contribution of this thesis is to formally develop that approach and to prove its essential properties, namely correctness, completeness, and termination. The crucial new idea in our approach is the use of repair rules, which are special rules that allow one to directly propagate changes from one model to the other instead of resorting to re-translation. To be able to construct and apply these repair rules, we contribute more fundamentally to the theory of algebraic graph transformation. First, we develop a new kind of sequential rule composition. Whereas the conventional composition of rules leads to rules that delete and re-create elements, we can compute rules that preserve such elements instead. Furthermore, technically the setting in which the synchronization process we develop takes place is the category of partial triple graphs and not the one of ordinary triple graphs. Hence, we have to ensure that the elaborate theory of double-pushout rewriting still applies. Therefore, we develop a (category-theoretic) construction of new categories from given ones and show that (i) this construction preserves the axioms that are necessary to develop the theory of double-pushout rewriting and (ii) partial triple graphs can be constructed as such a category. Together, those two more fundamental contributions enable us to develop our solution to the basic model synchronization problem in a fully formal manner and to prove its central properties

    Mathematics learning & technology – A structural topic modelling

    Get PDF

    Kostenbewertung

    Get PDF

    Consistency and identifiability of the polymorphism-aware phylogenetic models

    Get PDF
    Funding: Vienna Science and Technology Fund (WWTF) [MA16-061].Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.Publisher PDFPeer reviewe

    Kosten, fixe und variable

    Get PDF

    The life cycle of Drosophila orphan genes

    Get PDF
    Funding: Austrian Science Funds (FWF) (P22834).Orphans are genes restricted to a single phylogenetic lineage and emerge at high rates. While this predicts an accumulation of genes, the gene number has remained remarkably constant through evolution. This paradox has not yet been resolved. Because orphan genes have been mainly analyzed over long evolutionary time scales, orphan loss has remained unexplored. Here we study the patterns of orphan turnover among close relatives in the Drosophila obscura group. We show that orphans are not only emerging at a high rate, but that they are also rapidly lost. Interestingly, recently emerged orphans are more likely to be lost than older ones. Furthermore, highly expressed orphans with a strong male-bias are more likely to be retained. Since both lost and retained orphans show similar evolutionary signatures of functional conservation, we propose that orphan loss is not driven by high rates of sequence evolution, but reflects lineage-specific functional requirements.Publisher PDFPeer reviewe

    PoMo : an allele frequency-based approach for species tree estimation

    Get PDF
    This work was supported by a grant from the Austrian Science Fund (FWF, P24551-B25 to C.K.). N.D.M. and D.S. were members of the Vienna Graduate School of Population Genetics which is supported by a grant of the Austrian Science Fund (FWF, W1225-B20). N.D.M. was partially supported by the Institute for Emerging Infections, funded by the Oxford Martin School.Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches.Publisher PDFPeer reviewe
    corecore