97 research outputs found

    Effective Strategic Programming for Java Developers

    Get PDF
    International audienceIn object programming languages, the Visitor design pattern allows separation of algorithms and data-structures. When applying this pattern to tree-like structures, programmers are always confronted with the difficulty of making their code evolve. One reason is that the code implementing the algorithm is interwound with the code implementing the traversal inside the Visitor. When implementing algorithms such as data analyses or transformations, encoding the traversal directly into the algorithm turns out to be cumbersome as this type of algorithm only focuses on a small part of the data-structure model (e.g., program optimization). Unfortunately, typed programming languages like Java do not offer simple solutions for expressing generic traversals. Rewrite-based languages like ELAN or Stratego have introduced the notion of strategies to express both generic traversal and rule application control in a declarative way. Starting from this approach, our goal was to make the notion of strategic programming available in a widely used language such as Java and thus to offer generic traversals in typed Java structures. In this paper, we present the strategy language SL that provides programming support for strategies in Java

    An Algebraic Approach to XQuery Optimization

    Get PDF
    As more data is stored in XML and more applications need to process this data, XML query optimization becomes performance critical. While optimization techniques for relational databases have been developed over the last thirty years, the optimization of XML queries poses new challenges. Query optimizers for XQuery, the standard query language for XML data, need to consider both document order and sequence order. Nevertheless, algebraic optimization proved powerful in query optimizers in relational and object oriented databases. Thus, this dissertation presents an algebraic approach to XQuery optimization. In this thesis, an algebra over sequences is presented that allows for a simple translation of XQuery into this algebra. The formal definitions of the operators in this algebra allow us to reason formally about algebraic optimizations. This thesis leverages the power of this formalism when unnesting nested XQuery expressions. In almost all cases unnesting nested queries in XQuery reduces query execution times from hours to seconds or milliseconds. Moreover, this dissertation presents three basic algebraic patterns of nested queries. For every basic pattern a decision tree is developed to select the most effective unnesting equivalence for a given query. Query unnesting extends the search space that can be considered during cost-based optimization of XQuery. As a result, substantially more efficient query execution plans may be detected. This thesis presents two more important cases where the number of plan alternatives leads to substantially shorter query execution times: join ordering and reordering location steps in path expressions. Our algebraic framework detects cases where document order or sequence order is destroyed. However, state-of-the-art techniques for order optimization in cost-based query optimizers have efficient mechanisms to repair order in these cases. The results obtained for query unnesting and cost-based optimization of XQuery underline the need for an algebraic approach to XQuery optimization for efficient XML query processing. Moreover, they are applicable to optimization in relational databases where order semantics are considered

    Democratizing machine learning

    Get PDF
    Modelle des maschinellen Lernens sind zunehmend in der Gesellschaft verankert, oft in Form von automatisierten Entscheidungsprozessen. Ein wesentlicher Grund dafür ist die verbesserte Zugänglichkeit von Daten, aber auch von Toolkits für maschinelles Lernen, die den Zugang zu Methoden des maschinellen Lernens für Nicht-Experten ermöglichen. Diese Arbeit umfasst mehrere Beiträge zur Demokratisierung des Zugangs zum maschinellem Lernen, mit dem Ziel, einem breiterem Publikum Zugang zu diesen Technologien zu er- möglichen. Die Beiträge in diesem Manuskript stammen aus mehreren Bereichen innerhalb dieses weiten Gebiets. Ein großer Teil ist dem Bereich des automatisierten maschinellen Lernens (AutoML) und der Hyperparameter-Optimierung gewidmet, mit dem Ziel, die oft mühsame Aufgabe, ein optimales Vorhersagemodell für einen gegebenen Datensatz zu finden, zu vereinfachen. Dieser Prozess besteht meist darin ein für vom Benutzer vorgegebene Leistungsmetrik(en) optimales Modell zu finden. Oft kann dieser Prozess durch Lernen aus vorhergehenden Experimenten verbessert oder beschleunigt werden. In dieser Arbeit werden drei solcher Methoden vorgestellt, die entweder darauf abzielen, eine feste Menge möglicher Hyperparameterkonfigurationen zu erhalten, die wahrscheinlich gute Lösungen für jeden neuen Datensatz enthalten, oder Eigenschaften der Datensätze zu nutzen, um neue Konfigurationen vorzuschlagen. Darüber hinaus wird eine Sammlung solcher erforderlichen Metadaten zu den Experimenten vorgestellt, und es wird gezeigt, wie solche Metadaten für die Entwicklung und als Testumgebung für neue Hyperparameter- Optimierungsmethoden verwendet werden können. Die weite Verbreitung von ML-Modellen in vielen Bereichen der Gesellschaft erfordert gleichzeitig eine genauere Untersuchung der Art und Weise, wie aus Modellen abgeleitete automatisierte Entscheidungen die Gesellschaft formen, und ob sie möglicherweise Individuen oder einzelne Bevölkerungsgruppen benachteiligen. In dieser Arbeit wird daher ein AutoML-Tool vorgestellt, das es ermöglicht, solche Überlegungen in die Suche nach einem optimalen Modell miteinzubeziehen. Diese Forderung nach Fairness wirft gleichzeitig die Frage auf, ob die Fairness eines Modells zuverlässig geschätzt werden kann, was in einem weiteren Beitrag in dieser Arbeit untersucht wird. Da der Zugang zu Methoden des maschinellen Lernens auch stark vom Zugang zu Software und Toolboxen abhängt, sind mehrere Beiträge in Form von Software Teil dieser Arbeit. Das R-Paket mlr3pipelines ermöglicht die Einbettung von Modellen in sogenan- nte Machine Learning Pipelines, die Vor- und Nachverarbeitungsschritte enthalten, die im maschinellen Lernen und AutoML häufig benötigt werden. Das mlr3fairness R-Paket hingegen ermöglicht es dem Benutzer, Modelle auf potentielle Benachteiligung hin zu über- prüfen und diese durch verschiedene Techniken zu reduzieren. Eine dieser Techniken, multi-calibration wurde darüberhinaus als seperate Software veröffentlicht.Machine learning artifacts are increasingly embedded in society, often in the form of automated decision-making processes. One major reason for this, along with methodological improvements, is the increasing accessibility of data but also machine learning toolkits that enable access to machine learning methodology for non-experts. The core focus of this thesis is exactly this – democratizing access to machine learning in order to enable a wider audience to benefit from its potential. Contributions in this manuscript stem from several different areas within this broader area. A major section is dedicated to the field of automated machine learning (AutoML) with the goal to abstract away the tedious task of obtaining an optimal predictive model for a given dataset. This process mostly consists of finding said optimal model, often through hyperparameter optimization, while the user in turn only selects the appropriate performance metric(s) and validates the resulting models. This process can be improved or sped up by learning from previous experiments. Three such methods one with the goal to obtain a fixed set of possible hyperparameter configurations that likely contain good solutions for any new dataset and two using dataset characteristics to propose new configurations are presented in this thesis. It furthermore presents a collection of required experiment metadata and how such meta-data can be used for the development and as a test bed for new hyperparameter optimization methods. The pervasion of models derived from ML in many aspects of society simultaneously calls for increased scrutiny with respect to how such models shape society and the eventual biases they exhibit. Therefore, this thesis presents an AutoML tool that allows incorporating fairness considerations into the search for an optimal model. This requirement for fairness simultaneously poses the question of whether we can reliably estimate a model’s fairness, which is studied in a further contribution in this thesis. Since access to machine learning methods also heavily depends on access to software and toolboxes, several contributions in the form of software are part of this thesis. The mlr3pipelines R package allows for embedding models in so-called machine learning pipelines that include pre- and postprocessing steps often required in machine learning and AutoML. The mlr3fairness R package on the other hand enables users to audit models for potential biases as well as reduce those biases through different debiasing techniques. One such technique, multi-calibration is published as a separate software package, mcboost

    An expert system for integrated structural analysis and design optimization for aerospace structures

    Get PDF
    The results of a research study on the development of an expert system for integrated structural analysis and design optimization is presented. An Object Representation Language (ORL) was developed first in conjunction with a rule-based system. This ORL/AI shell was then used to develop expert systems to provide assistance with a variety of structural analysis and design optimization tasks, in conjunction with procedural modules for finite element structural analysis and design optimization. The main goal of the research study was to provide expertise, judgment, and reasoning capabilities in the aerospace structural design process. This will allow engineers performing structural analysis and design, even without extensive experience in the field, to develop error-free, efficient and reliable structural designs very rapidly and cost-effectively. This would not only improve the productivity of design engineers and analysts, but also significantly reduce time to completion of structural design. An extensive literature survey in the field of structural analysis, design optimization, artificial intelligence, and database management systems and their application to the structural design process was first performed. A feasibility study was then performed, and the architecture and the conceptual design for the integrated 'intelligent' structural analysis and design optimization software was then developed. An Object Representation Language (ORL), in conjunction with a rule-based system, was then developed using C++. Such an approach would improve the expressiveness for knowledge representation (especially for structural analysis and design applications), provide ability to build very large and practical expert systems, and provide an efficient way for storing knowledge. Functional specifications for the expert systems were then developed. The ORL/AI shell was then used to develop a variety of modules of expert systems for a variety of modeling, finite element analysis, and design optimization tasks in the integrated aerospace structural design process. These expert systems were developed to work in conjunction with procedural finite element structural analysis and design optimization modules (developed in-house at SAT, Inc.). The complete software, AutoDesign, so developed, can be used for integrated 'intelligent' structural analysis and design optimization. The software was beta-tested at a variety of companies, used by a range of engineers with different levels of background and expertise. Based on the feedback obtained by such users, conclusions were developed and are provided

    Agent Based Modeling in Land-Use and Land-Cover Change Studies

    Get PDF
    Agent based models (ABM) for land use and cover change (LUCC) holds the promise to provide new insight into the processes and patterns of the human and biophysical interactions in ways that have never been explored. Advances in computer technology make it possible to run almost infinite numbers of simulations with multiple heterogeneously shaped actors that reciprocally interact via vertical and horizontal power lines on various levels. Based upon an extensive literature review the basic components for such exercises are explored and discussed. This resulted in a systematic representation of these components consisting of: (1) Spatial static input data, (2) Actor and Actor-group static input data, (3) Spatial dynamic input, (4) Actor and Actor-group dynamic input data, (5) the model with the rules describing the rules, (6) Spatial static output, (7) Actor and Actor-group static output, (8) Dynamic output of Actor behaviour changes, (9) Dynamic output of actor-group behavioural changes, (10) Dynamic output of spatial patterns, (11) Dynamic output of temporal patterns. This representation proves to be epistemologically useful in the analysis of the relationships between the ABM LUCC components. In this paper, this representation is also used to enumerate the strengths and limitations of agent based modelling in LUCC

    Java Grande Forum Report: Making Java Work for High-End Computing

    Get PDF
    This document describes the Java Grande Forum and includes its initial deliverables.Theseare reports that convey a succinct set of recommendations from this forum to SunMicrosystems and other purveyors of Java™ technology that will enable GrandeApplications to be developed with the Java programming language

    Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans

    Get PDF
    We are currently unable to specify human goals and societal values in a way that reliably directs AI behavior. Law-making and legal interpretation form a computational engine that converts opaque human values into legible directives. "Law Informs Code" is the research agenda embedding legal knowledge and reasoning in AI. Similar to how parties to a legal contract cannot foresee every potential contingency of their future relationship, and legislators cannot predict all the circumstances under which their proposed bills will be applied, we cannot ex ante specify rules that provably direct good AI behavior. Legal theory and practice have developed arrays of tools to address these specification problems. For instance, legal standards allow humans to develop shared understandings and adapt them to novel situations. In contrast to more prosaic uses of the law (e.g., as a deterrent of bad behavior through the threat of sanction), leveraged as an expression of how humans communicate their goals, and what society values, Law Informs Code. We describe how data generated by legal processes (methods of law-making, statutory interpretation, contract drafting, applications of legal standards, legal reasoning, etc.) can facilitate the robust specification of inherently vague human goals. This increases human-AI alignment and the local usefulness of AI. Toward society-AI alignment, we present a framework for understanding law as the applied philosophy of multi-agent alignment. Although law is partly a reflection of historically contingent political power - and thus not a perfect aggregation of citizen preferences - if properly parsed, its distillation offers the most legitimate computational comprehension of societal values available. If law eventually informs powerful AI, engaging in the deliberative political process to improve law takes on even more meaning.Comment: Forthcoming in Northwestern Journal of Technology and Intellectual Property, Volume 2
    • …
    corecore