7 research outputs found

    Simplifying Dependent Reductions in the Polyhedral Model

    Full text link
    A Reduction -- an accumulation over a set of values, using an associative and commutative operator -- is a common computation in many numerical computations, including scientific computations, machine learning, computer vision, and financial analytics. Contemporary polyhedral-based compilation techniques make it possible to optimize reductions, such as prefix sums, in which each component of the reduction's output potentially shares computation with another component in the reduction. Therefore an optimizing compiler can identify the computation shared between multiple components and generate code that computes the shared computation only once. These techniques, however, do not support reductions that -- when phrased in the language of the polyhedral model -- span multiple dependent statements. In such cases, existing approaches can generate incorrect code that violates the data dependences of the original, unoptimized program. In this work, we identify and formalize the optimization of dependent reductions as an integer bilinear program. We present a heuristic optimization algorithm that uses an affine sequential schedule of the program to determine how to simplfy reductions yet still preserve the program's dependences. We demonstrate that the algorithm provides optimal complexity for a set of benchmark programs from the literature on probabilistic inference algorithms, whose performance critically relies on simplifying these reductions. The complexities for 10 of the 11 programs improve siginifcantly by factors at least of the sizes of the input data, which are in the range of 10410^4 to 10610^6 for typical real application inputs. We also confirm the significance of the improvement by showing speedups in wall-clock time that range from 1.1x1.1\text{x} to over 106x10^6\text{x}

    Practical synthesis from real-world oracles

    Get PDF
    As software systems become increasingly heterogeneous, the ability of compilers to reason about an entire system has decreased. When components of a system are not implemented as traditional programs, but rather as specialised hardware, optimised architecture-specific libraries, or network services, the compiler is unable to cross these abstraction barriers and analyse the system as a whole. If these components could be modelled or understood as programs, then the compiler would be able to reason about their behaviour without concern for their internal implementation details: a homogeneous view of the entire system would be afforded. However, it is not often the case that such components ever corresponded to an original program. This means that to facilitate this homogenenous analysis, programmatic models of component behaviour must be learned or constructed automatically. Constructing these models is an inductive program synthesis problem, albeit a challenging one that is largely beyond the ability of existing implementations. In order for the problem to be made tractable, information provided by the underlying context (i.e. the real component behaviour to be matched) must be integrated. This thesis presents three program synthesis approaches that integrate contextual information to synthesise programmatic models for real, existing components. The first, Annote, exploits informally-encoded information about a component's interface (e.g. from documentation) by weaving that information into an extended type-and-attribute system for component interfaces. The second, Presyn, learns a pair of cooperating probabilistic models from prior syntheses, that aim to predict likely program structure based on a component's interface. Finally, Haze uses observations of common side-effects of component executions to bias the search for programs. These approaches are each evaluated against comparable synthesisers from the literature, on a set of benchmark problems derived from real components. Learning models for component behaviour is only a partial solution; the compiler must also have some mechanism to use those models for program analysis and transformation. This thesis additionally proposes a novel mechanism for context-sensitive automatic API migration based on synthesised programmatic models, and evaluates the effectiveness of doing so on real application code. In summary, this thesis proposes a new framing for program synthesis problems that target the behaviour of real components, and demonstrates three different potential approaches to synthesis in this spirit. The success of these approaches is evaluated against implementations from the literature, and their results used to drive a novel API migration technique

    Interactive Narrative for Adaptive Educational Games: Architecture and an Application to Character Education

    Get PDF
    This thesis presents AEINS, Adaptive Educational Interactive Narrative System, that supports teaching ethics for 8-12 year old children. AEINS is designed based on Keller's and Gagné's learning theories. The idea is centered around involving students in moral dilemmas (called teaching moments) within which the Socratic Method is used as the teaching pedagogy. The important unique aspect of AEINS is that it exhibits the presence of four features shown to individually increase effectiveness of edugames environments, yet not integrated together in past research: a student model, a dynamic generated narrative, scripted branched narrative and evolving non-player characters. The student model aims to provide adaptation. The dynamic generated narrative forms a continuous story that glues the scripted teaching moments together. The evolving agents increase the realism and believability of the environment and perform a recognized pedagogical role by helping in supplying the educational process. AEINS has been evaluated intrinsically and empirically according to the following themes: architecture and implementation, social aspects, and educational achievements. The intrinsic evaluation checked the implicit goals embodied by the design aspects and made a value judgment about these goals. In the empirical evaluation, twenty participants were assigned to use AEINS over a number of games. The evaluation showed positive results as the participants appreciated the social characteristics of the system as they were able to recognize the genuine social aspects and the realism represented in the game. Finally, the evaluation showed indications for developing new lines of thinking for some participants to the extent that some of them were ready to carry the experience forward to the real world. However, the evaluation also suggested possible improvements, such as the use of 3D interface and free text natural language

    Definite Description Processing in Unrestricted Text

    Get PDF
    Institute for Communicating and Collaborative SystemsNoun phrases with the definite article the, that we call DEFINITE DESCRIPTIONS, following (Russell, 1905), are one of the most common constructs in English, and have been extensively studied by linguists, philosophers, psychologists, and computational linguists. In this dissertation we present an implemented model of definite description processing that is based on extensive empirical studies of definite description use and whose performance can be quantitatively measured. In almost all approaches to discourse processing and discourse representation, definite descriptions have been regarded as anaphoric1; and the models of definite description processing proposed in the literature tend to emphasise the role of common-sense inference mechanisms. Recent work on discourse interpretation (Carletta, 1996; Carletta et al., 1997; Walker and Moore, 1997) has claimed that the judgements on which a theory is based should be shared by more than one subject. On the basis of previous linguistics and corpus linguistics work, we developed several annotation schemes and ran two experiments in which subjects were asked to annotate the uses of definite descriptions in newspaper articles. We compared their annotations and used them to develop our system and to evaluate its performance. Quantitative evaluation has become an issue in other language engineering tasks such as parsing, and has shown its usefulness also for theoretical developments. Recently, evaluation techniques have been introduced for semantic interpretation as well, as is the case for the Sixth Message Understanding Conference (MUC-6) (Sundheim, 1995). However, in this case, the emphasis was on the engineering aspects rather than on a careful study of the phenomena. Our goal has been to develop methods whose performance could be evaluated, but that were based on a careful study of linguistic evidence. The empirical studies we present are evidence that definite descriptions are not primarily anaphoric; they are often used to introduce a new entity in the discourse. Therefore, in the model of definite description processing that we propose, recognising discourse new descriptions plays a role as important as identifying the antecedent of those used anaphorically. Unlike most previous models, our system does not make use of specific hand coded knowledge or common-sense reasoning techniques; the only lexical source we use is WordNet (Miller et al., 1993). As a consequence, our system can process definite descriptions in any domain; a drawback is that our coverage is limited. Nevertheless, our studies serve to reveal the kind of knowledge that is needed for resolving definite descriptions, especially the bridging cases. The system resulting from this work can be useful in applications such as semi-automatic coreference annotation in unrestricted domains

    First Annual Workshop on Space Operations Automation and Robotics (SOAR 87)

    Get PDF
    Several topics relative to automation and robotics technology are discussed. Automation of checkout, ground support, and logistics; automated software development; man-machine interfaces; neural networks; systems engineering and distributed/parallel processing architectures; and artificial intelligence/expert systems are among the topics covered

    Knowledge acquisition for coreference resolution

    Get PDF
    Diese Arbeit befasst sich mit dem Problem der statistischen Koreferenzauflösung. Theoretische Studien bezeichnen Koreferenz als ein vielseitiges linguistisches Phänomen, das von verschiedenen Faktoren beeinflusst wird. Moderne statistiche Algorithmen dagegen basieren sich typischerweise auf einfache wissensarme Modelle. Ziel dieser Arbeit ist das Schließen der Lücke zwischen Theorie und Praxis. Ausgehend von den Erkentnissen der theoretischen Studien erfolgt die Bestimmung der linguistischen Faktoren die fuer die Koreferenz besonders relevant erscheinen. Unterschiedliche Informationsquellen werden betrachtet: von der Oberflächenübereinstimmung bis zu den tieferen syntaktischen, semantischen und pragmatischen Merkmalen. Die Präzision der untersuchten Faktoren wird mit korpus-basierten Methoden evaluiert. Die Ergebnisse beweisen, dass die Koreferenz mit den linguistischen, in den theoretischen Studien eingebrachten Merkmalen interagiert. Die Arbeit zeigt aber auch, dass die Abdeckung der untersuchten theoretischen Aussagen verbessert werden kann. Die Merkmale stellen die Grundlage für den Aufbau eines einerseits linguistisch gesehen reichen andererseits auf dem Machinellen Lerner basierten, d.h. eines flexiblen und robusten Systems zur Koreferenzauflösung. Die aufgestellten Untersuchungen weisen darauf hin dass das wissensreiche Model erfolgversprechende Leistung zeigt und im Vergleich mit den Algorithmen, die sich auf eine einzelne Informationsquelle verlassen, sowie mit anderen existierenden Anwendungen herausragt. Das System erreicht einen F-wert von 65.4% auf dem MUC-7 Korpus. In den bereits veröffentlichen Studien ist kein besseres Ergebnis verzeichnet. Die Lernkurven zeigen keine Konvergenzzeichen. Somit kann der Ansatz eine gute Basis fuer weitere Experimente bilden: eine noch bessere Leistung kann dadurch erreicht werden, dass man entweder mehr Texte annotiert oder die bereits existierende Daten effizienter einsetzt. Diese Arbeit beweist, dass statistiche Algorithmen fuer Koreferenzauflösung stark von den theoretischen linguistischen Studien profitiern können und sollen: auch unvollständige Informationen, die automatische fehleranfällige Sprachmodule liefern, können die Leistung der Anwendung signifikant verbessern.This thesis addresses the problem of statistical coreference resolution. Theoretical studies describe coreference as a complex linguistic phenomenon, affected by various different factors. State-of-the-art statistical approaches, on the contrary, rely on rather simple knowledge-poor modeling. This thesis aims at bridging the gap between the theory and the practice. We use insights from linguistic theory to identify relevant linguistic parameters of co-referring descriptions. We consider different types of information, from the most shallow name-matching measures to deeper syntactic, semantic, and discourse knowledge. We empirically assess the validity of the investigated theoretic predictions for the corpus data. Our data-driven evaluation experiments confirm that various linguistic parameters, suggested by theoretical studies, interact with coreference and may therefore provide valuable information for resolution systems. At the same time, our study raises several issues concerning the coverage of theoretic claims. It thus brings feedback to linguistic theory. We use the investigated knowledge sources to build a linguistically informed statistical coreference resolution engine. This framework allows us to combine the flexibility and robustness of a machine learning-based approach with wide variety of data from different levels of linguistic description. Our evaluation experiments with different machine learners show that our linguistically informed model, on the one side, outperforms algorithms, based on a single knowledge source and, on the other side, yields the best result on the MUC-7 data, reported in the literature (F-score of 65.4% with the SVM-light learning algorithm). The learning curves for our classifiers show no signs of convergence. This suggests that our approach makes a good basis for further experimentation: one can obtain even better results by annotating more material or by using the existing data more intelligently. Our study proves that statistical approaches to the coreference resolution task may and should benefit from linguistic theories: even imperfect knowledge, extracted from raw text data with off-the-shelf error-prone NLP modules, helps achieve significant improvements