9 research outputs found

    Automatic Population of Structured Reports from Narrative Pathology Reports

    Get PDF
    There are a number of advantages for the use of structured pathology reports: they can ensure the accuracy and completeness of pathology reporting; it is easier for the referring doctors to glean pertinent information from them. The goal of this thesis is to extract pertinent information from free-text pathology reports and automatically populate structured reports for cancer diseases and identify the commonalities and differences in processing principles to obtain maximum accuracy. Three pathology corpora were annotated with entities and relationships between the entities in this study, namely the melanoma corpus, the colorectal cancer corpus and the lymphoma corpus. A supervised machine-learning based-approach, utilising conditional random fields learners, was developed to recognise medical entities from the corpora. By feature engineering, the best feature configurations were attained, which boosted the F-scores significantly from 4.2% to 6.8% on the training sets. Without proper negation and uncertainty detection, the quality of the structured reports will be diminished. The negation and uncertainty detection modules were built to handle this problem. The modules obtained overall F-scores ranging from 76.6% to 91.0% on the test sets. A relation extraction system was presented to extract four relations from the lymphoma corpus. The system achieved very good performance on the training set, with 100% F-score obtained by the rule-based module and 97.2% F-score attained by the support vector machines classifier. Rule-based approaches were used to generate the structured outputs and populate them to predefined templates. The rule-based system attained over 97% F-scores on the training sets. A pipeline system was implemented with an assembly of all the components described above. It achieved promising results in the end-to-end evaluations, with 86.5%, 84.2% and 78.9% F-scores on the melanoma, colorectal cancer and lymphoma test sets respectively

    Irish dependency treebanking and parsing

    Get PDF
    Despite enjoying the status of an official EU language, Irish is considered a minority language. As with most minority languages, it is a `low-density' language, which means it lacks important linguistic and Natural Language Processing (NLP) resources. Relative to better-resourced languages such as English or French, for example, little research has been carried out on computational analysis or processing of Irish. Parsing is the method of analysing the linguistic structure of text, and it is an invaluable processing step that is required for many different types of language technology applications. As a verb-initial language, Irish has several features that are uncharacteristic of many languages previously studied in parsing research. Our work broadens the application of NLP methods to less studied language structures and provides a basis on which future work in Irish NLP is possible. We report on the development of a dependency treebank that serves as training data for the first full Irish dependency parser. We discuss the linguistic structures of Irish, and the motivation behind the design of our annotation scheme. Our work also examines various methods of employing semi-automated approaches to treebank development. We overcome the relatively small pool of linguistic and technological resources available for the Irish language with these approaches, and show that even in early stages of development, parsing results for Irish are promising. What counts as a sufficient number of trees for training a parser varies according to languages. Through empirical methods, we explore the impact our treebank's size and content has on parsing accuracy for Irish. We also discuss our work in crosslingual studies through converting our treebank to a universal annotation scheme. Finally we extend our Irish NLP work to the unstructured user-generated text of Irish tweets. We report on the creation of a POS-tagged corpus of Irish tweets and the training of statistical POS-tagging models. We show how existing resources can be leveraged for this domain-adapted resource development

    Preface

    Get PDF

    31th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    Information modelling is becoming more and more important topic for researchers, designers, and users of information systems.The amount and complexity of information itself, the number of abstractionlevels of information, and the size of databases and knowledge bases arecontinuously growing. Conceptual modelling is one of the sub-areas ofinformation modelling. The aim of this conference is to bring together experts from different areas of computer science and other disciplines, who have a common interest in understanding and solving problems on information modelling and knowledge bases, as well as applying the results of research to practice. We also aim to recognize and study new areas on modelling and knowledge bases to which more attention should be paid. Therefore philosophy and logic, cognitive science, knowledge management, linguistics and management science are relevant areas, too. In the conference, there will be three categories of presentations, i.e. full papers, short papers and position papers

    Proceedings of the 2010 Annual Conference of the Gesellschaft für Semantik

    Get PDF
    Sinn & Bedeutung - the annual conference of the Gesellschaft für Semantik - aims to bring together both established researchers and new blood working on current issues in natural language semantics, pragmatics, the syntax-semantics interface, the philosophy of language or carrying out psycholinguistic studies related to meaning. Every year, the conference moves to a different location in Europe. The 2010 conference - Sinn & Bedeutung 15 - took place on September 9 - 11 at Saarland University, Saarbrücken, organized by the Department for German Studies

    Generative Mesh Modeling

    Get PDF
    Generative Modeling is an alternative approach for the description of three-dimensional shape. The basic idea is to represent a model not as usual by an agglomeration of geometric primitives (triangles, point clouds, NURBS patches), but by functions. The paradigm change from objects to operations allows for a procedural representation of procedural shapes, such as most man-made objects. Instead of storing only the result of a 3D construction, the construction process itself is stored in a model file. The generative approach opens truly new perspectives in many ways, among others also for 3D knowledge management. It permits for instance to resort to a repository of already solved modeling problems, in order to re-use this knowledge also in different, slightly varied situations. The construction knowledge can be collected in digital libraries containing domain-specific parametric modeling tools. A concrete realization of this approach is a new general description language for 3D models, the "Generative Modeling Language" GML. As a Turing-complete "shape programming language" it is a basis of existing, primitv based 3D model formats. Together with its Runtime engine the GML permits - to store highly complex 3D models in a compact form, - to evaluate the description within fractions of a second, - to adaptively tesselate and to interactively display the model, - and even to change the models high-level parameters at runtime.Die generative Modellierung ist ein alternativer Ansatz zur Beschreibung von dreidimensionaler Form. Zugrunde liegt die Idee, ein Modell nicht wie üblich durch eine Ansammlung geometrischer Primitive (Dreiecke, Punkte, NURBS-Patches) zu beschreiben, sondern durch Funktionen. Der Paradigmenwechsel von Objekten zu Geometrie-erzeugenden Operationen ermöglicht es, prozedurale Modelle auch prozedural zu repräsentieren. Statt das Resultat eines 3D-Konstruktionsprozesses zu speichern, kann so der Konstruktionsprozess selber repräsentiert werden. Der generative Ansatz eröffnet unter anderem gänzlich neue Perspektiven für das Wissensmanagement im 3D-Bereich. Er ermöglicht etwa, auf einen Fundus bereits gelöster Konstruktions-Aufgaben zurückzugreifen, um sie in ähnlichen, aber leicht variierten Situationen wiederverwenden zu können. Das Konstruktions-Wissen kann dazu in Form von Bibliotheken parametrisierter, Domänen-spezifischer Modellier-Werkzeuge gesammelt werden. Konkret wird dazu eine neue allgemeine Modell-Beschreibungs-Sprache vorgeschlagen, die "Generative Modeling Language" GML. Als Turing-mächtige "Programmiersprache für Form" stellt sie eine echte Verallgemeinerung existierender Primitiv-basierter 3D-Modellformate dar. Zusammen mit ihrer Runtime-Engine erlaubt die GML, - hochkomplexe 3D-Objekte extrem kompakt zu beschreiben, - die Beschreibung innerhalb von Sekundenbruchteilen auszuwerten, - das Modell adaptiv darzustellen und interaktiv zu betrachten, - und die Modell-Parameter interaktiv zu verändern

    Proceedings of the 11th Workshop on Nonmonotonic Reasoning

    Get PDF
    These are the proceedings of the 11th Nonmonotonic Reasoning Workshop. The aim of this series is to bring together active researchers in the broad area of nonmonotonic reasoning, including belief revision, reasoning about actions, planning, logic programming, argumentation, causality, probabilistic and possibilistic approaches to KR, and other related topics. As part of the program of the 11th workshop, we have assessed the status of the field and discussed issues such as: Significant recent achievements in the theory and automation of NMR; Critical short and long term goals for NMR; Emerging new research directions in NMR; Practical applications of NMR; Significance of NMR to knowledge representation and AI in general
    corecore