7 research outputs found

    Terminology extraction from medical texts in Polish

    Full text link

    Gamifying Language Resource Acquisition

    Get PDF
    PhD ThesisNatural Language Processing, is an important collection of methods for processing the vast amounts of available natural language text we continually produce. These methods make use of supervised learning, an approach that learns from large amounts of annotated data. As humans, we’re able to provide information about text that such systems can learn from. Historically, this was carried out by small groups of experts. However, this did not scale. This led to various crowdsourcing approaches being taken that used large pools of non-experts. The traditional form of crowdsourcing was to pay users small amounts of money to complete tasks. As time progressed, gamification approaches such as GWAPs, showed various benefits over the micro-payment methods used before. These included a cost saving, worker training opportunities, increased worker engagement and potential to far exceed the scale of crowdsourcing. While these were successful in domains such as image labelling, they struggled in the domain of text annotation, which wasn’t such a natural fit. Despite many challenges, there were also clearly many opportunities and benefits to applying this approach to text annotation. Many of these are demonstrated by Phrase Detectives. Based on lessons learned from Phrase Detectives and investigations into other GWAPs, in this work, we attempt to create full GWAPs for NLP, extracting the benefits of the methodology. This includes training, high quality output from non-experts and a truly game-like GWAP design that players are happy to play voluntarily

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    Functional Ceramic Coatings

    Get PDF
    Ceramic materials in the form of coatings can significantly improve the functionality and applications of other engineering materials. Due to a wide range of controllable features and various deposition methods, it is possible to create tailored substrate–coating systems that meet the requirements of modern technologies. Therefore, it is crucial to understand the relationships between the structures, morphology and the properties of ceramic coatings and expand the base of scientific knowledge about them. This book contains a series of fourteen articles which present research on the production and properties of ceramic coatings designed to improve functionality for advanced applications

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    Optimisation Approach to the Construction of the Polish Morphological Guesser

    No full text
    corecore