1,186 research outputs found

    A case study in tagging case in german: an assessment of statistical approaches

    Full text link
    In this study, we assess the performance of purely statistical approaches using supervised machine learning for predicting case in German (nominative, accusative, dative, genitive, n/a). We experiment with two different treebanks containing morphological annotations: TIGER and TUEBA. An evaluation with 10-fold cross-validation serves as the basis for systematic comparisons of the optimal parametrizations of different approaches. We test taggers based on Hidden Markov Models (HMM), Decision Trees, and Conditional Random Fields (CRF). The CRF approach based on our hand-crafted feature model achieves an accuracy of about 94%. This outperforms all other approaches and results in an improvement of 11% compared to a baseline HMM trigram tagger and an improvement of 2% compared to a state-of-the-art tagger for rich morphological tagsets. Moreover, we investigate the effect of additional (morphological) categories (gender, number, person, part of speech) in the internal tagset used for the training. Rich internal tagsets improve results for all tested approaches

    Enhancing FunGramKB: Further Verbs of Feeling in English

    Get PDF
    The present dissertation aims at analyzing some linguistic aspects related to the lexical, semantic and syntactic behaviour of a number of verbs of FEELING in English whose lexical, grammatical and idiosyncratic properties have been entered into the FunGramKB Editor in application of study of the theoretical assumptions propounded by the Lexical-Constructional Model. Analysis and subsequent input of data have been assessed against the background of some of the 20th-century trends in linguistics which find their expression in the first decade of this century, and the role of semantics in a world in which increasing priority is given to probabilistic, machine-learned output in lexicographic work. From this stance, the generic features contained in the FunGramKB meaning postulates and thematic frames as outlined in the Lexical-Constructional Model bring hope for a more faithful rendering of the semantic relationships established within human expression, while making provisions for a semanticist‟s contribution to refinement and storage of both thorough and extensive knowledge

    The impact of digital therapeutics on the pharmaceutical industry in the treatment of mental health disorders

    Get PDF
    Digital alternatives to traditional treatment methods for mental health disorders are increasing. These digital therapeutics pose a threat to the traditional pharmaceutical company, having the ability to potentially replace some medications in the treatment. While the digital therapeutics sector is growing, the economic implications of the technology for pharmaceutical companies is rarely discussed in academic literature. This dissertation aims to close the knowledge gap by providing ex-ante predictions of the disruption potential of digital therapeutics in the treatment of mental health disorders. A mixed research approach consisting of both expert interviews and a survey was used to collect market insights from both digital therapeutic and pharmaceutical industries. In combination with knowledge gathered from current literature, predictions about the disruption potential of digital therapeutics in the mental health care market were made. The findings indicated a high disruption potential for digital therapeutics within the mental health care market. A complete displacement of pharmaceuticals is unlikely. For future research, additional medical specialties or competitive capabilities of pharmaceutical companies can be examined as the market environment constantly changes.As alternativas digitais aos métodos tradicionais de tratamento das perturbações da saúde mental estão a aumentar. Estas terapêuticas digitais representam uma ameaça para a empresa farmacêutica tradicional, tendo a capacidade de potencialmente substituir alguns medicamentos no tratamento. Enquanto o sector da terapêutica digital está a crescer, o tema é raramente discutido na literatura académica sobre as implicações económicas da tecnologia para as empresas farmacêuticas. Esta dissertação visa colmatar a lacuna de conhecimentos, fornecendo previsões ex-ante do potencial de perturbação da terapêutica digital no tratamento de distúrbios da saúde mental. Foi utilizada uma abordagem de investigação mista que consiste em entrevistas a peritos e num inquérito para recolher conhecimentos de mercado tanto da indústria terapêutica digital como da indústria farmacêutica. Em combinação com o conhecimento recolhido da literatura actual, foram feitas previsões sobre o potencial de perturbação da terapêutica digital no mercado dos cuidados de saúde mental. Os resultados indicaram um elevado potencial de perturbação da terapêutica digital no mercado dos cuidados de saúde mental. É improvável uma deslocação completa dos produtos farmacêuticos. Para investigações futuras, especialidades médicas adicionais ou capacidades competitivas das empresas farmacêuticas podem ser examinadas à medida que o ambiente do mercado muda constantemente

    The Effects of Instruction on Intermediate JLEs’ Prepositional Accuracy: An Exploratory Study

    Get PDF

    Neural Techniques for German Dependency Parsing

    Get PDF
    Syntactic parsing is the task of analyzing the structure of a sentence based on some predefined formal assumption. It is a key component in many natural language processing (NLP) pipelines and is of great benefit for natural language understanding (NLU) tasks such as information retrieval or sentiment analysis. Despite achieving very high results with neural network techniques, most syntactic parsing research pays attention to only a few prominent languages (such as English or Chinese) or language-agnostic settings. Thus, we still lack studies that focus on just one language and design specific parsing strategies for that language with regards to its linguistic properties. In this thesis, we take German as the language of interest and develop more accurate methods for German dependency parsing by combining state-of-the-art neural network methods with techniques that address the specific challenges posed by the language-specific properties of German. Compared to English, German has richer morphology, semi-free word order, and case syncretism. It is the combination of those characteristics that makes parsing German an interesting and challenging task. Because syntactic parsing is a task that requires many levels of language understanding, we propose to study and improve the knowledge of parsing models at each level in order to improve syntactic parsing for German. These levels are: (sub)word level, syntactic level, semantic level, and sentence level. At the (sub)word level, we look into a surge in out-of-vocabulary words in German data caused by compounding. We propose a new type of embeddings for compounds that is a compositional model of the embeddings of individual components. Our experiments show that character-based embeddings are superior to word and compound embeddings in dependency parsing, and compound embeddings only outperform word embeddings when the part-of-speech (POS) information is unavailable. Thus, we conclude that it is the morpho-syntactic information of unknown compounds, not the semantic one, that is crucial for parsing German. At the syntax level, we investigate challenges for local grammatical function labeler that are caused by case syncretism. In detail, we augment the grammatical function labeling component in a neural dependency parser that labels each head-dependent pair independently with a new labeler that includes a decision history, using Long Short-Term Memory networks (LSTMs). All our proposed models significantly outperformed the baseline on three languages: English, German and Czech. However, the impact of the new models is not the same for all languages: the improvement for English is smaller than for the non-configurational languages (German and Czech). Our analysis suggests that the success of the history-based models is not due to better handling of long dependencies but that they are better in dealing with the uncertainty in head direction. We study the interaction of syntactic parsing with the semantic level via the problem of PP attachment disambiguation. Our motivation is to provide a realistic evaluation of the task where gold information is not available and compare the results of disambiguation systems against the output of a strong neural parser. To our best knowledge, this is the first time that PP attachment disambiguation is evaluated and compared against neural dependency parsing on predicted information. In addition, we present a novel approach for PP attachment disambiguation that uses biaffine attention and utilizes pre-trained contextualized word embeddings as semantic knowledge. Our end-to-end system outperformed the previous pipeline approach on German by a large margin simply by avoiding error propagation caused by predicted information. In the end, we show that parsing systems (with the same semantic knowledge) are in general superior to systems specialized for PP attachment disambiguation. Lastly, we improve dependency parsing at the sentence level using reranking techniques. So far, previous work on neural reranking has been evaluated on English and Chinese only, both languages with a configurational word order and poor morphology. We re-assess the potential of successful neural reranking models from the literature on English and on two morphologically rich(er) languages, German and Czech. In addition, we introduce a new variation of a discriminative reranker based on graph convolutional networks (GCNs). Our proposed reranker not only outperforms previous models on English but is the only model that is able to improve results over the baselines on German and Czech. Our analysis points out that the failure is due to the lower quality of the k-best lists, where the gold tree ratio and the diversity of the list play an important role

    Design and Construction of Semantic Document Networks Using Concept Extraction

    Get PDF
    Processing of unstructured documents according to their content is required in many disciplines; e.g., machine translation, text analysis and mining, and information extraction and retrieval. Whilst research in fields like text analysis, conceptualisation, or design of semantic networks progressed crucially over the last years, we still observe gaps between state-of-the-art algorithms to extract concepts from documents and how these concepts are linked effective and efficiently. This paper proposes a framework to store processed documents in a specialised semantic network database to enhance retrieval and analysis of common concepts in documents. We apply natural language reduction to calculate semantic cores for the concept-based indexing of stored documents. The developed prototype demonstrates an advanced document storage as well as a fast (semantical) retrieval of documents based on given key concepts
    corecore