126,768 research outputs found

    Automatic detection of accommodation steps as an indicator of knowledge maturing

    Get PDF
    Jointly working on shared digital artifacts – such as wikis – is a well-tried method of developing knowledge collectively within a group or organization. Our assumption is that such knowledge maturing is an accommodation process that can be measured by taking the writing process itself into account. This paper describes the development of a tool that detects accommodation automatically with the help of machine learning algorithms. We applied a software framework for task detection to the automatic identification of accommodation processes within a wiki. To set up the learning algorithms and test its performance, we conducted an empirical study, in which participants had to contribute to a wiki and, at the same time, identify their own tasks. Two domain experts evaluated the participants’ micro-tasks with regard to accommodation. We then applied an ontology-based task detection approach that identified accommodation with a rate of 79.12%. The potential use of our tool for measuring knowledge maturing online is discussed

    Extracting protein-protein interactions from text using rich feature vectors and feature selection

    Get PDF
    Because of the intrinsic complexity of natural language, automatically extracting accurate information from text remains a challenge. We have applied rich featurevectors derived from dependency graphs to predict protein-protein interactions using machine learning techniques. We present the first extensive analysis of applyingfeature selection in this domain, and show that it can produce more cost-effective models. For the first time, our technique was also evaluated on several large-scalecross-dataset experiments, which offers a more realistic view on model performance. During benchmarking, we encountered several fundamental problems hindering comparability with other methods. We present a set of practical guidelines to set up ameaningful evaluation. Finally, we have analysed the feature sets from our experiments before and after feature selection, and evaluated the contribution of both lexical and syntacticinformation to our method. The gained insight will be useful to develop better performing methods in this domain

    Learning Deep Latent Spaces for Multi-Label Classification

    Full text link
    Multi-label classification is a practical yet challenging task in machine learning related fields, since it requires the prediction of more than one label category for each input instance. We propose a novel deep neural networks (DNN) based model, Canonical Correlated AutoEncoder (C2AE), for solving this task. Aiming at better relating feature and label domain data for improved classification, we uniquely perform joint feature and label embedding by deriving a deep latent space, followed by the introduction of label-correlation sensitive loss function for recovering the predicted label outputs. Our C2AE is achieved by integrating the DNN architectures of canonical correlation analysis and autoencoder, which allows end-to-end learning and prediction with the ability to exploit label dependency. Moreover, our C2AE can be easily extended to address the learning problem with missing labels. Our experiments on multiple datasets with different scales confirm the effectiveness and robustness of our proposed method, which is shown to perform favorably against state-of-the-art methods for multi-label classification.Comment: published in AAAI-201
    corecore