104,698 research outputs found

    A neural network for semantic labelling of structured information

    Get PDF
    Intelligent systems rely on rich sources of information to make informed decisions. Using information from external sources requires establishing correspondences between the information and known information classes. This can be achieved with semantic labelling, which assigns known labels to structured information by classifying it according to computed features. The existing proposals have explored different sets of features, without focusing on what classification techniques are used. In this paper we present three contributions: first, insights on architectural issues that arise when using neural networks for semantic labelling; second, a novel implementation of semantic labelling that uses a state-of-the-art neural network classifier which achieves significantly better results than other four traditional classifiers; third, a comparison of the results obtained by the former network when using different subsets of features, comparing textual features to structural ones, and domain-dependent features to domain-independent ones. The experiments were carried away with datasets from three real world sources. Our results show that there is a need to develop more semantic labelling proposals with sophisticated classification techniques and large features catalogues.Ministerio de Economía y Competitividad TIN2016-75394-

    Improving Classification When a Class Hierarchy is Available Using a Hierarchy-Based Prior

    Full text link
    We introduce a new method for building classification models when we have prior knowledge of how the classes can be arranged in a hierarchy, based on how easily they can be distinguished. The new method uses a Bayesian form of the multinomial logit (MNL, a.k.a. ``softmax'') model, with a prior that introduces correlations between the parameters for classes that are nearby in the tree. We compare the performance on simulated data of the new method, the ordinary MNL model, and a model that uses the hierarchy in different way. We also test the new method on a document labelling problem, and find that it performs better than the other methods, particularly when the amount of training data is small

    Few-shot classification in Named Entity Recognition Task

    Full text link
    For many natural language processing (NLP) tasks the amount of annotated data is limited. This urges a need to apply semi-supervised learning techniques, such as transfer learning or meta-learning. In this work we tackle Named Entity Recognition (NER) task using Prototypical Network - a metric learning technique. It learns intermediate representations of words which cluster well into named entity classes. This property of the model allows classifying words with extremely limited number of training examples, and can potentially be used as a zero-shot learning method. By coupling this technique with transfer learning we achieve well-performing classifiers trained on only 20 instances of a target class.Comment: In proceedings of the 34th ACM/SIGAPP Symposium on Applied Computin

    Selective sampling for combined learning from labelled and unlabelled data

    Get PDF
    This paper examines the problem of selecting a suitable subset of data to be labelled when building pattern classifiers from labelled and unlabelled data. The selection of representative set is guided by a clustering information and various options of allocating a number of samples within clusters and their distributions are investigated. The experimental results show that hybrid methods like Semi-supervised clustering with selective sampling can result in building a classifier which requires much less labelled data in order to achieve a comparable classification performance to classifiers built only on the basis of labelled data
    corecore