6,123 research outputs found

    Exploiting Transitivity in Probabilistic Models for Ontology Learning

    Get PDF
    Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as ontologies are knowledge repositories used in a variety of applications. To be effectively used, these ontologies have to be large or, at least, adapted to specific domains. Our main goal is to contribute practically to the research on ontology learning models by covering different aspects of the task. We propose probabilistic models for learning ontologies that expands existing ontologies taking into accounts both corpus-extracted evidences and structure of the generated ontologies. The model exploits structural properties of target relations such as transitivity during learning. We then propose two extensions of our probabilistic models: a model for learning from a generic domain that can be exploited to extract new information in a specific domain and an incremental ontology learning system that put human validations in the learning loop. This latter provides a graphical user interface and a human-computer interaction workflow supporting the incremental leaning loop

    Semantic knowledge integration for learning from semantically imprecise data

    Get PDF
    Low availability of labeled training data often poses a fundamental limit to the accuracy of computer vision applications using machine learning methods. While these methods are improved continuously, e.g., through better neural network architectures, there cannot be a single methodical change that increases the accuracy on all possible tasks. This statement, known as the no free lunch theorem, suggests that we should consider aspects of machine learning other than learning algorithms for opportunities to escape the limits set by the available training data. In this thesis, we focus on two main aspects, namely the nature of the training data, where we introduce structure into the label set using concept hierarchies, and the learning paradigm, which we change in accordance with requirements of real-world applications as opposed to more academic setups.Concept hierarchies represent semantic relations, which are sets of statements such as "a bird is an animal." We propose a hierarchical classifier to integrate this domain knowledge in a pre-existing task, thereby increasing the information the classifier has access to. While the hierarchy's leaf nodes correspond to the original set of classes, the inner nodes are "new" concepts that do not exist in the original training data. However, we pose that such "imprecise" labels are valuable and should occur naturally, e.g., as an annotator's way of expressing their uncertainty. Furthermore, the increased number of concepts leads to more possible search terms when assembling a web-crawled dataset or using an image search. We propose CHILLAX, a method that learns from semantically imprecise training data, while still offering precise predictions to integrate seamlessly into a pre-existing application

    Exploiting transitivity in probabilistic models for ontology learning

    Get PDF
    Nel natural language processing (NLP) catturare il significato delle parole è una delle sfide a cui i ricercatori sono largamente interessati. Le reti semantiche di parole o concetti, che strutturano in modo formale la conoscenza, sono largamente utilizzate in molte applicazioni. Per essere effettivamente utilizzate, in particolare nei metodi automatici di apprendimento, queste reti semantiche devono essere di grandi dimensioni o almeno strutturare conoscenza di domini molto specifici. Il nostro principale obiettivo è contribuire alla ricerca di metodi di apprendimento di reti semantiche concentrandosi in differenti aspetti. Proponiamo un nuovo modello probabilistico per creare o estendere reti semantiche che prende contemporaneamente in considerazine sia le evidenze estratte nel corpus sia la struttura della rete semantiche considerata nel training. In particolare il nostro modello durante l'apprendimento sfrutta le proprietà strutturali, come la transitività, delle relazioni che legano i nodi della nostra rete. La formulazione della probabilità che una data relazione tra due istanze appartiene alla rete semantica dipenderà da due probabilità: la probabilità diretta stimata delle evidenze del corpus e la probabilità indotta che deriva delle proprietà strutturali della relazione presa in considerazione. Il modello che proponiano introduce alcune innovazioni nella stima di queste probabilità. Proponiamo anche un modello che può essere usato per apprendere conoscenza in differenti domini di interesse senza un grande effort aggiuntivo per l'adattamento. In particolare, nell'approccio che proponiamo, si apprende un modello da un dominio generico e poi si sfrutta tale modello per estrarre nuova conoscenza in un dominio specifico. Infine proponiamo Semantic Turkey Ontology Learner (ST-OL): un sistema di apprendimento di ontologie incrementale. Mediante ontology editor, ST-OL fornisce un efficiente modo di interagire con l'utente finale e inserire le decisioni di tale utente nel loop dell'apprendimento. Inoltre il modello probabilistico integrato in ST-OL permette di sfruttare la transitività delle relazioni per indurre migliori modelli di estrazione. Mediante degli esperimenti dimostriamo che tutti i modelli che proponiamo danno un reale contributo ai differenti task che consideriamo migliorando le prestazioni.Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as semantic networks of words or concepts are knowledge repositories used in a variety of applications. To be effectively used, these networks have to be large or, at least, adapted to specific domains. Our main goal is to contribute practically to the research on semantic networks learning models by covering different aspects of the task. We propose a novel probabilistic model for learning semantic networks that expands existing semantic networks taking into accounts both corpus-extracted evidences and the structure of the generated semantic networks. The model exploits structural properties of target relations such as transitivity during learning. The probability for a given relation instance to belong to the semantic networks of words depends both on its direct probability and on the induced probability derived from the structural properties of the target relation. Our model presents some innovations in estimating these probabilities. We also propose a model that can be used in different specific knowledge domains with a small effort for its adaptation. In this approach a model is learned from a generic domain that can be exploited to extract new informations in a specific domain. Finally, we propose an incremental ontology learning system: Semantic Turkey Ontology Learner (ST-OL). ST-OL addresses two principal issues. The first issue is an efficient way to interact with final users and, then, to put the final users decisions in the learning loop. We obtain this positive interaction using an ontology editor. The second issue is a probabilistic learning semantic networks of words model that exploits transitive relations for inducing better extraction models. ST-OL provides a graphical user interface and a human- computer interaction workflow supporting the incremental leaning loop of our learning semantic networks of words
    corecore