6,123 research outputs found
Recommended from our members
Models of incremental concept formation
Given a set of observations, humans acquire concepts that organize those observations and use them in classifying future experiences. This type of concept formation can occur in the absence of a tutor and it can take place despite irrelevant and incomplete information. A reasonable model of such human concept learning should be both incremental and capable of handling this type of complex experiences that people encounter in the real world. In this paper, we review three previous models of incremental concept formation and then present CLASSIT, a model that extends these earlier systems. All of the models integrate the process of recognition and learning, and all can be viewed as carrying out search through the space of possible concept hierarchies. In an attempt to show that CLASSIT is a robust concept formation system, we also present some empirical studies of its behavior under a variety of conditions
Recommended from our members
A survey of clustering methods
In this paper, I describe a large variety of clustering methods within a single framework. This paper unifies work across different fields, from biology (numerical taxonomy) to machine learning (concept formation). An important objective for this paper is to show that one can benefit by a knowledge of research across different disciplines. After describing the task from a set of different viewpoints or paradigms, I begin by describing the similarity measures or evaluation functions that form the basis of any clustering technique. Next, I describe a number of different algorithms that use these measures, and I close with a brief discussion of ways to evaluate different approaches to clustering
Recommended from our members
Hierarchical classification for multiple, distributed web databases
The proliferation of online information resources increases the importance of effective and efficient distributed searching. Our research aims to provide an alternative hierarchical categorization and search capability based on a Bayesian network learning algorithm. Our proposed approach, which is grounded on automatic textual analysis of subject content of online web databases, attempts to address the database selection problem by first classifying web databases into a hierarchy of topic categories. The experimental results reported demonstrate that such a classification approach not only effectively reduces the class search space, but also helps to significantly improve the accuracy of classification performance
Exploiting Transitivity in Probabilistic Models for Ontology Learning
Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as ontologies are knowledge repositories used in a variety of applications. To be effectively used, these ontologies have to be large or, at least, adapted to specific domains. Our main goal is to contribute practically to the research on ontology learning models by covering different aspects of the task.
We propose probabilistic models for learning ontologies that expands existing ontologies taking into accounts both corpus-extracted evidences and structure of the generated ontologies. The model exploits structural properties of target relations such as transitivity during learning. We then propose two extensions of our probabilistic models: a model for learning from a generic domain that can be exploited to extract new information in a specific domain and an incremental ontology learning system that put human validations in the learning loop. This latter provides a graphical user interface and a human-computer interaction workflow supporting the incremental leaning loop
Semantic knowledge integration for learning from semantically imprecise data
Low availability of labeled training data often poses a fundamental limit to the accuracy of computer vision applications using machine learning methods. While these methods are improved continuously, e.g., through better neural network architectures, there cannot be a single methodical change that increases the accuracy on all possible tasks. This statement, known as the no free lunch theorem, suggests that we should consider aspects of machine learning other than learning algorithms for opportunities to escape the limits set by the available training data. In this thesis, we focus on two main aspects, namely the nature of the training data, where we introduce structure into the label set using concept hierarchies, and the learning paradigm, which we change in accordance with requirements of real-world applications as opposed to more academic setups.Concept hierarchies represent semantic relations, which are sets of statements such as "a bird is an animal." We propose a hierarchical classifier to integrate this domain knowledge in a pre-existing task, thereby increasing the information the classifier has access to. While the hierarchy's leaf nodes correspond to the original set of classes, the inner nodes are "new" concepts that do not exist in the original training data. However, we pose that such "imprecise" labels are valuable and should occur naturally, e.g., as an annotator's way of expressing their uncertainty. Furthermore, the increased number of concepts leads to more possible search terms when assembling a web-crawled dataset or using an image search. We propose CHILLAX, a method that learns from semantically imprecise training data, while still offering precise predictions to integrate seamlessly into a pre-existing application
Exploiting transitivity in probabilistic models for ontology learning
Nel natural language processing (NLP) catturare il significato delle parole è una delle sfide a cui i ricercatori sono largamente interessati.
Le reti semantiche di parole o concetti, che strutturano in modo formale la conoscenza, sono largamente utilizzate in molte applicazioni.
Per essere effettivamente utilizzate, in particolare nei metodi automatici di apprendimento, queste reti semantiche devono essere di grandi dimensioni o almeno strutturare conoscenza di domini molto specifici.
Il nostro principale obiettivo è contribuire alla ricerca di metodi di apprendimento di reti semantiche concentrandosi in differenti aspetti.
Proponiamo un nuovo modello probabilistico per creare o estendere reti semantiche che prende contemporaneamente in considerazine sia le evidenze estratte nel corpus sia la struttura della rete semantiche considerata nel training.
In particolare il nostro modello durante l'apprendimento sfrutta le proprietà strutturali, come la transitività, delle relazioni che legano i nodi della nostra rete.
La formulazione della probabilità che una data relazione tra due istanze appartiene alla rete semantica dipenderà da due probabilità: la probabilità diretta stimata delle evidenze del corpus e la probabilità indotta che deriva delle proprietà strutturali della relazione presa in considerazione.
Il modello che proponiano introduce alcune innovazioni nella stima di queste probabilità.
Proponiamo anche un modello che può essere usato per apprendere conoscenza in differenti domini di interesse senza un grande effort aggiuntivo per l'adattamento.
In particolare, nell'approccio che proponiamo, si apprende un modello da un dominio generico e poi si sfrutta tale modello per estrarre nuova conoscenza in un dominio specifico.
Infine proponiamo Semantic Turkey Ontology Learner (ST-OL): un sistema di apprendimento di ontologie incrementale.
Mediante ontology editor, ST-OL fornisce un efficiente modo di interagire con l'utente finale e inserire le decisioni di tale utente nel loop dell'apprendimento.
Inoltre il modello probabilistico integrato in ST-OL permette di sfruttare la transitività delle relazioni per indurre migliori modelli di estrazione.
Mediante degli esperimenti dimostriamo che tutti i modelli che proponiamo danno un reale contributo ai differenti task che consideriamo migliorando le prestazioni.Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as semantic networks of words or
concepts are knowledge repositories used in a variety of applications. To be
effectively used, these networks have to be large or, at least, adapted to specific
domains. Our main goal is to contribute practically to the research on semantic
networks learning models by covering different aspects of the task.
We propose a novel probabilistic model for learning semantic networks that
expands existing semantic networks taking into accounts both corpus-extracted
evidences and the structure of the generated semantic networks. The model exploits structural properties of target relations such as transitivity during learning. The probability for a given relation instance to belong to the semantic
networks of words depends both on its direct probability and on the induced
probability derived from the structural properties of the target relation. Our
model presents some innovations in estimating these probabilities.
We also propose a model that can be used in different specific knowledge
domains with a small effort for its adaptation. In this approach a model is
learned from a generic domain that can be exploited to extract new informations
in a specific domain.
Finally, we propose an incremental ontology learning system: Semantic
Turkey Ontology Learner (ST-OL). ST-OL addresses two principal issues. The
first issue is an efficient way to interact with final users and, then, to put the
final users decisions in the learning loop. We obtain this positive interaction
using an ontology editor. The second issue is a probabilistic learning semantic
networks of words model that exploits transitive relations for inducing better
extraction models. ST-OL provides a graphical user interface and a human-
computer interaction workflow supporting the incremental leaning loop of our
learning semantic networks of words
- …