1,767 research outputs found
Inductive probabilistic taxonomy learning using singular value decomposition
Capturing word meaning is one of the challenges of natural language processing (NLP).
Formal models of meaning, such as networks of words or concepts, are knowledge repositories
used in a variety of applications. To be effectively used, these networks have to be large or, at
least, adapted to specific domains. Learning word meaning from texts is then an active area
of research. Lexico-syntactic pattern methods are one of the possible solutions. Yet, these
models do not use structural properties of target semantic relations, e.g. transitivity, during
learning. In this paper, we propose a novel lexico-syntactic pattern probabilistic method
for learning taxonomies that explicitly models transitivity and naturally exploits vector space
model techniques for reducing space dimensions. We define two probabilistic models: the
direct probabilistic model and the induced probabilistic model. The first is directly estimated
on observations over text collections. The second uses transitivity on the direct probabilistic
model to induce probabilities of derived events. Within our probabilistic model, we also
propose a novel way of using singular value decomposition as unsupervised method for
feature selection in estimating direct probabilities. We empirically show that the induced
probabilistic taxonomy learning model outperforms state-of-the-art probabilistic models and
our unsupervised feature selection method improves performance
Exploiting transitivity in probabilistic models for ontology learning
Nel natural language processing (NLP) catturare il significato delle parole è una delle sfide a cui i ricercatori sono largamente interessati.
Le reti semantiche di parole o concetti, che strutturano in modo formale la conoscenza, sono largamente utilizzate in molte applicazioni.
Per essere effettivamente utilizzate, in particolare nei metodi automatici di apprendimento, queste reti semantiche devono essere di grandi dimensioni o almeno strutturare conoscenza di domini molto specifici.
Il nostro principale obiettivo è contribuire alla ricerca di metodi di apprendimento di reti semantiche concentrandosi in differenti aspetti.
Proponiamo un nuovo modello probabilistico per creare o estendere reti semantiche che prende contemporaneamente in considerazine sia le evidenze estratte nel corpus sia la struttura della rete semantiche considerata nel training.
In particolare il nostro modello durante l'apprendimento sfrutta le proprietà strutturali, come la transitività, delle relazioni che legano i nodi della nostra rete.
La formulazione della probabilità che una data relazione tra due istanze appartiene alla rete semantica dipenderà da due probabilità: la probabilità diretta stimata delle evidenze del corpus e la probabilità indotta che deriva delle proprietà strutturali della relazione presa in considerazione.
Il modello che proponiano introduce alcune innovazioni nella stima di queste probabilità.
Proponiamo anche un modello che può essere usato per apprendere conoscenza in differenti domini di interesse senza un grande effort aggiuntivo per l'adattamento.
In particolare, nell'approccio che proponiamo, si apprende un modello da un dominio generico e poi si sfrutta tale modello per estrarre nuova conoscenza in un dominio specifico.
Infine proponiamo Semantic Turkey Ontology Learner (ST-OL): un sistema di apprendimento di ontologie incrementale.
Mediante ontology editor, ST-OL fornisce un efficiente modo di interagire con l'utente finale e inserire le decisioni di tale utente nel loop dell'apprendimento.
Inoltre il modello probabilistico integrato in ST-OL permette di sfruttare la transitività delle relazioni per indurre migliori modelli di estrazione.
Mediante degli esperimenti dimostriamo che tutti i modelli che proponiamo danno un reale contributo ai differenti task che consideriamo migliorando le prestazioni.Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as semantic networks of words or
concepts are knowledge repositories used in a variety of applications. To be
effectively used, these networks have to be large or, at least, adapted to specific
domains. Our main goal is to contribute practically to the research on semantic
networks learning models by covering different aspects of the task.
We propose a novel probabilistic model for learning semantic networks that
expands existing semantic networks taking into accounts both corpus-extracted
evidences and the structure of the generated semantic networks. The model exploits structural properties of target relations such as transitivity during learning. The probability for a given relation instance to belong to the semantic
networks of words depends both on its direct probability and on the induced
probability derived from the structural properties of the target relation. Our
model presents some innovations in estimating these probabilities.
We also propose a model that can be used in different specific knowledge
domains with a small effort for its adaptation. In this approach a model is
learned from a generic domain that can be exploited to extract new informations
in a specific domain.
Finally, we propose an incremental ontology learning system: Semantic
Turkey Ontology Learner (ST-OL). ST-OL addresses two principal issues. The
first issue is an efficient way to interact with final users and, then, to put the
final users decisions in the learning loop. We obtain this positive interaction
using an ontology editor. The second issue is a probabilistic learning semantic
networks of words model that exploits transitive relations for inducing better
extraction models. ST-OL provides a graphical user interface and a human-
computer interaction workflow supporting the incremental leaning loop of our
learning semantic networks of words
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
A mathematical theory of semantic development in deep neural networks
An extensive body of empirical research has revealed remarkable regularities
in the acquisition, organization, deployment, and neural representation of
human semantic knowledge, thereby raising a fundamental conceptual question:
what are the theoretical principles governing the ability of neural networks to
acquire, organize, and deploy abstract knowledge by integrating across many
individual experiences? We address this question by mathematically analyzing
the nonlinear dynamics of learning in deep linear networks. We find exact
solutions to this learning dynamics that yield a conceptual explanation for the
prevalence of many disparate phenomena in semantic cognition, including the
hierarchical differentiation of concepts through rapid developmental
transitions, the ubiquity of semantic illusions between such transitions, the
emergence of item typicality and category coherence as factors controlling the
speed of semantic processing, changing patterns of inductive projection over
development, and the conservation of semantic similarity in neural
representations across species. Thus, surprisingly, our simple neural model
qualitatively recapitulates many diverse regularities underlying semantic
development, while providing analytic insight into how the statistical
structure of an environment can interact with nonlinear deep learning dynamics
to give rise to these regularities
Exploiting Transitivity in Probabilistic Models for Ontology Learning
Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as ontologies are knowledge repositories used in a variety of applications. To be effectively used, these ontologies have to be large or, at least, adapted to specific domains. Our main goal is to contribute practically to the research on ontology learning models by covering different aspects of the task.
We propose probabilistic models for learning ontologies that expands existing ontologies taking into accounts both corpus-extracted evidences and structure of the generated ontologies. The model exploits structural properties of target relations such as transitivity during learning. We then propose two extensions of our probabilistic models: a model for learning from a generic domain that can be exploited to extract new information in a specific domain and an incremental ontology learning system that put human validations in the learning loop. This latter provides a graphical user interface and a human-computer interaction workflow supporting the incremental leaning loop
A Taxonomy of Information Retrieval Models and Tools
Information retrieval is attracting significant attention due to the exponential growth of the amount of information available in digital format. The proliferation of information retrieval objects, including algorithms, methods, technologies, and tools, makes it difficult to assess their capabilities and features and to understand the relationships that exist among them. In addition, the terminology is often confusing and misleading, as different terms are used to denote the same, or similar, tasks.
This paper proposes a taxonomy of information retrieval models and tools and provides precise definitions for the key terms. The taxonomy consists of superimposing two views: a vertical taxonomy, that classifies IR models with respect to a set of basic features, and a horizontal taxonomy, which classifies IR systems and services with respect to the tasks they support.
The aim is to provide a framework for classifying existing information retrieval models and tools and a solid point to assess future developments in the field
- …