761 research outputs found
Exploiting Transitivity in Probabilistic Models for Ontology Learning
Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as ontologies are knowledge repositories used in a variety of applications. To be effectively used, these ontologies have to be large or, at least, adapted to specific domains. Our main goal is to contribute practically to the research on ontology learning models by covering different aspects of the task.
We propose probabilistic models for learning ontologies that expands existing ontologies taking into accounts both corpus-extracted evidences and structure of the generated ontologies. The model exploits structural properties of target relations such as transitivity during learning. We then propose two extensions of our probabilistic models: a model for learning from a generic domain that can be exploited to extract new information in a specific domain and an incremental ontology learning system that put human validations in the learning loop. This latter provides a graphical user interface and a human-computer interaction workflow supporting the incremental leaning loop
Exploiting transitivity in probabilistic models for ontology learning
Nel natural language processing (NLP) catturare il significato delle parole è una delle sfide a cui i ricercatori sono largamente interessati.
Le reti semantiche di parole o concetti, che strutturano in modo formale la conoscenza, sono largamente utilizzate in molte applicazioni.
Per essere effettivamente utilizzate, in particolare nei metodi automatici di apprendimento, queste reti semantiche devono essere di grandi dimensioni o almeno strutturare conoscenza di domini molto specifici.
Il nostro principale obiettivo è contribuire alla ricerca di metodi di apprendimento di reti semantiche concentrandosi in differenti aspetti.
Proponiamo un nuovo modello probabilistico per creare o estendere reti semantiche che prende contemporaneamente in considerazine sia le evidenze estratte nel corpus sia la struttura della rete semantiche considerata nel training.
In particolare il nostro modello durante l'apprendimento sfrutta le proprietà strutturali, come la transitività, delle relazioni che legano i nodi della nostra rete.
La formulazione della probabilità che una data relazione tra due istanze appartiene alla rete semantica dipenderà da due probabilità: la probabilità diretta stimata delle evidenze del corpus e la probabilità indotta che deriva delle proprietà strutturali della relazione presa in considerazione.
Il modello che proponiano introduce alcune innovazioni nella stima di queste probabilità.
Proponiamo anche un modello che può essere usato per apprendere conoscenza in differenti domini di interesse senza un grande effort aggiuntivo per l'adattamento.
In particolare, nell'approccio che proponiamo, si apprende un modello da un dominio generico e poi si sfrutta tale modello per estrarre nuova conoscenza in un dominio specifico.
Infine proponiamo Semantic Turkey Ontology Learner (ST-OL): un sistema di apprendimento di ontologie incrementale.
Mediante ontology editor, ST-OL fornisce un efficiente modo di interagire con l'utente finale e inserire le decisioni di tale utente nel loop dell'apprendimento.
Inoltre il modello probabilistico integrato in ST-OL permette di sfruttare la transitività delle relazioni per indurre migliori modelli di estrazione.
Mediante degli esperimenti dimostriamo che tutti i modelli che proponiamo danno un reale contributo ai differenti task che consideriamo migliorando le prestazioni.Capturing word meaning is one of the challenges of natural language processing (NLP). Formal models of meaning such as semantic networks of words or
concepts are knowledge repositories used in a variety of applications. To be
effectively used, these networks have to be large or, at least, adapted to specific
domains. Our main goal is to contribute practically to the research on semantic
networks learning models by covering different aspects of the task.
We propose a novel probabilistic model for learning semantic networks that
expands existing semantic networks taking into accounts both corpus-extracted
evidences and the structure of the generated semantic networks. The model exploits structural properties of target relations such as transitivity during learning. The probability for a given relation instance to belong to the semantic
networks of words depends both on its direct probability and on the induced
probability derived from the structural properties of the target relation. Our
model presents some innovations in estimating these probabilities.
We also propose a model that can be used in different specific knowledge
domains with a small effort for its adaptation. In this approach a model is
learned from a generic domain that can be exploited to extract new informations
in a specific domain.
Finally, we propose an incremental ontology learning system: Semantic
Turkey Ontology Learner (ST-OL). ST-OL addresses two principal issues. The
first issue is an efficient way to interact with final users and, then, to put the
final users decisions in the learning loop. We obtain this positive interaction
using an ontology editor. The second issue is a probabilistic learning semantic
networks of words model that exploits transitive relations for inducing better
extraction models. ST-OL provides a graphical user interface and a human-
computer interaction workflow supporting the incremental leaning loop of our
learning semantic networks of words
Enhanced services for targeted information retrieval by event extraction and data mining
Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receive answers. Question-answering from text has already been the goal of the Message Understanding Conferences. Since then, the task of text understanding has been reduced to several more tractable tasks, most prominently Named Entity Recognition (NER) and Relation Extraction. Now, pieces can be put together to form enhanced services added on an IR system. In this paper, we present a framework which combines standard IR with machine learning and (pre-)processing for NER in order to extract events from a large document collection. Some questions can already be answered by particular events. Other questions require an analysis of a set of events. Hence, the extracted events become input to another machine learning process which delivers the final output to the user's question. Our case study is the public collection of minutes of plenary sessions of the German parliament and of petitions to the German parliament. --
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
Understanding spoken language is a highly complex problem, which can be
decomposed into several simpler tasks. In this paper, we focus on Spoken
Language Understanding (SLU), the module of spoken dialog systems responsible
for extracting a semantic interpretation from the user utterance. The task is
treated as a labeling problem. In the past, SLU has been performed with a wide
variety of probabilistic models. The rise of neural networks, in the last
couple of years, has opened new interesting research directions in this domain.
Recurrent Neural Networks (RNNs) in particular are able not only to represent
several pieces of information as embeddings but also, thanks to their recurrent
architecture, to encode as embeddings relatively long contexts. Such long
contexts are in general out of reach for models previously used for SLU. In
this paper we propose novel RNNs architectures for SLU which outperform
previous ones. Starting from a published idea as base block, we design new deep
RNNs achieving state-of-the-art results on two widely used corpora for SLU:
ATIS (Air Traveling Information System), in English, and MEDIA (Hotel
information and reservation in France), in French.Comment: 8 pages. Rejected from IJCAI 2017, good remarks overall, but slightly
off-topic as from global meta-reviews. Recommendations: 8, 6, 6, 4. arXiv
admin note: text overlap with arXiv:1706.0174
Design of an E-learning system using semantic information and cloud computing technologies
Humanity is currently suffering from many difficult problems that threaten the life and survival of the human race. It is very easy for all mankind to be affected, directly or indirectly, by these problems. Education is a key solution for most of them. In our thesis we tried to make use of current technologies to enhance and ease the learning process.
We have designed an e-learning system based on semantic information and cloud computing, in addition to many other technologies that contribute to improving the educational process and raising the level of students. The design was built after much research on useful technology, its types, and examples of actual systems that were previously discussed by other researchers.
In addition to the proposed design, an algorithm was implemented to identify topics found in large textual educational resources. It was tested and proved to be efficient against other methods. The algorithm has the ability of extracting the main topics from textual learning resources, linking related resources and generating interactive dynamic knowledge graphs. This algorithm accurately and efficiently accomplishes those tasks even for bigger books. We used Wikipedia Miner, TextRank, and Gensim within our algorithm. Our algorithm‘s accuracy was evaluated against Gensim, largely improving its accuracy.
Augmenting the system design with the implemented algorithm will produce many useful services for improving the learning process such as: identifying main topics of big textual learning resources automatically and connecting them to other well defined concepts from Wikipedia, enriching current learning resources with semantic information from external sources, providing student with browsable dynamic interactive knowledge graphs, and making use of learning groups to encourage students to share their learning experiences and feedback with other learners.Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Luis Sánchez Fernández.- Secretario: Luis de la Fuente Valentín.- Vocal: Norberto Fernández Garcí
Inductive probabilistic taxonomy learning using singular value decomposition
Capturing word meaning is one of the challenges of natural language processing (NLP).
Formal models of meaning, such as networks of words or concepts, are knowledge repositories
used in a variety of applications. To be effectively used, these networks have to be large or, at
least, adapted to specific domains. Learning word meaning from texts is then an active area
of research. Lexico-syntactic pattern methods are one of the possible solutions. Yet, these
models do not use structural properties of target semantic relations, e.g. transitivity, during
learning. In this paper, we propose a novel lexico-syntactic pattern probabilistic method
for learning taxonomies that explicitly models transitivity and naturally exploits vector space
model techniques for reducing space dimensions. We define two probabilistic models: the
direct probabilistic model and the induced probabilistic model. The first is directly estimated
on observations over text collections. The second uses transitivity on the direct probabilistic
model to induce probabilities of derived events. Within our probabilistic model, we also
propose a novel way of using singular value decomposition as unsupervised method for
feature selection in estimating direct probabilities. We empirically show that the induced
probabilistic taxonomy learning model outperforms state-of-the-art probabilistic models and
our unsupervised feature selection method improves performance
Ontology-based personalisation of e-learning resources for disabled students
Students with disabilities are often expected to use e-learning systems to access learning materials but most systems do not provide appropriate adaptation or personalisation to meet their needs.The difficulties related to inadaptability of current learning environments can now be resolved using semantic web technologies such as web ontologies which have been successfully used to drive e-learning personalisation. Nevertheless, e-learning personalisation for students with disabilities has mainly targeted those with single disabilities such as dyslexia or visual impairment, often neglecting those with multiple disabilities due to the difficulty of designing for a combination of disabilities.This thesis argues that it is possible to personalise learning materials for learners with disabilities, including those with multiple disabilities. This is achieved by developing a model that allows the learning environment to present the student with learning materials in suitable formats while considering their disability and learning needs through an ontology-driven and disability-aware personalised e-learning system model (ONTODAPS). A disability ontology known as the Abilities and Disabilities Ontology for Online LEarning and Services (ADOOLES) is developed and used to drive this model. To test the above hypothesis, some case studies are employed to show how the model functions for various individuals with and without disabilities and then the implemented visual interface is experimentally evaluated by eighteen students with disabilities and heuristically by ten lecturers. The results are collected and statistically analysed.The results obtained confirm the above hypothesis and suggest that ONTODAPS can be effectively employed to personalise learning and to manage learning resources. The student participants found that ONTODAPS could aid their learning experience and all agreed that they would like to use this functionality in an existing learning environment. The results also suggest that ONTODAPS provides a platform where students with disabilities can have equivalent learning experience with their peers without disabilities. For the results to be generalised, this study could be extended through further experiments with more diverse groups of students with disabilities and across multiple educational institutions
Analyzing the Semantic Relatedness of Paper Abstracts: An Application to the Educational Research Field
International audienceEach domain, along with its knowledge base, changes over time and every timeframe is centered on specific topics that emerge from different ongoing research projects. As searching for relevant resources is a time-consuming process, the automatic extraction of the most important and relevant articles from a domain becomes essential in supporting researchers in their day-today activities. The proposed analysis extends other previous researches focused on extracting co-citations between the papers, with the purpose of comparing their overall importance within the domain from a semantic perspective. Our method focuses on the semantic analysis of paper abstracts by using Natural Language Processing (NLP) techniques such as Latent Semantic Analysis, Latent Dirichlet Allocation or specific ontology distances, i.e., WordNet. Moreover, the defined mechanisms are enforced on two different subdomains from the corpora generated around the keywords " e-learning " and " computer ". Graph visual representations are used to highlight the keywords of each subdomain, links among concepts and between articles, as well as specific document similarity views, or scores reflecting the keyword-abstract overlaps. In the end, conclusions and future improvements are presented, emphasizing nevertheless the key elements of our research support framework
- …