23,482 research outputs found
Unsupervised Terminological Ontology Learning based on Hierarchical Topic Modeling
In this paper, we present hierarchical relationbased latent Dirichlet
allocation (hrLDA), a data-driven hierarchical topic model for extracting
terminological ontologies from a large number of heterogeneous documents. In
contrast to traditional topic models, hrLDA relies on noun phrases instead of
unigrams, considers syntax and document structures, and enriches topic
hierarchies with topic relations. Through a series of experiments, we
demonstrate the superiority of hrLDA over existing topic models, especially for
building hierarchies. Furthermore, we illustrate the robustness of hrLDA in the
settings of noisy data sets, which are likely to occur in many practical
scenarios. Our ontology evaluation results show that ontologies extracted from
hrLDA are very competitive with the ontologies created by domain experts
An experiment with ontology mapping using concept similarity
This paper describes a system for automatically mapping between concepts in different ontologies. The motivation for the research stems from the Diogene project, in which the project's own ontology covering the ICT domain is mapped to external ontologies, in order that their associated content can automatically be included in the Diogene system. An approach involving measuring the similarity of concepts is introduced, in which standard Information Retrieval indexing techniques are applied to concept descriptions. A matrix representing the similarity of concepts in two ontologies is generated, and a mapping is performed based on two parameters: the domain coverage of the ontologies, and their levels of granularity. Finally, some initial experimentation is presented which suggests that our approach meets the project's unique set of requirements
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
XL-NBT: A Cross-lingual Neural Belief Tracking Framework
Task-oriented dialog systems are becoming pervasive, and many companies
heavily rely on them to complement human agents for customer service in call
centers. With globalization, the need for providing cross-lingual customer
support becomes more urgent than ever. However, cross-lingual support poses
great challenges---it requires a large amount of additional annotated data from
native speakers. In order to bypass the expensive human annotation and achieve
the first step towards the ultimate goal of building a universal dialog system,
we set out to build a cross-lingual state tracking framework. Specifically, we
assume that there exists a source language with dialog belief tracking
annotations while the target languages have no annotated dialog data of any
form. Then, we pre-train a state tracker for the source language as a teacher,
which is able to exploit easy-to-access parallel data. We then distill and
transfer its own knowledge to the student state tracker in target languages. We
specifically discuss two types of common parallel resources: bilingual corpus
and bilingual dictionary, and design different transfer learning strategies
accordingly. Experimentally, we successfully use English state tracker as the
teacher to transfer its knowledge to both Italian and German trackers and
achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc
Comparing human and automatic thesaurus mapping approaches in the agricultural domain
Knowledge organization systems (KOS), like thesauri and other controlled
vocabularies, are used to provide subject access to information systems across
the web. Due to the heterogeneity of these systems, mapping between
vocabularies becomes crucial for retrieving relevant information. However,
mapping thesauri is a laborious task, and thus big efforts are being made to
automate the mapping process. This paper examines two mapping approaches
involving the agricultural thesaurus AGROVOC, one machine-created and one human
created. We are addressing the basic question "What are the pros and cons of
human and automatic mapping and how can they complement each other?" By
pointing out the difficulties in specific cases or groups of cases and grouping
the sample into simple and difficult types of mappings, we show the limitations
of current automatic methods and come up with some basic recommendations on
what approach to use when.Comment: 10 pages, Int'l Conf. on Dublin Core and Metadata Applications 200
From Biological to Synthetic Neurorobotics Approaches to Understanding the Structure Essential to Consciousness (Part 3)
This third paper locates the synthetic neurorobotics research reviewed in the second paper in terms of themes introduced in the first paper. It begins with biological non-reductionism as understood by Searle. It emphasizes the role of synthetic neurorobotics studies in accessing the dynamic structure essential to consciousness with a focus on system criticality and self, develops a distinction between simulated and formal consciousness based on this emphasis, reviews Tani and colleagues' work in light of this distinction, and ends by forecasting the increasing importance of synthetic neurorobotics studies for cognitive science and philosophy of mind going forward, finally in regards to most- and myth-consciousness
- …