34,627 research outputs found
Ontology extraction for index generation
The administration of electronic publication in the Information Era congregates old and new problems,
especially those related with Information Retrieval and Automatic Knowledge Extraction. This article
presents an Information Retrieval System that uses Natural Language Processing and Ontology to
index collection s texts. We describe a system that constructs a domain specific ontology, starting
from the syntactic and semantic analyses of the texts that compose the collection. First the texts are
tokenized, then a robust syntactic analysis is made, subsequently the semantic analysis is accomplished
in conformity with a metalanguage of knowledge representation, based on a basic ontology composed
of 47 classes. The ontology, automatically extracted, generates richer domain specific knowledge.
It propitiates, through its semantic net, the right conditions for the user to find with larger efficiency
and agility the terms adapted for the consultation to the texts. A prototype of this system was built
and used for the indexation of a collection of 221 electronic texts of Information Science written in
Portuguese from Brazil. Instead of being based in statistical theories, we propose a robust Information Retrieval System that uses cognitive theories, allowing a larger efficiency in the answer to the users' queries
Applying semantic web technologies to knowledge sharing in aerospace engineering
This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale
Intelligent multimedia indexing and retrieval through multi-source information extraction and merging
This paper reports work on automated meta-data\ud
creation for multimedia content. The approach results\ud
in the generation of a conceptual index of\ud
the content which may then be searched via semantic\ud
categories instead of keywords. The novelty\ud
of the work is to exploit multiple sources of\ud
information relating to video content (in this case\ud
the rich range of sources covering important sports\ud
events). News, commentaries and web reports covering\ud
international football games in multiple languages\ud
and multiple modalities is analysed and the\ud
resultant data merged. This merging process leads\ud
to increased accuracy relative to individual sources
Unsupervised Terminological Ontology Learning based on Hierarchical Topic Modeling
In this paper, we present hierarchical relationbased latent Dirichlet
allocation (hrLDA), a data-driven hierarchical topic model for extracting
terminological ontologies from a large number of heterogeneous documents. In
contrast to traditional topic models, hrLDA relies on noun phrases instead of
unigrams, considers syntax and document structures, and enriches topic
hierarchies with topic relations. Through a series of experiments, we
demonstrate the superiority of hrLDA over existing topic models, especially for
building hierarchies. Furthermore, we illustrate the robustness of hrLDA in the
settings of noisy data sets, which are likely to occur in many practical
scenarios. Our ontology evaluation results show that ontologies extracted from
hrLDA are very competitive with the ontologies created by domain experts
Generating adaptive hypertext content from the semantic web
Accessing and extracting knowledge from online documents is crucial for therealisation of the Semantic Web and the provision of advanced knowledge services.
The Artequakt project is an ongoing investigation tackling these issues to facilitate the creation of tailored biographies from information harvested from the web.
In this paper we will present the methods we currently use to model, consolidate and store knowledge extracted from the web so that it can be re-purposed as adaptive content. We look at how Semantic Web technology could be used within this process and also how such techniques might be used to provide content to be published via the Semantic Web
- …