Search CORE

1,426 research outputs found

Knowledge Portability with Semantic Expansion of Ontology Labels

Author: Arcan Mihael
Buitelaar Paul
Turchi Marco
Publication venue: The Association for Computer Linguistics
Publication date
Field of study

Our research focuses on the multilingual enhancement of ontologies that, often represented only in English, need to be translated in different languages to enable knowledge access across languages. Ontology translation is a rather different task then the classic document translation, because ontologies contain highly specific vocabulary and they lack contextual information. For these reasons, to improve automatic ontology translations, we first focus on identifying relevant unambiguous and domain-specific sentences from a large set of generic parallel corpora. Then, we leverage Linked Open Data resources, such as DBPedia, to isolate ontologyspecific bilingual lexical knowledge. In both cases, we take advantage of the semantic information of the labels to select relevant bilingual data with the aim of building an ontology-specific statistical machine translation system. We evaluate our approach on the translation of a medical ontology, translating from English into German. Our experiment shows a significant improvement of around 3 BLEU points compared to a generic as well as a domain-specific translation approach

Archivio della ricerca - Fondazione Bruno Kessler

Knowledge Representation in the Context of E-business Applications

Author: Simona Elena Varlan
Publication venue
Publication date
Field of study

The article emphasizes the theoretical principles of knowledge representation. The paper also tries to show how to represent knowledge in the context of e-business applications creating a tagging platform for economic knowledge using SKOS language.Knowledge Representation, Semantic Web, E-business, SKOS

Research Papers in Economics

The usability of semantic search tools: a review

Author: Athanasis
Bamba
Bernstein
Buscaldi
Catarci
Cimiano
Ding
ENRICO MOTTA
Euzenat
HAIMING LIU
Hyvonen
Lee
Lei
Lopez
Lopez
Lopez
MARINA GIORDANINO
Mihalcea
Motta
Shneiderman
Sinha
Stuckenschmidt
Vallet
VANESSA LOPEZ
VICTORIA UREN
Wu
YUANGUI LEI
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2007
Field of study

The goal of semantic search is to improve on traditional search methods by exploiting the semantic metadata. In this paper, we argue that supporting iterative and exploratory search modes is important to the usability of all search systems. We also identify the types of semantic queries the users need to make, the issues concerning the search environment and the problems that are intrinsic to semantic search in particular. We then review the four modes of user interaction in existing semantic search systems, namely keyword-based, form-based, view-based and natural language-based systems. Future development should focus on multimodal search systems, which exploit the advantages of more than one mode of interaction, and on developing the search systems that can search heterogeneous semantic metadata on the open semantic Web

Crossref

Open Research Online (The Open University)

Optimisation Method for Training Deep Neural Networks in Classification of Non- functional Requirements

Author: Sabir M.
Sabir M.
Publication venue: London South Bank University
Publication date: 01/01/2022
Field of study

Non-functional requirements (NFRs) are regarded critical to a software system's success. The majority of NFR detection and classification solutions have relied on supervised machine learning models. It is hindered by the lack of labelled data for training and necessitate a significant amount of time spent on feature engineering. In this work we explore emerging deep learning techniques to reduce the burden of feature engineering. The goal of this study is to develop an autonomous system that can classify NFRs into multiple classes based on a labelled corpus. In the first section of the thesis, we standardise the NFRs ontology and annotations to produce a corpus based on five attributes: usability, reliability, efficiency, maintainability, and portability. In the second section, the design and implementation of four neural networks, including the artificial neural network, convolutional neural network, long short-term memory, and gated recurrent unit are examined to classify NFRs. These models, necessitate a large corpus. To overcome this limitation, we proposed a new paradigm for data augmentation. This method uses a sort and concatenates strategy to combine two phrases from the same class, resulting in a two-fold increase in data size while keeping the domain vocabulary intact. We compared our method to a baseline (no augmentation) and an existing approach Easy data augmentation (EDA) with pre-trained word embeddings. All training has been performed under two modifications to the data; augmentation on the entire data before train/validation split vs augmentation on train set only. Our findings show that as compared to EDA and baseline, NFRs classification model improved greatly, and CNN outperformed when trained using our suggested technique in the first setting. However, we saw a slight boost in the second experimental setup with just train set augmentation. As a result, we can determine that augmentation of the validation is required in order to achieve acceptable results with our proposed approach. We hope that our ideas will inspire new data augmentation techniques, whether they are generic or task specific. Furthermore, it would also be useful to implement this strategy in other languages

LSBU Research Open

Recommended from our members

PowerAqua: Open Question Answering on the Semantic Web

Author: Lopez Vanessa
Publication venue
Publication date: 01/01/2011
Field of study

With the rapid growth of semantic information in the Web, the processes of searching and querying these very large amounts of heterogeneous content have become increasingly challenging. This research tackles the problem of supporting users in querying and exploring information across multiple and heterogeneous Semantic Web (SW) sources. A review of literature on ontology-based Question Answering reveals the limitations of existing technology. Our approach is based on providing a natural language Question Answering interface for the SW, PowerAqua. The realization of PowerAqua represents a considerable advance with respect to other systems, which restrict their scope to an ontology-specific or homogeneous fraction of the publicly available SW content. To our knowledge, PowerAqua is the only system that is able to take advantage of the semantic data available on the Web to interpret and answer user queries posed in natural language. In particular, PowerAqua is uniquely able to answer queries by combining and aggregating information, which can be distributed across heterogeneous semantic resources. Here, we provide a complete overview of our work on PowerAqua, including: the research challenges it addresses; its architecture; the techniques we have realised to map queries to semantic data, to integrate partial answers drawn from different semantic resources and to rank alternative answers; and the evaluation studies we have performed, to assess the performance of PowerAqua. We believe our experiences can be extrapolated to a variety of end-user applications that wish to open up to large scale and heterogeneous structured datasets, to be able to exploit effectively what possibly is the greatest wealth of data in the history of Artificial Intelligence

Open Research Online (The Open University)

OpenGrey Repository

Results of the Ontology Alignment Evaluation Initiative 2011 (Final)

Author: Euzenat Jérôme
Ferrara Alfio
Hage Willem Robert van
Hollink Laura
Meilicke Christian
Nikolov Andriy
Scharffe Francois
Shvaiko Pavel
Stuckenschmidt Heiner
Svab-Zamazal Ondrej
Trojahn Cássia
Publication venue: RWTH
Publication date: 01/01/2011
Field of study

MAnnheim DOCument Server

Schema Label Normalization for Improving Schema Matching

Author: Bergamaschi Sonia
Gawinecki Maciej
Po Laura
Sorrentino Serena
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources that are heterogeneous in format and in structure. Starting from the \u201chidden meaning\u201d associated with schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a \u201cmeaning\u201d to schema labels.However, the performance of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns, abbreviations, and acronyms. We address this problem by proposing a method to perform schema label normalization which increases the number of comparable labels. The method semi-automatically expands abbreviations/acronyms and annotates compound nouns, with minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching results

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia