Search CORE

12 research outputs found

Development of an ontology construction component for the OBCIE (ontology-based components for information extraction) approach

Author: Lewke Bandara MS
Wimalasuriya DC
Publication venue
Publication date: 19/06/2015
Field of study

Information extraction systems identify and retrieve certain types of information from natural language text. A recent development in the field of information extraction is the emergence of ontology-based information extraction as a sub-filed, where ontologies are used to guide the information extraction process and to present the extracted information. One of the challenges faced by fields of ontology-based information extraction and information extraction is the difficulty of reuse of prior work in developing new systems. A component-based approach for information extraction named OBCIE (Ontology-Based Components for Information Extraction) has been previously developed to address this issue. This paper presents the progress in developing an ontology construction component for the OBCIE approach, which identifies classes and relationships for a given domain. It is centered on discovering the information contained within the loose structure of Wikipedia pages

Digital Repository, University of Moratuwa

Lexical Enrichment and Sense Disambiguation of Ontology Concepts

Author: De Silva LYSG
Indrajee KHH
Mahawithana P
Premarathna PHSR
Wimalasuriya DC
Publication venue
Publication date: 21/03/2017
Field of study

This paper presents a model to measure semantic similarity between custom ontology concepts and the taxonomy of WordNet and introduces a new ontology concept similarity measure. The similarity measure is based on a measure of weighted overlap of semantic cotopy of a concept in two taxonomies. The model can be applied to automatically enhance the vocabulary of terms in ontologies embedding equivalence classes of terms and other linguistic information directly in the ontology. This model is applied to the products and services domain where a Product Ontology is lexically enhanced and the effectiveness of the model is evaluated

Digital Repository, University of Moratuwa

Lexical enrichment and sense disambiguation of ontology concepts

Author: De Silva LYSG
Indrajee KHH
Mahawithana P
Prernarathna PHSR
Wimalasuriya DC
Publication venue
Publication date: 16/01/2014
Field of study

This paper presents a model to measure semantic similarity between custom ontology concepts and the taxonomy of WordNet and introduces a new ontology concept similarity measure. The similarity measure is based on a measure of weighted overlap of semantic cotopy of a concept in two taxonomies. The model can be applied to automatically enhance the vocabulary of terms in ontologies embeddingequivalence classes of terms and other linguistic information directly in the ontology. This model is applied to the products and services domain where a Product Ontology is lexically enhanced and the effectiveness of the model is evaluated

Digital Repository, University of Moratuwa

Ontology-based information extraction and reservoir computing for topic detection from blogosphere's content : a case study about BBC backstage

Author: C Emmerich
D Verstraeten
DC Wimalasuriya
F Wyffels
JAK Suykens
K Denecke
LA Alexandre
M Lukosevicius
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2014
Field of study

This research study aims at detecting topics and extracting themes(subtopics) from the blogosphere’s content while bridging the gap between the Social Web and the Semantic Web. The goal is to detect certain types of information from collections of blogs’ and microblogs’ narratives that lack explicit semantics. The approach presented introduces a novel approach that blends together two young paradigms: Ontology-Based Information Extraction (OBIE) and Reservoir Computing (RC). The novelty of the work lies in integrating ontologies and RC as well as the pioneering use of RC with social media data. Experiments with retrospect data from blogs and Twitter microblogs provide valuable insights into the BBC Backstage initiative and prove the viability of the approach presented in terms of scalability, computational complexity,and performance

University of Salford Institutional Repository

Crossref

Building a WordNet for Sinhala

Author: De Silva N
Dias G
Gallage M
Gunathilaka B
Lakjeewa M
Paranavithana R
Wijesiri I
Wimalasuriya DC
Publication venue
Publication date: 16/01/2017
Field of study

Sinhala is one of the official languages of Sri Lanka and is used by over 19 million people. It belongs to the Indo-Aryan branch of the In-do-European languages and its origins date back to at least 2000 years. It has developed into its current form over a long period of time with influences from a wide variety of lan-guages including Tamil, Portuguese and Eng-lish. As for any other language, a WordNet is extremely important for Sinhala to take it into the digital era. This paper is based on the pro-ject to develop a WordNet for Sinhala based on the English (Princeton) WordNet. It de-scribes how we overcame the challenges in adding Sinhala specific characteristics which were deemed important by Sinhala language experts to the WordNet while keeping the structure of the original English WordNet. It also presents the details of the crowdsourcing system we developed as a part of the project - consisting of a NoSQL database in the backend and a web-based frontend. We con-clude by discussing the possibility of adapting this architecture for other languages and the road ahead for the Sinhala WordNet and Sin-hala NLP

Digital Repository, University of Moratuwa

Keyword extraction for blogs based on content richness

Author: Ahmed A
Blei DM
Frank E
Gazendam L
Karlgren J
Mihalcea R
Nallapati R
Sekiguchi Y
Sim H
Thanadechteempat W
Turney PD
Wimalasuriya DC
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Identifying Things, Relations, and Semantizing Data

Author: A Teixeira
B Mohit
B Popov
C Bizer
D Nadeau
DC Wimalasuriya
J Cowie
K Bontcheva
L Ferreira
L Màrquez
S Kim
S Sarawagi
T Nunes
TR Gruber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Crossref

SkipCor: Skip-Mention Coreference Resolution Using Linear-Chain Conditional Random Fields

Author: C Orasan
DC Wimalasuriya
E Fosler-Lussier
GA Miller
Lovro Šubelj
Marko Bajec
N Nguyen
Neil R. Smalheiser
S Huang
S Sarawagi
Slavko Žitnik
WM Soon
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref

Applying Information Extraction for Abstracting and Automating CLI-Based Configuration of Network Devices in Heterogeneous Environments

Author: A Esuli
AKY Wong
D Caldwell
D Yang
DC Wimalasuriya
G Tsatsaronis
H Kim
JE López De Vergara
L Johansson
M Dong
M Horridge
R Studer
S Chisholm
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A framework for automatic population of ontology-based digital libraries

Author: B Motik
C Faria
DC Wimalasuriya
G Adorni
H Saggion
HB Zghal
J Jiang
J Piskorski
JM Ruiz-Martınez
K Bontcheva
M Horridge
R Benammar
R Dale
S Boschi
T Berners-Lee
TR Gruber
V Bush
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Maintaining updated ontology-based digital libraries faces two main issues. First, documents are often unstructured and in heterogeneous data formats, making it even more difficult to extract information and search in. Second, manual ontology population is time consuming and therefore automatic methods to support this process are needed. In this paper, we present an ontology-based framework aiming at populating ontologies. In particular, we propose an approach for triplet extraction from heterogeneous and unstructured documents in order to automatically populate ontology-based digital libraries. Finally, we evaluate the proposed framework on a real world case study

Crossref

Archivio istituzionale della ricerca - Università di Genova