Search CORE

4,480 research outputs found

Ekstraksi Informasi Halaman Web Menggunakan Pendekatan Bootstrapping pada Ontology-Based Information Extraction

Author: Mustofa Khabib
Susanti Erma
Publication venue: 'Universitas Gadjah Mada'
Publication date: 01/07/2015
Field of study

AbstrakEkstraksi informasi merupakan suatu bidang ilmu untuk pengolahan bahasa alami, dengan cara mengubah teks tidak terstruktur menjadi informasi dalam bentuk terstruktur. Berbagai jenis informasi di Internet ditransmisikan secara tidak terstruktur melalui website, menyebabkan munculnya kebutuhan akan suatu teknologi untuk menganalisa teks dan menemukan pengetahuan yang relevan dalam bentuk informasi terstruktur. Contoh informasi tidak terstruktur adalah informasi utama yang ada pada konten halaman web. Bermacam pendekatan untuk ekstraksi informasi telah dikembangkan oleh berbagai peneliti, baik menggunakan metode manual atau otomatis, namun masih perlu ditingkatkan kinerjanya terkait akurasi dan kecepatan ekstraksi. Pada penelitian ini diusulkan suatu penerapan pendekatan ekstraksi informasi dengan mengkombinasikan pendekatan bootstrapping dengan Ontology-based Information Extraction (OBIE). Pendekatan bootstrapping dengan menggunakan sedikit contoh data berlabel, digunakan untuk memimalkan keterlibatan manusia dalam proses ekstraksi informasi, sedangkan penggunakan panduan ontologi untuk mengekstraksi classes (kelas), properties dan instance digunakan untuk menyediakan konten semantik untuk web semantik. Pengkombinasian kedua pendekatan tersebut diharapkan dapat meningkatan kecepatan proses ekstraksi dan akurasi hasil ekstraksi. Studi kasus untuk penerapan sistem ekstraksi informasi menggunakan dataset “LonelyPlanet”. Kata kunci—Ekstraksi informasi, ontologi, bootstrapping, Ontology-Based Information Extraction, OBIE, kinerja Abstract Information extraction is a field study of natural language processing by converting unstructured text into structured information. Several types of information on the Internet is transmitted through unstructured information via websites, led to emergence of the need a technology to analyze text and found relevant knowledge into structured information. For example of unstructured information is existing main information on the content of web pages. Various approaches for information extraction have been developed by many researchers, either using manual or automatic method, but still need to be improved performance related accuracy and speed of extraction. This research proposed an approach of information extraction that combines bootstrapping approach with Ontology-Based Information Extraction (OBIE). Bootstrapping approach using small seed of labelled data, is used to minimize human intervention on information extraction process, while the use of guide ontology for extracting classes, properties and instances, using for provide semantic content for semantic web. Combining both approaches expected to increase speed of extraction process and accuracy of extraction results. Case study to apply information extraction system using “LonelyPlanet” datasets. Keywords— Information extraction, ontology, bootstrapping, Ontology-Based Information Extraction, OBIE, performanc

Directory of Open Access Journals

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Towards a relation extraction framework for cyber-security concepts

Author: Bridges R. A.
Brin S.
Carlson A.
de Lacalle O. L.
Jones R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/04/2015
Field of study

In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting from the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.Comment: 4 pages in Cyber & Information Security Research Conference 2015, AC

arXiv.org e-Print Archive

Crossref

Initiating organizational memories using ontology-based network analysis as a bootstrapping tool

Author: Alani Harith
Kalfoglou Yannis
O'Hara Kieron
Shadbolt Nigel
Publication venue: BCS SGAI
Publication date: 01/01/2002
Field of study

An important problem for many kinds of knowledge systems is their initial set-up. It is difficult to choose the right information to include in such systems, and the right information is also a prerequisite for maximizing the uptake and relevance. To tackle this problem, most developers adopt heavyweight solutions and rely on a faithful continuous interaction with users to create and improve content. In this paper, we explore the use of an automatic, lightweight ontology-based solution to the bootstrapping problem, in which domain-describing ontologies are analysed to uncover significant yet implicit relationships between instances. We illustrate the approach by using such an analysis to provide content automatically for the initial set-up of an organizational memory

CiteSeerX

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Using Neural Networks for Relation Extraction from Biomedical Literature

Author: A Koike
A Lamurias
A Lamurias
A Lamurias
A Lamurias
A Singhal
AV Aho
B Xu
CD Manning
CH Alves
D Westergaard
D Zhou
E Guresen
F Rinaldi
HC Wang
HM Müller
J Hastings
L Aroyo
M Ashburner
MY Kim
N Ma
N Peng
P Goyal
P Zweigenbaum
PN Robinson
Q Li
QL Nguyen
S HayKin
S Hochreiter
TR Gruber
W Wang
WWM Fleuren
Y Hao
Y Luo
Y Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2020
Field of study

Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely, using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1

arXiv.org e-Print Archive

Crossref

Requirements for Information Extraction for Knowledge Management

Author: Cimiano Philipp
Ciravegna Fabio
Domingue John
Handschuh Siegfried
Lavelli Alberto
Staab Steffen
Stevenson Mark
Publication venue
Publication date: 01/01/2003
Field of study

Knowledge Management (KM) systems inherently suffer from the knowledge acquisition bottleneck - the difficulty of modeling and formalizing knowledge relevant for specific domains. A potential solution to this problem is Information Extraction (IE) technology. However, IE was originally developed for database population and there is a mismatch between what is required to successfully perform KM and what current IE technology provides. In this paper we begin to address this issue by outlining requirements for IE based KM

Archivio della ricerca - Fondazione Bruno Kessler

Open Research Online (The Open University)

Web based knowledge extraction and consolidation for automatic ontology instantiation

Author: Alani Harith
Hall Wendy
Kim Sanghee
Lewis Paul H.
Millard David E.
Shadbolt Nigel
Weal Mark J.
Publication venue
Publication date: 01/01/2003
Field of study

The Web is probably the largest and richest information repository available today. Search engines are the common access routes to this valuable source. However, the role of these search engines is often limited to the retrieval of lists of potentially relevant documents. The burden of analysing the returned documents and identifying the knowledge of interest is therefore left to the user. The Artequakt system aims to deploy natural language tools to automatically ex-tract and consolidate knowledge from web documents and instantiate a given ontology, which dictates the type and form of knowledge to extract. Artequakt focuses on the domain of artists, and uses the harvested knowledge to gen-erate tailored biographies. This paper describes the latest developments of the system and discusses the problem of knowledge consolidation

CiteSeerX

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Generating and visualizing a soccer knowledge base

Author: Buitelaar Paul
Cimiano Philipp
Eigner Thomas
Gulrajani Greg
Ladwig Günter
Mantel Matthias
Schutz Alexander
Siegel Melanie
Weber Nicolas
Zhu Honggang
Publication venue
Publication date: 01/01/2006
Field of study

This demo abstract describes the SmartWeb Ontology-based Information Extraction System (SOBIE). A key feature of SOBIE is that all information is extracted and stored with respect to the SmartWeb ontology. In this way, other components of the systems, which use the same ontology, can access this information in a straightforward way. We will show how information extracted by SOBIE is visualized within its original context, thus enhancing the browsing experience of the end user

Hochschulschriftenserver - Universität Frankfurt am Main