Search CORE

163 research outputs found

Requirements for Information Extraction for Knowledge Management

Author: Cimiano Philipp
Ciravegna Fabio
Domingue John
Handschuh Siegfried
Lavelli Alberto
Staab Steffen
Stevenson Mark
Publication venue
Publication date: 01/01/2003
Field of study

Knowledge Management (KM) systems inherently suffer from the knowledge acquisition bottleneck - the difficulty of modeling and formalizing knowledge relevant for specific domains. A potential solution to this problem is Information Extraction (IE) technology. However, IE was originally developed for database population and there is a mismatch between what is required to successfully perform KM and what current IE technology provides. In this paper we begin to address this issue by outlining requirements for IE based KM

Archivio della ricerca - Fondazione Bruno Kessler

Open Research Online (The Open University)

A Literature Survey on Web Content Mining

Author: V. David Martin, Dr. T. N. Ravi
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2016
Field of study

Web is an accumulation of inter related documents on one or more web servers while web mining implies extricating important data from web databases. Web mining is one of the data mining spaces where data mining methods are utilized for extricating data from the web servers. The web information incorporates site pages, web links, questions on the web and web logs. Web mining is utilized to comprehend the client behavior, assess a specific site in view of the data which is stored in web log documents. Web mining is assessed by utilizing data mining strategies, specifically Association Rules, Classification and Clustering. It has some helpful regions or applications, for example, Electronic trade, E-learning, E-government, E-arrangements, E-majority rules system, Electronic business, security, crime examination and computerized library. Recovering the required web page from the web productively and adequately becomes a challenging task since web is comprised of unstructured information, which conveys the substantial measure of data and increment the unpredictability of managing data from various web service providers. The accumulation of data turns out to be elusive, extract, channel or assess the significant data for the clients. In this paper, we have considered the essential ideas of web mining, classification, procedures and issues. Notwithstanding this, this paper likewise broke down the web mining research challenges

International Journal on Recent and Innovation Trends in Computing and Communication

Bayesian Information Extraction Network

Author: Peshkin Leonid
Pfeffer Avi
Publication venue
Publication date: 01/01/2003
Field of study

Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, we employ a DBN in an information extraction task. We show how to assemble wealth of emerging linguistic instruments for shallow parsing, syntactic and semantic tagging, morphological decomposition, named entity recognition etc. in order to incrementally build a robust information extraction system. Our method outperforms previously published results on an established benchmark domain.Comment: 6 page

arXiv.org e-Print Archive

CiteSeerX

Ekstraksi Judul dan Abstrak Artikel Ilmiah Berbasis Rule

Author: Soekamto Yosua Setyawan
Publication venue: Institut Sains dan Teknologi Terpadu Surabaya (d/h Sekolah Tinggi Teknik Surabaya)
Publication date: 31/03/2020
Field of study

Seiring perkembangan penelitian dan jumlah research paper yang dipublikasikan di berbagai Jurnal, maka kesulitan yang timbul adalah proses seleksi dan referensi oleh para peneliti dan pengelola jurnal. Dalam research paper bagian judul dan abstrak adalah ide utama dan ringkasan penelitian beserta metode yang digunakan dalam penelitian tersebut. Oleh karena itu, ekstraksi judul dan ringkasan research paper menjadi topik yang cukup banyak dibahas dengan berbagai metode dan umumnya terbatas dengan penggunaan bahasa dan gaya penulisan tiap-tiap jurnal. Dalam penelitian ini, ekstraksi judul dan abstrak akan menggunakan bentuk association rule dan diterapkan pada intuisi umum dalam penulisan research paper. Penelitian yang dilakukan akan menggunakan 2 dataset layout research paper, yaitu bentuk 1 kolom dan 2 kolom. Penelitian ini akan sangat membantu pengelola jurnal dan peneliti sehingga kedua pihak tersebut dapat melakukan proses referensi secara otomatis dan memudahkan seleksi untuk publikasi jurnal secara online. Rule akan diterapkan pada gaya penulisan research paper yang umum digunakan sehingga dapat diberlakukan pada berbagai jenis paper dengan berbagai bahasa. Salah satu contoh rule yang digunakan adalah “Judul paper merupakan sebuah kalimat (frase) dengan menggunakan ukuran teks yang paling besar”, “Judul paper ditulis pada awal halaman pertama”, “Judul paper mayoritas ditulis dengan menggunakan cetak tebal (bold)”, “Judul paper diikuti dengan nama penulis”, “Judul paper yang muncul di halaman kedua dan selanjutnya sebagai header atau footer memiliki letak yang tidak lazim dibanding isi paper (atau berada di margin halaman)”

Jurnal Sekolah Tinggi Teknik Surabaya

Jurnal LPPM iSTTS

Journal of Information System,Graphics, Hospitality and Technology

Extracting semantics for information extraction

Author: Al Fawareh Hejab M.
Jusoh Shaidah
Publication venue
Publication date: 24/06/2009
Field of study

Text documents are one of the means to store information.These documents can be found on personal desktop computers, intranets and in the Web. Thus the valuable knowledge is embedded in an unstructured form. Having an automated system that can extract information from the texts is very desirable.However, the major challenging issue in developing such an automated system is a natural language is not free from ambiguity and uncertainty problems.Thus semantic extraction remains a challenging task to researchers in this area.In this paper, a new framework to extract semantics for information extraction is proposed, where possibility theory, fuzzy sets, and knowledge about the subject and preceding sentence have been used as the key in resolving the ambiguity and uncertainty problems

UUM Repository