Search CORE

80,828 research outputs found

Improving User Experience In Information Retrieval Using Semantic Web And Other Technologies

Author: Najmi Erfan
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2016
Field of study

The need to find, access and extract information has been the motivation for many different fields of research in the past few years. The fields such as Machine Learning, Question Answering Systems, Semantic Web, etc. each tries to cover parts of the mentioned problem. Each of these fields have introduced many different tools and approaches which in many cases are multi-disciplinary, covering more than one of these fields to provide solution for one or more of them. On the other hand, the expansion of the Web with Web 2.0, gave researchers many new tools to extend approaches to help users extract and find information faster and easier. Currently, the size of e-commerce and online shopping, the extended use of search engines for different purposes and the amount of collaboration for creating content on the Web provides us with different possibilities and challenges which we address some of them here

Digital Commons@Wayne State University

RULIE : rule unification for learning information extraction

Author: Busuttil Dale P.
Dingli Alexiei
Seychell Dylan
The 22nd International Joint Conference on Artificial Intelligence (IJCAI-11)
Publication venue: International Joint Conferences on Artificial Intelligence
Publication date: 01/01/2011
Field of study

In this paper we are presenting RULIE (Rule Unification for Learning Information Extraction), an adaptive information extraction algorithm which works by employing a hybrid technique of Rule Learning and Rule Unification in order to extract relevant information from all types of documents which can be found and used in the semantic web. This algorithm combines the techniques of the LP2 and the BWI algorithms for improved performance. In this paper we are also presenting the experimen- tal results of this algorithm and respective details of evaluation. This evaluation compares RULIE to other information extraction algorithms based on their respective performance measurements and in almost all cases RULIE outruns the other algorithms which are namely: LP2 , BWI, RAPIER, SRV and WHISK. This technique would aid current techniques of linked data which would eventually lead to fullier realisation of the semantic web.peer-reviewe

OAR@UM

Finding Person Relations in Image Data of the Internet Archive

Author: A Gangemi
A Moro
C Ding
I Masi
L Best-Rowden
R Navigli
Y Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2019
Field of study

The multimedia content in the World Wide Web is rapidly growing and contains valuable information for many applications in different domains. For this reason, the Internet Archive initiative has been gathering billions of time-versioned web pages since the mid-nineties. However, the huge amount of data is rarely labeled with appropriate metadata and automatic approaches are required to enable semantic search. Normally, the textual content of the Internet Archive is used to extract entities and their possible relations across domains such as politics and entertainment, whereas image and video content is usually neglected. In this paper, we introduce a system for person recognition in image content of web news stored in the Internet Archive. Thus, the system complements entity recognition in text and allows researchers and analysts to track media coverage and relations of persons more precisely. Based on a deep learning face recognition approach, we suggest a system that automatically detects persons of interest and gathers sample material, which is subsequently used to identify them in the image data of the Internet Archive. We evaluate the performance of the face recognition system on an appropriate standard benchmark dataset and demonstrate the feasibility of the approach with two use cases

arXiv.org e-Print Archive

Crossref

Tree pattern inference and matching for wrapper induction on the World Wide Web

Author: Hogue Andrew William, 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2004
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 103-106).We develop a method for learning patterns from a set of positive examples to retrieve semantic content from tree-structured data. Specifically, we focus on HTML documents on the World Wide Web, which contain a wealth of semantic information and have a useful underlying tree structure. A user provides examples of relevant data they wish to extract from a web site through a simple user interface in a web browser. To construct patterns, we use the notion of the edit distance between the subtrees represented by these examples to distill them into a more general pattern. This pattern may then be used to retrieve other instances of the selected data from the same page or other similar pages. By linking patterns and their components with semantic labels using RDF, we can create semantic "overlays" for Web information which are useful in such projects as the Semantic Web and the Haystack information management environment.by Andrew William Hogue.M.Eng

DSpace@MIT

Sensitivity of Semantic Signatures in Text Mining

Author: Peddada Sri Ramya
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2010
Field of study

The rapid development of the Internet and the ability to store data relatively inexpensively has contributed to an information explosion that did not exist a few years ago. Just a few keystrokes on search engines on any given subject will provide more web pages than any time before. As the amount of data available to us is so overwhelming, the ability to extract relevant information from it remains a challenge.;Since 80% of the available data stored world wide is text, we need advanced techniques to process this textual data and extract useful in formation. Text mining is one such process to address the information explosion problem that employs techniques such as natural language processing, information retrieval, machine learning algorithms and knowledge management. In text mining, the subjected text undergoes a transformation where essential attributes of the text are derived. The attributes that form interesting patterns are chosen and machine learning algorithms are used to find similar patterns in desired corpora. At the end, the resulting texts are evaluated and interpreted.;In this thesis we develop a new framework for the text mining process. An investigator chooses target content from training files, which is captured in semantic signatures. Semantic signatures characterize the target content derived from training files that we are looking for in testing files (whose content is unknown). The semantic signatures work as attributes to fetch and/or categorize the target content from a test corpus. A proof of concept software package, consisting of tools that aid an investigator in mining text data, is developed using Visual studio, C# and .NET framework.;Choosing keywords plays a major role in designing semantic signatures; careful selection of keywords leads to a more accurate analysis, especially in English, which is sensitive to semantics. It is interesting to note that when words appear in different contexts they carry a different meaning. We have incorporated stemming within the framework and its effectiveness is demonstrated using a large corpus. We have conducted experiments to demonstrate the sensitivity of semantic signatures to subtle content differences between closely related documents. These experiments show that the newly developed framework can identify subtle semantic differences substantially

The Research Repository @ WVU (West Virginia University)

SWA-KMDLS: An Enhanced e-Learning Management System Using Semantic Web and Knowledge Management Technology

Author: Mukhlason Ahmad Mukhlason
Publication venue
Publication date: 01/03/2009
Field of study

In this era of knowledge economy in which knowledge have become the most precious resource, surveys have shown that e-Learning has been on the increasing trend in various organizations including, among others, education and corporate. The use of e-Learning is not only aim to acquire knowledge but also to maintain competitiveness and advantages for individuals or organizations. However, the early promise of e-Learning has yet to be fully realized, as it has been no more than a handout being published online, coupled with simple multiple-choice quizzes. The emerging of e-Learning 2.0 that is empowered by Web 2.0 technology still hardly overcome common problem such as information overload and poor content aggregation in a highly increasing number of learning objects in an e-Learning Management System (LMS) environment. The aim of this research study is to exploit the Semantic Web (SW) and Knowledge Management (KM) technology; the two emerging and promising technology to enhance the existing LMS. The proposed system is named as Semantic Web Aware-Knowledge Management Driven e-Learning System (SWA-KMDLS). An Ontology approach that is the backbone of SW and KM is introduced for managing knowledge especially from learning object and developing automated question answering system (Aquas) with expert locator in SWA-KMDLS. The METHONTOLOGY methodology is selected to develop the Ontology in this research work. The potential of SW and KM technology is identified in this research finding which will benefit e-Learning developer to develop e-Learning system especially with social constructivist pedagogical approach from the point of view of KM framework and SW environment. The (semi-) automatic ontological knowledge base construction system (SAOKBCS) has contributed to knowledge extraction from learning object semiautomatically whilst the Aquas with expert locator has facilitated knowledge retrieval that encourages knowledge sharing in e-Learning environment. The experiment conducted has shown that the SAOKBCS can extract concept that is the main component of Ontology from text learning object with precision of 86.67%, thus saving the expert time and effort to build Ontology manually. Additionally the experiment on Aquas has shown that more than 80% of users are satisfied with answers provided by the system. The expert locator framework can also improve the performance of Aquas in the future usage. Keywords: semantic web aware – knowledge e-Learning Management System (SWAKMDLS), semi-automatic ontological knowledge base construction system (SAOKBCS), automated question answering system (Aquas), Ontology, expert locator

UTPedia