Search CORE

930 research outputs found

The RDFa Content Editor - From WYSIWYG to WYSIWYM

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

WebBANC: Building Semantically-Rich Annotated Corpora from Web User Annotations of Minority Languages

Author: Breimyer Paul
Green Nathan
Kumar Vinay
Samatova Nagiza F
Publication venue
Publication date: 13/05/2009
Field of study

Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 48-56. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

DSpace at Tartu University Library

Information Extraction in Illicit Domains

Author: Banko M.
Bauer F.
Chakrabarti S.
Kushmerick N.
Mikolov T.
Sahlgren M.
Wick M.
Zouaq A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2017
Field of study

Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

arXiv.org e-Print Archive

Crossref

Features for Killer Apps from a Semantic Web Perspective

Author: Alani Harith
Kalfoglou Yannis
O'Hara Kieron
Shadbolt Nigel
Publication venue: Information Science Reference
Publication date: 01/01/2008
Field of study

There are certain features that that distinguish killer apps from other ordinary applications. This chapter examines those features in the context of the semantic web, in the hope that a better understanding of the characteristics of killer apps might encourage their consideration when developing semantic web applications. Killer apps are highly tranformative technologies that create new e-commerce venues and widespread patterns of behaviour. Information technology, generally, and the Web, in particular, have benefited from killer apps to create new networks of users and increase its value. The semantic web community on the other hand is still awaiting a killer app that proves the superiority of its technologies. The authors hope that this chapter will help to highlight some of the common ingredients of killer apps in e-commerce, and discuss how such applications might emerge in the semantic web

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Large-Scale Pattern-Based Information Extraction from the World Wide Web

Author: Blohm Sebastian
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Extracting information from text is the task of obtaining structured, machine-processable facts from information that is mentioned in an unstructured manner. It thus allows systems to automatically aggregate information for further analysis, efficient retrieval, automatic validation, or appropriate visualization. This work explores the potential of using textual patterns for Information Extraction from the World Wide Web

Directory of Open Access Books (DOAB)

Automatic Annotating Search Results with Relevance Feedback for User Search Goals

Author: Ms. Ashwini Dere
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2015
Field of study

Information retrieved form web database which contain data in html format. For more understanding of user need to extract the html pages and assign labels mean Data Alignment is need for Data units for html documents . Then, for each group annotate it from different aspects and aggregate the different annotations to predict a final annotation label for it. An annotation wrapper for the search site is automatically constructed and can be used to annotate new result pages from the same web database. Users search with accuracy and speed goals is to study law. This method limits the conditions suffered in the search accuracy and speed. Currently the main aim for more improvements and approaches to Web user satisfaction of search is the basis for the goals. Users search for goals different methods literature review to present the new framework and proposed methods and insightful analysis algorithms and evaluate its performance. First, we propose framework automatic annotation for retrieved documents by clustering the same contain documents and assign data units for each cluster . Feedback sessions are constructed from user click-through logs and can efficiently reflect the information needs of users. Finally, we propose a new criterion “Classified Average Precision (CAP)” to evaluate the performance of inferring user search goals. Experimental results are presented using user click-through logs from a commercial search engine to validate the effectiveness of our proposed methods. DOI: 10.17762/ijritcc2321-8169.15076

International Journal on Recent and Innovation Trends in Computing and Communication

Integrating institutional repositories into the Semantic Web

Author: Mason Harry Jon
Publication venue
Publication date: 01/01/2008
Field of study

The Web has changed the face of scientific communication; and the Semantic Web promises new ways of adding value to research material by making it more accessible to automatic discovery, linking, and analysis. Institutional repositories contain a wealth of information which could benefit from the application of this technology. In this thesis I describe the problems inherent in the informality of traditional repository metadata, and propose a data model based on the Semantic Web which will support more efficient use of this data, with the aim of streamlining scientific communication and promoting efficient use of institutional research output

Southampton (e-Prints Soton)

OpenGrey Repository

Semantic Web meets Web 2.0 (and vice versa): The Value of the Mundane for the Semantic Web

Author: Russell A.
schraefel m.c.
Smith D.A.
Wilson M.L.
Publication venue: s.n.
Publication date: 01/01/2006
Field of study

Web 2.0, not the Semantic Web, has become the face of “the next generation Web” among the tech-literate set, and even among many in the various research communities involved in the Web. Perceptions in these communities of what the Semantic Web is (and who is involved in it) are often misinformed if not misguided. In this paper we identify opportunities for Semantic Web activities to connect with the Web 2.0 community; we explore why this connection is of significant benefit to both groups, and identify how these connections open valuable research opportunities “in the real” for the Semantic Web effort

Southampton (e-Prints Soton)