930 research outputs found

    The RDFa Content Editor - From WYSIWYG to WYSIWYM

    Full text link

    WebBANC: Building Semantically-Rich Annotated Corpora from Web User Annotations of Minority Languages

    Get PDF
    Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 48-56. Š 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

    Information Extraction in Illicit Domains

    Full text link
    Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

    Features for Killer Apps from a Semantic Web Perspective

    Get PDF
    There are certain features that that distinguish killer apps from other ordinary applications. This chapter examines those features in the context of the semantic web, in the hope that a better understanding of the characteristics of killer apps might encourage their consideration when developing semantic web applications. Killer apps are highly tranformative technologies that create new e-commerce venues and widespread patterns of behaviour. Information technology, generally, and the Web, in particular, have benefited from killer apps to create new networks of users and increase its value. The semantic web community on the other hand is still awaiting a killer app that proves the superiority of its technologies. The authors hope that this chapter will help to highlight some of the common ingredients of killer apps in e-commerce, and discuss how such applications might emerge in the semantic web

    Large-Scale Pattern-Based Information Extraction from the World Wide Web

    Get PDF
    Extracting information from text is the task of obtaining structured, machine-processable facts from information that is mentioned in an unstructured manner. It thus allows systems to automatically aggregate information for further analysis, efficient retrieval, automatic validation, or appropriate visualization. This work explores the potential of using textual patterns for Information Extraction from the World Wide Web

    Automatic Annotating Search Results with Relevance Feedback for User Search Goals

    Get PDF
    Information retrieved form web database which contain data in html format. For more understanding of user need to extract the html pages and assign labels mean Data Alignment is need for Data units for html documents . Then, for each group annotate it from different aspects and aggregate the different annotations to predict a final annotation label for it. An annotation wrapper for the search site is automatically constructed and can be used to annotate new result pages from the same web database. Users search with accuracy and speed goals is to study law. This method limits the conditions suffered in the search accuracy and speed. Currently the main aim for more improvements and approaches to Web user satisfaction of search is the basis for the goals. Users search for goals different methods literature review to present the new framework and proposed methods and insightful analysis algorithms and evaluate its performance. First, we propose framework automatic annotation for retrieved documents by clustering the same contain documents and assign data units for each cluster . Feedback sessions are constructed from user click-through logs and can efficiently reflect the information needs of users. Finally, we propose a new criterion “Classified Average Precision (CAP)” to evaluate the performance of inferring user search goals. Experimental results are presented using user click-through logs from a commercial search engine to validate the effectiveness of our proposed methods. DOI: 10.17762/ijritcc2321-8169.15076

    Integrating institutional repositories into the Semantic Web

    Get PDF
    The Web has changed the face of scientific communication; and the Semantic Web promises new ways of adding value to research material by making it more accessible to automatic discovery, linking, and analysis. Institutional repositories contain a wealth of information which could benefit from the application of this technology. In this thesis I describe the problems inherent in the informality of traditional repository metadata, and propose a data model based on the Semantic Web which will support more efficient use of this data, with the aim of streamlining scientific communication and promoting efficient use of institutional research output

    Semantic Web meets Web 2.0 (and vice versa): The Value of the Mundane for the Semantic Web

    No full text
    Web 2.0, not the Semantic Web, has become the face of “the next generation Web” among the tech-literate set, and even among many in the various research communities involved in the Web. Perceptions in these communities of what the Semantic Web is (and who is involved in it) are often misinformed if not misguided. In this paper we identify opportunities for Semantic Web activities to connect with the Web 2.0 community; we explore why this connection is of significant benefit to both groups, and identify how these connections open valuable research opportunities “in the real” for the Semantic Web effort
    • …
    corecore