1,093 research outputs found

    Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis

    No full text
    The Market Blended Insight project1 has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the unstructured text on the web, to annotate and then translate the extracted data according to the backend schema

    Local search engine with global content based on domain specific knowledge

    Get PDF
    In the growing need for information we have come to rely on search engines. The use of large scale search engines, such as Google, is as common as surfingthe World Wide Web. We are impressed with the capabilities of these search engines but still there is a need for improvment. A common problem withsearching is the ambiguity of words. Their meaning often depends on the context in which they are used or varies across specific domains. To resolve this we propose a domain specific search engine that is globally oriented. We intend to provide content classification according to the target domain concepts, access to privileged information, personalization and custom rankingfunctions. Domain specific concepts have been formalized in the form ofontology. The paper describes our approach to a centralized search service for domain specific content. The approach uses automated indexing for various content sources that can be found in the form of a relational database, we! b service, web portal or page, various document formats and other structured or unstructured data. The gathered data is tagged with various approaches and classified against the domain classification. The indexed data is accessible through a highly optimized and personalized search service

    Towards a Cloud-Based Service for Maintaining and Analyzing Data About Scientific Events

    Full text link
    We propose the new cloud-based service OpenResearch for managing and analyzing data about scientific events such as conferences and workshops in a persistent and reliable way. This includes data about scientific articles, participants, acceptance rates, submission numbers, impact values as well as organizational details such as program committees, chairs, fees and sponsors. OpenResearch is a centralized repository for scientific events and supports researchers in collecting, organizing, sharing and disseminating information about scientific events in a structured way. An additional feature currently under development is the possibility to archive web pages along with the extracted semantic data in order to lift the burden of maintaining new and old conference web sites from public research institutions. However, the main advantage is that this cloud-based repository enables a comprehensive analysis of conference data. Based on extracted semantic data, it is possible to determine quality estimations, scientific communities, research trends as well the development of acceptance rates, fees, and number of participants in a continuous way complemented by projections into the future. Furthermore, data about research articles can be systematically explored using a content-based analysis as well as citation linkage. All data maintained in this crowd-sourcing platform is made freely available through an open SPARQL endpoint, which allows for analytical queries in a flexible and user-defined way.Comment: A completed version of this paper had been accepted in SAVE-SD workshop 2017 at WWW conferenc

    On the evolution of hyperlinking

    Get PDF
    Across time, the hyperlink object has supported different applications and studies. This is one perspective on the evolution of the hyperlinking concept, its context and related behaviors. Through a spectrum of hyperlinking applications and practices, the article contrasts the status quo with its related, broader, conceptual roots; it also bridges to some theorized and prototyped hyperlink variations, namely "stigmergic hyperlinks", to make the case that the ubiquitousness of some objects and certain usage patterns can obfuscate opportunities to (re)think them. In trying to contribute an answer to "what has the common hyperlink (such an apparently simple object) done to society, and what has society done to it?", the article identifies situations that have become so embedded in the daily routine, that it is now hard to think of hyperlinking alternatives.info:eu-repo/semantics/publishedVersio

    Desk Set: Ready Reference on the Web

    Get PDF

    Contexts and Contributions: Building the Distributed Library

    Get PDF
    This report updates and expands on A Survey of Digital Library Aggregation Services, originally commissioned by the DLF as an internal report in summer 2003, and released to the public later that year. It highlights major developments affecting the ecosystem of scholarly communications and digital libraries since the last survey and provides an analysis of OAI implementation demographics, based on a comparative review of repository registries and cross-archive search services. Secondly, it reviews the state-of-practice for a cohort of digital library aggregation services, grouping them in the context of the problem space to which they most closely adhere. Based in part on responses collected in fall 2005 from an online survey distributed to the original core services, the report investigates the purpose, function and challenges of next-generation aggregation services. On a case-by-case basis, the advances in each service are of interest in isolation from each other, but the report also attempts to situate these services in a larger context and to understand how they fit into a multi-dimensional and interdependent ecosystem supporting the worldwide community of scholars. Finally, the report summarizes the contributions of these services thus far and identifies obstacles requiring further attention to realize the goal of an open, distributed digital library system

    Using RSS to Improve Web Harvest Results for News Web Sites

    Get PDF
    In the last several years, the Library of Congress Web archiving program has grown to include large sites that publish news–over more than a year we learned they present serious challenges. After thinking through the use cases for archived online news sites, we realized that completeness of harvest was paramount. As we developed our understanding of deficiencies in the completeness of these kinds of sites we began to test use of RSS feeds to build customized seed lists for shallow crawls as the primary way these sites are crawled. Over time we discovered that while completeness of harvest was greatly improved, we had a new problem with the ability to browse to all harvested content. This article is a case study describing these iterative experiences that are a work in progress

    SEARCH ENGINE OPTIMIZATION: A REVIEW

    Get PDF
    The Search Engine has a critical role in presenting the correct pages to the user because of the availability of a huge number of websites, Search Engines such as Google use the Page Ranking Algorithm to rate web pages according to the nature of their content and their existence on the world wide web. SEO can be characterized as methodology used to elevate site keeping in mind the end goal to have a high rank i.e., top outcome. In this paper the authors present the most search engine optimization like (Google, Bing, MSN, Yahoo, etc.), and compare by the performance of the search engine optimization. The authors also present the benefits, limitation, challenges, and the search engine optimization application in business
    corecore