1,093 research outputs found
Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis
The Market Blended Insight project1 has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the unstructured text on the web, to annotate and then translate the extracted data according to the backend schema
Local search engine with global content based on domain specific knowledge
In the growing need for information we have come to rely on search engines. The use of large scale search engines, such as Google, is as common as surfingthe World Wide Web. We are impressed with the capabilities of these search engines but still there is a need for improvment. A common problem withsearching is the ambiguity of words. Their meaning often depends on the context in which they are used or varies across specific domains. To resolve this we propose a domain specific search engine that is globally oriented. We intend to provide content classification according to the target domain concepts, access to privileged information, personalization and custom rankingfunctions. Domain specific concepts have been formalized in the form ofontology. The paper describes our approach to a centralized search service for domain specific content. The approach uses automated indexing for various content sources that can be found in the form of a relational database, we! b service, web portal or page, various document formats and other structured or unstructured data. The gathered data is tagged with various approaches and classified against the domain classification. The indexed data is accessible through a highly optimized and personalized search service
Towards a Cloud-Based Service for Maintaining and Analyzing Data About Scientific Events
We propose the new cloud-based service OpenResearch for managing and
analyzing data about scientific events such as conferences and workshops in a
persistent and reliable way. This includes data about scientific articles,
participants, acceptance rates, submission numbers, impact values as well as
organizational details such as program committees, chairs, fees and sponsors.
OpenResearch is a centralized repository for scientific events and supports
researchers in collecting, organizing, sharing and disseminating information
about scientific events in a structured way. An additional feature currently
under development is the possibility to archive web pages along with the
extracted semantic data in order to lift the burden of maintaining new and old
conference web sites from public research institutions. However, the main
advantage is that this cloud-based repository enables a comprehensive analysis
of conference data. Based on extracted semantic data, it is possible to
determine quality estimations, scientific communities, research trends as well
the development of acceptance rates, fees, and number of participants in a
continuous way complemented by projections into the future. Furthermore, data
about research articles can be systematically explored using a content-based
analysis as well as citation linkage. All data maintained in this
crowd-sourcing platform is made freely available through an open SPARQL
endpoint, which allows for analytical queries in a flexible and user-defined
way.Comment: A completed version of this paper had been accepted in SAVE-SD
workshop 2017 at WWW conferenc
On the evolution of hyperlinking
Across time, the hyperlink object has supported different applications and studies. This is one perspective on the evolution of the hyperlinking concept, its context and related behaviors. Through a spectrum of hyperlinking applications and practices, the article contrasts the status quo with its related, broader, conceptual roots; it also bridges to some theorized and prototyped hyperlink variations, namely "stigmergic hyperlinks", to make the case that the ubiquitousness of some objects and certain usage patterns can obfuscate opportunities to (re)think them. In trying to contribute an answer to "what has the common hyperlink (such an apparently simple object) done to society, and what has society done to it?", the article identifies situations that have become so embedded in the daily routine, that it is now hard to think of hyperlinking alternatives.info:eu-repo/semantics/publishedVersio
Contexts and Contributions: Building the Distributed Library
This report updates and expands on A Survey of Digital Library Aggregation Services, originally commissioned by the DLF as an internal report in summer 2003, and released to the public later that year. It highlights major developments affecting the ecosystem of scholarly communications and digital libraries since the last survey and provides an analysis of OAI implementation demographics, based on a comparative review of repository registries and cross-archive search services. Secondly, it reviews the state-of-practice for a cohort of digital library aggregation services, grouping them in the context of the problem space to which they most closely adhere. Based in part on responses collected in fall 2005 from an online survey distributed to the original core services, the report investigates the purpose, function and challenges of next-generation aggregation services. On a case-by-case basis, the advances in each service are of interest in isolation from each other, but the report also attempts to situate these services in a larger context and to understand how they fit into a multi-dimensional and interdependent ecosystem supporting the worldwide community of scholars. Finally, the report summarizes the contributions of these services thus far and identifies obstacles requiring further attention to realize the goal of an open, distributed digital library system
Using RSS to Improve Web Harvest Results for News Web Sites
In the last several years, the Library of Congress Web archiving program has grown to include large sites that publish news–over more than a year we learned they present serious challenges. After thinking through the use cases for archived online news sites, we realized that completeness of harvest was paramount. As we developed our understanding of deficiencies in the completeness of these kinds of sites we began to test use of RSS feeds to build customized seed lists for shallow crawls as the primary way these sites are crawled. Over time we discovered that while completeness of harvest was greatly improved, we had a new problem with the ability to browse to all harvested content. This article is a case study describing these iterative experiences that are a work in progress
SEARCH ENGINE OPTIMIZATION: A REVIEW
The Search Engine has a critical role in presenting the correct pages to the user because of the availability of a huge number of websites, Search Engines such as Google use the Page Ranking Algorithm to rate web pages according to the nature of their content and their existence on the world wide web. SEO can be characterized as methodology used to elevate site keeping in mind the end goal to have a high rank i.e., top outcome. In this paper the authors present the most search engine optimization like (Google, Bing, MSN, Yahoo, etc.), and compare by the performance of the search engine optimization. The authors also present the benefits, limitation, challenges, and the search engine optimization application in business
Recommended from our members
Web Archiving Environmental Scan
Environmental scan of Web archiving activities at university libraries around the United States
- …