3,437 research outputs found

    Digitometric Services for Open Archives Environments

    No full text
    We describe “digitometric” services and tools that add value to open-access eprint archives using the Open Archives Initiative (OAI) Protocol for Metadata Harvesting. Celestial is an OAI cache and gateway tool. Citebase Search enhances OAI-harvested metadata with linked references harvested from the full-text to provide a web service for citation navigation and research impact analysis. Digitometrics builds on data harvested using OAI to provide advanced visualisation and hypertext navigation for the research community. Together these services provide a modular, distributed architecture for building a “semantic web” for the research literature

    Understanding and Improving the Performance of Web Page Loads

    Full text link
    The web is vital to our daily lives, yet web pages are often slow to load. The inefficiency and complexity of loading web pages can be attributed to the dependencies between resources within a web page, which also leads to underutilization of the CPU and network on client devices. My thesis research seeks solutions that enable better use of the client-side CPU and network during page loads. Such solutions can be categorized into three types of approaches: 1) leveraging a proxy to optimize web page loads, 2) modifying the end-to-end interaction between client browsers and web servers, and 3) rewriting web pages. Each approach offers various benefits and trade-offs. This dissertation explores three specific solutions. First, CASPR is a proxy-based solution that enables clients to offload JavaScript computations to proxies. CASPR loads web pages on behalf of clients and transforms every page into a version that is simpler for clients to process, leading to a 1.7s median improvement in web page rendering for popular CASPR web pages. Second, Vroom rethinks how page loads work; in order to minimize dependencies between resources, it enables web servers to provide resource hints to clients and ensures that resources are loaded with proper prioritization. As a result, Vroom halves the median load times for popular news and sports websites. Finally, I conducted a longitudinal study to understand how web pages have changed over time and how these changes have affected performance.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163157/1/vaspol_1.pd

    Federating Heterogeneous Digital Libraries by Metadata Harvesting

    Get PDF
    This dissertation studies the challenges and issues faced in federating heterogeneous digital libraries (DLs) by metadata harvesting. The objective of federation is to provide high-level services (e.g. transparent search across all DLs) on the collective metadata from different digital libraries. There are two main approaches to federate DLs: distributed searching approach and harvesting approach. As the distributed searching approach replies on executing queries to digital libraries in real time, it has problems with scalability. The difficulty of creating a distributed searching service for a large federation is the motivation behind Open Archives Initiatives Protocols for Metadata Harvesting (OAI-PMH). OAI-PMH supports both data providers (repositories, archives) and service providers. Service providers develop value-added services based on the information collected from data providers. Data providers are simply collections of harvestable metadata. This dissertation examines the application of the metadata harvesting approach in DL federations. It addresses the following problems: (1) Whether or not metadata harvesting provides a realistic and scalable solution for DL federation. (2) What is the status of and problems with current data provider implementations, and how to solve these problems. (3) How to synchronize data providers and service providers. (4) How to build different types of federation services over harvested metadata. (5) How to create a scalable and reliable infrastructure to support federation services. The work done in this dissertation is based on OAI-PMH, and the results have influenced the evolution of OAI-PMH. However, the results are not limited to the scope of OAI-PMH. Our approach is to design and build key services for metadata harvesting and to deploy them on the Web. Implementing a publicly available service allows us to demonstrate how these approaches are practical. The problems posed above are evaluated by performing experiments over these services. To summarize the results of this thesis, we conclude that the metadata harvesting approach is a realistic and scalable approach to federate heterogeneous DLs. We present two models of building federation services: a centralized model and a replicated model. Our experiments also demonstrate that the repository synchronization problem can be addressed by push, pull, and hybrid push/pull models; each model has its strengths and weaknesses and fits a specific scenario. Finally, we present a scalable and reliable infrastructure to support the applications of metadata harvesting

    Towards a biodiversity knowledge graph

    Get PDF
    One way to think about "core" biodiversity data is as a network of connected entities, such as taxa, taxonomic names, publications, people, species, sequences, images, and collections that form the "biodiversity knowledge graph". Many questions in biodiversity informatics can be framed as paths in this graph. This article explores this futher, and sketches a set of services and tools we would need in order to construct the graph
    • …
    corecore