236,227 research outputs found

    High-performance integrated virtual environment (HIVE) tools and applications for big data analysis

    Get PDF
    The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis

    Functional adaptivity for digital library services in e-infrastructures: the gCube approach

    Get PDF
    We consider the problem of e-Infrastructures that wish to reconcile the generality of their services with the bespoke requirements of diverse user communities. We motivate the requirement of functional adaptivity in the context of gCube, a service-based system that integrates Grid and Digital Library technologies to deploy, operate, and monitor Virtual Research Environments deïŹned over infrastructural resources. We argue that adaptivity requires mapping service interfaces onto multiple implementations, truly alternative interpretations of the same functionality. We then analyse two design solutions in which the alternative implementations are, respectively, full-ïŹ‚edged services and local components of a single service. We associate the latter with lower development costs and increased binding ïŹ‚exibility, and outline a strategy to deploy them dynamically as the payload of service plugins. The result is an infrastructure in which services exhibit multiple behaviours, know how to select the most appropriate behaviour, and can seamlessly learn new behaviours

    Sharing a conceptual model of grid resources and services

    Full text link
    Grid technologies aim at enabling a coordinated resource-sharing and problem-solving capabilities over local and wide area networks and span locations, organizations, machine architectures and software boundaries. The heterogeneity of involved resources and the need for interoperability among different grid middlewares require the sharing of a common information model. Abstractions of different flavors of resources and services and conceptual schemas of domain specific entities require a collaboration effort in order to enable a coherent information services cooperation. With this paper, we present the result of our experience in grid resources and services modelling carried out within the Grid Laboratory Uniform Environment (GLUE) effort, a joint US and EU High Energy Physics projects collaboration towards grid interoperability. The first implementation-neutral agreement on services such as batch computing and storage manager, resources such as the hierarchy cluster, sub-cluster, host and the storage library are presented. Design guidelines and operational results are depicted together with open issues and future evolutions.Comment: 4 pages, 0 figures, CHEP 200

    Servicing the federation : the case for metadata harvesting

    Get PDF
    The paper presents a comparative analysis of data harvesting and distributed computing as complementary models of service delivery within large-scale federated digital libraries. Informed by requirements of flexibility and scalability of federated services, the analysis focuses on the identification and assessment of model invariants. In particular, it abstracts over application domains, services, and protocol implementations. The analytical evidence produced shows that the harvesting model offers stronger guarantees of satisfying the identified requirements. In addition, it suggests a first characterisation of services based on their suitability to either model and thus indicates how they could be integrated in the context of a single federated digital library

    Collection-level description: metadata of the future?

    Get PDF
    The potential for digital library growth has recently drawn into question the ability of users to navigate large distributed and heterogeneous collections. This paper attempts to summarise some of the potential benefits to be derived through the implementation of collection-level descriptions for both user resource discovery and institutional collection management. In particular, the concept of "functional granularity" is introduced and some related issues are briefly explored

    SPEIR: Scottish Portals for Education, Information and Research. Final Project Report: Elements and Future Development Requirements of a Common Information Environment for Scotland

    Get PDF
    The SPEIR (Scottish Portals for Education, Information and Research) project was funded by the Scottish Library and Information Council (SLIC). It ran from February 2003 to September 2004, slightly longer than the 18 months originally scheduled and was managed by the Centre for Digital Library Research (CDLR). With SLIC's agreement, community stakeholders were represented in the project by the Confederation of Scottish Mini-Cooperatives (CoSMiC), an organisation whose members include SLIC, the National Library of Scotland (NLS), the Scottish Further Education Unit (SFEU), the Scottish Confederation of University and Research Libraries (SCURL), regional cooperatives such as the Ayrshire Libraries Forum (ALF)1, and representatives from the Museums and Archives communities in Scotland. Aims; A Common Information Environment For Scotland The aims of the project were to: o Conduct basic research into the distributed information infrastructure requirements of the Scottish Cultural Portal pilot and the public library CAIRNS integration proposal; o Develop associated pilot facilities by enhancing existing facilities or developing new ones; o Ensure that both infrastructure proposals and pilot facilities were sufficiently generic to be utilised in support of other portals developed by the Scottish information community; o Ensure the interoperability of infrastructural elements beyond Scotland through adherence to established or developing national and international standards. Since the Scottish information landscape is taken by CoSMiC members to encompass relevant activities in Archives, Libraries, Museums, and related domains, the project was, in essence, concerned with identifying, researching, and developing the elements of an internationally interoperable common information environment for Scotland, and of determining the best path for future progress

    Enabling Interactive Analytics of Secure Data using Cloud Kotta

    Full text link
    Research, especially in the social sciences and humanities, is increasingly reliant on the application of data science methods to analyze large amounts of (often private) data. Secure data enclaves provide a solution for managing and analyzing private data. However, such enclaves do not readily support discovery science---a form of exploratory or interactive analysis by which researchers execute a range of (sometimes large) analyses in an iterative and collaborative manner. The batch computing model offered by many data enclaves is well suited to executing large compute tasks; however it is far from ideal for day-to-day discovery science. As researchers must submit jobs to queues and wait for results, the high latencies inherent in queue-based, batch computing systems hinder interactive analysis. In this paper we describe how we have augmented the Cloud Kotta secure data enclave to support collaborative and interactive analysis of sensitive data. Our model uses Jupyter notebooks as a flexible analysis environment and Python language constructs to support the execution of arbitrary functions on private data within this secure framework.Comment: To appear in Proceedings of Workshop on Scientific Cloud Computing, Washington, DC USA, June 2017 (ScienceCloud 2017), 7 page

    Collaborative tagging as a knowledge organisation and resource discovery tool

    Get PDF
    The purpose of the paper is to provide an overview of the collaborative tagging phenomenon and explore some of the reasons for its emergence. Design/methodology/approach - The paper reviews the related literature and discusses some of the problems associated with, and the potential of, collaborative tagging approaches for knowledge organisation and general resource discovery. A definition of controlled vocabularies is proposed and used to assess the efficacy of collaborative tagging. An exposition of the collaborative tagging model is provided and a review of the major contributions to the tagging literature is presented. Findings - There are numerous difficulties with collaborative tagging systems (e.g. low precision, lack of collocation, etc.) that originate from the absence of properties that characterise controlled vocabularies. However, such systems can not be dismissed. Librarians and information professionals have lessons to learn from the interactive and social aspects exemplified by collaborative tagging systems, as well as their success in engaging users with information management. The future co-existence of controlled vocabularies and collaborative tagging is predicted, with each appropriate for use within distinct information contexts: formal and informal. Research limitations/implications - Librarians and information professional researchers should be playing a leading role in research aimed at assessing the efficacy of collaborative tagging in relation to information storage, organisation, and retrieval, and to influence the future development of collaborative tagging systems. Practical implications - The paper indicates clear areas where digital libraries and repositories could innovate in order to better engage users with information. Originality/value - At time of writing there were no literature reviews summarising the main contributions to the collaborative tagging research or debate

    Terminology server for improved resource discovery: analysis of model and functions

    Get PDF
    This paper considers the potential to improve distributed information retrieval via a terminologies server. The restriction upon effective resource discovery caused by the use of disparate terminologies across services and collections is outlined, before considering a DDC spine based approach involving inter-scheme mapping as a possible solution. The developing HILT model is discussed alongside other existing models and alternative approaches to solving the terminologies problem. Results from the current HILT pilot are presented to illustrate functionality and suggestions are made for further research and development
    • 

    corecore