2,595 research outputs found

    Impliance: A Next Generation Information Management Appliance

    Full text link
    ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

    Smart Environmental Data Infrastructures: Bridging the Gap between Earth Sciences and Citizens

    Get PDF
    The monitoring and forecasting of environmental conditions is a task to which much effort and resources are devoted by the scientific community and relevant authorities. Representative examples arise in meteorology, oceanography, and environmental engineering. As a consequence, high volumes of data are generated, which include data generated by earth observation systems and different kinds of models. Specific data models, formats, vocabularies and data access infrastructures have been developed and are currently being used by the scientific community. Due to this, discovering, accessing and analyzing environmental datasets requires very specific skills, which is an important barrier for their reuse in many other application domains. This paper reviews earth science data representation and access standards and technologies, and identifies the main challenges to overcome in order to enable their integration in semantic open data infrastructures. This would allow non-scientific information technology practitioners to devise new end-user solutions for citizen problems in new application domainsThis research was co-funded by (i) the TRAFAIR project (2017-EU-IA-0167), co-financed by the Connecting Europe Facility of the European Union, (ii) the RADAR-ON-RAIA project (0461_RADAR_ON_RAIA_1_E) co-financed by the European Regional Development Fund (ERDF) through the Iterreg V-A Spain-Portugal program (POCTEP) 2014-2020, and (iii) the Consellería de Educación, Universidade e Formación Profesional of the regional government of Galicia (Spain), through the support for research groups with growth potential (ED431B 2018/28)S

    OpenKnowledge at work: exploring centralized and decentralized information gathering in emergency contexts

    Get PDF
    Real-world experience teaches us that to manage emergencies, efficient crisis response coordination is crucial; ICT infrastructures are effective in supporting the people involved in such contexts, by supporting effective ways of interaction. They also should provide innovative means of communication and information management. At present, centralized architectures are mostly used for this purpose; however, alternative infrastructures based on the use of distributed information sources, are currently being explored, studied and analyzed. This paper aims at investigating the capability of a novel approach (developed within the European project OpenKnowledge1) to support centralized as well as decentralized architectures for information gathering. For this purpose we developed an agent-based e-Response simulation environment fully integrated with the OpenKnowledge infrastructure and through which existing emergency plans are modelled and simulated. Preliminary results show the OpenKnowledge capability of supporting the two afore-mentioned architectures and, under ideal assumptions, a comparable performance in both cases

    CHAIN-REDS DART Challenge

    Get PDF
    CHAIN-REDS (Coordination and Harmonisation of Advanced e-infrastructure for Research and Education Data Sharing) is EU project focused on promoting and supporting technological and scientific collaboration across different communities established in various continents. Nowadays, one of the most challenging scenarios scientist and scientific communities are facing is huge amount of data emerging from vast networks of sensors and form computational simulations performed in a diversity of computing architectures and e-infrastructure. The new knowledge coming out from the interpretation of these datasets, reported on the scholar literature, is increasingly problematic to be reproducible due to the difficulty to access measured data repositories and/or computational applications that generate synthetic data through computer simulations. This paper presents CHAIN REDS approach, several tools and services, based on the adoption of standards, aimed at providing easy/seamless access to datasets, data repositories, open access document repositories and to the applications that could make use of them. All these tools and services are enclosed in what we have called the Data Accessibility, Reproducibility and Trustworthiness (DART) challenge. This initiative allows researchers to easily find data of his interest and directly use them in a code running by means of a Science Gateway (SG) that provides access to cluster, Grid and Cloud infrastructure worldwide. In this scenario, the datasets are found by means of either the CHAIN-REDS Knowledge Base (KB) or the Semantic Search Engine (SSE), the applications ran on the CHAIN-REDS SG, accessible through an Identity Federation. The datasets can be both identified by Persistent Identifier (PID) and assigned unique number ID. Scientists can then access the data and the corresponding application in order to either reproduce and extend the results of a given study or start a new investigation. The new data (and the new paper if any) are stored on the Data Infrastructure and can be easily found by the people belonging to the same domain making possible to start the cycle again.Repositório de dados científicos.Ibero-American Science and Technology Education Consortium (ISTEC

    Deliverable JRA1.1: Evaluation of current network control and management planes for multi-domain network infrastructure

    Get PDF
    This deliverable includes a compilation and evaluation of available control and management architectures and protocols applicable to a multilayer infrastructure in a multi-domain Virtual Network environment.The scope of this deliverable is mainly focused on the virtualisation of the resources within a network and at processing nodes. The virtualization of the FEDERICA infrastructure allows the provisioning of its available resources to users by means of FEDERICA slices. A slice is seen by the user as a real physical network under his/her domain, however it maps to a logical partition (a virtual instance) of the physical FEDERICA resources. A slice is built to exhibit to the highest degree all the principles applicable to a physical network (isolation, reproducibility, manageability, ...). Currently, there are no standard definitions available for network virtualization or its associated architectures. Therefore, this deliverable proposes the Virtual Network layer architecture and evaluates a set of Management- and Control Planes that can be used for the partitioning and virtualization of the FEDERICA network resources. This evaluation has been performed taking into account an initial set of FEDERICA requirements; a possible extension of the selected tools will be evaluated in future deliverables. The studies described in this deliverable define the virtual architecture of the FEDERICA infrastructure. During this activity, the need has been recognised to establish a new set of basic definitions (taxonomy) for the building blocks that compose the so-called slice, i.e. the virtual network instantiation (which is virtual with regard to the abstracted view made of the building blocks of the FEDERICA infrastructure) and its architectural plane representation. These definitions will be established as a common nomenclature for the FEDERICA project. Other important aspects when defining a new architecture are the user requirements. It is crucial that the resulting architecture fits the demands that users may have. Since this deliverable has been produced at the same time as the contact process with users, made by the project activities related to the Use Case definitions, JRA1 has proposed a set of basic Use Cases to be considered as starting point for its internal studies. When researchers want to experiment with their developments, they need not only network resources on their slices, but also a slice of the processing resources. These processing slice resources are understood as virtual machine instances that users can use to make them behave as software routers or end nodes, on which to download the software protocols or applications they have produced and want to assess in a realistic environment. Hence, this deliverable also studies the APIs of several virtual machine management software products in order to identify which best suits FEDERICA’s needs.Postprint (published version

    Public Commons for Geospatial Data: A Conceptual Model

    Get PDF
    A wide variety of spatial data collection efforts are ongoing throughout local, state and federal agencies, private firms and non-profit organizations. Each effort is established for a different purpose but organizations and individuals often collect and maintain the same or similar information. The United States federal government has undertaken many initiatives such as the National Spatial Data Infrastructure, the National Map and Geospatial One-Stop to reduce duplicative spatial data collection and promote the coordinated use, sharing, and dissemination of spatial data nationwide. A key premise in most of these initiatives is that no national government will be able to gather and maintain more than a small percentage of the geographic data that users want and desire. Thus, national initiatives depend typically on the cooperation of those already gathering spatial data and those using GIs to meet specific needs to help construct and maintain these spatial data infrastructures and geo-libraries for their nations (Onsrud 2001). Some of the impediments to widespread spatial data sharing are well known from directly asking GIs data producers why they are not currently involved in creating datasets that are of common or compatible formats, documenting their datasets in a standardized metadata format or making their datasets more readily available to others through Data Clearinghouses or geo-libraries. The research described in this thesis addresses the impediments to wide-scale spatial data sharing faced by GIs data producers and explores a new conceptual data-sharing approach, the Public Commons for Geospatial Data, that supports user-friendly metadata creation, open access licenses, archival services and documentation of parent lineage of the contributors and value- adders of digital spatial data sets

    Web system for creating and managing virtual high perfomance computing environments

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
    corecore