22,104 research outputs found

    archivist: An R Package for Managing, Recording and Restoring Data Analysis Results

    Get PDF
    Everything that exists in R is an object [Chambers2016]. This article examines what would be possible if we kept copies of all R objects that have ever been created. Not only objects but also their properties, meta-data, relations with other objects and information about context in which they were created. We introduce archivist, an R package designed to improve the management of results of data analysis. Key functionalities of this package include: (i) management of local and remote repositories which contain R objects and their meta-data (objects' properties and relations between them); (ii) archiving R objects to repositories; (iii) sharing and retrieving objects (and it's pedigree) by their unique hooks; (iv) searching for objects with specific properties or relations to other objects; (v) verification of object's identity and context of it's creation. The presented archivist package extends, in a combination with packages such as knitr and Sweave, the reproducible research paradigm by creating new ways to retrieve and validate previously calculated objects. These new features give a variety of opportunities such as: sharing R objects within reports or articles; adding hooks to R objects in table or figure captions; interactive exploration of object repositories; caching function calls with their results; retrieving object's pedigree (information about how the object was created); automated tracking of the performance of considered models, restoring R libraries to the state in which object was archived.Comment: Submitted to JSS in 2015, conditionally accepte

    Hypermedia-based discovery for source selection using low-cost linked data interfaces

    Get PDF
    Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness

    Terrestrial applications: An intelligent Earth-sensing information system

    Get PDF
    For Abstract see A82-2214

    EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

    Get PDF
    Producción CientíficaIterative stencil computations are widely used in numerical simulations. They present a high degree of parallelism, high locality and mostly-coalesced memory access patterns. Therefore, GPUs are good candidates to speed up their computa- tion. However, the development of stencil programs that can work with huge grids in distributed systems with multiple GPUs is not straightforward, since it requires solv- ing problems related to the partition of the grid across nodes and devices, and the synchronization and data movement across remote GPUs. In this work, we present EPSILOD, a high-productivity parallel programming skeleton for iterative stencil computations on distributed multi-GPUs, of the same or different vendors that sup- ports any type of n-dimensional geometric stencils of any order. It uses an abstract specification of the stencil pattern (neighbors and weights) to internally derive the data partition, synchronizations and communications. Computation is split to better overlap with communications. This paper describes the underlying architecture of EPSILOD, its main components, and presents an experimental evaluation to show the benefits of our approach, including a comparison with another state-of-the-art solution. The experimental results show that EPSILOD is faster and shows good strong and weak scalability for platforms with both homogeneous and heterogene- ous types of GPUJunta de Castilla y León, Ministerio de Economía, Industria y Competitividad, y Fondo Europeo de Desarrollo Regional (FEDER): Proyecto PCAS (TIN2017-88614-R) y Proyecto PROPHET-2 (VA226P20).Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - grant TED2021-130367B-I00CTE-POWER and Minotauro and the technical support provided by Barcelona Supercomputing Center (RES-IM-2021-2-0005, RES-IM-2021-3-0024, RES- IM-2022-1-0014).Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    Capturing natural-colour 3D models of insects for species discovery

    Full text link
    Collections of biological specimens are fundamental to scientific understanding and characterization of natural diversity. This paper presents a system for liberating useful information from physical collections by bringing specimens into the digital domain so they can be more readily shared, analyzed, annotated and compared. It focuses on insects and is strongly motivated by the desire to accelerate and augment current practices in insect taxonomy which predominantly use text, 2D diagrams and images to describe and characterize species. While these traditional kinds of descriptions are informative and useful, they cannot cover insect specimens "from all angles" and precious specimens are still exchanged between researchers and collections for this reason. Furthermore, insects can be complex in structure and pose many challenges to computer vision systems. We present a new prototype for a practical, cost-effective system of off-the-shelf components to acquire natural-colour 3D models of insects from around 3mm to 30mm in length. Colour images are captured from different angles and focal depths using a digital single lens reflex (DSLR) camera rig and two-axis turntable. These 2D images are processed into 3D reconstructions using software based on a visual hull algorithm. The resulting models are compact (around 10 megabytes), afford excellent optical resolution, and can be readily embedded into documents and web pages, as well as viewed on mobile devices. The system is portable, safe, relatively affordable, and complements the sort of volumetric data that can be acquired by computed tomography. This system provides a new way to augment the description and documentation of insect species holotypes, reducing the need to handle or ship specimens. It opens up new opportunities to collect data for research, education, art, entertainment, biodiversity assessment and biosecurity control.Comment: 24 pages, 17 figures, PLOS ONE journa

    An assessment of NASA master directory/catalog interoperability for interdisciplinary study of the global water cycle

    Get PDF
    The most important issue facing science is understanding global change; the causes, the processes involved and their consequences. The key to success in this massive Earth science research effort will depend on efficient identification and access to the most data available across the atmospheric, oceanographic, and land sciences. Current mechanisms used by earth scientists for accessing these data fall far short of meeting this need. Scientists must as a result frequently rely on a priori knowledge and informal person to person networks to find relevant data. The Master Directory/Catalog Interoperability Program (MC/CI) undertaken by NASA is an important step in overcoming these problems. The stated goal of the MD project is to enable researchers to efficiently identify, locate, and obtain access to space and Earth science data

    Technology assessment of advanced automation for space missions

    Get PDF
    Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology
    corecore