185,557 research outputs found

    Citizen Science 2.0 : Data Management Principles to Harness the Power of the Crowd

    Get PDF
    Citizen science refers to voluntary participation by the general public in scientific endeavors. Although citizen science has a long tradition, the rise of online communities and user-generated web content has the potential to greatly expand its scope and contributions. Citizens spread across a large area will collect more information than an individual researcher can. Because citizen scientists tend to make observations about areas they know well, data are likely to be very detailed. Although the potential for engaging citizen scientists is extensive, there are challenges as well. In this paper we consider one such challenge – creating an environment in which non-experts in a scientific domain can provide appropriate and accurate data regarding their observations. We describe the problem in the context of a research project that includes the development of a website to collect citizen-generated data on the distribution of plants and animals in a geographic region. We propose an approach that can improve the quantity and quality of data collected in such projects by organizing data using instance-based data structures. Potential implications of this approach are discussed and plans for future research to validate the design are described

    On the selection of secondary indices in relational databases

    Get PDF
    An important problem in the physical design of databases is the selection of secondary indices. In general, this problem cannot be solved in an optimal way due to the complexity of the selection process. Often use is made of heuristics such as the well-known ADD and DROP algorithms. In this paper it will be shown that frequently used cost functions can be classified as super- or submodular functions. For these functions several mathematical properties have been derived which reduce the complexity of the index selection problem. These properties will be used to develop a tool for physical database design and also give a mathematical foundation for the success of the before-mentioned ADD and DROP algorithms

    PynPoint: a modular pipeline architecture for processing and analysis of high-contrast imaging data

    Full text link
    The direct detection and characterization of planetary and substellar companions at small angular separations is a rapidly advancing field. Dedicated high-contrast imaging instruments deliver unprecedented sensitivity, enabling detailed insights into the atmospheres of young low-mass companions. In addition, improvements in data reduction and PSF subtraction algorithms are equally relevant for maximizing the scientific yield, both from new and archival data sets. We aim at developing a generic and modular data reduction pipeline for processing and analysis of high-contrast imaging data obtained with pupil-stabilized observations. The package should be scalable and robust for future implementations and in particular well suitable for the 3-5 micron wavelength range where typically (ten) thousands of frames have to be processed and an accurate subtraction of the thermal background emission is critical. PynPoint is written in Python 2.7 and applies various image processing techniques, as well as statistical tools for analyzing the data, building on open-source Python packages. The current version of PynPoint has evolved from an earlier version that was developed as a PSF subtraction tool based on PCA. The architecture of PynPoint has been redesigned with the core functionalities decoupled from the pipeline modules. Modules have been implemented for dedicated processing and analysis steps, including background subtraction, frame registration, PSF subtraction, photometric and astrometric measurements, and estimation of detection limits. The pipeline package enables end-to-end data reduction of pupil-stabilized data and supports classical dithering and coronagraphic data sets. As an example, we processed archival VLT/NACO L' and M' data of beta Pic b and reassessed the planet's brightness and position with an MCMC analysis, and we provide a derivation of the photometric error budget.Comment: 16 pages, 9 figures, accepted for publication in A&A, PynPoint is available at https://github.com/PynPoint/PynPoin

    Relational Approach to Knowledge Engineering for POMDP-based Assistance Systems as a Translation of a Psychological Model

    Get PDF
    Assistive systems for persons with cognitive disabilities (e.g. dementia) are difficult to build due to the wide range of different approaches people can take to accomplishing the same task, and the significant uncertainties that arise from both the unpredictability of client's behaviours and from noise in sensor readings. Partially observable Markov decision process (POMDP) models have been used successfully as the reasoning engine behind such assistive systems for small multi-step tasks such as hand washing. POMDP models are a powerful, yet flexible framework for modelling assistance that can deal with uncertainty and utility. Unfortunately, POMDPs usually require a very labour intensive, manual procedure for their definition and construction. Our previous work has described a knowledge driven method for automatically generating POMDP activity recognition and context sensitive prompting systems for complex tasks. We call the resulting POMDP a SNAP (SyNdetic Assistance Process). The spreadsheet-like result of the analysis does not correspond to the POMDP model directly and the translation to a formal POMDP representation is required. To date, this translation had to be performed manually by a trained POMDP expert. In this paper, we formalise and automate this translation process using a probabilistic relational model (PRM) encoded in a relational database. We demonstrate the method by eliciting three assistance tasks from non-experts. We validate the resulting POMDP models using case-based simulations to show that they are reasonable for the domains. We also show a complete case study of a designer specifying one database, including an evaluation in a real-life experiment with a human actor

    The Design and Operation of The Keck Observatory Archive

    Get PDF
    The Infrared Processing and Analysis Center (IPAC) and the W. M. Keck Observatory (WMKO) operate an archive for the Keck Observatory. At the end of 2013, KOA completed the ingestion of data from all eight active observatory instruments. KOA will continue to ingest all newly obtained observations, at an anticipated volume of 4 TB per year. The data are transmitted electronically from WMKO to IPAC for storage and curation. Access to data is governed by a data use policy, and approximately two-thirds of the data in the archive are public.Comment: 12 pages, 4 figs, 4 tables. Presented at Software and Cyberinfrastructure for Astronomy III, SPIE Astronomical Telescopes + Instrumentation 2014. June 2014, Montreal, Canad

    Using concept lattices to mine functional dependencies

    Get PDF
    Concept Lattices have been proved to be a valuable tool to represent the knowlegde in a database. In this paper we show how functional dependencies in databases can be extracted using Concept Lattices, not preprocessing the original database, but providing a new closure operator. We also prove that this method generalizes the previous methods and closure operators that are being used to find association rules in binary databases.Postprint (published version
    • …
    corecore