17 research outputs found

    Data Mining the SDSS SkyServer Database

    Full text link
    An earlier paper (Szalay et. al. "Designing and Mining MultiTerabyte Astronomy Archives: The Sloan Digital Sky Survey," ACM SIGMOD 2000) described the Sloan Digital Sky Survey's (SDSS) data management needs by defining twenty database queries and twelve data visualization tasks that a good data management system should support. We built a database and interfaces to support both the query load and also a website for ad-hoc access. This paper reports on the database design, describes the data loading pipeline, and reports on the query implementation and performance. The queries typically translated to a single SQL statement. Most queries run in less than 20 seconds, allowing scientists to interactively explore the database. This paper is an in-depth tour of those queries. Readers should first have studied the companion overview paper Szalay et. al. "The SDSS SkyServer, Public Access to the Sloan Digital Sky Server Data" ACM SIGMOND 2002.Comment: 40 pages, Original source is at http://research.microsoft.com/~gray/Papers/MSR_TR_O2_01_20_queries.do

    Online Scientific Data Curation, Publication, and Archiving

    Get PDF
    Science projects are data publishers. The scale and complexity of current and future science data changes the nature of the publication process. Publication is becoming a major project component. At a minimum, a project must preserve the ephemeral data it gathers. Derived data can be reconstructed from metadata, but metadata is ephemeral. Longer term, a project should expect some archive to preserve the data. We observe that pub-lished scientific data needs to be available forever ? this gives rise to the data pyramid of versions and to data inflation where the derived data volumes explode. As an example, this article describes the Sloan Digital Sky Survey (SDSS) strategies for data publication, data access, curation, and preservation.Comment: original at http://research.microsoft.com/scripts/pubs/view.asp?TR_ID=MSR-TR-2002-7

    The Eighteenth Data Release of the Sloan Digital Sky Surveys: Targeting and First Spectra from SDSS-V

    Full text link
    The eighteenth data release of the Sloan Digital Sky Surveys (SDSS) is the first one for SDSS-V, the fifth generation of the survey. SDSS-V comprises three primary scientific programs, or "Mappers": Milky Way Mapper (MWM), Black Hole Mapper (BHM), and Local Volume Mapper (LVM). This data release contains extensive targeting information for the two multi-object spectroscopy programs (MWM and BHM), including input catalogs and selection functions for their numerous scientific objectives. We describe the production of the targeting databases and their calibration- and scientifically-focused components. DR18 also includes ~25,000 new SDSS spectra and supplemental information for X-ray sources identified by eROSITA in its eFEDS field. We present updates to some of the SDSS software pipelines and preview changes anticipated for DR19. We also describe three value-added catalogs (VACs) based on SDSS-IV data that have been published since DR17, and one VAC based on the SDSS-V data in the eFEDS field.Comment: Accepted to ApJ

    The eighteenth data release of the Sloan Digital Sky Surveys : targeting and first spectra from SDSS-V

    Get PDF
    The eighteenth data release of the Sloan Digital Sky Surveys (SDSS) is the first one for SDSS-V, the fifth generation of the survey. SDSS-V comprises three primary scientific programs, or "Mappers": Milky Way Mapper (MWM), Black Hole Mapper (BHM), and Local Volume Mapper (LVM). This data release contains extensive targeting information for the two multi-object spectroscopy programs (MWM and BHM), including input catalogs and selection functions for their numerous scientific objectives. We describe the production of the targeting databases and their calibration- and scientifically-focused components. DR18 also includes ~25,000 new SDSS spectra and supplemental information for X-ray sources identified by eROSITA in its eFEDS field. We present updates to some of the SDSS software pipelines and preview changes anticipated for DR19. We also describe three value-added catalogs (VACs) based on SDSS-IV data that have been published since DR17, and one VAC based on the SDSS-V data in the eFEDS field.Publisher PDFPeer reviewe

    The Seventeenth Data Release of the Sloan Digital Sky Surveys: Complete Release of MaNGA, MaStar and APOGEE-2 Data

    Get PDF
    This paper documents the seventeenth data release (DR17) from the Sloan Digital Sky Surveys; the fifth and final release from the fourth phase (SDSS-IV). DR17 contains the complete release of the Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey, which reached its goal of surveying over 10,000 nearby galaxies. The complete release of the MaNGA Stellar Library (MaStar) accompanies this data, providing observations of almost 30,000 stars through the MaNGA instrument during bright time. DR17 also contains the complete release of the Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2) survey which publicly releases infra-red spectra of over 650,000 stars. The main sample from the Extended Baryon Oscillation Spectroscopic Survey (eBOSS), as well as the sub-survey Time Domain Spectroscopic Survey (TDSS) data were fully released in DR16. New single-fiber optical spectroscopy released in DR17 is from the SPectroscipic IDentification of ERosita Survey (SPIDERS) sub-survey and the eBOSS-RM program. Along with the primary data sets, DR17 includes 25 new or updated Value Added Catalogs (VACs). This paper concludes the release of SDSS-IV survey data. SDSS continues into its fifth phase with observations already underway for the Milky Way Mapper (MWM), Local Volume Mapper (LVM) and Black Hole Mapper (BHM) surveys

    Lessons Learned from the SDSS Catalog Archive Server

    No full text

    Jim Gray, Microsoft Research

    No full text
    Science projects are data publishers. The scale and complexity of current and future science data changes the nature of the publication process. Publication is becoming a major project component. At a minimum, a project must preserve the ephemeral data it gathers. Derived data can be reconstructed from metadata, but metadata is ephemeral. Longer term, a project should expect some archive to preserve the data. We observe that published scientific data needs to be available forever -- this gives rise to the data pyramid of versions and to data inflation where the derived data volumes explode. As an example, this article describes the Sloan Digital Sky Survey (SDSS) strategies for data publication, data access, curation, and preservation

    The Catalog Archive Server Database Management System

    No full text
    corecore