217 research outputs found

    Robust Machine Learning Applied to Astronomical Datasets I: Star-Galaxy Classification of the SDSS DR3 Using Decision Trees

    Get PDF
    We provide classifications for all 143 million non-repeat photometric objects in the Third Data Release of the Sloan Digital Sky Survey (SDSS) using decision trees trained on 477,068 objects with SDSS spectroscopic data. We demonstrate that these star/galaxy classifications are expected to be reliable for approximately 22 million objects with r < ~20. The general machine learning environment Data-to-Knowledge and supercomputing resources enabled extensive investigation of the decision tree parameter space. This work presents the first public release of objects classified in this way for an entire SDSS data release. The objects are classified as either galaxy, star or nsng (neither star nor galaxy), with an associated probability for each class. To demonstrate how to effectively make use of these classifications, we perform several important tests. First, we detail selection criteria within the probability space defined by the three classes to extract samples of stars and galaxies to a given completeness and efficiency. Second, we investigate the efficacy of the classifications and the effect of extrapolating from the spectroscopic regime by performing blind tests on objects in the SDSS, 2dF Galaxy Redshift and 2dF QSO Redshift (2QZ) surveys. Given the photometric limits of our spectroscopic training data, we effectively begin to extrapolate past our star-galaxy training set at r ~ 18. By comparing the number counts of our training sample with the classified sources, however, we find that our efficiencies appear to remain robust to r ~ 20. As a result, we expect our classifications to be accurate for 900,000 galaxies and 6.7 million stars, and remain robust via extrapolation for a total of 8.0 million galaxies and 13.9 million stars. [Abridged]Comment: 27 pages, 12 figures, to be published in ApJ, uses emulateapj.cl

    Robust Machine Learning Applied to Astronomical Datasets III: Probabilistic Photometric Redshifts for Galaxies and Quasars in the SDSS and GALEX

    Full text link
    We apply machine learning in the form of a nearest neighbor instance-based algorithm (NN) to generate full photometric redshift probability density functions (PDFs) for objects in the Fifth Data Release of the Sloan Digital Sky Survey (SDSS DR5). We use a conceptually simple but novel application of NN to generate the PDFs - perturbing the object colors by their measurement error - and using the resulting instances of nearest neighbor distributions to generate numerous individual redshifts. When the redshifts are compared to existing SDSS spectroscopic data, we find that the mean value of each PDF has a dispersion between the photometric and spectroscopic redshift consistent with other machine learning techniques, being sigma = 0.0207 +/- 0.0001 for main sample galaxies to r < 17.77 mag, sigma = 0.0243 +/- 0.0002 for luminous red galaxies to r < ~19.2 mag, and sigma = 0.343 +/- 0.005 for quasars to i < 20.3 mag. The PDFs allow the selection of subsets with improved statistics. For quasars, the improvement is dramatic: for those with a single peak in their probability distribution, the dispersion is reduced from 0.343 to sigma = 0.117 +/- 0.010, and the photometric redshift is within 0.3 of the spectroscopic redshift for 99.3 +/- 0.1% of the objects. Thus, for this optical quasar sample, we can virtually eliminate 'catastrophic' photometric redshift estimates. In addition to the SDSS sample, we incorporate ultraviolet photometry from the Third Data Release of the Galaxy Evolution Explorer All-Sky Imaging Survey (GALEX AIS GR3) to create PDFs for objects seen in both surveys. For quasars, the increased coverage of the observed frame UV of the SED results in significant improvement over the full SDSS sample, with sigma = 0.234 +/- 0.010. We demonstrate that this improvement is genuine. [Abridged]Comment: Accepted to ApJ, 10 pages, 12 figures, uses emulateapj.cl

    Clinical outcomes after detection of elevated cardiac enzymes in patients undergoing percutaneous intervention

    Get PDF
    AbstractObjectives. We examined the relations of elevated creatine kinase (CK) and its myocardial band isoenzyme (CK-MB) to clinical outcomes after percutaneous coronary intervention (PCI) in patients enrolled in Integrilin (eptifibatide) to Minimize Platelet Aggregation and Coronary Thrombosis-II (trial) (IMPACT-II), a trial of the platelet glycoprotein IIb/IIIa inhibitor eptifibatide.Background. Elevation of cardiac enzymes often occurs after PCI, but its clinical implications are uncertain.Methods. Patients undergoing elective, scheduled PCI for any indication were analyzed. Parallel analyses investigated CK (n = 3,535) and CK-MB (n = 2,341) levels after PCI (within 4 to 20 h). Clinical outcomes at 30 days and 6 months were stratified by postprocedure CK and CK-MB (multiple of the site’s upper normal limit).Results. Overall, 1,779 patients (76%) had no CK-MB elevation; CK-MB levels were elevated to 1 to 3 times the upper normal limit in 323 patients (13.8%), to 3 to 5 times normal in 84 (3.6%), to 5 to 10 times normal in 86 (3.7%), and to >10 times normal in 69 patients (2.9%). Elevated CK-MB was associated with an increased risk of death, reinfarction, or emergency revascularization at 30 days, and of death, reinfarction, or surgical revascularization at 6 months. Elevated total CK to above three times normal was less frequent, but its prognostic significance paralleled that seen for CK-MB. The degree of risk correlated with the rise in CK or CK-MB, even for patients with successful procedures not complicated by abrupt closure.Conclusions. Elevations in cardiac enzymes, including small increases (between one and three times normal) often not considered an infarction, are associated with an increased risk for short-term adverse clinical outcomes after successful or unsuccessful PCI
    corecore