371 research outputs found

    Robust Machine Learning Applied to Astronomical Datasets I: Star-Galaxy Classification of the SDSS DR3 Using Decision Trees

    Get PDF
    We provide classifications for all 143 million non-repeat photometric objects in the Third Data Release of the Sloan Digital Sky Survey (SDSS) using decision trees trained on 477,068 objects with SDSS spectroscopic data. We demonstrate that these star/galaxy classifications are expected to be reliable for approximately 22 million objects with r < ~20. The general machine learning environment Data-to-Knowledge and supercomputing resources enabled extensive investigation of the decision tree parameter space. This work presents the first public release of objects classified in this way for an entire SDSS data release. The objects are classified as either galaxy, star or nsng (neither star nor galaxy), with an associated probability for each class. To demonstrate how to effectively make use of these classifications, we perform several important tests. First, we detail selection criteria within the probability space defined by the three classes to extract samples of stars and galaxies to a given completeness and efficiency. Second, we investigate the efficacy of the classifications and the effect of extrapolating from the spectroscopic regime by performing blind tests on objects in the SDSS, 2dF Galaxy Redshift and 2dF QSO Redshift (2QZ) surveys. Given the photometric limits of our spectroscopic training data, we effectively begin to extrapolate past our star-galaxy training set at r ~ 18. By comparing the number counts of our training sample with the classified sources, however, we find that our efficiencies appear to remain robust to r ~ 20. As a result, we expect our classifications to be accurate for 900,000 galaxies and 6.7 million stars, and remain robust via extrapolation for a total of 8.0 million galaxies and 13.9 million stars. [Abridged]Comment: 27 pages, 12 figures, to be published in ApJ, uses emulateapj.cl

    A Home-Based Telerehabilitation Program for Patients with Stroke

    Get PDF
    Background. Although rehabilitation therapy is commonly provided after stroke, many patients do not derive maximal benefit because of access, cost, and compliance. A telerehabilitation-based program may overcome these barriers. We designed, then evaluated a home-based telerehabilitation system in patients with chronic hemiparetic stroke. Methods. Patients were 3 to 24 months poststroke with stable arm motor deficits. Each received 28 days of telerehabilitation using a system delivered to their home. Each day consisted of 1 structured hour focused on individualized exercises and games, stroke education, and an hour of free play. Results. Enrollees (n = 12) had baseline Fugl-Meyer (FM) scores of 39 ± 12 (mean ± SD). Compliance was excellent: participants engaged in therapy on 329/336 (97.9%) assigned days. Arm repetitions across the 28 days averaged 24,607 ± 9934 per participant. Arm motor status showed significant gains (FM change 4.8 ± 3.8 points, P = .0015), with half of the participants exceeding the minimal clinically important difference. Although scores on tests of computer literacy declined with age (r = −0.92; P \u3c .0001), neither the motor gains nor the amount of system use varied with computer literacy. Daily stroke education via the telerehabilitation system was associated with a 39% increase in stroke prevention knowledge (P = .0007). Depression scores obtained in person correlated with scores obtained via the telerehabilitation system 16 days later (r = 0.88; P = .0001). In-person blood pressure values closely matched those obtained via this system (r = 0.99; P \u3c .0001). Conclusions. This home-based system was effective in providing telerehabilitation, education, and secondary stroke prevention to participants. Use of a computer-based interface offers many opportunities to monitor and improve the health of patients after stroke

    Prediction of survival probabilities with Bayesian Decision Trees

    Get PDF
    Practitioners use Trauma and Injury Severity Score (TRISS) models for predicting the survival probability of an injured patient. The accuracy of TRISS predictions is acceptable for patients with up to three typical injuries, but unacceptable for patients with a larger number of injuries or with atypical injuries. Based on a regression model, the TRISS methodology does not provide the predictive density required for accurate assessment of risk. Moreover, the regression model is difficult to interpret. We therefore consider Bayesian inference for estimating the predictive distribution of survival. The inference is based on decision tree models which recursively split data along explanatory variables, and so practitioners can understand these models. We propose the Bayesian method for estimating the predictive density and show that it outperforms the TRISS method in terms of both goodness-of-fit and classification accuracy. The developed method has been made available for evaluation purposes as a stand-alone application

    The empirical replicability of task-based fMRI as a function of sample size

    Get PDF
    Replicating results (i.e. obtaining consistent results using a new independent dataset) is an essential part of good science. As replicability has consequences for theories derived from empirical studies, it is of utmost importance to better understand the underlying mechanisms influencing it. A popular tool for non-invasive neuroimaging studies is functional magnetic resonance imaging (fMRI). While the effect of underpowered studies is well documented, the empirical assessment of the interplay between sample size and replicability of results for task-based fMRI studies remains limited. In this work, we extend existing work on this assessment in two ways. Firstly, we use a large database of 1400 subjects performing four types of tasks from the IMAGEN project to subsample a series of independent samples of increasing size. Secondly, replicability is evaluated using a multi-dimensional framework consisting of 3 different measures: (un)conditional test-retest reliability, coherence and stability. We demonstrate not only a positive effect of sample size, but also a trade-off between spatial resolution and replicability. When replicability is assessed voxelwise or when observing small areas of activation, a larger sample size than typically used in fMRI is required to replicate results. On the other hand, when focussing on clusters of voxels, we observe a higher replicability. In addition, we observe variability in the size of clusters of activation between experimental paradigms or contrasts of parameter estimates within these

    VAST: An ASKAP Survey for Variables and Slow Transients

    Get PDF
    The Australian Square Kilometre Array Pathfinder (ASKAP) will give us an unprecedented opportunity to investigate the transient sky at radio wavelengths. In this paper we present VAST, an ASKAP survey for Variables and Slow Transients. VAST will exploit the wide-field survey capabilities of ASKAP to enable the discovery and investigation of variable and transient phenomena from the local to the cosmological, including flare stars, intermittent pulsars, X-ray binaries, magnetars, extreme scattering events, interstellar scintillation, radio supernovae and orphan afterglows of gamma ray bursts. In addition, it will allow us to probe unexplored regions of parameter space where new classes of transient sources may be detected. In this paper we review the known radio transient and variable populations and the current results from blind radio surveys. We outline a comprehensive program based on a multi-tiered survey strategy to characterise the radio transient sky through detection and monitoring of transient and variable sources on the ASKAP imaging timescales of five seconds and greater. We also present an analysis of the expected source populations that we will be able to detect with VAST.Comment: 29 pages, 8 figures. Submitted for publication in Pub. Astron. Soc. Australi

    TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data.</p> <p>Results</p> <p>TagCleaner is a web application developed to automatically identify and remove known or unknown tag sequences allowing insertions and deletions in the dataset. TagCleaner is designed to filter the trimmed reads for duplicates, short reads, and reads with high rates of ambiguous sequences. An additional screening for and splitting of fragment-to-fragment concatenations that gave rise to artificial concatenated sequences can increase the quality of the dataset. Users may modify the different filter parameters according to their own preferences.</p> <p>Conclusions</p> <p>TagCleaner is a publicly available web application that is able to automatically detect and efficiently remove tag sequences from metagenomic datasets. It is easily configurable and provides a user-friendly interface. The interactive web interface facilitates export functionality for subsequent data processing, and is available at <url>http://edwards.sdsu.edu/tagcleaner</url>.</p

    Spire, an Actin Nucleation Factor, Regulates Cell Division during Drosophila Heart Development

    Get PDF
    The Drosophila dorsal vessel is a beneficial model system for studying the regulation of early heart development. Spire (Spir), an actin-nucleation factor, regulates actin dynamics in many developmental processes, such as cell shape determination, intracellular transport, and locomotion. Through protein expression pattern analysis, we demonstrate that the absence of spir function affects cell division in Myocyte enhancer factor 2-, Tinman (Tin)-, Even-skipped- and Seven up (Svp)-positive heart cells. In addition, genetic interaction analysis shows that spir functionally interacts with Dorsocross, tin, and pannier to properly specify the cardiac fate. Furthermore, through visualization of double heterozygous embryos, we determines that spir cooperates with CycA for heart cell specification and division. Finally, when comparing the spir mutant phenotype with that of a CycA mutant, the results suggest that most Svp-positive progenitors in spir mutant embryos cannot undergo full cell division at cell cycle 15, and that Tin-positive progenitors are arrested at cell cycle 16 as double-nucleated cells. We conclude that Spir plays a crucial role in controlling dorsal vessel formation and has a function in cell division during heart tube morphogenesis
    corecore