96 research outputs found

    INTERACTING WITH LOCAL AND REMOTE DATA RESPOSITORIES USING THE stashR PACKAGE

    Get PDF
    The stashR package (a Set of Tools for Administering SHared Repositories) for R implements a simple key-value style database where character string keys are associated with data values. The key-value databases can be either stored locally on the user\u27s computer or accessed remotely via the Internet. Methods specific to the stashR package allow users to share data repositories or access previously created remote data repositories. In particular, methods are available for the S4 classes localDB and remoteDB to insert, retrieve, or delete data from the database as well as to synchronize local copies of the data to the remote version of the database. Users efficiently access information from a remote database by retrieving only the data files indexed by user-specified keys and caching this data in a local copy of the remote database. The local and remote counterparts of the stashR package offer the potential to enhance reproducible research by allowing users of Sweave to cache their R computations for a research paper in a localDB database. This database can then be stored on the Internet as a remoteDB database. When readers of the research paper wish to reproduce the computations involved in creating a specific figure or calculating a specific numeric value, they can access the remoteDB database and obtain the R objects involved in the computation

    DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS

    Get PDF
    The ability to make scientific findings reproducible is increasingly important in areas where substantive results are the product of complex statistical computations. Reproducibility can allow others to verify the published findings and conduct alternate analyses of the same data. A question that arises naturally is how can one conduct and distribute reproducible research? This question is relevant from the point of view of both the authors who want to make their research reproducible and readers who want to reproduce relevant findings reported in the scientific literature. We present a framework in which reproducible research can be conducted and distributed via cached computations and describe specific tools for both authors and readers. As a prototype implementation we introduce three software packages written in the R language. The cacheSweave and stashR packages together provide tools for caching computational results in a key-value style database which can be published to a public repository for readers to download. The SRPM package provides tools for generating and interacting with shared reproducibility packages (SRPs) which can facilitate the distribution of the data and code. As a case study we demonstrate the use of the toolkit on a national study of air pollution exposure and mortality

    IDENTIFYING EFFECT MODIFIERS IN AIR POLLUTION TIME-SERIES STUDIES USING A TWO-STAGE ANALYSIS

    Get PDF
    Studies of the health effects of air pollution such as the National Morbidity and Mortality Air Pollution Study (NMMAPS) relate changes in daily pollution to daily deaths in a sample of cities and calendar years. Generally, city-specific estimates are combined into regional and national estimates using two-stage models. Our two-stage analysis identifies effect modifiers of the relation between single-day lagged PM10 and daily mortality in people age 65 and older from the 50 largest NMMAPS cities. We build on the standard approach by fractionating city-specific analyses to produce month-year-city specific estimated air pollution effects (slopes) in Stage I. In Stage II, we identify potential effect modifiers via weighted regression and weighted regression trees with the estimated slopes as dependent variables and predictors such as temperature, relative humidity, CO, NO2, O3, SO2, season, year, and other city-specific characteristics

    SURROGATE SCREENING MODELS FOR THE LOW PHYSICAL ACTIVITY CRITERION OF FRAILTY

    Get PDF
    Background and Aims. Low physical activity, one of five criteria in a validated clinical phenotype of frailty, is assessed by a standardized questionnaire on up to 20 leisure time activities. Because of the time demanded to collect the interview data, it has been challenging to translate to studies other than the Cardiovascular Health Study (CHS), for which it was developed. Considering subsets of activities, we identified and evaluated streamlined surrogate assessment methods and compared them to one implemented in the Women’s Health and Aging Study (WHAS). Methods. Using data on men and women ages 65 and older from the CHS, we applied logistic regression models to rank activities by “relative influence” in predicting low physical activity. We considered subsets of the most influential activities as inputs to potential surrogate models (logistic regressions). We evaluated predictive accuracy and predictive validity using the area under receiver operating characteristic curves and assessed criterion validity using proportional hazards models relating frailty status (defined using the surrogate) to mortality. Results. Walking for exercise and moderately strenuous household chores were highly influential for both genders. Women required fewer activities than men for accurate classification. The WHAS model (8 CHS activities) was an effective surrogate, but a surrogate using 6 activities (walking, chores, gardening, general exercise, mowing and golfing) was also highly predictive. Conclusions. We recommend a 6 activity questionnaire to assess physical activity for men and women. If efficiency is essential and the study involves only women, fewer activities can be included

    MODIFICATION BY FRAILTY STATUS OF AMBIENT AIR POLLUTION EFFECTS ON LUNG FUNCTION IN OLDER ADULTS IN THE CARDIOVASCULAR HEALTH STUDY

    Get PDF
    Older adult susceptibility to air pollution health effects is well-recognized. Advanced age may act as a partial surrogate for conditions associated with aging. The authors investigated whether gerontologic frailty (a clinical health status metric) modified the effects of ambient ozone or particulate matter (PM10) air pollution on lung function in 3382 older adults using 7 years of followup data from the Cardiovascular Health Study (CHS) and the CHS Environmental Factors Ancillary Study. Monthly average pollution and annual frailty assessments were related to up to 3 repeated measurements of lung function using novel cumulative summaries of pollution and frailty histories that account for duration as well as concentration. Frailty history was found to modify long-term pollution effects on Forced Vital Capacity (FVC). For example, the decrease in FVC associated with a 70 ppb-month increase in the cumulative sum of monthly average O3 exposure was 8.8 mL (95% confidence interval (CI): 7.4, 10.1) for a woman who had spent the prior 7 years prefrail or frail compared to 3.3 mL (95% CI: 2.7, 4.0) for a similar not frail woman (interaction P\u3c0.001)

    Restructuring of amygdala subregion apportion across adolescence

    Get PDF
    Total amygdala volumes develop in association with sex and puberty, and postmortem studies find neuronal numbers increase in a nuclei specific fashion across development. Thus, amygdala subregions and composition may evolve with age. Our goal was to examine if amygdala subregion absolute volumes and/or relative proportion varies as a function of age, sex, or puberty in a large sample of typically developing adolescents (N = 408, 43 % female, 10–17 years). Utilizing the in vivo CIT168 atlas, we quantified 9 subregions and implemented Generalized Additive Mixed Models to capture potential non-linear associations with age and pubertal status between sexes. Only males showed significant age associations with the basolateral ventral and paralaminar subdivision (BLVPL), central nucleus (CEN), and amygdala transition area (ATA). Again, only males showed relative differences in the proportion of the BLVPL, CEN, ATA, along with lateral (LA) and amygdalostriatal transition area (ASTA), with age. Using a best-fit modeling approach, age, and not puberty, was found to drive these associations. The results suggest that amygdala subregions show unique variations with age in males across adolescence. Future research is warranted to determine if our findings may contribute to sex differences in mental health that emerge across adolescence

    Prenatal metal mixtures and child blood pressure in the Rhea mother-child cohort in Greece

    Get PDF
    Background: Child blood pressure (BP) is predictive of future cardiovascular risk. Prenatal exposure to metals has been associated with higher BP in childhood, but most studies have evaluated elements individually and measured BP at a single time point. We investigated impacts of prenatal metal mixture exposures on longitudinal changes in BP during childhood and elevated BP at 11 years of age. Methods: The current study included 176 mother-child pairs from the Rhea Study in Heraklion, Greece and focused on eight elements (antimony, arsenic, cadmium, cobalt, lead, magnesium, molybdenum, selenium) measured in maternal urine samples collected during pregnancy (median gestational age at collection: 12 weeks). BP was measured at approximately 4, 6, and 11 years of age. Covariate-adjusted Bayesian Varying Coefficient Kernel Machine Regression and Bayesian Kernel Machine Regression (BKMR) were used to evaluate metal mixture impacts on baseline and longitudinal changes in BP (from ages 4 to 11) and the development of elevated BP at age 11, respectively. BKMR results were compared using static versus percentile-based cutoffs to define elevated BP. Results: Molybdenum and lead were the mixture components most consistently associated with BP. J-shaped relationships were observed between molybdenum and both systolic and diastolic BP at age 4. Similar associations were identified for both molybdenum and lead in relation to elevated BP at age 11. For molybdenum concentrations above the inflection points (~ 40–80 μg/L), positive associations with BP at age 4 were stronger at high levels of lead. Lead was positively associated with BP measures at age 4, but only at high levels of molybdenum. Potential interactions between molybdenum and lead were also identified for BP at age 11, but were sensitive to the cutoffs used to define elevated BP. Conclusions: Prenatal exposure to high levels of molybdenum and lead, particularly in combination, may contribute to higher BP at age 4. These early effects appear to persist throughout childhood, contributing to elevated BP in adolescence. Future studies are needed to identify the major sources of molybdenum and lead in this population

    Distributed Reproducible Research Using Cached Computations

    No full text

    On the importance of statistics in breath analysis--hope or curse?

    No full text
    As we saw at the 2013 Breath Analysis Summit, breath analysis is a rapidly evolving field. Increasingly sophisticated technology is producing huge amounts of complex data. A major barrier now faced by the breath research community is the analysis of these data. Emerging breath data require sophisticated, modern statistical methods to allow for a careful and robust deduction of real-world conclusions

    Understanding the importance of key risk factors in predicting chronic bronchitic symptoms using a machine learning approach

    No full text
    Abstract Background Chronic respiratory symptoms involving bronchitis, cough and phlegm in children are underappreciated but pose a significant public health burden. Efforts for prevention and management could be supported by an understanding of the relative importance of determinants, including environmental exposures. Thus, we aim to develop a prediction model for bronchitic symptoms. Methods Schoolchildren from the population-based southern California Children’s Health Study were visited annually from 2003 to 2012. Bronchitic symptoms over the prior 12 months were assessed by questionnaire. A gradient boosting model was fit using groups of risk factors (including traffic/air pollution exposures) for all children and by asthma status. Training data consisted of one observation per participant in a random study year (for 50% of participants). Validation data consisted of: (1) a random (later) year in the same participants (within-participant); (2) a random year in participants excluded from the training data (across-participant). Results At baseline, 13.2% of children had asthma and 18.1% reported bronchitic symptoms. Models performed similarly within- and across-participant. Previous year symptoms/medication use provided much of the predictive ability (across-participant area under the receiver operating characteristic curve (AUC): 0.76 vs 0.78 for all risk factors, in all participants). Traffic/air pollution exposures added modestly to prediction as did body mass index percentile, age and parent stress. Conclusions Regardless of asthma status, previous symptoms were the most important predictors of current symptoms. Traffic/air pollution variables contribute modest predictive information, but impact large populations. Methods proposed here could be generalized to personalized exacerbation predictions in future longitudinal studies to support targeted prevention efforts
    • …
    corecore