1,961 research outputs found

    Current approaches used in epidemiologic studies to examine short-term multipollutant air pollution exposures

    Get PDF
    Air pollution epidemiology traditionally focuses on the relationship between individual air pollutants and health outcomes (e.g., mortality). To account for potential copollutant confounding, individual pollutant associations are often estimated by adjusting or controlling for other pollutants in the mixture. Recently, the need to characterize the relationship between health outcomes and the larger multipollutant mixture has been emphasized in an attempt to better protect public health and inform more sustainable air quality management decisions

    Improving Pure-Tone Audiometry Using Probabilistic Machine Learning Classification

    Get PDF
    Hearing loss is a critical public health concern, affecting hundreds millions of people worldwide and dramatically impacting quality of life for affected individuals. While treatment techniques have evolved in recent years, methods for assessing hearing ability have remained relatively unchanged for decades. The standard clinical procedure is the modified Hughson-Westlake procedure, an adaptive pure-tone detection task that is typically performed manually by audiologists, costing millions of collective hours annually among healthcare professionals. In addition to the high burden of labor, the technique provides limited detail about an individual’s hearing ability, estimating only detection thresholds at a handful of pre-defined pure-tone frequencies (a threshold audiogram). An efficient technique that produces a detailed estimate of the audiometric function, including threshold and spread, could allow for better characterization of particular hearing pathologies and provide more diagnostic value. Parametric techniques exist to efficiently estimate multidimensional psychometric functions, but are ill-suited for estimation of audiometric functions because these functions cannot be easily parameterized. The Gaussian process is a compelling machine learning technique for inference of nonparametric multidimensional functions using binary data. The work described in this thesis utilizes Gaussian process classification to build an automated framework for efficient, high-resolution estimation of the full audiometric function, which we call the machine learning audiogram (MLAG). This Bayesian technique iteratively computes a posterior distribution describing its current belief about detection probability given the current set of observed pure tones and detection responses. The posterior distribution can be used to provide a current point estimate of the psychometric function as well as to select an informative query point for the next stimulus to be provided to the listener. The Gaussian process covariance function encodes correlations between variables, reflecting prior beliefs on the system; MLAG uses a composite linear/squared exponential covariance function that enforces monotonicity with respect to intensity but only smoothness with respect to frequency for the audiometric function. This framework was initially evaluated in human subjects for threshold audiogram estimation. 2 repetitions of MLAG and 1 repetition of manual clinical audiometry were conducted in each of 21 participants. Results indicated that MLAG both agreed with clinical estimates and exhibited test-retest reliability to within accepted clinical standards, but with significantly fewer tone deliveries required compared to clinical methods while also providing an effectively continuous threshold estimate along frequency. This framework’s ability to evaluate full psychometric functions was then evaluated using simulated experiments. As a feasibility check, performance for estimating unidimensional psychometric functions was assessed and directly compared to inference using standard maximum-likelihood probit regression; results indicated that the two methods exhibited near identical performance for estimating threshold and spread. MLAG was then used to estimate 2-dimensional audiometric functions constructed using existing audiogram phenotypes. Results showed that this framework could estimate both threshold and spread of the full audiometric function with high accuracy and reliability given a sufficient sample count; non-active sampling using the Halton set required between 50-100 queries to reach clinical reliability, while active sampling strategies reduced the required number to around 20-30, with Bayesian active leaning by disagreement exhibiting the best performance of the tested methods. Overall, MLAG’s accuracy, reliability, and high degree of detail make it a promising method for estimation of threshold audiograms and audiometric functions, and the framework’s flexibility enables it to be easily extended to other psychophysical domains

    A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

    Full text link
    We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

    Change-point Problem and Regression: An Annotated Bibliography

    Get PDF
    The problems of identifying changes at unknown times and of estimating the location of changes in stochastic processes are referred to as the change-point problem or, in the Eastern literature, as disorder . The change-point problem, first introduced in the quality control context, has since developed into a fundamental problem in the areas of statistical control theory, stationarity of a stochastic process, estimation of the current position of a time series, testing and estimation of change in the patterns of a regression model, and most recently in the comparison and matching of DNA sequences in microarray data analysis. Numerous methodological approaches have been implemented in examining change-point models. Maximum-likelihood estimation, Bayesian estimation, isotonic regression, piecewise regression, quasi-likelihood and non-parametric regression are among the methods which have been applied to resolving challenges in change-point problems. Grid-searching approaches have also been used to examine the change-point problem. Statistical analysis of change-point problems depends on the method of data collection. If the data collection is ongoing until some random time, then the appropriate statistical procedure is called sequential. If, however, a large finite set of data is collected with the purpose of determining if at least one change-point occurred, then this may be referred to as non-sequential. Not surprisingly, both the former and the latter have a rich literature with much of the earlier work focusing on sequential methods inspired by applications in quality control for industrial processes. In the regression literature, the change-point model is also referred to as two- or multiple-phase regression, switching regression, segmented regression, two-stage least squares (Shaban, 1980), or broken-line regression. The area of the change-point problem has been the subject of intensive research in the past half-century. The subject has evolved considerably and found applications in many different areas. It seems rather impossible to summarize all of the research carried out over the past 50 years on the change-point problem. We have therefore confined ourselves to those articles on change-point problems which pertain to regression. The important branch of sequential procedures in change-point problems has been left out entirely. We refer the readers to the seminal review papers by Lai (1995, 2001). The so called structural change models, which occupy a considerable portion of the research in the area of change-point, particularly among econometricians, have not been fully considered. We refer the reader to Perron (2005) for an updated review in this area. Articles on change-point in time series are considered only if the methodologies presented in the paper pertain to regression analysis

    Towards the Efficient Probabilistic Characterization of Tropical Cyclone-Generated Storm Surge Hazards Under Stationary and Nonstationary Conditions

    Get PDF
    The scarcity of observations at any single location confounds the probabilistic characterization of tropical cyclone-generated storm surge hazards using annual maxima and peaks-over-threshold methods. The EST and the JPM are indirect approaches aimed at estimating the probability distribution of the response variable of interest (i.e. storm surge) using the probability distributions of predictor variables (e.g. storm size, storm intensity etc.). In the first part of this work, the relative performance of the empirical simulation technique (EST; Borgman et al., 1992) and the joint probability method (JPM; Myers, 1970) is evaluated via stochastic simulation methods. It is shown that the JPM has greater predictive capability for the estimation of the frequency of tropical cyclone winds, an efficient proxy for storm surge. The traditional attractions of the EST have been its economy and ease of implementation; more efficient numerical approximation schemes such as Bayesian quadrature now exist, which allows for more cost effective implementation of the JPM. In addition, typical enhancements of the original EST approach, such as the introduction of synthetic storms to complement the historical sample, are largely ineffective. These observations indicate that the EST should no longer be considered a practical approach for the robust and reliable estimation of the exceedance probabilities of storm surge levels, as required for actuarial purposes, engineering design and flood risk management in tropical cyclone-prone regions. The JPM is, however, not applicable to extratropical storm-prone regions and nonstationary phenomena. Additionally, the JPM requires the evaluation of a multidimensional integral composed of the product of marginal and conditional probability distributions of storm descriptors. This integral is typically approximated as a weighted summation of discrete function evaluations in each dimension and extended to D-dimensions by tensor product rules. To adequately capture the dynamics of the underlying physical process—storm surge driven by tropical cyclone wind fields—one must maintain a large number of explanatory variables in the integral. The complexity and cost of the joint probability problem, however, increases exponentially with dimension, precluding the inclusion of more than a few (≤4) stochastic variables. In the second part of the work, we extend stochastic simulation approaches to the classical joint probability problem. The successful implementation of stochastic simulation to the storm surge frequency problem requires the introduction of a new paradigm: the use of a regression function constructed by the careful selection of an optimal training set from the storm sample space such that the growth of support nodes required for efficient interpolation remains nonexponential while preserving the performance of a product grid equivalent. Apart from retaining the predictive capability of the JPM, the stochastic simulation approach also allows for nonstationary phenomena such as the effects of climate change on tropical cyclone activity to be efficiently modeled. A great utility of the stochastic approach is that the random sampling scheme is readily modified so that it conducts empirical simulation if required in place of parametric simulation. The enhanced empirical simulation technique attains predictive capabilities that are comparable with the JPM and the parametric simulation approach, while also retaining the suitability of empirical methods for application to situations that confound parametric methods, such as, application to extratropical cyclones and complexly distributed data. The parametric and empirical simulation techniques, together, will enable seamless flood hazard estimation for the entire coastline of the United States, with simple elaborations where needed to allow for the joint occurrence of both tropical and extratropical storms as compound stochastic processes. The stochastic approaches proposed hold great promise for the efficient probabilistic modeling of other multi-parameter systems such as earthquakes and riverine floods

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Modeling bivariate longitudinal hormone profiles by hierarchical state space models

    Get PDF
    The hypothalamic-pituitary-adrenal (HPA) axis is crucial in coping with stress and maintaining homeostasis. Hormones produced by the HPA axis exhibit both complex univariate longitudinal profiles and complex relationships among different hormones. Consequently, modeling these multivariate longitudinal hormone profiles is a challenging task. In this paper, we propose a bivariate hierarchical state space model, in which each hormone profile is modeled by a hierarchical state space model, with both population-average and subject-specific components. The bivariate model is constructed by concatenating the univariate models based on the hypothesized relationship. Because of the flexible framework of state space form, the resultant models not only can handle complex individual profiles, but also can incorporate complex relationships between two hormones, including both concurrent and feedback relationship. Estimation and inference are based on marginal likelihood and posterior means and variances. Computationally efficient Kalman filtering and smoothing algorithms are used for implementation. Application of the proposed method to a study of chronic fatigue syndrome and fibromyalgia reveals that the relationships between adrenocorticotropic hormone and cortisol in the patient group are weaker than in healthy controls
    • …
    corecore