3,556 research outputs found

    An autoregressive approach to house price modeling

    Get PDF
    A statistical model for predicting individual house prices and constructing a house price index is proposed utilizing information regarding sale price, time of sale and location (ZIP code). This model is composed of a fixed time effect and a random ZIP (postal) code effect combined with an autoregressive component. The former two components are applied to all home sales, while the latter is applied only to homes sold repeatedly. The time effect can be converted into a house price index. To evaluate the proposed model and the resulting index, single-family home sales for twenty US metropolitan areas from July 1985 through September 2004 are analyzed. The model is shown to have better predictive abilities than the benchmark S&P/Case--Shiller model, which is a repeat sales model, and a conventional mixed effects model. Finally, Los Angeles, CA, is used to illustrate a historical housing market downturn.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS380 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Minimax Linear Estimation in a White Noise Problem

    Get PDF
    Linear estimation of f(x) at a point in a white noise model is considered. The exact linear minimax estimator of f(0) is found for the family of f(x) in which fâ€Č(x) is Lip (M). The resulting estimator is then used to verify a conjecture of Sacks and Ylvisaker concerning the near optimality of the Epanechnikov kernel

    Bayesian Aspects of Some Nonparametric Problems

    Get PDF
    We study the Bayesian approach to nonparametric function estimation problems such as nonparametric regression and signal estimation. We consider the asymptotic properties of Bayes procedures for conjugate (= Gaussian) priors. We show that so long as the prior puts nonzero measure on the very large parameter set of interest then the Bayes estimators are not satisfactory. More specifically, we show that these estimators do not achieve the correct minimax rate over norm bounded sets in the parameter space. Thus all Bayes estimators for proper Gaussian priors have zero asymptotic efficiency in this minimax sense. We then present a class of priors whose Bayes procedures attain the optimal minimax rate of convergence. These priors may be viewed as compound, or hierarchical, mixtures of suitable Gaussian distributions

    Bayesian Nonparametric Point Estimation Under a Conjugate Prior

    Get PDF
    Estimation of a nonparametric regression function at a point is considered. The function is assumed to lie in a Sobolev space, Sq, of order q. The asymptotic squared-error performance of Bayes estimators corresponding to Gaussian priors is investigated as the sample size, n, increases. It is shown that for any such fixed prior on Sq the Bayes procedures do not attain the optimal minimax rate over balls in Sq. This result complements that in Zhao (Ann. Statist. 28 (2000) 532) for estimating the entire regression function, but the proof is rather different

    A Geometrical Explanation of Stein Shrinkage

    Get PDF
    Shrinkage estimation has become a basic tool in the analysis of high-dimensional data. Historically and conceptually a key development toward this was the discovery of the inadmissibility of the usual estimator of a multivariate normal mean. This article develops a geometrical explanation for this inadmissibility. By exploiting the spherical symmetry of the problem it is possible to effectively conceptualize the multidimensional setting in a two-dimensional framework that can be easily plotted and geometrically analyzed. We begin with the heuristic explanation for inadmissibility that was given by Stein [In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, Vol. I (1956) 197–206, Univ. California Press]. Some geometric figures are included to make this reasoning more tangible. It is also explained why Stein’s argument falls short of yielding a proof of inadmissibility, even when the dimension, p, is much larger than p = 3. We then extend the geometric idea to yield increasingly persuasive arguments for inadmissibility when p ≄ 3, albeit at the cost of increased geometric and computational detail

    Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding

    Full text link
    Large language models (LLMs) have made significant progress in various domains, including healthcare. However, the specialized nature of clinical language understanding tasks presents unique challenges and limitations that warrant further investigation. In this study, we conduct a comprehensive evaluation of state-of-the-art LLMs, namely GPT-3.5, GPT-4, and Bard, within the realm of clinical language understanding tasks. These tasks span a diverse range, including named entity recognition, relation extraction, natural language inference, semantic textual similarity, document classification, and question-answering. We also introduce a novel prompting strategy, self-questioning prompting (SQP), tailored to enhance LLMs' performance by eliciting informative questions and answers pertinent to the clinical scenarios at hand. Our evaluation underscores the significance of task-specific learning strategies and prompting techniques for improving LLMs' effectiveness in healthcare-related tasks. Additionally, our in-depth error analysis on the challenging relation extraction task offers valuable insights into error distribution and potential avenues for improvement using SQP. Our study sheds light on the practical implications of employing LLMs in the specialized domain of healthcare, serving as a foundation for future research and the development of potential applications in healthcare settings.Comment: 19 pages, preprin

    Statistical Analysis of a Telephone Call Center: A Queueing-Science Perspective

    Get PDF
    A call center is a service network in which agents provide telephone-based services. Customers that seek these services are delayed in tele-queues. This paper summarizes an analysis of a unique record of call center operations. The data comprise a complete operational history of a small banking call center, call by call, over a full year. Taking the perspective of queueing theory, we decompose the service process into three fundamental components: arrivals, customer abandonment behavior and service durations. Each component involves different basic mathematical structures and requires a different style of statistical analysis. Some of the key empirical results are sketched, along with descriptions of the varied techniques required. Several statistical techniques are developed for analysis of the basic components. One of these is a test that a point process is a Poisson process. Another involves estimation of the mean function in a nonparametric regression with lognormal errors. A new graphical technique is introduced for nonparametric hazard rate estimation with censored data. Models are developed and implemented for forecasting of Poisson arrival rates. We then survey how the characteristics deduced from the statistical analyses form the building blocks for theoretically interesting and practically useful mathematical models for call center operations. Key Words: call centers, queueing theory, lognormal distribution, inhomogeneous Poisson process, censored data, human patience, prediction of Poisson rates, Khintchine-Pollaczek formula, service times, arrival rate, abandonment rate, multiserver queues.
    • 

    corecore