1,318 research outputs found

    Conditional Transformation Models

    Full text link
    The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only estimate the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularised optimisation of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalised additive models for location, scale and shape

    A Simple Baseline for Travel Time Estimation using Large-Scale Trip Data

    Full text link
    The increased availability of large-scale trajectory data around the world provides rich information for the study of urban dynamics. For example, New York City Taxi Limousine Commission regularly releases source-destination information about trips in the taxis they regulate. Taxi data provide information about traffic patterns, and thus enable the study of urban flow -- what will traffic between two locations look like at a certain date and time in the future? Existing big data methods try to outdo each other in terms of complexity and algorithmic sophistication. In the spirit of "big data beats algorithms", we present a very simple baseline which outperforms state-of-the-art approaches, including Bing Maps and Baidu Maps (whose APIs permit large scale experimentation). Such a travel time estimation baseline has several important uses, such as navigation (fast travel time estimates can serve as approximate heuristics for A search variants for path finding) and trip planning (which uses operating hours for popular destinations along with travel time estimates to create an itinerary).Comment: 12 page
    corecore