249,540 research outputs found
Optimal rates of convergence for persistence diagrams in Topological Data Analysis
Computational topology has recently known an important development toward
data analysis, giving birth to the field of topological data analysis.
Topological persistence, or persistent homology, appears as a fundamental tool
in this field. In this paper, we study topological persistence in general
metric spaces, with a statistical approach. We show that the use of persistent
homology can be naturally considered in general statistical frameworks and
persistence diagrams can be used as statistics with interesting convergence
properties. Some numerical experiments are performed in various contexts to
illustrate our results
Methodological and empirical challenges in modelling residential location choices
The modelling of residential locations is a key element in land use and transport planning. There are significant empirical and methodological challenges inherent in such modelling, however, despite recent advances both in the availability of spatial datasets and in computational and choice modelling techniques.
One of the most important of these challenges concerns spatial aggregation. The housing market is characterised by the fact that it offers spatially and functionally heterogeneous products; as a result, if residential alternatives are represented as aggregated spatial units (as in conventional residential location models), the variability of dwelling attributes is lost, which may limit the predictive ability and policy sensitivity of the model. This thesis presents a modelling framework for residential location choice that addresses three key challenges: (i) the development of models at the dwelling-unit level, (ii) the treatment of spatial structure effects in such dwelling-unit level models, and (iii) problems associated with estimation in such modelling frameworks in the absence of disaggregated dwelling unit supply data. The proposed framework is applied to the residential location choice context in London.
Another important challenge in the modelling of residential locations is the choice set formation problem. Most models of residential location choices have been developed based on the assumption that households consider all available alternatives when they are making location choices. Due the high search costs associated with the housing market, however, and the limited capacity of households to process information, the validity of this assumption has been an on-going debate among researchers. There have been some attempts in the literature to incorporate the cognitive capacities of households within discrete choice models of residential location: for instance, by modelling households’ choice sets exogenously based on simplifying assumptions regarding their spatial search behaviour (e.g., an anchor-based search strategy) and their characteristics. By undertaking an empirical comparison of alternative models within the context of residential location choice in the Greater London area this thesis investigates the feasibility and practicality of applying deterministic choice set formation approaches to capture the underlying search process of households. The thesis also investigates the uncertainty of choice sets in residential location choice modelling and proposes a simplified probabilistic choice set formation approach to model choice sets and choices simultaneously.
The dwelling-level modelling framework proposed in this research is practice-ready and can be used to estimate residential location choice models at the level of dwelling units without requiring independent and disaggregated dwelling supply data. The empirical comparison of alternative exogenous choice set formation approaches provides a guideline for modellers and land use planners to avoid inappropriate choice set formation approaches in practice. Finally, the proposed simplified choice set formation model can be applied to model the behaviour of households in online real estate environments.Open Acces
Minimum Distance Estimation of Milky Way Model Parameters and Related Inference
We propose a method to estimate the location of the Sun in the disk of the
Milky Way using a method based on the Hellinger distance and construct
confidence sets on our estimate of the unknown location using a bootstrap based
method. Assuming the Galactic disk to be two-dimensional, the sought solar
location then reduces to the radial distance separating the Sun from the
Galactic center and the angular separation of the Galactic center to Sun line,
from a pre-fixed line on the disk. On astronomical scales, the unknown solar
location is equivalent to the location of us earthlings who observe the
velocities of a sample of stars in the neighborhood of the Sun. This unknown
location is estimated by undertaking pairwise comparisons of the estimated
density of the observed set of velocities of the sampled stars, with densities
estimated using synthetic stellar velocity data sets generated at chosen
locations in the Milky Way disk according to four base astrophysical models.
The "match" between the pair of estimated densities is parameterized by the
affinity measure based on the familiar Hellinger distance. We perform a novel
cross-validation procedure to establish a desirable "consistency" property of
the proposed method.Comment: 25 pages, 10 Figures. This version incorporates the suggestions made
by the referees. To appear in SIAM/ASA Journal on Uncertainty Quantificatio
Techniques for the Fast Simulation of Models of Highly dependable Systems
With the ever-increasing complexity and requirements of highly dependable systems, their evaluation during design and operation is becoming more crucial. Realistic models of such systems are often not amenable to analysis using conventional analytic or numerical methods. Therefore, analysts and designers turn to simulation to evaluate these models. However, accurate estimation of dependability measures of these models requires that the simulation frequently observes system failures, which are rare events in highly dependable systems. This renders ordinary Simulation impractical for evaluating such systems. To overcome this problem, simulation techniques based on importance sampling have been developed, and are very effective in certain settings. When importance sampling works well, simulation run lengths can be reduced by several orders of magnitude when estimating transient as well as steady-state dependability measures. This paper reviews some of the importance-sampling techniques that have been developed in recent years to estimate dependability measures efficiently in Markov and nonMarkov models of highly dependable system
A Nonparametric Bayesian Approach to Copula Estimation
We propose a novel Dirichlet-based P\'olya tree (D-P tree) prior on the
copula and based on the D-P tree prior, a nonparametric Bayesian inference
procedure. Through theoretical analysis and simulations, we are able to show
that the flexibility of the D-P tree prior ensures its consistency in copula
estimation, thus able to detect more subtle and complex copula structures than
earlier nonparametric Bayesian models, such as a Gaussian copula mixture.
Further, the continuity of the imposed D-P tree prior leads to a more favorable
smoothing effect in copula estimation over classic frequentist methods,
especially with small sets of observations. We also apply our method to the
copula prediction between the S\&P 500 index and the IBM stock prices during
the 2007-08 financial crisis, finding that D-P tree-based methods enjoy strong
robustness and flexibility over classic methods under such irregular market
behaviors
Robust Geometry Estimation using the Generalized Voronoi Covariance Measure
The Voronoi Covariance Measure of a compact set K of R^d is a tensor-valued
measure that encodes geometric information on K and which is known to be
resilient to Hausdorff noise but sensitive to outliers. In this article, we
generalize this notion to any distance-like function delta and define the
delta-VCM. We show that the delta-VCM is resilient to Hausdorff noise and to
outliers, thus providing a tool to estimate robustly normals from a point cloud
approximation. We present experiments showing the robustness of our approach
for normal and curvature estimation and sharp feature detection
What is the best risk measure in practice? A comparison of standard measures
Expected Shortfall (ES) has been widely accepted as a risk measure that is
conceptually superior to Value-at-Risk (VaR). At the same time, however, it has
been criticised for issues relating to backtesting. In particular, ES has been
found not to be elicitable which means that backtesting for ES is less
straightforward than, e.g., backtesting for VaR. Expectiles have been suggested
as potentially better alternatives to both ES and VaR. In this paper, we
revisit commonly accepted desirable properties of risk measures like coherence,
comonotonic additivity, robustness and elicitability. We check VaR, ES and
Expectiles with regard to whether or not they enjoy these properties, with
particular emphasis on Expectiles. We also consider their impact on capital
allocation, an important issue in risk management. We find that, despite the
caveats that apply to the estimation and backtesting of ES, it can be
considered a good risk measure. As a consequence, there is no sufficient
evidence to justify an all-inclusive replacement of ES by Expectiles in
applications. For backtesting ES, we propose an empirical approach that
consists in replacing ES by a set of four quantiles, which should allow to make
use of backtesting methods for VaR.
Keywords: Backtesting; capital allocation; coherence; diversification;
elicitability; expected shortfall; expectile; forecasts; probability integral
transform (PIT); risk measure; risk management; robustness; value-at-riskComment: 27 pages, 1 tabl
- …