31,628 research outputs found
Improving PSF modelling for weak gravitational lensing using new methods in model selection
A simple theoretical framework for the description and interpretation of
spatially correlated modelling residuals is presented, and the resulting tools
are found to provide a useful aid to model selection in the context of weak
gravitational lensing. The description is focused upon the specific problem of
modelling the spatial variation of a telescope point spread function (PSF)
across the instrument field of view, a crucial stage in lensing data analysis,
but the technique may be used to rank competing models wherever data are
described empirically. As such it may, with further development, provide useful
extra information when used in combination with existing model selection
techniques such as the Akaike and Bayesian Information Criteria, or the
Bayesian evidence. Two independent diagnostic correlation functions are
described and the interpretation of these functions demonstrated using a
simulated PSF anisotropy field. The efficacy of these diagnostic functions as
an aid to the correct choice of empirical model is then demonstrated by
analyzing results for a suite of Monte Carlo simulations of random PSF fields
with varying degrees of spatial structure, and it is shown how the diagnostic
functions can be related to requirements for precision cosmic shear
measurement. The limitations of the technique, and opportunities for
improvements and applications to fields other than weak gravitational lensing,
are discussed.Comment: 18 pages, 12 figures. Modified to match version accepted for
publication in MNRA
A renewal cluster model for the inter-arrival times of rainfall events
A statistical model, based on a renewal cluster point process, is proposed and used to infer the
distributional properties of dry periods in a continuous-time record. The model incorporates a mixed
probability distribution in which inter-arrival times are classified into two distinct types, representing
cyclonic and anticyclonic weather. This results in rainfall events being clustered in time, and enables
objective probabilistic statements to be made about storm properties, e.g. the expected number of events
in a storm cluster. The model is fitted to data taken from a gauge near Wellington, New Zealand, by
maximising the likelihood function with respect to the parameters. The Akaike Information Criteria is
used to select the best fitting distributions from a range of candidates. The log-Normal distribution is
found to provide the best fit to the times between successive storm clusters, whilst the Weibull
distribution is found to provide the best fit to the times between successive events in the same storm
cluster. Harmonic curves are used to provide a parsimonious parameterisation, allowing for the seasonal
variation in precipitation. Under the fitted model, the interval series is transformed into a residual series,
which is assessed to determine overall goodness-of-fit
Quantifying correlations between galaxy emission lines and stellar continua
We analyse the correlations between continuum properties and emission line
equivalent widths of star-forming and active galaxies from the Sloan Digital
Sky Survey. Since upcoming large sky surveys will make broad-band observations
only, including strong emission lines into theoretical modelling of spectra
will be essential to estimate physical properties of photometric galaxies. We
show that emission line equivalent widths can be fairly well reconstructed from
the stellar continuum using local multiple linear regression in the continuum
principal component analysis (PCA) space. Line reconstruction is good for
star-forming galaxies and reasonable for galaxies with active nuclei. We
propose a practical method to combine stellar population synthesis models with
empirical modelling of emission lines. The technique will help generate more
accurate model spectra and mock catalogues of galaxies to fit observations of
the new surveys. More accurate modelling of emission lines is also expected to
improve template-based photometric redshift estimation methods. We also show
that, by combining PCA coefficients from the pure continuum and the emission
lines, automatic distinction between hosts of weak active galactic nuclei
(AGNs) and quiescent star-forming galaxies can be made. The classification
method is based on a training set consisting of high-confidence starburst
galaxies and AGNs, and allows for the similar separation of active and
star-forming galaxies as the empirical curve found by Kauffmann et al. We
demonstrate the use of three important machine learning algorithms in the
paper: k-nearest neighbour finding, k-means clustering and support vector
machines.Comment: 14 pages, 14 figures. Accepted by MNRAS on 2015 December 22. The
paper's website with data and code is at
http://www.vo.elte.hu/papers/2015/emissionlines
Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework
Approximate Bayesian Computation (ABC) is a useful class of methods for
Bayesian inference when the likelihood function is computationally intractable.
In practice, the basic ABC algorithm may be inefficient in the presence of
discrepancy between prior and posterior. Therefore, more elaborate methods,
such as ABC with the Markov chain Monte Carlo algorithm (ABC-MCMC), should be
used. However, the elaboration of a proposal density for MCMC is a sensitive
issue and very difficult in the ABC setting, where the likelihood is
intractable. We discuss an automatic proposal distribution useful for ABC-MCMC
algorithms. This proposal is inspired by the theory of quasi-likelihood (QL)
functions and is obtained by modelling the distribution of the summary
statistics as a function of the parameters. Essentially, given a real-valued
vector of summary statistics, we reparametrize the model by means of a
regression function of the statistics on parameters, obtained by sampling from
the original model in a pilot-run simulation study. The QL theory is well
established for a scalar parameter, and it is shown that when the conditional
variance of the summary statistic is assumed constant, the QL has a closed-form
normal density. This idea of constructing proposal distributions is extended to
non constant variance and to real-valued parameter vectors. The method is
illustrated by several examples and by an application to a real problem in
population genetics.Comment: Published at http://dx.doi.org/10.1214/14-BA921 in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
A general class of zero-or-one inflated beta regression models
This paper proposes a general class of regression models for continuous
proportions when the data contain zeros or ones. The proposed class of models
assumes that the response variable has a mixed continuous-discrete distribution
with probability mass at zero or one. The beta distribution is used to describe
the continuous component of the model, since its density has a wide range of
different shapes depending on the values of the two parameters that index the
distribution. We use a suitable parameterization of the beta law in terms of
its mean and a precision parameter. The parameters of the mixture distribution
are modeled as functions of regression parameters. We provide inference,
diagnostic, and model selection tools for this class of models. A practical
application that employs real data is presented.Comment: 21 pages, 3 figures, 5 tables. Computational Statistics and Data
Analysis, 17 October 2011, ISSN 0167-9473
(http://www.sciencedirect.com/science/article/pii/S0167947311003628
Automated model selection in finance: General-to-specic modelling of the mean and volatility specications
General-to-Specific (GETS) modelling has witnessed major advances over the last decade thanks to the automation of multi-path GETS specification search. However, several scholars have argued that the estimation complexity associated with financial models constitutes an obstacle to multi-path GETS modelling in finance. Making use of a recent result on log-GARCH Models, we provide and study simple but general and flexible methods that automate financial multi-path GETS modelling. Starting from a general model where the mean specification can contain autoregressive (AR) terms and explanatory variables, and where the exponential volatility specification can include log-ARCH terms, asymmetry terms, volatility proxies and other explanatory variables, the algorithm we propose returns parsimonious mean and volatility specifications. The finite sample properties of the methods are studied by means of extensive Monte Carlo simulations, and two empirical applications suggest the methods are very useful in practice.general-to-specific; specification search; model selection; finance; volatility
General to specific modelling of exchange rate volatility : a forecast evaluation
The general-to-specific (GETS) methodology is widely employed in the modelling of
economic series, but less so in financial volatility modelling due to computational
complexity when many explanatory variables are involved. This study proposes a
simple way of avoiding this problem when the conditional mean can appropriately be
restricted to zero, and undertakes an out-of-sample forecast evaluation of the
methodology applied to the modelling of weekly exchange rate volatility. Our findings
suggest that GETS specifications perform comparatively well in both ex post and ex
ante forecasting as long as sufficient care is taken with respect to functional form and
with respect to how the conditioning information is used. Also, our forecast comparison
provides an example of a discrete time explanatory model being more accurate than
realised volatility ex post in 1 step forecasting
- …