4,674 research outputs found
Some Aspects of Measurement Error in Linear Regression of Astronomical Data
I describe a Bayesian method to account for measurement errors in linear
regression of astronomical data. The method allows for heteroscedastic and
possibly correlated measurement errors, and intrinsic scatter in the regression
relationship. The method is based on deriving a likelihood function for the
measured data, and I focus on the case when the intrinsic distribution of the
independent variables can be approximated using a mixture of Gaussians. I
generalize the method to incorporate multiple independent variables,
non-detections, and selection effects (e.g., Malmquist bias). A Gibbs sampler
is described for simulating random draws from the probability distribution of
the parameters, given the observed data. I use simulation to compare the method
with other common estimators. The simulations illustrate that the Gaussian
mixture model outperforms other common estimators and can effectively give
constraints on the regression parameters, even when the measurement errors
dominate the observed scatter, source detection fraction is low, or the
intrinsic distribution of the independent variables is not a mixture of
Gaussians. I conclude by using this method to fit the X-ray spectral slope as a
function of Eddington ratio using a sample of 39 z < 0.8 radio-quiet quasars. I
confirm the correlation seen by other authors between the radio-quiet quasar
X-ray spectral slope and the Eddington ratio, where the X-ray spectral slope
softens as the Eddington ratio increases.Comment: 39 pages, 11 figures, 1 table, accepted by ApJ. IDL routines
(linmix_err.pro) for performing the Markov Chain Monte Carlo are available at
the IDL astronomy user's library, http://idlastro.gsfc.nasa.gov/homepage.htm
Designing a Belief Function-Based Accessibility Indicator to Improve Web Browsing for Disabled People
The purpose of this study is to provide an accessibility measure of
web-pages, in order to draw disabled users to the pages that have been designed
to be ac-cessible to them. Our approach is based on the theory of belief
functions, using data which are supplied by reports produced by automatic web
content assessors that test the validity of criteria defined by the WCAG 2.0
guidelines proposed by the World Wide Web Consortium (W3C) organization. These
tools detect errors with gradual degrees of certainty and their results do not
always converge. For these reasons, to fuse information coming from the
reports, we choose to use an information fusion framework which can take into
account the uncertainty and imprecision of infor-mation as well as divergences
between sources. Our accessibility indicator covers four categories of
deficiencies. To validate the theoretical approach in this context, we propose
an evaluation completed on a corpus of 100 most visited French news websites,
and 2 evaluation tools. The results obtained illustrate the interest of our
accessibility indicator
Condition monitoring of an advanced gas-cooled nuclear reactor core
A critical component of an advanced gas-cooled reactor station is the graphite core. As a station ages, the graphite bricks that comprise the core can distort and may eventually crack. Since the core cannot be replaced, the core integrity ultimately determines the station life. Monitoring these distortions is usually restricted to the routine outages, which occur every few years, as this is the only time that the reactor core can be accessed by external sensing equipment. This paper presents a monitoring module based on model-based techniques using measurements obtained during the refuelling process. A fault detection and isolation filter based on unknown input observer techniques is developed. The role of this filter is to estimate the friction force produced by the interaction between the wall of the fuel channel and the fuel assembly supporting brushes. This allows an estimate to be made of the shape of the graphite bricks that comprise the core and, therefore, to monitor any distortion on them
Application of Monte Carlo Algorithms to the Bayesian Analysis of the Cosmic Microwave Background
Power spectrum estimation and evaluation of associated errors in the presence
of incomplete sky coverage; non-homogeneous, correlated instrumental noise; and
foreground emission is a problem of central importance for the extraction of
cosmological information from the cosmic microwave background. We develop a
Monte Carlo approach for the maximum likelihood estimation of the power
spectrum. The method is based on an identity for the Bayesian posterior as a
marginalization over unknowns. Maximization of the posterior involves the
computation of expectation values as a sample average from maps of the cosmic
microwave background and foregrounds given some current estimate of the power
spectrum or cosmological model, and some assumed statistical characterization
of the foregrounds. Maps of the CMB are sampled by a linear transform of a
Gaussian white noise process, implemented numerically with conjugate gradient
descent. For time series data with N_{t} samples, and N pixels on the sphere,
the method has a computational expense $KO[N^{2} +- N_{t} +AFw-log N_{t}],
where K is a prefactor determined by the convergence rate of conjugate gradient
descent. Preconditioners for conjugate gradient descent are given for scans
close to great circle paths, and the method allows partial sky coverage for
these cases by numerically marginalizing over the unobserved, or removed,
region.Comment: submitted to Ap
Clustering as an example of optimizing arbitrarily chosen objective functions
This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison
Reconstruction of photon statistics using low performance photon counters
The output of a photodetector consists of a current pulse whose charge has
the statistical distribution of the actual photon numbers convolved with a
Bernoulli distribution. Photodetectors are characterized by a nonunit quantum
efficiency, i.e. not all the photons lead to a charge, and by a finite
resolution, i.e. a different number of detected photons leads to a
discriminable values of the charge only up to a maximum value. We present a
detailed comparison, based on Monte Carlo simulated experiments and real data,
among the performances of detectors with different upper limits of counting
capability. In our scheme the inversion of Bernoulli convolution is performed
by maximum-likelihood methods assisted by measurements taken at different
quantum efficiencies. We show that detectors that are only able to discriminate
between zero, one and more than one detected photons are generally enough to
provide a reliable reconstruction of the photon statistics for single-peaked
distributions, while detectors with higher resolution limits do not lead to
further improvements. In addition, we demonstrate that, for semiclassical
states, even on/off detectors are enough to provide a good reconstruction.
Finally, we show that a reliable reconstruction of multi-peaked distributions
requires either higher quantum efficiency or better capability in
discriminating high number of detected photons.Comment: 8 pages, 3 figure
The Time Machine: A Simulation Approach for Stochastic Trees
In the following paper we consider a simulation technique for stochastic
trees. One of the most important areas in computational genetics is the
calculation and subsequent maximization of the likelihood function associated
to such models. This typically consists of using importance sampling (IS) and
sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating
the tree, backward in time from observed data, to a most recent common ancestor
(MRCA). However, in many cases, the computational time and variance of
estimators are often too high to make standard approaches useful. In this paper
we propose to stop the simulation, subsequently yielding biased estimates of
the likelihood surface. The bias is investigated from a theoretical point of
view. Results from simulation studies are also given to investigate the balance
between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure
The Weibull-Geometric distribution
In this paper we introduce, for the first time, the Weibull-Geometric
distribution which generalizes the exponential-geometric distribution proposed
by Adamidis and Loukas (1998). The hazard function of the last distribution is
monotone decreasing but the hazard function of the new distribution can take
more general forms. Unlike the Weibull distribution, the proposed distribution
is useful for modeling unimodal failure rates. We derive the cumulative
distribution and hazard functions, the density of the order statistics and
calculate expressions for its moments and for the moments of the order
statistics. We give expressions for the R\'enyi and Shannon entropies. The
maximum likelihood estimation procedure is discussed and an algorithm EM
(Dempster et al., 1977; McLachlan and Krishnan, 1997) is provided for
estimating the parameters. We obtain the information matrix and discuss
inference. Applications to real data sets are given to show the flexibility and
potentiality of the proposed distribution
- …