186 research outputs found
A likelihood method to cross-calibrate air-shower detectors
We present a detailed statistical treatment of the energy calibration of
hybrid air-shower detectors, which combine a surface detector array and a
fluorescence detector, to obtain an unbiased estimate of the calibration curve.
The special features of calibration data from air showers prevent unbiased
results, if a standard least-squares fit is applied to the problem. We develop
a general maximum-likelihood approach, based on the detailed statistical model,
to solve the problem. Our approach was developed for the Pierre Auger
Observatory, but the applied principles are general and can be transferred to
other air-shower experiments, even to the cross-calibration of other
observables. Since our general likelihood function is expensive to compute, we
derive two approximations with significantly smaller computational cost. In the
recent years both have been used to calibrate data of the Pierre Auger
Observatory. We demonstrate that these approximations introduce negligible bias
when they are applied to simulated toy experiments, which mimic realistic
experimental conditions.Comment: 10 pages, 7 figure
An apple-to-apple comparison of Learning-to-rank algorithms in terms of Normalized Discounted Cumulative Gain
International audienceThe Normalized Discounted Cumulative Gain (NDCG) is a widely used evaluation metric for learning-to-rank (LTR) systems. NDCG is designed for ranking tasks with more than one relevance levels. There are many freely available, open source tools for computing the NDCG score for a ranked result list. Even though the definition of NDCG is unambiguous, the various tools can produce different scores for ranked lists with certain properties, deteriorating the empirical tests in many published papers and thereby making the comparison of empirical results published in different studies difficult to compare. In this study, first, we identify the major differences between the various publicly available NDCG evaluation tools. Second, based on a set of comparative experiments using a common benchmark dataset in LTR research and 6 different LTR algorithms, we demonstrate how these differences affect the overall performance of different algorithms and the final scores that are used to compare different systems
Theoretical insights into the electronic structure of nickel(0)-diphosphine-carbon dioxide complexes
The coordination properties of carbon dioxide bound to Ni(0) with various phosphines have been inves- tigated by means of DFT calculations. Reasonable linear correlation has been found between Tolman’s electronic parameters (TEPs) and the asymmetric stretching frequency of the coordinated CO2. Two de- scriptors from EDA-NOCV calculations, namely the interaction energy and the Hirshfeld charge associated with the back donation component gave acceptable linear correlation as well with the TEPs. The coordination strength, as well as the C = O bond order in coordinated carbon dioxide can be tuned by varying the substituents on phosphorus: in the presence of electron withdrawing groups the C = O bond remains stronger and Ni-C interaction is weaker, moreover, a new Ni-O bond path is formed; whereas for more basic diphosphines the Ni –C bond order is higher and the coordinated carbon dioxide possess a weaker C = O bond
Elastic principal manifolds and their practical applications
Principal manifolds serve as useful tool for many practical applications.
These manifolds are defined as lines or surfaces passing through "the middle"
of data distribution. We propose an algorithm for fast construction of grid
approximations of principal manifolds with given topology. It is based on
analogy of principal manifold and elastic membrane. The first advantage of this
method is a form of the functional to be minimized which becomes quadratic at
the step of the vertices position refinement. This makes the algorithm very
effective, especially for parallel implementations. Another advantage is that
the same algorithmic kernel is applied to construct principal manifolds of
different dimensions and topologies. We demonstrate how flexibility of the
approach allows numerous adaptive strategies like principal graph constructing,
etc. The algorithm is implemented as a C++ package elmap and as a part of
stand-alone data visualization tool VidaExpert, available on the web. We
describe the approach and provide several examples of its application with
speed performance characteristics.Comment: 26 pages, 10 figures, edited final versio
Elastic Maps and Nets for Approximating Principal Manifolds and Their Application to Microarray Data Visualization
Principal manifolds are defined as lines or surfaces passing through ``the
middle'' of data distribution. Linear principal manifolds (Principal Components
Analysis) are routinely used for dimension reduction, noise filtering and data
visualization. Recently, methods for constructing non-linear principal
manifolds were proposed, including our elastic maps approach which is based on
a physical analogy with elastic membranes. We have developed a general
geometric framework for constructing ``principal objects'' of various
dimensions and topologies with the simplest quadratic form of the smoothness
penalty which allows very effective parallel implementations. Our approach is
implemented in three programming languages (C++, Java and Delphi) with two
graphical user interfaces (VidaExpert
http://bioinfo.curie.fr/projects/vidaexpert and ViMiDa
http://bioinfo-out.curie.fr/projects/vimida applications). In this paper we
overview the method of elastic maps and present in detail one of its major
applications: the visualization of microarray data in bioinformatics. We show
that the method of elastic maps outperforms linear PCA in terms of data
approximation, representation of between-point distance structure, preservation
of local point neighborhood and representing point classes in low-dimensional
spaces.Comment: 35 pages 10 figure
Representing complex data using localized principal components with application to astronomical data
Often the relation between the variables constituting a multivariate data
space might be characterized by one or more of the terms: ``nonlinear'',
``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or,
more general, ``complex''. In these cases, simple principal component analysis
(PCA) as a tool for dimension reduction can fail badly. Of the many alternative
approaches proposed so far, local approximations of PCA are among the most
promising. This paper will give a short review of localized versions of PCA,
focusing on local principal curves and local partitioning algorithms.
Furthermore we discuss projections other than the local principal components.
When performing local dimension reduction for regression or classification
problems it is important to focus not only on the manifold structure of the
covariates, but also on the response variable(s). Local principal components
only achieve the former, whereas localized regression approaches concentrate on
the latter. Local projection directions derived from the partial least squares
(PLS) algorithm offer an interesting trade-off between these two objectives. We
apply these methods to several real data sets. In particular, we consider
simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and
Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds),
Lecture Notes in Computational Science and Engineering, Springer, 2007, pp.
180--204,
http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-
Operations of and Future Plans for the Pierre Auger Observatory
Technical reports on operations and features of the Pierre Auger Observatory,
including ongoing and planned enhancements and the status of the future
northern hemisphere portion of the Observatory. Contributions to the 31st
International Cosmic Ray Conference, Lodz, Poland, July 2009.Comment: Contributions to the 31st ICRC, Lodz, Poland, July 200
PCA Beyond The Concept of Manifolds: Principal Trees, Metro Maps, and Elastic Cubic Complexes
Multidimensional data distributions can have complex topologies and variable
local dimensions. To approximate complex data, we propose a new type of
low-dimensional ``principal object'': a principal cubic complex. This complex
is a generalization of linear and non-linear principal manifolds and includes
them as a particular case. To construct such an object, we combine a method of
topological grammars with the minimization of an elastic energy defined for its
embedment into multidimensional data space. The whole complex is presented as a
system of nodes and springs and as a product of one-dimensional continua
(represented by graphs), and the grammars describe how these continua transform
during the process of optimal complex construction. The simplest case of a
topological grammar (``add a node'', ``bisect an edge'') is equivalent to the
construction of ``principal trees'', an object useful in many practical
applications. We demonstrate how it can be applied to the analysis of bacterial
genomes and for visualization of cDNA microarray data using the ``metro map''
representation. The preprint is supplemented by animation: ``How the
topological grammar constructs branching principal components
(AnimatedBranchingPCA.gif)''.Comment: 19 pages, 8 figure
A search for point sources of EeV photons
Measurements of air showers made using the hybrid technique developed with
the fluorescence and surface detectors of the Pierre Auger Observatory allow a
sensitive search for point sources of EeV photons anywhere in the exposed sky.
A multivariate analysis reduces the background of hadronic cosmic rays. The
search is sensitive to a declination band from -85{\deg} to +20{\deg}, in an
energy range from 10^17.3 eV to 10^18.5 eV. No photon point source has been
detected. An upper limit on the photon flux has been derived for every
direction. The mean value of the energy flux limit that results from this,
assuming a photon spectral index of -2, is 0.06 eV cm^-2 s^-1, and no celestial
direction exceeds 0.25 eV cm^-2 s^-1. These upper limits constrain scenarios in
which EeV cosmic ray protons are emitted by non-transient sources in the
Galaxy.Comment: 28 pages, 10 figures, accepted for publication in The Astrophysical
Journa
- …