16,705 research outputs found
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
A Bayesian approach to the study of white dwarf binaries in LISA data: The application of a reversible jump Markov chain Monte Carlo method
The Laser Interferometer Space Antenna (LISA) defines new demands on data
analysis efforts in its all-sky gravitational wave survey, recording
simultaneously thousands of galactic compact object binary foreground sources
and tens to hundreds of background sources like binary black hole mergers and
extreme mass ratio inspirals. We approach this problem with an adaptive and
fully automatic Reversible Jump Markov Chain Monte Carlo sampler, able to
sample from the joint posterior density function (as established by Bayes
theorem) for a given mixture of signals "out of the box'', handling the total
number of signals as an additional unknown parameter beside the unknown
parameters of each individual source and the noise floor. We show in examples
from the LISA Mock Data Challenge implementing the full response of LISA in its
TDI description that this sampler is able to extract monochromatic Double White
Dwarf signals out of colored instrumental noise and additional foreground and
background noise successfully in a global fitting approach. We introduce 2
examples with fixed number of signals (MCMC sampling), and 1 example with
unknown number of signals (RJ-MCMC), the latter further promoting the idea
behind an experimental adaptation of the model indicator proposal densities in
the main sampling stage. We note that the experienced runtimes and degeneracies
in parameter extraction limit the shown examples to the extraction of a low but
realistic number of signals.Comment: 18 pages, 9 figures, 3 tables, accepted for publication in PRD,
revised versio
Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case
Astronomy has entered the big data era and Machine Learning based methods
have found widespread use in a large variety of astronomical applications. This
is demonstrated by the recent huge increase in the number of publications
making use of this new approach. The usage of machine learning methods, however
is still far from trivial and many problems still need to be solved. Using the
evaluation of photometric redshifts as a case study, we outline the main
problems and some ongoing efforts to solve them.Comment: 13 pages, 3 figures, Springer's Communications in Computer and
Information Science (CCIS), Vol. 82
Image formation in synthetic aperture radio telescopes
Next generation radio telescopes will be much larger, more sensitive, have
much larger observation bandwidth and will be capable of pointing multiple
beams simultaneously. Obtaining the sensitivity, resolution and dynamic range
supported by the receivers requires the development of new signal processing
techniques for array and atmospheric calibration as well as new imaging
techniques that are both more accurate and computationally efficient since data
volumes will be much larger. This paper provides a tutorial overview of
existing image formation techniques and outlines some of the future directions
needed for information extraction from future radio telescopes. We describe the
imaging process from measurement equation until deconvolution, both as a
Fourier inversion problem and as an array processing estimation problem. The
latter formulation enables the development of more advanced techniques based on
state of the art array processing. We demonstrate the techniques on simulated
and measured radio telescope data.Comment: 12 page
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
- …