9 research outputs found
The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch
Recent and forthcoming advances in instrumentation, and giant new surveys,
are creating astronomical data sets that are not amenable to the methods of
analysis familiar to astronomers. Traditional methods are often inadequate not
merely because of the size in bytes of the data sets, but also because of the
complexity of modern data sets. Mathematical limitations of familiar algorithms
and techniques in dealing with such data sets create a critical need for new
paradigms for the representation, analysis and scientific visualization (as
opposed to illustrative visualization) of heterogeneous, multiresolution data
across application domains. Some of the problems presented by the new data sets
have been addressed by other disciplines such as applied mathematics,
statistics and machine learning and have been utilized by other sciences such
as space-based geosciences. Unfortunately, valuable results pertaining to these
problems are mostly to be found only in publications outside of astronomy. Here
we offer brief overviews of a number of concepts, techniques and developments,
some "old" and some new. These are generally unknown to most of the
astronomical community, but are vital to the analysis and visualization of
complex datasets and images. In order for astronomers to take advantage of the
richness and complexity of the new era of data, and to be able to identify,
adopt, and apply new solutions, the astronomical community needs a certain
degree of awareness and understanding of the new concepts. One of the goals of
this paper is to help bridge the gap between applied mathematics, artificial
intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in
Astronomy, special issue "Robotic Astronomy
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
On the application of machine learning approaches in astronomy: Exploring novel representations of high-dimensional and complex astronomical data
The goal of the presented work is the application of data-driven methods on complex and high-
dimensional astronomical databases. The focus of the work is the exploration of novel data
representations in order to enable the use of statistical learning approaches in the analysis of
data. With the help of diverse science cases, the advantages of the introduced approaches for
classication, visualization and regression tasks are shown by applying the developed methodology
to astronomical data.
In the first part, an alternative approach for estimating redshifts of spectra by using the
knowledge about the redshifts provided by the SDSS pipeline is presented. A novel data repre-
sentation is employed which contains only information relevant for estimating the redshift and
the detection of multiple redshift systems. Subsequently, a novel data representation for regu-
larly sampled light curves based on recurrent networks is presented. This allows an explorative
investigation of huge databases with unlabeled data. Finally, a new way of representing the static
part of irregularly sampled light curves by a mixture of Gaussians is discussed. This represen-
tation is more general than the extraction of features, as it allows the inclusion of photometric
uncertainties and avoids the introduction of observational biases
Spectral variability studies in Active Galactic Nuclei: Exploring continuum and emission line regions in the age of LSST and JWST
The investigation of emission line regions within active galaxies (AGNs) has
a rich and extensive history, now extending to the use of AGNs and quasars as
"standardizable" cosmological indicators, shedding light on the evolution of
our universe. As we enter the era of advanced observatories, such as the
successful launch of JWST and the forthcoming Vera C. Rubin Observatory's
Legacy Survey of Space and Time (LSST), the landscape of AGN exploration across
cosmic epochs is poised for exciting advancements. In this work, we delve into
recent developments in AGN variability research, anticipating the substantial
influx of data facilitated by LSST. The article highlights recent strides made
by the AGN Polish Consortium in their contributions to LSST. The piece
emphasizes the role of quasars in cosmology, dissecting the intricacies of
their calibration as standard candles. The primary focus centers on the
relationship between the broad-line region size and luminosity, showcasing
recent breakthroughs that enhance our comprehension of this correlation. These
breakthroughs encompass a range of perspectives, including spectroscopic
analyses, photoionization modeling, and collaborative investigations with other
cosmological tools. The study further touches on select studies, underlining
how the synergy of theoretical insights and advancements in observational
capabilities has yielded deeper insights into these captivating cosmic
entities.Comment: 34 pages, 5 figures, accepted for publication as a review in
MDPI/Univers
Planetary Science Informatics and Data Analytics Conference : April 24–26, 2018, St. Louis, Missouri
The PSIDA conference provides a forum to discuss approaches, challenges, and applications of informatics and data analytics technologies and capabilities in planetary science.Institutional Support NASA Planetary Data System Geosciences, Lunar and Planetary Institute.Chairs Tom Stein, Washington University, St. Louis, USA, Dan Crichton, Jet Propulsion Laboratory, Pasadena, USA ; Program Committee Alphan Altinok, Jet Propulsion Laboratory, Pasadena, USA … [and 8 others]PARTIAL CONTENTS: ESA Planetary Science Archive Architecture and Data Management--SPICE for ESA Planetary Missions--VESPA: Enlarging the Virtual Observatory to Planetary Science--SeaBIRD: A Flexible and Intuitive Planetary Datamining Infrastructure--Model-Driven Development for PDS4 Software and Services--The Need for a Planetary Spatial Data Clearinghouse--The Relationship Between Planetary Spatial Data Infrastructure and the Planetary Data System--Update on the NASA-USGS Planetary Spatial Data Infrastructure Inter-Agency Agreement--MoonDB - A Data System for Analytical Data of Lunar Samples--Large-Scale Numerical Simulations of Planetary Interiors--Scalable Data Processing with the LROC Processing Pipelines--PACKMAN-Net: A Distributed, Open-Access, and Scalable Network of User-Friendly Space Weather Stations