904 research outputs found
Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground
In Astrophysics, the identification of candidate Globular Clusters through
deep, wide-field, single band HST images, is a typical data analytics problem,
where methods based on Machine Learning have revealed a high efficiency and
reliability, demonstrating the capability to improve the traditional
approaches. Here we experimented some variants of the known Neural Gas model,
exploring both supervised and unsupervised paradigms of Machine Learning, on
the classification of Globular Clusters, extracted from the NGC1399 HST data.
Main focus of this work was to use a well-tested playground to scientifically
validate such kind of models for further extended experiments in astrophysics
and using other standard Machine Learning methods (for instance Random Forest
and Multi Layer Perceptron neural network) for a comparison of performances in
terms of purity and completeness.Comment: Proceedings of the XIX International Conference "Data Analytics and
Management in Data Intensive Domains" (DAMDID/RCDL 2017), Moscow, Russia,
October 10-13, 2017, 8 pages, 4 figure
The Statistical Approach to Quantifying Galaxy Evolution
Studies of the distribution and evolution of galaxies are of fundamental
importance to modern cosmology; these studies, however, are hampered by the
complexity of the competing effects of spectral and density evolution.
Constructing a spectroscopic sample that is able to unambiguously disentangle
these processes is currently excessively prohibitive due to the observational
requirements. This paper extends and applies an alternative approach that
relies on statistical estimates for both distance (z) and spectral type to a
deep multi-band dataset that was obtained for this exact purpose.
These statistical estimates are extracted directly from the photometric data
by capitalizing on the inherent relationships between flux, redshift, and
spectral type. These relationships are encapsulated in the empirical
photometric redshift relation which we extend to z ~ 1.2, with an intrinsic
dispersion of dz = 0.06. We also develop realistic estimates for the
photometric redshift error for individual objects, and introduce the
utilization of the galaxy ensemble as a tool for quantifying both a
cosmological parameter and its measured error. We present deep, multi-band,
optical number counts as a demonstration of the integrity of our sample. Using
the photometric redshift and the corresponding redshift error, we can divide
our data into different redshift intervals and spectral types. As an example
application, we present the number redshift distribution as a function of
spectral type.Comment: 40 pages (LaTex), 21 Figures, requires aasms4.sty; Accepted by the
Astrophysical Journa
The detection of globular clusters in galaxies as a data mining problem
We present an application of self-adaptive supervised learning classifiers
derived from the Machine Learning paradigm, to the identification of candidate
Globular Clusters in deep, wide-field, single band HST images. Several methods
provided by the DAME (Data Mining & Exploration) web application, were tested
and compared on the NGC1399 HST data described in Paolillo 2011. The best
results were obtained using a Multi Layer Perceptron with Quasi Newton learning
rule which achieved a classification accuracy of 98.3%, with a completeness of
97.8% and 1.6% of contamination. An extensive set of experiments revealed that
the use of accurate structural parameters (effective radius, central surface
brightness) does improve the final result, but only by 5%. It is also shown
that the method is capable to retrieve also extreme sources (for instance, very
extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original
form 2011 October 1
The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: constraints on the time variation of fundamental constants from the large-scale two-point correlation function
We obtain constraints on the variation of the fundamental constants from the
full shape of the redshift-space correlation function of a sample of luminous
galaxies drawn from the Data Release 9 of the Baryonic Oscillations
Spectroscopic Survey. We combine this information with data from recent CMB,
BAO and H_0 measurements. We focus on possible variations of the fine structure
constant \alpha and the electron mass m_e in the early universe, and study the
degeneracies between these constants and other cosmological parameters, such as
the dark energy equation of state parameter w_DE, the massive neutrinos
fraction f_\nu, the effective number of relativistic species N_eff, and the
primordial helium abundance Y_He. When only one of the fundamental constants is
varied, our final bounds are \alpha / \alpha_0 = 0.9957_{-0.0042}^{+0.0041} and
m_e /(m_e)_0 = 1.006_{-0.013}^{+0.014}. For their joint variation, our results
are \alpha / \alpha_0 = 0.9901_{-0.0054}^{+0.0055} and m_e /(m_e)_0 = 1.028 +/-
0.019. Although when m_e is allowed to vary our constraints on w_DE are
consistent with a cosmological constant, when \alpha is treated as a free
parameter we find w_DE = -1.20 +/- 0.13; more than 1 \sigma away from its
standard value. When f_\nu and \alpha are allowed to vary simultaneously, we
find f_\nu < 0.043 (95% CL), implying a limit of \sum m_\nu < 0.46 eV (95% CL),
while for m_e variation, we obtain f_nu < 0.086 (95% CL), which implies \sum
m_\nu < 1.1 eV (95% CL). When N_eff or Y_He are considered as free parameters,
their simultaneous variation with \alpha provides constraints close to their
standard values (when the H_0 prior is not included in the analysis), while
when m_e is allowed to vary, their preferred values are significantly higher.
In all cases, our results are consistent with no variations of \alpha or m_e at
the 1 or 2 \sigma level.Comment: 18 pages, 16 figures. Submitted to MNRA
On the effect of climate change on European summer blocking
Atmospheric blocking events are regularly observed mid-latitude weather patterns, which obstruct the usual path of the jet streams. However, there is no well-defined historical dataset of blocking events and the effect of climate change on atmospheric blocking is uncertain. In this thesis, I explore how climate change influences European summer blocking (ESB). I develop a new algorithm to identify regional blocking events (the SOM-BI index), combining supervised and unsupervised learning. This is compared to other methods and a new ground truth dataset. I find the SOM-BI has an improved detection skill over other methods, particularly for climate models.
I apply the SOM-BI to study ESB in the abrupt-4xCO2 experiments from phases 5 and 6 of the Coupled Model Intercomparison Project. These runs maximise the forcing and have not previously been used to study atmospheric blocking. I identify a strong negative correlation between the historical occurrence of ESB and the change in occurrence of ESB. This enables a prediction of the ESB climate response from the historical model bias.
Further, I identify the two main physical mechanisms which affect the ESB climate response: the poleward shift of the North Atlantic jet; and the propagation of Rossby waves across the North Pacific from diabatic heating in the tropical Pacific. I develop an informed physical understanding of these mechanisms, which have not been discussed in the literature as positive influences on the ESB climate response. I then define two metrics as proxies for these physical mechanisms and estimate a positive climate feedback on ESB: 0.22±0.35 days / °C.
My thesis demonstrates the potential for machine learning in studying atmospheric blocking, highlights the importance of tropical forcing in influencing the climate feedback on ESB, and identifies new mechanisms that can be further explored to develop our understanding of how climate change will influence atmospheric blocking.Open Acces
The Local Ly-alpha Forest IV: STIS G140M Spectra and Results on the Distribution and Baryon Content of HI Absorbers
We present HST STIS/G140M spectra of 15 extragalactic targets, which we
combine with GHRS/G160M data to examine the statistical properties of the low-z
Ly-alpha forest. We evaluate the physical properties of these Ly-alpha
absorbers and compare them to their high-z counterparts. We determine that the
warm, photoionized IGM contains 29+/-4% of the total baryon inventory at z = 0.
We derive the distribution in column density, N_HI^(1.65+/-0.07) for 12.5 < log
[N_HI] 14.5. The slowing
of the number density evolution of high-W Ly-alpha clouds is not as great as
previously measured, and the break to slower evolution may occur later than
previously suggested (z~1.0 rather than 1.6). We find a 7.2sigma excess in the
two-point correlation function (TPCF) of Ly-alpha absorbers for velocity
separations less than 260 km/s, which is exclusively due to the higher column
density clouds. From our previous result that higher column density Ly-alpha
clouds cluster more strongly with galaxies, this TPCF suggests a physical
difference between the higher and lower column density clouds in our sample.Comment: 71 pages, 6 tables, 26 EPS figures, to appear in ApJ Supplemen
SampleHST: Efficient On-the-Fly Selection of Distributed Traces
Since only a small number of traces generated from distributed tracing helps in troubleshooting, its storage requirement can be significantly reduced by biasing the selection towards anomalous traces. To aid in this scenario, we propose SampleHST, a novel approach to sample on-the-fly from a stream of traces in an unsupervised manner. SampleHST adjusts the storage quota of normal and anomalous traces depending on the size of its budget. Initially, it utilizes a forest of Half Space Trees (HSTs) for trace scoring. This is based on the distribution of the mass scores across the trees, which characterizes the probability of observing different traces. The mass distribution from HSTs is subsequently used to cluster the traces online leveraging a variant of the mean-shift algorithm. This trace-cluster association eventually drives the sampling decision. We have compared the performance of SampleHST with a recently suggested method using data from a cloud data center and demonstrated that SampleHST improves sampling performance up to by 9.5Ă—
- …