220 research outputs found
A probabilistic approach to emission-line galaxy classification
We invoke a Gaussian mixture model (GMM) to jointly analyse two traditional
emission-line classification schemes of galaxy ionization sources: the
Baldwin-Phillips-Terlevich (BPT) and vs. [NII]/H
(WHAN) diagrams, using spectroscopic data from the Sloan Digital Sky Survey
Data Release 7 and SEAGal/STARLIGHT datasets. We apply a GMM to empirically
define classes of galaxies in a three-dimensional space spanned by the
[OIII]/H, [NII]/H, and EW(H), optical
parameters. The best-fit GMM based on several statistical criteria suggests a
solution around four Gaussian components (GCs), which are capable to explain up
to 97 per cent of the data variance. Using elements of information theory, we
compare each GC to their respective astronomical counterpart. GC1 and GC4 are
associated with star-forming galaxies, suggesting the need to define a new
starburst subgroup. GC2 is associated with BPT's Active Galaxy Nuclei (AGN)
class and WHAN's weak AGN class. GC3 is associated with BPT's composite class
and WHAN's strong AGN class. Conversely, there is no statistical evidence --
based on four GCs -- for the existence of a Seyfert/LINER dichotomy in our
sample. Notwithstanding, the inclusion of an additional GC5 unravels it. The
GC5 appears associated to the LINER and Passive galaxies on the BPT and WHAN
diagrams respectively. Subtleties aside, we demonstrate the potential of our
methodology to recover/unravel different objects inside the wilderness of
astronomical datasets, without lacking the ability to convey physically
interpretable results. The probabilistic classifications from the GMM analysis
are publicly available within the COINtoolbox
(https://cointoolbox.github.io/GMM\_Catalogue/).Comment: Accepted for publication in MNRA
A Uniformly Selected Sample of Low-mass Black Holes in Seyfert 1 Galaxies
We have conducted a systematic search of low-mass black holes (BHs) in active
galactic nuclei (AGNs) with broad Halpha emission lines, aiming at building a
homogeneous sample that is more complete than previous ones for fainter, less
highly accreting sources. For this purpose, we developed a set of elaborate,
automated selection procedures and applied it uniformly to the Fourth Data
Release of the Sloan Digital Sky Survey. Special attention is given to
AGN--galaxy spectral decomposition and emission-line deblending. We define a
sample of 309 type 1 AGNs with BH masses in the range -- \msun (with a median of solar mass), using the
virial mass estimator based on the broad Halpha line. About half of our sample
of low-mass BHs differs from that of Greene & Ho, with 61 of them discovered
here for the first time. Our new sample picks up more AGNs with low accretion
rates: the Eddington ratios of the present sample range from to ~1,
with 30% below 0.1. This suggests that a significant fraction of low-mass BHs
in the local Universe are accreting at low rates. The host galaxies of the
low-mass BHs have luminosities similar to those of field galaxies,
optical colors of Sbc spirals, and stellar spectral features consistent with a
continuous star formation history with a mean stellar age of less than 1 Gyr.Comment: Accepted for publication in Ap
Predicting Daily Probability Distributions Of S&P500 Returns
Most approaches in forecasting merely try to predict the next value of the time series.
In contrast, this paper presents a framework to predict the full probability distribution. It
is expressed as a mixture model: the dynamics of the individual states is modeled with so-called
"experts" (potentially nonlinear neural networks), and the dynamics between the states is modeled
using a hidden Markov approach. The full density predictions are obtained by a weighted superposition
of the individual densities of each expert. This model class is called "hidden Markov experts".
Results are presented for daily S&P500 data. While the predictive accuracy of the mean does
not improve over simpler models, evaluating the prediction of the full density shows a clear out-of-sample
improvement both over a simple GARCH(1,l) model (which assumes Gaussian distributed
returns) and over a "gated experts" model (which expresses the weighting for each state non-recursively
as a function of external inputs). Several interpretations are given: the blending of
supervised and unsupervised learning, the discovery of hidden states, the combination of forecasts,
the specialization of experts, the removal of outliers, and the persistence of volatility.Information Systems Working Papers Serie
Learning and Using Taxonomies For Fast Visual Categorization
The computational complexity of current visual categorization algorithms scales linearly at best with the number of categories. The goal of classifying simultaneously N_(cat) = 10^4 - 10^5 visual categories requires sub-linear classification costs. We explore algorithms for automatically building classification trees which have, in principle, log N_(cat) complexity. We find that a greedy algorithm that recursively splits the set of categories into the two minimally confused subsets achieves 5-20 fold speedups at a small cost in classification performance. Our approach is independent of the specific classification algorithm used. A welcome by-product of our algorithm is a very reasonable taxonomy of the Caltech-256 dataset
Robust Event Detection and Retrieval in Surveillance Video
We developed a robust event detection and retrieval system for surveillance video. The proposed system offers vision-based capabilities for the detection and tracking of various objects of interest, and can recognize events such as: 1. a person with certain attributes being present in the scene; 2. two people meeting; 3. people carrying bags; 4. bags being dropped; 5. bags being stolen; 6. bags being exchanged; 7. two people handshaking; 8. one person's pointing gesture. We use an improved adaptive Gaussian mixture model for background modeling and foreground detection; a connected component labeling algorithm is then employed to label the foreground pixels. A Kalman filter approach is used to build models for the entities of interest (people and bags), which is combined with color histograms for tracking. We use shape symmetry analysis and color histograms to detect people carrying bags. Our experiments demonstrate the ability to search for instances of events according to specific attributes in large video sequences
Second-order Temporal Pooling for Action Recognition
Deep learning models for video-based action recognition usually generate
features for short clips (consisting of a few frames); such clip-level features
are aggregated to video-level representations by computing statistics on these
features. Typically zero-th (max) or the first-order (average) statistics are
used. In this paper, we explore the benefits of using second-order statistics.
Specifically, we propose a novel end-to-end learnable feature aggregation
scheme, dubbed temporal correlation pooling that generates an action descriptor
for a video sequence by capturing the similarities between the temporal
evolution of clip-level CNN features computed across the video. Such a
descriptor, while being computationally cheap, also naturally encodes the
co-activations of multiple CNN features, thereby providing a richer
characterization of actions than their first-order counterparts. We also
propose higher-order extensions of this scheme by computing correlations after
embedding the CNN features in a reproducing kernel Hilbert space. We provide
experiments on benchmark datasets such as HMDB-51 and UCF-101, fine-grained
datasets such as MPII Cooking activities and JHMDB, as well as the recent
Kinetics-600. Our results demonstrate the advantages of higher-order pooling
schemes that when combined with hand-crafted features (as is standard practice)
achieves state-of-the-art accuracy.Comment: Accepted in the International Journal of Computer Vision (IJCV
- …