199,543 research outputs found
A Deep Embedding Model for Co-occurrence Learning
Co-occurrence Data is a common and important information source in many
areas, such as the word co-occurrence in the sentences, friends co-occurrence
in social networks and products co-occurrence in commercial transaction data,
etc, which contains rich correlation and clustering information about the
items. In this paper, we study co-occurrence data using a general energy-based
probabilistic model, and we analyze three different categories of energy-based
model, namely, the , and models, which are able to capture
different levels of dependency in the co-occurrence data. We also discuss how
several typical existing models are related to these three types of energy
models, including the Fully Visible Boltzmann Machine (FVBM) (), Matrix
Factorization (), Log-BiLinear (LBL) models (), and the Restricted
Boltzmann Machine (RBM) model (). Then, we propose a Deep Embedding Model
(DEM) (an model) from the energy model in a \emph{principled} manner.
Furthermore, motivated by the observation that the partition function in the
energy model is intractable and the fact that the major objective of modeling
the co-occurrence data is to predict using the conditional probability, we
apply the \emph{maximum pseudo-likelihood} method to learn DEM. In consequence,
the developed model and its learning method naturally avoid the above
difficulties and can be easily used to compute the conditional probability in
prediction. Interestingly, our method is equivalent to learning a special
structured deep neural network using back-propagation and a special sampling
strategy, which makes it scalable on large-scale datasets. Finally, in the
experiments, we show that the DEM can achieve comparable or better results than
state-of-the-art methods on datasets across several application domains
Adaptive Resonance Theory (ART) for social media analytics
This chapter presents the ART-based clustering algorithms for social media analytics in detail. Sections 3.1 and 3.2 introduce Fuzzy ART and its clustering mechanisms, respectively, which provides a deep understanding of the base model that is used and extended for handling the social media clustering challenges. Important concepts such as vigilance region (VR) and its properties are explained and proven. Subsequently, Sects. 3.3-3.7 illustrate five types of ART adaptive resonance theory variants, each of which addresses the challenges in one social media analytical scenario, including automated parameter adaptation, user preference incorporation, short text clustering, heterogeneous data co-clustering and online streaming data indexing. The content of this chapter is several prior studies, including Probabilistic ART [15
Clustering of Very Red Galaxies in the Las Campanas IR Survey
We report results from the first 1000 square arc-minutes of the Las Campanas
IR survey. We have imaged 1 square degree of high latitude sky in six distinct
fields to a 5-sigma H-band depth of 20.5 (Vega). Optical imaging in the
V,R,I,and z' bands allow us to select color subsets and
photometric-redshift-defined shells. We show that the angular clustering of
faint red galaxies (18 3) is an order of magnitude stronger
than that of the complete H-selected field sample. We employ three approaches
to estimate in order to invert w(theta) to derive r_0. We find that our
n(z) is well described by a Gaussian with = 1.2, sigma(z) = 0.15. From this
we derive a value for r_0 of 7 (+2,-1) co-moving H^{-1} Mpc at = 1.2. This
is a factor of ~ 2 larger than the clustering length for Lyman break galaxies
and is similar to the expectation for early type galaxies at this epoch.Comment: 5 pages, 2 figures, 1 table. To appear in proceedings of the
ESO/ECF/STScI workshop "Deep Fields" held in Garching, Germany, 9-12 October
200
Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models
Deep learning has shown state-of-art classification performance on datasets
such as ImageNet, which contain a single object in each image. However,
multi-object classification is far more challenging. We present a unified
framework which leverages the strengths of multiple machine learning methods,
viz deep learning, probabilistic models and kernel methods to obtain
state-of-art performance on Microsoft COCO, consisting of non-iconic images. We
incorporate contextual information in natural images through a conditional
latent tree probabilistic model (CLTM), where the object co-occurrences are
conditioned on the extracted fc7 features from pre-trained Imagenet CNN as
input. We learn the CLTM tree structure using conditional pairwise
probabilities for object co-occurrences, estimated through kernel methods, and
we learn its node and edge potentials by training a new 3-layer neural network,
which takes fc7 features as input. Object classification is carried out via
inference on the learnt conditional tree model, and we obtain significant gain
in precision-recall and F-measures on MS-COCO, especially for difficult object
categories. Moreover, the latent variables in the CLTM capture scene
information: the images with top activations for a latent node have common
themes such as being a grasslands or a food scene, and on on. In addition, we
show that a simple k-means clustering of the inferred latent nodes alone
significantly improves scene classification performance on the MIT-Indoor
dataset, without the need for any retraining, and without using scene labels
during training. Thus, we present a unified framework for multi-object
classification and unsupervised scene understanding
Deep observations of CO line emission from star-forming galaxies in a cluster candidate at z=1.5
We report results from a deep Jansky Very Large Array (JVLA) search for CO
1-0 line emission from galaxies in a candidate galaxy cluster at z~1.55 in the
COSMOS field. We target 4 galaxies with optical spectroscopic redshifts in the
range z=1.47-1.59. Two of these 4 galaxies, ID51613 and ID51813, are nominally
detected in CO line emission at the 3-4 sigma level. We find CO luminosities of
2.4x10^10 K km/s pc^2 and 1.3x10^10 K km/s pc^2, respectively. Taking advantage
from the clustering and 2-GHz bandwidth of the JVLA, we perform a search for
emission lines in the proximity of optical sources within the field of view of
our observations. We limit our search to galaxies with K<23.5 (AB) and
z_phot=1.2-1.8. We find 2 bright optical galaxies to be associated with
significant emission line peaks (>4 sigma) in the data cube, which we identify
with the CO line emission. To test the reliability of the line peaks found, we
performed a parallel search for line peaks using a Bayesian inference method.
Monte Carlo simulations show that such associations are statistically
significant, with probabilities of chance association of 3.5% and 10.7% for ID
51207 and ID 51380, respectively. Modeling of their optical/IR SEDs indicates
that the CO detected galaxies and candidates have stellar masses and SFRs in
the range (0.3-1.1)x10^11 M_sun and 60-160 M_sun/yr, with SFEs comparable to
that found in other star-forming galaxies at similar redshifts. By comparing
the space density of CO emitters derived from our observations with the space
density derived from previous CO detections at z~1.5, and with semi-analytic
predictions for the CO luminosity function, we suggest that the latter tend to
underestimate the number of CO galaxies detected at high-redshift. Finally, we
argue about the benefits of future blind CO searches in clustered fields with
upcoming submm/radio facilities.Comment: Accepted for publication in MNRAS. Abstract has been slightly
shortened compared to original pdf versio
The VIRMOS deep imaging survey II: CFH12K BVRI optical data for the 0226-04 deep field
(abridged) In this paper we describe in detail the reduction, preparation and
reliability of the photometric catalogues which comprise the 1.2 deg^2
CFH12K-VIRMOS deep field. The survey reaches a limiting magnitude of BAB~26.5,
VAB~26.2, RAB~25.9 IAB~25.0 and contains 90,729 extended sources in the
magnitude range 18.0<IAB<24.0. We demonstrate our catalogues are free from
systematic biases and are complete and reliable down these limits. We estimate
that the upper limit on bin-to-bin systematic photometric errors for the I-
limited sample is ~10% in this magnitude range. We estimate that 68% of the
catalogues sources have absolute per co-ordinate astrometric uncertainties less
than ~0.38" and ~0.32" (alpha,delta). Our internal (filter-to-filter) per
co-ordinate astrometric uncertainties are 0.08" and 0.08" (alpha,delta). We
quantify the completeness of our survey in the joint space defined by object
total magnitude and peak surface brightness. Finally, we present numerous
comparisons between our catalogues and published literature data: galaxy and
star counts, galaxy and stellar colours, and the clustering of both point-like
and extended populations. In all cases our measurements are in excellent
agreement with literature data to IAB<24.0. This combination of depth and areal
coverage makes this multi-colour catalogue a solid foundation to select
galaxies for follow-up spectroscopy with VIMOS on the ESO-VLT and a unique
database to study the formation and evolution of the faint galaxy population to
z~1 and beyond.Comment: 18 pages, 23 figures, accepted for publication in A&
- …