86,194 research outputs found
Recommended from our members
Constrained Instance Clustering in Multi-Instance Multi-Label Learning
In multi-instance multi-label (MIML) learning, datasets are given in the form of bags, each of which contains multiple instances and is associated with multiple labels. This paper considers a novel instance clustering problem in MIML learning, where the bag labels are used as background knowledge to help group instances into clusters. The goal is to recover the class labels or to find the subclasses within each class. Prior work on constraint-based clustering focuses on pairwise constraints and can not fully utilize the bag-level label information. We propose to encode the bag-label knowledge into soft bag constraints that can be easily incorporated into any optimization based clustering algorithm. As a specific example, we demonstrate how the bag constraints can be incorporated into a popular spectral clustering algorithm. Empirical results on both synthetic and real-world datasets show that the proposed method achieves promising performance compared to state-of-the-art methods that use pairwise constraints.Keywords: MIML, Bag Constraints, Instance Clustering, Spectral Clustering, Constrained Clusterin
Angular Correlations of the X-Ray Background and Clustering of Extragalactic X-Ray Sources
The information content of the autocorrelation function (ACF) of intensity
fluctuations of the X-ray background (XRB) is analyzed. The tight upper limits
set by ROSAT deep survey data on the ACF at arcmin scales imply strong
constraints on clustering properties of X-ray sources at cosmological distances
and on their contribution to the soft XRB. If quasars have a clustering radius
r_0=12-20 Mpc (H_0=50), and their two point correlation function, is constant
in comoving coordinates as indicated by optical data, they cannot make up more
40-50% of the soft XRB (the maximum contribution may reach 80% in the case of
stable clustering, epsilon=0). Active Star-forming (ASF) galaxies clustered
like normal galaxies, with r_0=10-12 Mpc can yield up to 20% or up to 40% of
the soft XRB for epsilon=-1.2 or epsilon=0, respectively. The ACF on degree
scales essentially reflects the clustering properties of local sources and is
proportional to their volume emissivity. The upper limits on scales of a few
degrees imply that hard X-ray selected AGNs have r_0<25 Mpc if epsilon=0 or
r_0<20 Mpc if epsilon=-1.2. No significant constraints are set on clustering of
ASF galaxies, due to their low local volume emissivity. The possible signal on
scales >6 deg, if real, may be due to AGNs with r_0=20 Mpc; the contribution
from clusters of galaxies with r_0~50 Mpc is a factor 2 lower.Comment: ApJ, in press (20 July 1993); 28 pages, TeX, ASTRPD-93-2-0
Fronthaul-Constrained Cloud Radio Access Networks: Insights and Challenges
As a promising paradigm for fifth generation (5G) wireless communication
systems, cloud radio access networks (C-RANs) have been shown to reduce both
capital and operating expenditures, as well as to provide high spectral
efficiency (SE) and energy efficiency (EE). The fronthaul in such networks,
defined as the transmission link between a baseband unit (BBU) and a remote
radio head (RRH), requires high capacity, but is often constrained. This
article comprehensively surveys recent advances in fronthaul-constrained
C-RANs, including system architectures and key techniques. In particular, key
techniques for alleviating the impact of constrained fronthaul on SE/EE and
quality of service for users, including compression and quantization,
large-scale coordinated processing and clustering, and resource allocation
optimization, are discussed. Open issues in terms of software-defined
networking, network function virtualization, and partial centralization are
also identified.Comment: 5 Figures, accepted by IEEE Wireless Communications. arXiv admin
note: text overlap with arXiv:1407.3855 by other author
Soft clustering analysis of galaxy morphologies: A worked example with SDSS
Context: The huge and still rapidly growing amount of galaxies in modern sky
surveys raises the need of an automated and objective classification method.
Unsupervised learning algorithms are of particular interest, since they
discover classes automatically. Aims: We briefly discuss the pitfalls of
oversimplified classification methods and outline an alternative approach
called "clustering analysis". Methods: We categorise different classification
methods according to their capabilities. Based on this categorisation, we
present a probabilistic classification algorithm that automatically detects the
optimal classes preferred by the data. We explore the reliability of this
algorithm in systematic tests. Using a small sample of bright galaxies from the
SDSS, we demonstrate the performance of this algorithm in practice. We are able
to disentangle the problems of classification and parametrisation of galaxy
morphologies in this case. Results: We give physical arguments that a
probabilistic classification scheme is necessary. The algorithm we present
produces reasonable morphological classes and object-to-class assignments
without any prior assumptions. Conclusions: There are sophisticated automated
classification algorithms that meet all necessary requirements, but a lot of
work is still needed on the interpretation of the results.Comment: 18 pages, 19 figures, 2 tables, submitted to A
Modularity functions maximization with nonnegative relaxation facilitates community detection in networks
We show here that the problem of maximizing a family of quantitative
functions, encompassing both the modularity (Q-measure) and modularity density
(D-measure), for community detection can be uniformly understood as a
combinatoric optimization involving the trace of a matrix called modularity
Laplacian. Instead of using traditional spectral relaxation, we apply
additional nonnegative constraint into this graph clustering problem and design
efficient algorithms to optimize the new objective. With the explicit
nonnegative constraint, our solutions are very close to the ideal community
indicator matrix and can directly assign nodes into communities. The
near-orthogonal columns of the solution can be reformulated as the posterior
probability of corresponding node belonging to each community. Therefore, the
proposed method can be exploited to identify the fuzzy or overlapping
communities and thus facilitates the understanding of the intrinsic structure
of networks. Experimental results show that our new algorithm consistently,
sometimes significantly, outperforms the traditional spectral relaxation
approaches
- …