243 research outputs found
Efficient and Effective Deep Multi-view Subspace Clustering
Recent multi-view subspace clustering achieves impressive results utilizing
deep networks, where the self-expressive correlation is typically modeled by a
fully connected (FC) layer. However, they still suffer from two limitations. i)
The parameter scale of the FC layer is quadratic to sample numbers, resulting
in high time and memory costs that significantly degrade their feasibility in
large-scale datasets. ii) It is under-explored to extract a unified
representation that simultaneously satisfies minimal sufficiency and
discriminability. To this end, we propose a novel deep framework, termed
Efficient and Effective deep Multi-View Subspace Clustering (EMVSC).
Instead of a parameterized FC layer, we design a Relation-Metric Net that
decouples network parameter scale from sample numbers for greater computational
efficiency. Most importantly, the proposed method devises a multi-type
auto-encoder to explicitly decouple consistent, complementary, and superfluous
information from every view, which is supervised by a soft clustering
assignment similarity constraint. Following information bottleneck theory and
the maximal coding rate reduction principle, a sufficient yet minimal unified
representation can be obtained, as well as pursuing intra-cluster aggregation
and inter-cluster separability within it. Extensive experiments show that
EMVSC yields comparable results to existing methods and achieves
state-of-the-art performance in various types of multi-view datasets
Multiple Kernel Driven Clustering With Locally Consistent and Selfish Graph in Industrial IoT
[EN] In the cognitive computing of intelligent industrial Internet of Things, clustering is a fundamental machine learning problem to exploit the latent data relationships. To overcome the challenge of kernel choice for nonlinear clustering tasks, multiple kernel clustering (MKC) has attracted intensive attention. However, existing graph-based MKC methods mainly aim to learn a consensus kernel as well as an affinity graph from multiple candidate kernels, which cannot fully exploit the latent graph information. In this article, we propose a novel pure graph-based MKC method. Specifically, a new graph model is proposed to preserve the local manifold structure of the data in kernel space so as to learn multiple candidate graphs. Afterward, the latent consistency and selfishness of these candidate graphs are fully considered. Furthermore, a graph connectivity constraint is introduced to avoid requiring any postprocessing clustering step. Comprehensive experimental results demonstrate the superiority of our method.This work was supported in part by Sichuan Science and Technology Program under Grant 2020ZDZX0014 and Grant 2019ZDZX0119 and in part by the Key Lab of Film and TV Media Technology of Zhejiang Province under Grant 2020E10015.Ren, Z.; Mukherjee, M.; Lloret, J.; Venu, P. (2021). Multiple Kernel Driven Clustering With Locally Consistent and Selfish Graph in Industrial IoT. IEEE Transactions on Industrial Informatics. 17(4):2956-2963. https://doi.org/10.1109/TII.2020.3010357S2956296317
Two-Level Text Classification Using Hybrid Machine Learning Techniques
Nowadays, documents are increasingly being associated with multi-level
category hierarchies rather than a flat category scheme. To access these
documents in real time, we need fast automatic methods to navigate these
hierarchies. Today’s vast data repositories such as the web also contain many
broad domains of data which are quite distinct from each other e.g. medicine,
education, sports and politics. Each domain constitutes a subspace of the data
within which the documents are similar to each other but quite distinct from the
documents in another subspace. The data within these domains is frequently
further divided into many subcategories.
Subspace Learning is a technique popular with non-text domains such as
image recognition to increase speed and accuracy. Subspace analysis lends
itself naturally to the idea of hybrid classifiers. Each subspace can be
processed by a classifier best suited to the characteristics of that particular
subspace. Instead of using the complete set of full space feature dimensions,
classifier performances can be boosted by using only a subset of the
dimensions.
This thesis presents a novel hybrid parallel architecture using separate
classifiers trained on separate subspaces to improve two-level text
classification. The classifier to be used on a particular input and the relevant
feature subset to be extracted is determined dynamically by using a novel
method based on the maximum significance value. A novel vector
representation which enhances the distinction between classes within the
subspace is also developed. This novel system, the Hybrid Parallel Classifier,
was compared against the baselines of several single classifiers such as the
Multilayer Perceptron and was found to be faster and have higher two-level
classification accuracies. The improvement in performance achieved was even
higher when dealing with more complex category hierarchies
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition
Sparse representation based classification (SRC) methods have achieved
remarkable results. SRC, however, still suffer from requiring enough training
samples, insufficient use of test samples and instability of representation. In
this paper, a stable inverse projection representation based classification
(IPRC) is presented to tackle these problems by effectively using test samples.
An IPR is firstly proposed and its feasibility and stability are analyzed. A
classification criterion named category contribution rate is constructed to
match the IPR and complete classification. Moreover, a statistical measure is
introduced to quantify the stability of representation-based classification
methods. Based on the IPRC technique, a robust tumor recognition framework is
presented by interpreting microarray gene expression data, where a two-stage
hybrid gene selection method is introduced to select informative genes.
Finally, the functional analysis of candidate's pathogenicity-related genes is
given. Extensive experiments on six public tumor microarray gene expression
datasets demonstrate the proposed technique is competitive with
state-of-the-art methods.Comment: 14 pages, 19 figures, 10 table
- …