399 research outputs found
Multi-Source Multi-View Clustering via Discrepancy Penalty
With the advance of technology, entities can be observed in multiple views.
Multiple views containing different types of features can be used for
clustering. Although multi-view clustering has been successfully applied in
many applications, the previous methods usually assume the complete instance
mapping between different views. In many real-world applications, information
can be gathered from multiple sources, while each source can contain multiple
views, which are more cohesive for learning. The views under the same source
are usually fully mapped, but they can be very heterogeneous. Moreover, the
mappings between different sources are usually incomplete and partially
observed, which makes it more difficult to integrate all the views across
different sources. In this paper, we propose MMC (Multi-source Multi-view
Clustering), which is a framework based on collective spectral clustering with
a discrepancy penalty across sources, to tackle these challenges. MMC has
several advantages compared with other existing methods. First, MMC can deal
with incomplete mapping between sources. Second, it considers the disagreements
between sources while treating views in the same source as a cohesive set.
Third, MMC also tries to infer the instance similarities across sources to
enhance the clustering performance. Extensive experiments conducted on
real-world data demonstrate the effectiveness of the proposed approach
Online Unsupervised Multi-view Feature Selection
In the era of big data, it is becoming common to have data with multiple
modalities or coming from multiple sources, known as "multi-view data".
Multi-view data are usually unlabeled and come from high-dimensional spaces
(such as language vocabularies), unsupervised multi-view feature selection is
crucial to many applications. However, it is nontrivial due to the following
challenges. First, there are too many instances or the feature dimensionality
is too large. Thus, the data may not fit in memory. How to select useful
features with limited memory space? Second, how to select features from
streaming data and handles the concept drift? Third, how to leverage the
consistent and complementary information from different views to improve the
feature selection in the situation when the data are too big or come in as
streams? To the best of our knowledge, none of the previous works can solve all
the challenges simultaneously. In this paper, we propose an Online unsupervised
Multi-View Feature Selection, OMVFS, which deals with large-scale/streaming
multi-view data in an online fashion. OMVFS embeds unsupervised feature
selection into a clustering algorithm via NMF with sparse learning. It further
incorporates the graph regularization to preserve the local structure
information and help select discriminative features. Instead of storing all the
historical data, OMVFS processes the multi-view data chunk by chunk and
aggregates all the necessary information into several small matrices. By using
the buffering technique, the proposed OMVFS can reduce the computational and
storage cost while taking advantage of the structure information. Furthermore,
OMVFS can capture the concept drifts in the data streams. Extensive experiments
on four real-world datasets show the effectiveness and efficiency of the
proposed OMVFS method. More importantly, OMVFS is about 100 times faster than
the off-line methods
Learning from Multi-View Multi-Way Data via Structural Factorization Machines
Real-world relations among entities can often be observed and determined by
different perspectives/views. For example, the decision made by a user on
whether to adopt an item relies on multiple aspects such as the contextual
information of the decision, the item's attributes, the user's profile and the
reviews given by other users. Different views may exhibit multi-way
interactions among entities and provide complementary information. In this
paper, we introduce a multi-tensor-based approach that can preserve the
underlying structure of multi-view data in a generic predictive model.
Specifically, we propose structural factorization machines (SFMs) that learn
the common latent spaces shared by multi-view tensors and automatically adjust
the importance of each view in the predictive model. Furthermore, the
complexity of SFMs is linear in the number of parameters, which make SFMs
suitable to large-scale problems. Extensive experiments on real-world datasets
demonstrate that the proposed SFMs outperform several state-of-the-art methods
in terms of prediction accuracy and computational cost.Comment: 10 page
On Spectral Graph Embedding: A Non-Backtracking Perspective and Graph Approximation
Graph embedding has been proven to be efficient and effective in facilitating
graph analysis. In this paper, we present a novel spectral framework called
NOn-Backtracking Embedding (NOBE), which offers a new perspective that
organizes graph data at a deep level by tracking the flow traversing on the
edges with backtracking prohibited. Further, by analyzing the non-backtracking
process, a technique called graph approximation is devised, which provides a
channel to transform the spectral decomposition on an edge-to-edge matrix to
that on a node-to-node matrix. Theoretical guarantees are provided by bounding
the difference between the corresponding eigenvalues of the original graph and
its graph approximation. Extensive experiments conducted on various real-world
networks demonstrate the efficacy of our methods on both macroscopic and
microscopic levels, including clustering and structural hole spanner detection.Comment: SDM 2018 (Full version including all proofs
Multi-view Graph Embedding with Hub Detection for Brain Network Analysis
Multi-view graph embedding has become a widely studied problem in the area of
graph learning. Most of the existing works on multi-view graph embedding aim to
find a shared common node embedding across all the views of the graph by
combining the different views in a specific way. Hub detection, as another
essential topic in graph mining has also drawn extensive attentions in recent
years, especially in the context of brain network analysis. Both the graph
embedding and hub detection relate to the node clustering structure of graphs.
The multi-view graph embedding usually implies the node clustering structure of
the graph based on the multiple views, while the hubs are the boundary-spanning
nodes across different node clusters in the graph and thus may potentially
influence the clustering structure of the graph. However, none of the existing
works in multi-view graph embedding considered the hubs when learning the
multi-view embeddings. In this paper, we propose to incorporate the hub
detection task into the multi-view graph embedding framework so that the two
tasks could benefit each other. Specifically, we propose an auto-weighted
framework of Multi-view Graph Embedding with Hub Detection (MVGE-HD) for brain
network analysis. The MVGE-HD framework learns a unified graph embedding across
all the views while reducing the potential influence of the hubs on blurring
the boundaries between node clusters in the graph, thus leading to a clear and
discriminative node clustering structure for the graph. We apply MVGE-HD on two
real multi-view brain network datasets (i.e., HIV and Bipolar). The
experimental results demonstrate the superior performance of the proposed
framework in brain network analysis for clinical investigation and application
- …