2,100 research outputs found
A Convex Relaxation for Weakly Supervised Classifiers
This paper introduces a general multi-class approach to weakly supervised
classification. Inferring the labels and learning the parameters of the model
is usually done jointly through a block-coordinate descent algorithm such as
expectation-maximization (EM), which may lead to local minima. To avoid this
problem, we propose a cost function based on a convex relaxation of the
soft-max loss. We then propose an algorithm specifically designed to
efficiently solve the corresponding semidefinite program (SDP). Empirically,
our method compares favorably to standard ones on different datasets for
multiple instance learning and semi-supervised learning as well as on
clustering tasks.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Deep Clustering: A Comprehensive Survey
Cluster analysis plays an indispensable role in machine learning and data
mining. Learning a good data representation is crucial for clustering
algorithms. Recently, deep clustering, which can learn clustering-friendly
representations using deep neural networks, has been broadly applied in a wide
range of clustering tasks. Existing surveys for deep clustering mainly focus on
the single-view fields and the network architectures, ignoring the complex
application scenarios of clustering. To address this issue, in this paper we
provide a comprehensive survey for deep clustering in views of data sources.
With different data sources and initial conditions, we systematically
distinguish the clustering methods in terms of methodology, prior knowledge,
and architecture. Concretely, deep clustering methods are introduced according
to four categories, i.e., traditional single-view deep clustering,
semi-supervised deep clustering, deep multi-view clustering, and deep transfer
clustering. Finally, we discuss the open challenges and potential future
opportunities in different fields of deep clustering
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
DealMVC: Dual Contrastive Calibration for Multi-view Clustering
Benefiting from the strong view-consistent information mining capacity,
multi-view contrastive clustering has attracted plenty of attention in recent
years. However, we observe the following drawback, which limits the clustering
performance from further improvement. The existing multi-view models mainly
focus on the consistency of the same samples in different views while ignoring
the circumstance of similar but different samples in cross-view scenarios. To
solve this problem, we propose a novel Dual contrastive calibration network for
Multi-View Clustering (DealMVC). Specifically, we first design a fusion
mechanism to obtain a global cross-view feature. Then, a global contrastive
calibration loss is proposed by aligning the view feature similarity graph and
the high-confidence pseudo-label graph. Moreover, to utilize the diversity of
multi-view information, we propose a local contrastive calibration loss to
constrain the consistency of pair-wise view features. The feature structure is
regularized by reliable class information, thus guaranteeing similar samples
have similar features in different views. During the training procedure, the
interacted cross-view feature is jointly optimized at both local and global
levels. In comparison with other state-of-the-art approaches, the comprehensive
experimental results obtained from eight benchmark datasets provide substantial
validation of the effectiveness and superiority of our algorithm. We release
the code of DealMVC at https://github.com/xihongyang1999/DealMVC on GitHub
- …