1,243 research outputs found
MiniMax Entropy Network: Learning Category-Invariant Features for Domain Adaptation
How to effectively learn from unlabeled data from the target domain is
crucial for domain adaptation, as it helps reduce the large performance gap due
to domain shift or distribution change. In this paper, we propose an
easy-to-implement method dubbed MiniMax Entropy Networks (MMEN) based on
adversarial learning. Unlike most existing approaches which employ a generator
to deal with domain difference, MMEN focuses on learning the categorical
information from unlabeled target samples with the help of labeled source
samples. Specifically, we set an unfair multi-class classifier named
categorical discriminator, which classifies source samples accurately but be
confused about the categories of target samples. The generator learns a common
subspace that aligns the unlabeled samples based on the target pseudo-labels.
For MMEN, we also provide theoretical explanations to show that the learning of
feature alignment reduces domain mismatch at the category level. Experimental
results on various benchmark datasets demonstrate the effectiveness of our
method over existing state-of-the-art baselines.Comment: 8 pages, 6 figure
Contrast and Clustering: Learning Neighborhood Pair Representation for Source-free Domain Adaptation
Unsupervised domain adaptation uses source data from different distributions
to solve the problem of classifying data from unlabeled target domains.
However, conventional methods require access to source data, which often raise
concerns about data privacy. In this paper, we consider a more practical but
challenging setting where the source domain data is unavailable and the target
domain data is unlabeled. Specifically, we address the domain discrepancy
problem from the perspective of contrastive learning. The key idea of our work
is to learn a domain-invariant feature by 1) performing clustering directly in
the original feature space with nearest neighbors; 2) constructing truly hard
negative pairs by extended neighbors without introducing additional
computational complexity; and 3) combining noise-contrastive estimation theory
to gain computational advantage. We conduct careful ablation studies and
extensive experiments on three common benchmarks: VisDA, Office-Home, and
Office-31. The results demonstrate the superiority of our methods compared with
other state-of-the-art works.Comment: Journal article
Spatio-Temporal Multimedia Big Data Analytics Using Deep Neural Networks
With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era, where new opportunities and challenges appear with the high diversity multimedia data together with the huge amount of social data. Nowadays, multimedia data consisting of audio, text, image, and video has grown tremendously. With such an increase in the amount of multimedia data, the main question raised is how one can analyze this high volume and variety of data in an efficient and effective way. A vast amount of research work has been done in the multimedia area, targeting different aspects of big data analytics, such as the capture, storage, indexing, mining, and retrieval of multimedia big data. However, there is insufficient research that provides a comprehensive framework for multimedia big data analytics and management.
To address the major challenges in this area, a new framework is proposed based on deep neural networks for multimedia semantic concept detection with a focus on spatio-temporal information analysis and rare event detection. The proposed framework is able to discover the pattern and knowledge of multimedia data using both static deep data representation and temporal semantics. Specifically, it is designed to handle data with skewed distributions. The proposed framework includes the following components: (1) a synthetic data generation component based on simulation and adversarial networks for data augmentation and deep learning training, (2) an automatic sampling model to overcome the imbalanced data issue in multimedia data, (3) a deep representation learning model leveraging novel deep learning techniques to generate the most discriminative static features from multimedia data, (4) an automatic hyper-parameter learning component for faster training and convergence of the learning models, (5) a spatio-temporal deep learning model to analyze dynamic features from multimedia data, and finally (6) a multimodal deep learning fusion model to integrate different data modalities. The whole framework has been evaluated using various large-scale multimedia datasets that include the newly collected disaster-events video dataset and other public datasets
- …