79 research outputs found
Representation Learning with Fine-grained Patterns
With the development of computational power and techniques for data
collection, deep learning demonstrates a superior performance over most of
existing algorithms on benchmark data sets. Many efforts have been devoted to
studying the mechanism of deep learning. One important observation is that deep
learning can learn the discriminative patterns from raw materials directly in a
task-dependent manner. Therefore, the representations obtained by deep learning
outperform hand-crafted features significantly. However, those patterns are
often learned from super-class labels due to a limited availability of
fine-grained labels, while fine-grained patterns are desired in many real-world
applications such as visual search in online shopping. To mitigate the
challenge, we propose an algorithm to learn the fine-grained patterns
sufficiently when only super-class labels are available. The effectiveness of
our method can be guaranteed with the theoretical analysis. Extensive
experiments on real-world data sets demonstrate that the proposed method can
significantly improve the performance on target tasks corresponding to
fine-grained classes, when only super-class information is available for
training
Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Vision-language pre-training methods, e.g., CLIP, demonstrate an impressive
zero-shot performance on visual categorizations with the class proxy from the
text embedding of the class name. However, the modality gap between the text
and vision space can result in a sub-optimal performance. We theoretically show
that the gap cannot be reduced sufficiently by minimizing the contrastive loss
in CLIP and the optimal proxy for vision tasks may reside only in the vision
space. Therefore, given unlabeled target vision data, we propose to learn the
vision proxy directly with the help from the text proxy for zero-shot transfer.
Moreover, according to our theoretical analysis, strategies are developed to
further refine the pseudo label obtained by the text proxy to facilitate the
intra-modal proxy learning (InMaP) for vision. Experiments on extensive
downstream tasks confirm the effectiveness and efficiency of our proposal.
Concretely, InMaP can obtain the vision proxy within one minute on a single GPU
while improving the zero-shot accuracy from to on ImageNet
with ViT-L/14@336 pre-trained by CLIP. Code is available at
\url{https://github.com/idstcv/InMaP}.Comment: accepted by NeurIPS'2
Exact and Consistent Interpretation for Piecewise Linear Neural Networks: A Closed Form Solution
Strong intelligent machines powered by deep neural networks are increasingly
deployed as black boxes to make decisions in risk-sensitive domains, such as
finance and medical. To reduce potential risk and build trust with users, it is
critical to interpret how such machines make their decisions. Existing works
interpret a pre-trained neural network by analyzing hidden neurons, mimicking
pre-trained models or approximating local predictions. However, these methods
do not provide a guarantee on the exactness and consistency of their
interpretation. In this paper, we propose an elegant closed form solution named
to compute exact and consistent interpretations for the family of
Piecewise Linear Neural Networks (PLNN). The major idea is to first transform a
PLNN into a mathematically equivalent set of linear classifiers, then interpret
each linear classifier by the features that dominate its prediction. We further
apply to demonstrate the effectiveness of non-negative and sparse
constraints on improving the interpretability of PLNNs. The extensive
experiments on both synthetic and real world data sets clearly demonstrate the
exactness and consistency of our interpretation.Comment: KDD 201
AugDMC: Data Augmentation Guided Deep Multiple Clustering
Clustering aims to group similar objects together while separating dissimilar
ones apart. Thereafter, structures hidden in data can be identified to help
understand data in an unsupervised manner. Traditional clustering methods such
as k-means provide only a single clustering for one data set. Deep clustering
methods such as auto-encoder based clustering methods have shown a better
performance, but still provide a single clustering. However, a given dataset
might have multiple clustering structures and each represents a unique
perspective of the data. Therefore, some multiple clustering methods have been
developed to discover multiple independent structures hidden in data. Although
deep multiple clustering methods provide better performance, how to efficiently
capture the alternative perspectives in data is still a problem. In this paper,
we propose AugDMC, a novel data Augmentation guided Deep Multiple Clustering
method, to tackle the challenge. Specifically, AugDMC leverages data
augmentations to automatically extract features related to a certain aspect of
the data using a self-supervised prototype-based representation learning, where
different aspects of the data can be preserved under different data
augmentations. Moreover, a stable optimization strategy is proposed to
alleviate the unstable problem from different augmentations. Thereafter,
multiple clusterings based on different aspects of the data can be obtained.
Experimental results on three real-world datasets compared with
state-of-the-art methods validate the effectiveness of the proposed method
Enhancing Peak Network Traffic Prediction via Time-Series Decomposition
For network administration and maintenance, it is critical to anticipate when
networks will receive peak volumes of traffic so that adequate resources can be
allocated to service requests made to servers. In the event that sufficient
resources are not allocated to servers, they can become prone to failure and
security breaches. On the contrary, we would waste a lot of resources if we
always allocate the maximum amount of resources. Therefore, anticipating peak
volumes in network traffic becomes an important problem. However, popular
forecasting models such as Autoregressive Integrated Moving Average (ARIMA)
forecast time-series data generally, thus lack in predicting peak volumes in
these time-series. More than often, a time-series is a combination of different
features, which may include but are not limited to 1) Trend, the general
movement of the traffic volume, 2) Seasonality, the patterns repeated over some
time periods (e.g. daily and monthly), and 3) Noise, the random changes in the
data. Considering that the fluctuation of seasonality can be harmful for trend
and peak prediction, we propose to extract seasonalities to facilitate the peak
volume predictions in the time domain. The experiments on both synthetic and
real network traffic data demonstrate the effectiveness of the proposed method
NPRL: Nightly Profile Representation Learning for Early Sepsis Onset Prediction in ICU Trauma Patients
Sepsis is a syndrome that develops in response to the presence of infection.
It is characterized by severe organ dysfunction and is one of the leading
causes of mortality in Intensive Care Units (ICUs) worldwide. These
complications can be reduced through early application of antibiotics, hence
the ability to anticipate the onset of sepsis early is crucial to the survival
and well-being of patients. Current machine learning algorithms deployed inside
medical infrastructures have demonstrated poor performance and are insufficient
for anticipating sepsis onset early. In recent years, deep learning
methodologies have been proposed to predict sepsis, but some fail to capture
the time of onset (e.g., classifying patients' entire visits as developing
sepsis or not) and others are unrealistic to be deployed into medical
facilities (e.g., creating training instances using a fixed time to onset where
the time of onset needs to be known apriori). Therefore, in this paper, we
first propose a novel but realistic prediction framework that predicts each
morning whether sepsis onset will occur within the next 24 hours with the help
of most recent data collected at night, when patient-provider ratios are higher
due to cross-coverage resulting in limited observation to each patient.
However, as we increase the prediction rate into daily, the number of negative
instances will increase while that of positive ones remain the same.
Thereafter, we have a severe class imbalance problem, making a machine learning
model hard to capture rare sepsis cases. To address this problem, we propose to
do nightly profile representation learning (NPRL) for each patient. We prove
that NPRL can theoretically alleviate the rare event problem. Our empirical
study using data from a level-1 trauma center further demonstrates the
effectiveness of our proposal
Improved Visual Fine-tuning with Natural Language Supervision
Fine-tuning a visual pre-trained model can leverage the semantic information
from large-scale pre-training data and mitigate the over-fitting problem on
downstream vision tasks with limited training examples. While the problem of
catastrophic forgetting in pre-trained backbone has been extensively studied
for fine-tuning, its potential bias from the corresponding pre-training task
and data, attracts less attention. In this work, we investigate this problem by
demonstrating that the obtained classifier after fine-tuning will be close to
that induced by the pre-trained model. To reduce the bias in the classifier
effectively, we introduce a reference distribution obtained from a fixed text
classifier, which can help regularize the learned vision classifier. The
proposed method, Text Supervised fine-tuning (TeS), is evaluated with diverse
pre-trained vision models including ResNet and ViT, and text encoders including
BERT and CLIP, on 11 downstream tasks. The consistent improvement with a clear
margin over distinct scenarios confirms the effectiveness of our proposal. Code
is available at \url{https://github.com/idstcv/TeS}.Comment: accepted by ICCV'2
Multi-Subset Approach to Early Sepsis Prediction
Sepsis is a life-threatening organ malfunction caused by the host's inability
to fight infection, which can lead to death without proper and immediate
treatment. Therefore, early diagnosis and medical treatment of sepsis in
critically ill populations at high risk for sepsis and sepsis-associated
mortality are vital to providing the patient with rapid therapy. Studies show
that advancing sepsis detection by 6 hours leads to earlier administration of
antibiotics, which is associated with improved mortality. However, clinical
scores like Sequential Organ Failure Assessment (SOFA) are not applicable for
early prediction, while machine learning algorithms can help capture the
progressing pattern for early prediction. Therefore, we aim to develop a
machine learning algorithm that predicts sepsis onset 6 hours before it is
suspected clinically. Although some machine learning algorithms have been
applied to sepsis prediction, many of them did not consider the fact that six
hours is not a small gap. To overcome this big gap challenge, we explore a
multi-subset approach in which the likelihood of sepsis occurring earlier than
6 hours is output from a previous subset and feed to the target subset as
additional features. Moreover, we use the hourly sampled data like vital signs
in an observation window to derive a temporal change trend to further assist,
which however is often ignored by previous studies. Our empirical study shows
that both the multi-subset approach to alleviating the 6-hour gap and the added
temporal trend features can help improve the performance of sepsis-related
early prediction
- …