558 research outputs found
Self-Supervised Texture Image Anomaly Detection By Fusing Normalizing Flow and Dictionary Learning
A common study area in anomaly identification is industrial images anomaly
detection based on texture background. The interference of texture images and
the minuteness of texture anomalies are the main reasons why many existing
models fail to detect anomalies. We propose a strategy for anomaly detection
that combines dictionary learning and normalizing flow based on the
aforementioned questions. The two-stage anomaly detection approach already in
use is enhanced by our method. In order to improve baseline method, this
research add normalizing flow in representation learning and combines deep
learning and dictionary learning. Improved algorithms have exceeded 95
detection accuracy on all MVTec AD texture type data after experimental
validation. It shows strong robustness. The baseline method's detection
accuracy for the Carpet data was 67.9%. The article was upgraded, raising the
detection accuracy to 99.7%
Future Frame Prediction for Anomaly Detection -- A New Baseline
Anomaly detection in videos refers to the identification of events that do
not conform to expected behavior. However, almost all existing methods tackle
the problem by minimizing the reconstruction errors of training data, which
cannot guarantee a larger reconstruction error for an abnormal event. In this
paper, we propose to tackle the anomaly detection problem within a video
prediction framework. To the best of our knowledge, this is the first work that
leverages the difference between a predicted future frame and its ground truth
to detect an abnormal event. To predict a future frame with higher quality for
normal events, other than the commonly used appearance (spatial) constraints on
intensity and gradient, we also introduce a motion (temporal) constraint in
video prediction by enforcing the optical flow between predicted frames and
ground truth frames to be consistent, and this is the first work that
introduces a temporal constraint into the video prediction task. Such spatial
and motion constraints facilitate the future frame prediction for normal
events, and consequently facilitate to identify those abnormal events that do
not conform the expectation. Extensive experiments on both a toy dataset and
some publicly available datasets validate the effectiveness of our method in
terms of robustness to the uncertainty in normal events and the sensitivity to
abnormal events.Comment: IEEE Conference on Computer Vision and Pattern Recognition 201
Lossy Compressive Sensing Based on Online Dictionary Learning
In this paper, a lossy compression of hyperspectral images is realized by using a novel online dictionary learning method in which three dimensional datasets can be compressed. This online dictionary learning method and blind compressive sensing (BCS) algorithm are combined in a hybrid lossy compression framework for the first time in the literature. According to the experimental results, BCS algorithm has the best compression performance when the compression bit rate is higher than or equal to 0.5 bps. Apart from observing rate-distortion performance, anomaly detection performance is also tested on the reconstructed images to measure the information preservation performance
REPRESENTATION LEARNING FOR ACTION RECOGNITION
The objective of this research work is to develop discriminative representations for human
actions. The motivation stems from the fact that there are many issues encountered while
capturing actions in videos like intra-action variations (due to actors, viewpoints, and duration),
inter-action similarity, background motion, and occlusion of actors. Hence, obtaining
a representation which can address all the variations in the same action while maintaining
discrimination with other actions is a challenging task. In literature, actions have been represented
either using either low-level or high-level features. Low-level features describe
the motion and appearance in small spatio-temporal volumes extracted from a video. Due
to the limited space-time volume used for extracting low-level features, they are not able
to account for viewpoint and actor variations or variable length actions. On the other hand,
high-level features handle variations in actors, viewpoints, and duration but the resulting
representation is often high-dimensional which introduces the curse of dimensionality. In
this thesis, we propose new representations for describing actions by combining the advantages
of both low-level and high-level features. Specifically, we investigate various linear
and non-linear decomposition techniques to extract meaningful attributes in both high-level
and low-level features. In the first approach, the sparsity of high-level feature descriptors is leveraged to build
action-specific dictionaries. Each dictionary retains only the discriminative information
for a particular action and hence reduces inter-action similarity. Then, a sparsity-based
classification method is proposed to classify the low-rank representation of clips obtained
using these dictionaries. We show that this representation based on dictionary learning improves
the classification performance across actions. Also, a few of the actions consist of
rapid body deformations that hinder the extraction of local features from body movements.
Hence, we propose to use a dictionary which is trained on convolutional neural network
(CNN) features of the human body in various poses to reliably identify actors from the
background. Particularly, we demonstrate the efficacy of sparse representation in the identification
of the human body under rapid and substantial deformation.
In the first two approaches, sparsity-based representation is developed to improve discriminability
using class-specific dictionaries that utilize action labels. However, developing
an unsupervised representation of actions is more beneficial as it can be used to both
recognize similar actions and localize actions. We propose to exploit inter-action similarity
to train a universal attribute model (UAM) in order to learn action attributes (common and
distinct) implicitly across all the actions. Using maximum aposteriori (MAP) adaptation,
a high-dimensional super action-vector (SAV) for each clip is extracted. As this SAV contains
redundant attributes of all other actions, we use factor analysis to extract a novel lowvi
dimensional action-vector representation for each clip. Action-vectors are shown to suppress
background motion and highlight actions of interest in both trimmed and untrimmed
clips that contributes to action recognition without the help of any classifiers.
It is observed during our experiments that action-vector cannot effectively discriminate
between actions which are visually similar to each other. Hence, we subject action-vectors
to supervised linear embedding using linear discriminant analysis (LDA) and probabilistic
LDA (PLDA) to enforce discrimination. Particularly, we show that leveraging complimentary
information across action-vectors using different local features followed by discriminative
embedding provides the best classification performance. Further, we explore
non-linear embedding of action-vectors using Siamese networks especially for fine-grained
action recognition. A visualization of the hidden layer output in Siamese networks shows
its ability to effectively separate visually similar actions. This leads to better classification
performance than linear embedding on fine-grained action recognition.
All of the above approaches are presented on large unconstrained datasets with hundreds
of examples per action. However, actions in surveillance videos like snatch thefts are
difficult to model because of the diverse variety of scenarios in which they occur and very
few labeled examples. Hence, we propose to utilize the universal attribute model (UAM)
trained on large action datasets to represent such actions. Specifically, we show that there
are similarities between certain actions in the large datasets with snatch thefts which help
in extracting a representation for snatch thefts using the attributes from the UAM. This
representation is shown to be effective in distinguishing snatch thefts from regular actions
with high accuracy.In summary, this thesis proposes both supervised and unsupervised approaches for representing
actions which provide better discrimination than existing representations. The
first approach presents a dictionary learning based sparse representation for effective discrimination
of actions. Also, we propose a sparse representation for the human body based
on dictionaries in order to recognize actions with rapid body deformations. In the next
approach, a low-dimensional representation called action-vector for unsupervised action
recognition is presented. Further, linear and non-linear embedding of action-vectors is
proposed for addressing inter-action similarity and fine-grained action recognition, respectively.
Finally, we propose a representation for locating snatch thefts among thousands of
regular interactions in surveillance videos
Adaptive machinery fault diagnosis based on improved shift-invariant sparse coding
In machinery fault diagnosis, it is common that one kind of fault may correspond to several conditions, these conditions may contain different loads, different speeds and so on. When using conventional intelligent machinery fault diagnosis methods on diagnosing this kind of faults, if only one condition among all of these conditions was trained, when using this trained classifier for diagnosing fault which containing all conditions, it would obtain a classification result with higher error, it is the problem of robustness; but if we train all these data in each condition, the robustness can be improved a lot, but the time would be wasted. In order to balance these two aspects of fault diagnosis which seem contradict with each other, someone proposed a new method which based on shift-invariant sparse coding (SISC) method, this method can learn features from each condition of the same fault, and these features are adaptive to other conditions, which solve the first problem, but time efficiency of this algorithm is lower, in this paper, by improving the efficiency of shift-invariant sparse coding, we can reduce a lot of time on learning features. Through the experiment testing, it showed that this new method proposed in this paper produced better performance than SISC algorithm
Anomaly Detection on Time Series Data
Anomaly detection is an important problem that has been researched within diverse application domains. Detection of anomalies in the time series domain finds extensive application in monitoring system status, mal-ware/spam detection, credit-card fraud etc. In this work we explore methods to detect anomalies in multivariate as well as uni variate time-series and proposed a novel method using Dictionary Learning, Sparse Representation, Singular Value Decomposition and Topological anomaly detection(TAD). We have tested the proposed method on real as well as synthetic data sets. Our novel method brings down the false positive rates as compared to the existing methods
- …