18 research outputs found
Kervolutional Neural Networks
Convolutional neural networks (CNNs) have enabled the state-of-the-art
performance in many computer vision tasks. However, little effort has been
devoted to establishing convolution in non-linear space. Existing works mainly
leverage on the activation layers, which can only provide point-wise
non-linearity. To solve this problem, a new operation, kervolution (kernel
convolution), is introduced to approximate complex behaviors of human
perception systems leveraging on the kernel trick. It generalizes convolution,
enhances the model capacity, and captures higher order interactions of
features, via patch-wise kernel functions, but without introducing additional
parameters. Extensive experiments show that kervolutional neural networks (KNN)
achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201
Anomaly Detection And Classification In Time Series With Kervolutional Neural Networks
Recently, with the development of deep learning, end-to-end neural network
architectures have been increasingly applied to condition monitoring signals.
They have demonstrated superior performance for fault detection and
classification, in particular using convolutional neural networks. Even more
recently, an extension of the concept of convolution to the concept of
kervolution has been proposed with some promising results in image
classification tasks. In this paper, we explore the potential of kervolutional
neural networks applied to time series data. We demonstrate that using a
mixture of convolutional and kervolutional layers improves the model
performance. The mixed model is first applied to a classification task in time
series, as a benchmark dataset. Subsequently, the proposed mixed architecture
is used to detect anomalies in time series data recorded by accelerometers on
helicopters. We propose a residual-based anomaly detection approach using a
temporal auto-encoder. We demonstrate that mixing kervolutional with
convolutional layers in the encoder is more sensitive to variations in the
input data and is able to detect anomalous time series in a better way.Comment: 9 pages, 1 figure, 4 table
Analysis for Symptoms of Human Fall using Pre-Processing and Segmentation based on Deep Learning Architectures
By building sensor-based alert systems, physical therapists can not only decrease the after-fall repercussions but even save lives.Older people are prone to several diseases, and falling is a regular occurrence for them during this period.Various fall detection systems have recently been developed, with computer vision-based approaches being one of the most promising and effective. Here, the sensor-based data has been analysed for a patient's human fall symptoms. This data has been pre-processed using Gaussian filtering with kernel neural network in which the data has been normalized and trained based on neural network. The trained normalized data has been segmented using encoded Stacked Deconvolutional Network (EnSt-DeConvNet). We found that the suggested method predicts such fall symptoms with the highest accuracy from sensor data. Other algorithms' accuracy results, on the other hand, are also fairly close. Experiments reveal that the suggested technique, when compared to other generally utilized techniques based on multiple cameras fall dataset, produced reliable findings and that our dataset, which consists of more training samples, produced even better results. Experimental results showaccuracy of 96%, Precision of 94%, Recall of 88% and F-1 score of 82%, computational time of 69%
Kernelized dense layers for facial expression recognition
Fully connected layer is an essential component of Convolutional Neural
Networks (CNNs), which demonstrates its efficiency in computer vision tasks.
The CNN process usually starts with convolution and pooling layers that first
break down the input images into features, and then analyze them independently.
The result of this process feeds into a fully connected neural network
structure which drives the final classification decision. In this paper, we
propose a Kernelized Dense Layer (KDL) which captures higher order feature
interactions instead of conventional linear relations. We apply this method to
Facial Expression Recognition (FER) and evaluate its performance on RAF,
FER2013 and ExpW datasets. The experimental results demonstrate the benefits of
such layer and show that our model achieves competitive results with respect to
the state-of-the-art approaches
Self-Organized Operational Neural Networks for Severe Image Restoration Problems
Discriminative learning based on convolutional neural networks (CNNs) aims to
perform image restoration by learning from training examples of noisy-clean
image pairs. It has become the go-to methodology for tackling image restoration
and has outperformed the traditional non-local class of methods. However, the
top-performing networks are generally composed of many convolutional layers and
hundreds of neurons, with trainable parameters in excess of several millions.
We claim that this is due to the inherent linear nature of convolution-based
transformation, which is inadequate for handling severe restoration problems.
Recently, a non-linear generalization of CNNs, called the operational neural
networks (ONN), has been shown to outperform CNN on AWGN denoising. However,
its formulation is burdened by a fixed collection of well-known nonlinear
operators and an exhaustive search to find the best possible configuration for
a given architecture, whose efficacy is further limited by a fixed output layer
operator assignment. In this study, we leverage the Taylor series-based
function approximation to propose a self-organizing variant of ONNs, Self-ONNs,
for image restoration, which synthesizes novel nodal transformations onthe-fly
as part of the learning process, thus eliminating the need for redundant
training runs for operator search. In addition, it enables a finer level of
operator heterogeneity by diversifying individual connections of the receptive
fields and weights. We perform a series of extensive ablation experiments
across three severe image restoration tasks. Even when a strict equivalence of
learnable parameters is imposed, Self-ONNs surpass CNNs by a considerable
margin across all problems, improving the generalization performance by up to 3
dB in terms of PSNR
Graph Optimization Approach to Range-based Localization
In this paper, we propose a general graph optimization based framework for
localization, which can accommodate different types of measurements with
varying measurement time intervals. Special emphasis will be on range-based
localization. Range and trajectory smoothness constraints are constructed in a
position graph, then the robot trajectory over a sliding window is estimated by
a graph based optimization algorithm. Moreover, convergence analysis of the
algorithm is provided, and the effects of the number of iterations and window
size in the optimization on the localization accuracy are analyzed. Extensive
experiments on quadcopter under a variety of scenarios verify the effectiveness
of the proposed algorithm and demonstrate a much higher localization accuracy
than the existing range-based localization methods, especially in the altitude
direction
SSIVD-Net: A Novel Salient Super Image Classification & Detection Technique for Weaponized Violence
Detection of violence and weaponized violence in closed-circuit television
(CCTV) footage requires a comprehensive approach. In this work, we introduce
the \emph{Smart-City CCTV Violence Detection (SCVD)} dataset, specifically
designed to facilitate the learning of weapon distribution in surveillance
videos. To tackle the complexities of analyzing 3D surveillance video for
violence recognition tasks, we propose a novel technique called,
\emph{SSIVD-Net} (\textbf{S}alient-\textbf{S}uper-\textbf{I}mage for
\textbf{V}iolence \textbf{D}etection). Our method reduces 3D video data
complexity, dimensionality, and information loss while improving inference,
performance, and explainability through the use of Salient-Super-Image
representations. Considering the scalability and sustainability requirements of
futuristic smart cities, the authors introduce the \emph{Salient-Classifier}, a
novel architecture combining a kernelized approach with a residual learning
strategy. We evaluate variations of SSIVD-Net and Salient Classifier on our
SCVD dataset and benchmark against state-of-the-art (SOTA) models commonly
employed in violence detection. Our approach exhibits significant improvements
in detecting both weaponized and non-weaponized violence instances. By
advancing the SOTA in violence detection, our work offers a practical and
scalable solution suitable for real-world applications. The proposed
methodology not only addresses the challenges of violence detection in CCTV
footage but also contributes to the understanding of weapon distribution in
smart surveillance. Ultimately, our research findings should enable smarter and
more secure cities, as well as enhance public safety measures.Comment: 5 tables, 3 figure