18 research outputs found

    Kervolutional Neural Networks

    Full text link
    Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the activation layers, which can only provide point-wise non-linearity. To solve this problem, a new operation, kervolution (kernel convolution), is introduced to approximate complex behaviors of human perception systems leveraging on the kernel trick. It generalizes convolution, enhances the model capacity, and captures higher order interactions of features, via patch-wise kernel functions, but without introducing additional parameters. Extensive experiments show that kervolutional neural networks (KNN) achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201

    Anomaly Detection And Classification In Time Series With Kervolutional Neural Networks

    Full text link
    Recently, with the development of deep learning, end-to-end neural network architectures have been increasingly applied to condition monitoring signals. They have demonstrated superior performance for fault detection and classification, in particular using convolutional neural networks. Even more recently, an extension of the concept of convolution to the concept of kervolution has been proposed with some promising results in image classification tasks. In this paper, we explore the potential of kervolutional neural networks applied to time series data. We demonstrate that using a mixture of convolutional and kervolutional layers improves the model performance. The mixed model is first applied to a classification task in time series, as a benchmark dataset. Subsequently, the proposed mixed architecture is used to detect anomalies in time series data recorded by accelerometers on helicopters. We propose a residual-based anomaly detection approach using a temporal auto-encoder. We demonstrate that mixing kervolutional with convolutional layers in the encoder is more sensitive to variations in the input data and is able to detect anomalous time series in a better way.Comment: 9 pages, 1 figure, 4 table

    Analysis for Symptoms of Human Fall using Pre-Processing and Segmentation based on Deep Learning Architectures

    Get PDF
    By building sensor-based alert systems, physical therapists can not only decrease the after-fall repercussions but even save lives.Older people are prone to several diseases, and falling is a regular occurrence for them during this period.Various fall detection systems have recently been developed, with computer vision-based approaches being one of the most promising and effective. Here, the sensor-based data has been analysed for a patient's human fall symptoms. This data has been pre-processed using Gaussian filtering with kernel neural network in which the data has been normalized and trained based on neural network. The trained normalized data has been segmented using encoded Stacked Deconvolutional Network (EnSt-DeConvNet). We found that the suggested method predicts such fall symptoms with the highest accuracy from sensor data. Other algorithms' accuracy results, on the other hand, are also fairly close. Experiments reveal that the suggested technique, when compared to other generally utilized techniques based on multiple cameras fall dataset, produced reliable findings and that our dataset, which consists of more training samples, produced even better results. Experimental results showaccuracy of 96%, Precision of 94%, Recall of 88% and F-1 score of 82%, computational time of 69%

    Kernelized dense layers for facial expression recognition

    Full text link
    Fully connected layer is an essential component of Convolutional Neural Networks (CNNs), which demonstrates its efficiency in computer vision tasks. The CNN process usually starts with convolution and pooling layers that first break down the input images into features, and then analyze them independently. The result of this process feeds into a fully connected neural network structure which drives the final classification decision. In this paper, we propose a Kernelized Dense Layer (KDL) which captures higher order feature interactions instead of conventional linear relations. We apply this method to Facial Expression Recognition (FER) and evaluate its performance on RAF, FER2013 and ExpW datasets. The experimental results demonstrate the benefits of such layer and show that our model achieves competitive results with respect to the state-of-the-art approaches

    Self-Organized Operational Neural Networks for Severe Image Restoration Problems

    Get PDF
    Discriminative learning based on convolutional neural networks (CNNs) aims to perform image restoration by learning from training examples of noisy-clean image pairs. It has become the go-to methodology for tackling image restoration and has outperformed the traditional non-local class of methods. However, the top-performing networks are generally composed of many convolutional layers and hundreds of neurons, with trainable parameters in excess of several millions. We claim that this is due to the inherent linear nature of convolution-based transformation, which is inadequate for handling severe restoration problems. Recently, a non-linear generalization of CNNs, called the operational neural networks (ONN), has been shown to outperform CNN on AWGN denoising. However, its formulation is burdened by a fixed collection of well-known nonlinear operators and an exhaustive search to find the best possible configuration for a given architecture, whose efficacy is further limited by a fixed output layer operator assignment. In this study, we leverage the Taylor series-based function approximation to propose a self-organizing variant of ONNs, Self-ONNs, for image restoration, which synthesizes novel nodal transformations onthe-fly as part of the learning process, thus eliminating the need for redundant training runs for operator search. In addition, it enables a finer level of operator heterogeneity by diversifying individual connections of the receptive fields and weights. We perform a series of extensive ablation experiments across three severe image restoration tasks. Even when a strict equivalence of learnable parameters is imposed, Self-ONNs surpass CNNs by a considerable margin across all problems, improving the generalization performance by up to 3 dB in terms of PSNR

    Graph Optimization Approach to Range-based Localization

    Full text link
    In this paper, we propose a general graph optimization based framework for localization, which can accommodate different types of measurements with varying measurement time intervals. Special emphasis will be on range-based localization. Range and trajectory smoothness constraints are constructed in a position graph, then the robot trajectory over a sliding window is estimated by a graph based optimization algorithm. Moreover, convergence analysis of the algorithm is provided, and the effects of the number of iterations and window size in the optimization on the localization accuracy are analyzed. Extensive experiments on quadcopter under a variety of scenarios verify the effectiveness of the proposed algorithm and demonstrate a much higher localization accuracy than the existing range-based localization methods, especially in the altitude direction

    SSIVD-Net: A Novel Salient Super Image Classification & Detection Technique for Weaponized Violence

    Full text link
    Detection of violence and weaponized violence in closed-circuit television (CCTV) footage requires a comprehensive approach. In this work, we introduce the \emph{Smart-City CCTV Violence Detection (SCVD)} dataset, specifically designed to facilitate the learning of weapon distribution in surveillance videos. To tackle the complexities of analyzing 3D surveillance video for violence recognition tasks, we propose a novel technique called, \emph{SSIVD-Net} (\textbf{S}alient-\textbf{S}uper-\textbf{I}mage for \textbf{V}iolence \textbf{D}etection). Our method reduces 3D video data complexity, dimensionality, and information loss while improving inference, performance, and explainability through the use of Salient-Super-Image representations. Considering the scalability and sustainability requirements of futuristic smart cities, the authors introduce the \emph{Salient-Classifier}, a novel architecture combining a kernelized approach with a residual learning strategy. We evaluate variations of SSIVD-Net and Salient Classifier on our SCVD dataset and benchmark against state-of-the-art (SOTA) models commonly employed in violence detection. Our approach exhibits significant improvements in detecting both weaponized and non-weaponized violence instances. By advancing the SOTA in violence detection, our work offers a practical and scalable solution suitable for real-world applications. The proposed methodology not only addresses the challenges of violence detection in CCTV footage but also contributes to the understanding of weapon distribution in smart surveillance. Ultimately, our research findings should enable smarter and more secure cities, as well as enhance public safety measures.Comment: 5 tables, 3 figure
    corecore