13 research outputs found

    Learning Distributions via Monte-Carlo Marginalization

    Full text link
    We propose a novel method to learn intractable distributions from their samples. The main idea is to use a parametric distribution model, such as a Gaussian Mixture Model (GMM), to approximate intractable distributions by minimizing the KL-divergence. Based on this idea, there are two challenges that need to be addressed. First, the computational complexity of KL-divergence is unacceptable when the dimensions of distributions increases. The Monte-Carlo Marginalization (MCMarg) is proposed to address this issue. The second challenge is the differentiability of the optimization process, since the target distribution is intractable. We handle this problem by using Kernel Density Estimation (KDE). The proposed approach is a powerful tool to learn complex distributions and the entire process is differentiable. Thus, it can be a better substitute of the variational inference in variational auto-encoders (VAE). One strong evidence of the benefit of our method is that the distributions learned by the proposed approach can generate better images even based on a pre-trained VAE's decoder. Based on this point, we devise a distribution learning auto-encoder which is better than VAE under the same network architecture. Experiments on standard dataset and synthetic data demonstrate the efficiency of the proposed approach

    Affine-Transformation-Invariant Image Classification by Differentiable Arithmetic Distribution Module

    Full text link
    Although Convolutional Neural Networks (CNNs) have achieved promising results in image classification, they still are vulnerable to affine transformations including rotation, translation, flip and shuffle. The drawback motivates us to design a module which can alleviate the impact from different affine transformations. Thus, in this work, we introduce a more robust substitute by incorporating distribution learning techniques, focusing particularly on learning the spatial distribution information of pixels in images. To rectify the issue of non-differentiability of prior distribution learning methods that rely on traditional histograms, we adopt the Kernel Density Estimation (KDE) to formulate differentiable histograms. On this foundation, we present a novel Differentiable Arithmetic Distribution Module (DADM), which is designed to extract the intrinsic probability distributions from images. The proposed approach is able to enhance the model's robustness to affine transformations without sacrificing its feature extraction capabilities, thus bridging the gap between traditional CNNs and distribution-based learning. We validate the effectiveness of the proposed approach through ablation study and comparative experiments with LeNet

    Is Deep Learning Network Necessary for Image Generation?

    Full text link
    Recently, images are considered samples from a high-dimensional distribution, and deep learning has become almost synonymous with image generation. However, is a deep learning network truly necessary for image generation? In this paper, we investigate the possibility of image generation without using a deep learning network, motivated by validating the assumption that images follow a high-dimensional distribution. Since images are assumed to be samples from such a distribution, we utilize the Gaussian Mixture Model (GMM) to describe it. In particular, we employ a recent distribution learning technique named as Monte-Carlo Marginalization to capture the parameters of the GMM based on image samples. Moreover, we also use the Singular Value Decomposition (SVD) for dimensionality reduction to decrease computational complexity. During our evaluation experiment, we first attempt to model the distribution of image samples directly to verify the assumption that images truly follow a distribution. We then use the SVD for dimensionality reduction. The principal components, rather than raw image data, are used for distribution learning. Compared to methods relying on deep learning networks, our approach is more explainable, and its performance is promising. Experiments show that our images have a lower FID value compared to those generated by variational auto-encoders, demonstrating the feasibility of image generation without deep learning networks.Comment: This paper has been reject. I am planning to combine this paper with my another paper to make one strong pape

    Vehicle Detection Research Based on USILTP Operator

    No full text
    This paper presents a uniform SILTP operator based on the SILTP operator. In the vehicle detection, SILTP can solve the problems caused by the change of sunshine, the shadow of vehicle and the noise in the surrounding environment. But the algorithm has high dimensionality which can lead to error because of the deviation of the texture characteristics. The USILTP operator can reduce the dimensionality of the detection data which adapts to the problems caused by illumination variations and the noise in the surrounding environment. First, the method uses the SILTP operator to extract the vehicle image texture characteristics and reduce the dimensionality of the detection data, and then it uses the Gauss mixture model to do background modeling, and uses the texture characteristics of the new image to update background dynamically. At last, it gets the vehicle by contracting with the background model. It has been proved that this detection algorithm has a good performance with the test of the vehicle on public roads

    Vehicle Detection Research Based on USILTP Operator

    No full text
    This paper presents a uniform SILTP operator based on the SILTP operator. In the vehicle detection, SILTP can solve the problems caused by the change of sunshine, the shadow of vehicle and the noise in the surrounding environment. But the algorithm has high dimensionality which can lead to error because of the deviation of the texture characteristics. The USILTP operator can reduce the dimensionality of the detection data which adapts to the problems caused by illumination variations and the noise in the surrounding environment. First, the method uses the SILTP operator to extract the vehicle image texture characteristics and reduce the dimensionality of the detection data, and then it uses the Gauss mixture model to do background modeling, and uses the texture characteristics of the new image to update background dynamically. At last, it gets the vehicle by contracting with the background model. It has been proved that this detection algorithm has a good performance with the test of the vehicle on public roads

    Vehicle Detection Research Based on USILTP Operator

    No full text
    This paper presents a uniform SILTP operator based on the SILTP operator. In the vehicle detection, SILTP can solve the problems caused by the change of sunshine, the shadow of vehicle and the noise in the surrounding environment. But the algorithm has high dimensionality which can lead to error because of the deviation of the texture characteristics. The USILTP operator can reduce the dimensionality of the detection data which adapts to the problems caused by illumination variations and the noise in the surrounding environment. First, the method uses the SILTP operator to extract the vehicle image texture characteristics and reduce the dimensionality of the detection data, and then it uses the Gauss mixture model to do background modeling, and uses the texture characteristics of the new image to update background dynamically. At last, it gets the vehicle by contracting with the background model. It has been proved that this detection algorithm has a good performance with the test of the vehicle on public roads

    Frequency Regularization: Reducing Information Redundancy in Convolutional Neural Networks

    No full text
    Convolutional neural networks have demonstrated impressive results in many computer vision tasks. However, the increasing size of these networks raises concerns about the information overload resulting from the large number of network parameters. In this paper, we propose Frequency Regularization to restrict the non-zero elements of the network parameters in the frequency domain. The proposed approach operates at the tensor level, and can be applied to almost all network architectures. Specifically, the tensors of parameters are maintained in the frequency domain, where high-frequency components can be eliminated by zigzag setting tensor elements to zero. Then, the inverse discrete cosine transform (IDCT) is used to reconstruct the spatial tensors for matrix operations during network training. Since high-frequency components of images are known to be less critical, a large proportion of these parameters can be set to zero when networks are trained with the proposed frequency regularization. Comprehensive evaluations on various state-of-the-art network architectures, including LeNet, Alexnet, VGG, Resnet, ViT, UNet, GAN, and VAE, demonstrate the effectiveness of the proposed frequency regularization. For a very small accuracy decrease (less than 2%), a LeNet5 with 0.4M parameters can be represented by only 776 float16 numbers (over 1100×1100\times reduction), and a UNet with 34M parameters can be represented by only 759 float16 numbers (over 80000×80000\times reduction). In particular, the original size of the UNet model is reduced from 366 Mb to 4.5 Kb

    Background subtraction based on random superpixels under multiple scales for video analytics

    No full text
    Background subtraction is a fundamental problem of computer vision, which is usually the first step of video analytics to extract the interesting region. Most previously available region-based background subtraction methods ignore the similarity between the pixels, meaning that the information gained from the pixels that do not contribute, or even contribute negatively to understanding an image, is taken into account. A new background subtraction model based on random superpixel segmentation under multiple scales is proposed. A custom region segmentation area is replaced with a superpixel segmentation area that uses similarity characteristics for pixels in the superpixel area. The compactness of the pixels in the same superpixel area means that the pixels positively contribute to understanding an image compared with when using custom region pixels. Superpixel segmentation is performed using the random simple linear iterative cluster method. Taking random samples during the superpixel segmentation process produces the Matthew effect, thus improving the robustness and efficiency of the model. Multi-scale superpixel segmentation is therefore guaranteed to give more accurate results. Standard benchmark experiments using the proposed approach produced encouraging results compared with the results given by previously available algorithms.Published versio
    corecore