147 research outputs found

    Maxmin convolutional neural networks for image classification

    Get PDF
    Convolutional neural networks (CNN) are widely used in computer vision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures is still an open question. In this paper, we propose to modify the standard convo- lutional block of CNN in order to transfer more information layer after layer while keeping some invariance within the net- work. Our main idea is to exploit both positive and negative high scores obtained in the convolution maps. This behav- ior is obtained by modifying the traditional activation func- tion step before pooling. We are doubling the maps with spe- cific activations functions, called MaxMin strategy, in order to achieve our pipeline. Extensive experiments on two classical datasets, MNIST and CIFAR-10, show that our deep MaxMin convolutional net outperforms standard CNN

    Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks

    Full text link
    It is a highly desirable property for deep networks to be robust against small input changes. One popular way to achieve this property is by designing networks with a small Lipschitz constant. In this work, we propose a new technique for constructing such Lipschitz networks that has a number of desirable properties: it can be applied to any linear network layer (fully-connected or convolutional), it provides formal guarantees on the Lipschitz constant, it is easy to implement and efficient to run, and it can be combined with any training objective and optimization method. In fact, our technique is the first one in the literature that achieves all of these properties simultaneously. Our main contribution is a rescaling-based weight matrix parametrization that guarantees each network layer to have a Lipschitz constant of at most 1 and results in the learned weight matrices to be close to orthogonal. Hence we call such layers almost-orthogonal Lipschitz (AOL). Experiments and ablation studies in the context of image classification with certified robust accuracy confirm that AOL layers achieve results that are on par with most existing methods. Yet, they are simpler to implement and more broadly applicable, because they do not require computationally expensive matrix orthogonalization or inversion steps as part of the network architecture. We provide code at https://github.com/berndprach/AOL.Comment: - Corrected the results from competitor ECO. - Corrected a typo in the loss function equatio

    ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes

    Full text link
    Exploiting synthetic data to learn deep models has attracted increasing attention in recent years. However, the intrinsic domain difference between synthetic and real images usually causes a significant performance drop when applying the learned model to real world scenarios. This is mainly due to two reasons: 1) the model overfits to synthetic images, making the convolutional filters incompetent to extract informative representation for real images; 2) there is a distribution difference between synthetic and real data, which is also known as the domain adaptation problem. To this end, we propose a new reality oriented adaptation approach for urban scene semantic segmentation by learning from synthetic data. First, we propose a target guided distillation approach to learn the real image style, which is achieved by training the segmentation model to imitate a pretrained real style model using real images. Second, we further take advantage of the intrinsic spatial structure presented in urban scene images, and propose a spatial-aware adaptation scheme to effectively align the distribution of two domains. These two modules can be readily integrated with existing state-of-the-art semantic segmentation networks to improve their generalizability when adapting from synthetic to real urban scenes. We evaluate the proposed method on Cityscapes dataset by adapting from GTAV and SYNTHIA datasets, where the results demonstrate the effectiveness of our method.Comment: Add experiments on SYNTHIA, CVPR 2018 camera-ready versio

    Popularity prediction on Instagram using machine learning

    Get PDF
    In the last year, the research about new ways of using Machine Learning for the human profit has grown exponentially. At the same time, our society is evolving into new ways of communication and social interaction. Instagram is one of the faces of this change in the human evolution. Prediction systems and Artificial Intelligence are topics that are exploding right now. Companies are investing in challenges and experts to predict that product that a user will want, or that music that he will like, or that film that he will want to watch. There is a new challenging branch to be followed and that is Social Media. Predict the popularity of pictures, which one will be the most popular of the day, which is the best cut for a selfie to be more popular... This project tries to put these two realities together. The goal is to predict how many likes a post is gonna get before being posted on Instagram. This would only be possible right now thanks to Machine Learning. Machine Learning is this concept that we can train computers to identify patterns and data, and then use those patterns to predict off of new data. We will give samples of posts to our machine so it can find patterns and estimate and predict a result after being given some other input, based on the patterns that it learned. The system will consist of three sections: First, the input picture will be classified into one category depending on the theme of the picture using a retrained model based on Deep Learning. In this project we are going to take into account six categories: Animals, Food, Friends, Landscape, Quote and Selfie. Second, the input picture will be compared with a set of 200 pictures from the selected category that already have a score between 0 and 1 using another retrained model based on Deep Learning. An algorithm will make the comparison and will result in a histogram. The maximum point of the histogram will be the computed score of the input picture. Third, we will use that computed score from the second section as a variable in a regression, in order to get the final prediction of likes. Other variables are taken into account in this regression as well. As you can see in the description of the system, without Machine Learning it would be impossible, even for a human being, to identify all the necessary patterns and predict off new data accurately. Humans can typically create one or two good models a week; machine learning can create thousands of models a week

    Frugal Satellite Image Change Detection with Deep-Net Inversion

    Full text link
    Change detection in satellite imagery seeks to find occurrences of targeted changes in a given scene taken at different instants. This task has several applications ranging from land-cover mapping, to anthropogenic activity monitory as well as climate change and natural hazard damage assessment. However, change detection is highly challenging due to the acquisition conditions and also to the subjectivity of changes. In this paper, we devise a novel algorithm for change detection based on active learning. The proposed method is based on a question and answer model that probes an oracle (user) about the relevance of changes only on a small set of critical images (referred to as virtual exemplars), and according to oracle's responses updates deep neural network (DNN) classifiers. The main contribution resides in a novel adversarial model that allows learning the most representative, diverse and uncertain virtual exemplars (as inverted preimages of the trained DNNs) that challenge (the most) the trained DNNs, and this leads to a better re-estimate of these networks in the subsequent iterations of active learning. Experiments show the out-performance of our proposed deep-net inversion against the related work.Comment: arXiv admin note: text overlap with arXiv:2212.1397

    Adversarial Virtual Exemplar Learning for Label-Frugal Satellite Image Change Detection

    Full text link
    Satellite image change detection aims at finding occurrences of targeted changes in a given scene taken at different instants. This task is highly challenging due to the acquisition conditions and also to the subjectivity of changes. In this paper, we investigate satellite image change detection using active learning. Our method is interactive and relies on a question and answer model which asks the oracle (user) questions about the most informative display (dubbed as virtual exemplars), and according to the user's responses, updates change detections. The main contribution of our method consists in a novel adversarial model that allows frugally probing the oracle with only the most representative, diverse and uncertain virtual exemplars. The latter are learned to challenge the most the trained change decision criteria which ultimately leads to a better re-estimate of these criteria in the following iterations of active learning. Conducted experiments show the out-performance of our proposed adversarial display model against other display strategies as well as the related work.Comment: arXiv admin note: substantial text overlap with arXiv:2203.1155

    Frugal Reinforcement-based Active Learning

    Full text link
    Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on large collections of labeled data. Among the existing solutions, deep active learning is currently witnessing a major interest and its purpose is to train deep networks using as few labeled samples as possible. However, the success of active learning is highly dependent on how critical are these samples when training models. In this paper, we devise a novel active learning approach for label-efficient training. The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria. The proposed approach is probabilistic and unifies all these criteria in a single objective function whose solution models the probability of relevance of samples (i.e., how critical) when learning a decision function. We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration, using a particular stateless Q-learning model. Extensive experiments conducted on staple image classification data, including Object-DOTA, show the effectiveness of our proposed model w.r.t. several baselines including random, uncertainty and flat as well as other work.Comment: arXiv admin note: text overlap with arXiv:2203.1156

    1-Lipschitz Neural Networks are more expressive with N-Activations

    Full text link
    A crucial property for achieving secure, trustworthy and interpretable deep learning systems is their robustness: small changes to a system's inputs should not result in large changes to its outputs. Mathematically, this means one strives for networks with a small Lipschitz constant. Several recent works have focused on how to construct such Lipschitz networks, typically by imposing constraints on the weight matrices. In this work, we study an orthogonal aspect, namely the role of the activation function. We show that commonly used activation functions, such as MaxMin, as well as all piece-wise linear ones with two segments unnecessarily restrict the class of representable functions, even in the simplest one-dimensional setting. We furthermore introduce the new N-activation function that is provably more expressive than currently popular activation functions. We provide code at https://github.com/berndprach/NActivation
    • …
    corecore