103 research outputs found

    Motion Invariance in Visual Environments

    Full text link
    The puzzle of computer vision might find new challenging solutions when we realize that most successful methods are working at image level, which is remarkably more difficult than processing directly visual streams, just as happens in nature. In this paper, we claim that their processing naturally leads to formulate the motion invariance principle, which enables the construction of a new theory of visual learning based on convolutional features. The theory addresses a number of intriguing questions that arise in natural vision, and offers a well-posed computational scheme for the discovery of convolutional filters over the retina. They are driven by the Euler-Lagrange differential equations derived from the principle of least cognitive action, that parallels laws of mechanics. Unlike traditional convolutional networks, which need massive supervision, the proposed theory offers a truly new scenario in which feature learning takes place by unsupervised processing of video signals. An experimental report of the theory is presented where we show that features extracted under motion invariance yield an improvement that can be assessed by measuring information-based indexes.Comment: arXiv admin note: substantial text overlap with arXiv:1801.0711

    Toward Improving the Evaluation of Visual Attention Models: a Crowdsourcing Approach

    Full text link
    Human visual attention is a complex phenomenon. A computational modeling of this phenomenon must take into account where people look in order to evaluate which are the salient locations (spatial distribution of the fixations), when they look in those locations to understand the temporal development of the exploration (temporal order of the fixations), and how they move from one location to another with respect to the dynamics of the scene and the mechanics of the eyes (dynamics). State-of-the-art models focus on learning saliency maps from human data, a process that only takes into account the spatial component of the phenomenon and ignore its temporal and dynamical counterparts. In this work we focus on the evaluation methodology of models of human visual attention. We underline the limits of the current metrics for saliency prediction and scanpath similarity, and we introduce a statistical measure for the evaluation of the dynamics of the simulated eye movements. While deep learning models achieve astonishing performance in saliency prediction, our analysis shows their limitations in capturing the dynamics of the process. We find that unsupervised gravitational models, despite of their simplicity, outperform all competitors. Finally, exploiting a crowd-sourcing platform, we present a study aimed at evaluating how strongly the scanpaths generated with the unsupervised gravitational models appear plausible to naive and expert human observers

    The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

    Full text link
    Artificial intelligence is continuously seeking novel challenges and benchmarks to effectively measure performance and to advance the state-of-the-art. In this paper we introduce KANDY, a benchmarking framework that can be used to generate a variety of learning and reasoning tasks inspired by Kandinsky patterns. By creating curricula of binary classification tasks with increasing complexity and with sparse supervisions, KANDY can be used to implement benchmarks for continual and semi-supervised learning, with a specific focus on symbol compositionality. Classification rules are also provided in the ground truth to enable analysis of interpretable solutions. Together with the benchmark generation pipeline, we release two curricula, an easier and a harder one, that we propose as new challenges for the research community. With a thorough experimental evaluation, we show how both state-of-the-art neural models and purely symbolic approaches struggle with solving most of the tasks, thus calling for the application of advanced neuro-symbolic methods trained over time

    Friendly Training: Neural Networks Can Adapt Data To Make Learning Easier

    Full text link
    In the last decade, motivated by the success of Deep Learning, the scientific community proposed several approaches to make the learning procedure of Neural Networks more effective. When focussing on the way in which the training data are provided to the learning machine, we can distinguish between the classic random selection of stochastic gradient-based optimization and more involved techniques that devise curricula to organize data, and progressively increase the complexity of the training set. In this paper, we propose a novel training procedure named Friendly Training that, differently from the aforementioned approaches, involves altering the training examples in order to help the model to better fulfil its learning criterion. The model is allowed to simplify those examples that are too hard to be classified at a certain stage of the training procedure. The data transformation is controlled by a developmental plan that progressively reduces its impact during training, until it completely vanishes. In a sense, this is the opposite of what is commonly done in order to increase robustness against adversarial examples, i.e., Adversarial Training. Experiments on multiple datasets are provided, showing that Friendly Training yields improvements with respect to informed data sub-selection routines and random selection, especially in deep convolutional architectures. Results suggest that adapting the input data is a feasible way to stabilize learning and improve the generalization skills of the network.Comment: 9 pages, 5 figure

    Being Friends Instead of Adversaries: Deep Networks Learn from Data Simplified by Other Networks

    Get PDF
    Amongst a variety of approaches aimed at making the learning procedure of neural networks more effective, the scientifc community developed strategies to order the examples according to their estimated complexity, to distil knowledge from larger networks, or to exploit the principles behind adversarial machine learning. A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation, with the goal of facilitating the learning process of a neural classifer. The transformation progressively fadesout as long as training proceeds, until it completely vanishes. In this work we revisit and extend this idea, introducing a radically different and novel approach inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning. We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifer at the current stage of the training procedure. The auxiliary network is trained jointly with the neural classifer, thus intrinsically increasing the “depth” of the classifer, and it is expected to spot general regularities in the data alteration process. The effect of the auxiliary network is progressively reduced up to the end of training, when it is fully dropped and the classifer is deployed for applications. We refer to this approach as Neural Friendly Training. An extended experimental procedure involving several datasets and different neural architectures shows that Neural Friendly Training overcomes the originally proposed Friendly Training technique, improving the generalization of the classifer, especially in the case of noisy data
    corecore