8 research outputs found

    Developmental Pretraining (DPT) for Image Classification Networks

    Full text link
    In the backdrop of increasing data requirements of Deep Neural Networks for object recognition that is growing more untenable by the day, we present Developmental PreTraining (DPT) as a possible solution. DPT is designed as a curriculum-based pre-training approach designed to rival traditional pre-training techniques that are data-hungry. These training approaches also introduce unnecessary features that could be misleading when the network is employed in a downstream classification task where the data is sufficiently different from the pre-training data and is scarce. We design the curriculum for DPT by drawing inspiration from human infant visual development. DPT employs a phased approach where carefully-selected primitive and universal features like edges and shapes are taught to the network participating in our pre-training regime. A model that underwent the DPT regime is tested against models with randomised weights to evaluate the viability of DPT.Comment: 7 pages, 4 figure

    Is Robustness To Transformations Driven by Invariant Neural Representations?

    Full text link
    Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (e.g. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. Yet, to what extent this hypothesis holds true is an outstanding question, as including transformations in the training set could lead to properties different from invariance, e.g. parts of the network could be specialized to recognize either transformed or non-transformed images. In this paper, we analyze the conditions under which invariance emerges. To do so, we leverage that invariant representations facilitate robustness to transformations for object categories that are not seen transformed during training. Our results with state-of-the-art DCNNs indicate that invariant representations strengthen as the number of transformed categories in the training set is increased. This is much more prominent with local transformations such as blurring and high-pass filtering, compared to geometric transformations such as rotation and thinning, that entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant representations in deep learning, and the conditions under which invariance spontaneously emerges

    Cultural evolution creates the statistical structure of language

    Get PDF
    Human language is unique in its structure: language is made up of parts that can be recombined in a productive way. The parts are not given but have to be discovered by learners exposed to unsegmented wholes. Across languages, the frequency distribution of those parts follows a power law. Both statistical properties—having parts and having them follow a particular distribution—facilitate learning, yet their origin is still poorly understood. Where do the parts come from and why do they follow a particular frequency distribution? Here, we show how these two core properties emerge from the process of cultural evolution with whole-to-part learning. We use an experimental analog of cultural transmission in which participants copy sets of non-linguistic sequences produced by a previous participant: This design allows us to ask if parts will emerge purely under pressure for the system to be learnable, even without meanings to convey. We show that parts emerge from initially unsegmented sequences, that their distribution becomes closer to a power law over generations, and, importantly, that these properties make the sets of sequences more learnable. We argue that these two core statistical properties of language emerge culturally both as a cause and effect of greater learnability

    Investigating face perception in humans and DCNNs

    Full text link
    This thesis aims to compare strengths and weaknesses of AI and humans performing face identification tasks, and to use recent advances in machine-learning to develop new techniques for understanding face identity processing. By better understanding underlying processing differences between Deep Convolutional Neural Networks (DCNNs) and humans, it can help improve the ways in which AI technology is used to support human decision-making and deepen understanding of face identity processing in humans and DCNNs. In Chapter 2, I test how the accuracy of humans and DCNNs is affected by image quality and find that humans and DCNNs are affected differently. This has important applied implications, for example, when identifying faces from poor-quality imagery in police investigations, and also points to different processing strategies used by humans and DCNNs. Given these diverging processing strategies, in Chapter 3, I investigate the potential for human and DCNN decisions to be combined in face identification decisions. I find a large overall benefit of 'fusing' algorithm and human face identity judgments, and that this depends on the idiosyncratic accuracy and response patterns of the particular DCNNs and humans in question. This points to new optimal ways that individual humans and DCNNs can be aggregated to improve the accuracy of face identity decisions in applied settings. Building on my background in computer vision, in Chapters 4 and 5, I then aim to better understand face information sampling by humans using a novel combination of eye-tracking and machine-learning approaches. In chapter 4, I develop exploratory methods for studying individual differences in face information sampling strategies. This reveals differences in the way that 'super-recognisers' sample face information compared to typical viewers. I then use DCNNs to assess the computational value of the face information sampled by these two groups of human observers, finding that sampling by 'super-recognisers' contains more computationally valuable face identity information. In Chapter 5, I develop a novel approach to measuring fixations to people in unconstrained natural settings by combining wearable eye-tracking technology with face and body detection algorithms. Together, these new approaches provide novel insight into individual differences in face information sampling, both when looking at faces in lab-based tasks performed on computer monitors and when looking at faces 'in the wild'

    Potential downside of high initial visual acuity

    No full text
    Children who are treated for congenital cataracts later exhibit impairments in configural face analysis. This has been explained in terms of a critical period for the acquisition of normal face processing. Here, we consider a more parsimonious account according to which deficits in configural analysis result from the abnormally high initial retinal acuity that children treated for cataracts experience, relative to typical newborns. According to this proposal, the initial period of low retinal acuity characteristic of normal visual development induces extended spatial processing in the cortex that is important for configural face judgments. As a computational test of this hypothesis, we examined the effects of training with high-resolution or blurred images, and staged combinations, on the receptive fields and performance of a convolutional neural network. The results show that commencing trainingwith blurred images creates receptive fields that integrate information across larger image areas and leads to improved performance and better generalization across a range of resolutions. These findings offer an explanation for the observed face recognition impairments after late treatment of congenital blindness, suggest an adaptive function for the acuity trajectory in normal development, and provide a scheme for improving the performance of computational face recognition systems. ©2018 National Academy of Sciences. All rights reserved.Nick Simons FoundationNational Eye Institute Grant (EYR01020517
    corecore