284 research outputs found

    Incorporating action information into computational models of the human visual system

    Get PDF
    Deep convolutional neural networks (DCNNs) have been used to model the ventral visual stream. However, there have been relatively few computational models of the dorsal visual stream, preventing a wholistic understanding of the human visual system. Additionally, current DCNN models of the ventral stream have shortcomings (such as an over-reliance on texture data) which can be ameliorated by incorporating dorsal stream information. The current study aims to investigate two questions: 1) does incorporating action information improve computational models of the ventral visual system? 2) how do the ventral and dorsal streams influence each other during development? Three models will be created: a two-task neural network trained to both perform object recognition and to generate human grasp points; a single-task neural network trained to perform only image recognition; and a lesioned neural network, which will be identical to the two-task neural network except that the units with greatest representation contribution towards grasp point generation will be deactivated. All networks will be evaluated on performance metrics such as accuracy (evaluated with ImageNet and Stylized-ImageNet), transfer learning, and robustness against distortions. The networks will also be evaluated on representational metrics such as representation contribution analysis and representational similarity analysis. We expect the two-task network will score higher on performance measures than either the lesioned or single-task network. Additionally, for the two-task network we predict more units will contribute towards grasp point generation than object recognition. Lastly, we expect representations in the two-task network to better reflect human data than the single-task network

    Interpreting Adversarially Trained Convolutional Neural Networks

    Full text link
    We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation. We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and images under different transformations. The comparison could visually show that the prediction of the two types of CNNs is sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.Comment: To apper in ICML1

    End-to-End Boundary Aware Networks for Medical Image Segmentation

    Full text link
    Fully convolutional neural networks (CNNs) have proven to be effective at representing and classifying textural information, thus transforming image intensity into output class masks that achieve semantic image segmentation. In medical image analysis, however, expert manual segmentation often relies on the boundaries of anatomical structures of interest. We propose boundary aware CNNs for medical image segmentation. Our networks are designed to account for organ boundary information, both by providing a special network edge branch and edge-aware loss terms, and they are trainable end-to-end. We validate their effectiveness on the task of brain tumor segmentation using the BraTS 2018 dataset. Our experiments reveal that our approach yields more accurate segmentation results, which makes it promising for more extensive application to medical image segmentation.Comment: Accepted to MICCAI Machine Learning in Medical Imaging (MLMI 2019

    Увеличение точности реидентификации людей на основе двухэтапного обучения сверточных нейронных сетей и аугментации

    Get PDF
    Objectives. The main goal is to improve person re-identification accuracy in distributed video surveillance systems.Methods. Machine learning methods are applied.Result. A technology for two-stage training of convolutional neural networks (CNN) is presented, characterized by the use of image augmentation for the preliminary stage and fine tuning of weight coefficients based on the original images set for training. At the first stage, training is carried out on augmented data, at the second stage, fine tuning of the CNN is performed on the original images, which allows minimizing the losses and increasing model efficiency. The use of different data at different training stages does not allow the CNN to remember training examples, thereby preventing overfitting.Proposed method as expanding the training sample differs as it combines an image pixels cyclic shift, color  exclusion and fragment replacement with a reduced copy of another image. This augmentation method allows to get a wide variety of training data, which increases the CNN robustness to occlusions, illumination, low image resolution, dependence on the location of features.Conclusion. The use of two-stage learning technology and the proposed data augmentation method made it possible to increase the person re-identification accuracy for different CNNs and datasets: in the Rank1 metric  by 4–21 %; in the mAP by 10–31 %; in the mINP by 39–60 %. Цели. Основной целью является повышение точности повторной идентификации людей в распределенных системах видеонаблюдения.Методы. Используются методы машинного обучения.Результаты. Представлена технология двухэтапного обучения сверточных нейронных сетей (СНС), отличающаяся использованием аугментации изображений для предварительного этапа и точной настройки весовых коэффициентов на основе исходного набора изображений. На первом этапе обучение осуществляется на аугментированных данных, затем выполняется точная настройка СНС на исходных изображениях, что способствует повышению эффективности ре-идентификации за счет уменьшения потерь при обучении. Использование на двух этапах разных данных не позволяет СНС запоминать тренировочные примеры, тем самым предотвращая переобучение.Предложенный метод расширения набора данных для обучения отличается тем, что совмещает циклический сдвиг пикселей изображения, исключение цветности и замещение фрагмента уменьшенной копией другого из пакета, подаваемого на вход СНС. Данный метод аугментации позволяет увеличить разнообразие обучающих данных, что повышает робастность СНС ко многим факторам: перекрытию людей, изменению освещенности, уменьшению разрешения изображения, зависимости от местоположения отличительных особенностей объекта интереса.Заключение. Применение технологии двухэтапного обучения и предложенного метода аугментации данных позволило повысить точность повторной идентификации людей для разных СНС и наборов данных в метриках: Rank1 на 4% – 21%; mAP на 10% – 31%; mINP на 39% – 60%
    corecore