11 research outputs found

    Dynamic Face Video Segmentation via Reinforcement Learning

    Full text link
    For real-time semantic video segmentation, most recent works utilised a dynamic framework with a key scheduler to make online key/non-key decisions. Some works used a fixed key scheduling policy, while others proposed adaptive key scheduling methods based on heuristic strategies, both of which may lead to suboptimal global performance. To overcome this limitation, we model the online key decision process in dynamic video segmentation as a deep reinforcement learning problem and learn an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return. Moreover, we study the application of dynamic video segmentation on face videos, a field that has not been investigated before. By evaluating on the 300VW dataset, we show that the performance of our reinforcement key scheduler outperforms that of various baselines in terms of both effective key selections and running speed. Further results on the Cityscapes dataset demonstrate that our proposed method can also generalise to other scenarios. To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos.Comment: CVPR 2020. 300VW with segmentation labels is available at: https://github.com/mapleandfire/300VW-Mas

    XploreNAS: Explore Adversarially Robust & Hardware-efficient Neural Architectures for Non-ideal Xbars

    Full text link
    Compute In-Memory platforms such as memristive crossbars are gaining focus as they facilitate acceleration of Deep Neural Networks (DNNs) with high area and compute-efficiencies. However, the intrinsic non-idealities associated with the analog nature of computing in crossbars limits the performance of the deployed DNNs. Furthermore, DNNs are shown to be vulnerable to adversarial attacks leading to severe security threats in their large-scale deployment. Thus, finding adversarially robust DNN architectures for non-ideal crossbars is critical to the safe and secure deployment of DNNs on the edge. This work proposes a two-phase algorithm-hardware co-optimization approach called XploreNAS that searches for hardware-efficient & adversarially robust neural architectures for non-ideal crossbar platforms. We use the one-shot Neural Architecture Search (NAS) approach to train a large Supernet with crossbar-awareness and sample adversarially robust Subnets therefrom, maintaining competitive hardware-efficiency. Our experiments on crossbars with benchmark datasets (SVHN, CIFAR10 & CIFAR100) show upto ~8-16% improvement in the adversarial robustness of the searched Subnets against a baseline ResNet-18 model subjected to crossbar-aware adversarial training. We benchmark our robust Subnets for Energy-Delay-Area-Products (EDAPs) using the Neurosim tool and find that with additional hardware-efficiency driven optimizations, the Subnets attain ~1.5-1.6x lower EDAPs than ResNet-18 baseline.Comment: 16 pages, 8 figures, 2 table

    Efficient Fully-Convolutional Networks for Image Perception

    Get PDF
    Neural architecture search is widely applied to design networks to outperform manually designed architectures. However, it is not trivial to be directly applied to challenging perception tasks such as object detection since previous methods often rely on manually designed complex operations such as RoI pooling and RCNN heads. Thus, we look for universal fully-convolutional representations for perception tasks, which are easy to optimise and deploy because of their sim ple structures. They perform well on dense prediction tasks such as semantic segmentation, where the networks consist of a backbone module for visual feature extraction and a task-specific module for result generation. Designing the task-specific modules helps us understand how these networks tackle perception tasks and is also crucial for performance and efficiency improvements. However, fully-convolutional networks fall behind two-stage approaches on instance-level tasks such as object detection and instance segmentation. To solve this problem, we focus on designing fully-convolutional frameworks for instance detection tasks and study the task-specific structures and improve their performance by devising efficient neural architecture search algorithms. Our approach starts by designing fully-convolutional models for instance detection tasks. With de- formable convolution, we tackle the local-incoherence problem for top-down instance segmentation, resulting in a fully-convolutional model with equivalent expressiveness as a typical two-stage model. We also propose BlendMask, a fully-convolutional instance segmentation network that is faster and more ac- curate than the state-of-the-art two-stage models. Then we demonstrate the benefit of having uniform representation by designing the first a panoptic segmentation network solving instance and semantic segmentation with a single branch. Targeting to improve the design of task-specific modules for fully- convolutional perception models, we devised efficient neural architecture search algorithms and applied them to video segmentation and object detection.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202

    Efficient evolutionary-based neural architecture search in few GPU hours for image classification and medical image segmentation

    Get PDF
    Orientador: Lucas Ferrari de OliveiraTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 20/09/2021Inclui referências: p. 132-139Área de concentração: Ciência da ComputaçãoResumo: O uso de aprendizagem profunda (AP) está crescendo rapidamente, já que o poder computacional atual fornece otimização e inferência rápidas. Além disso, vários métodos exclusivos de AP estão evoluindo, permitindo resultados superiores em visão computacional, reconhecimento de voz e análise de texto. Os métodos AP extraem característica automaticamente para melhor representação de um problema específico, removendo o árduo trabalho do desenvolvimento de descritores de características dos métodos convencionais. Mesmo que esse processo sejaautomatizado, a criação inteligente de redes neurais é necessária para o aprendizado adequado da representação, o que requer conhecimento em AP. O campo de busca de arquiteturas neurais (BAN) foca no desenvolvimento de abordagens inteligentes que projetam redes robustas automaticamente para reduzir o conhecimento exigido para o desenvolvimento de redes eficientes. BAN pode fornecer maneiras de descobrir diferentes representações de rede, melhorando o estado da arte em diferentes aplicações. Embora BAN seja relativamente nova, várias abordagens foram desenvolvidas para descobrir modelos robustos. Métodos eficientes baseados em evolução são amplamente populares em BAN, mas seu alto consumo de placa gráfica (de alguns dias a meses)desencoraja o uso prático. No presente trabalho, propomos duas abordagens BAN baseadas na evolução eficiente com baixo custo de processamento, exigindo apenas algumas horas de processamento na placa gráfica (menos de doze em uma RTX 2080Ti) para descobrir modelos competitivos. Nossas abordagens extraem conceitos da programação de expressão gênica para representar e gerar redes baseadas em células robustas combinadas com rápido treinamento de candidatos, compartilhamento de peso e combinações dinâmicas. Além disso, os métodos propostos são empregados em um espaço de busca mais amplo, com mais células representando uma rede única. Nossa hipótese central é que BAN baseado na evolução pode ser usado em uma busca com baixo custo (combinada com uma estratégia robusta e busca eficiente) em diversas tarefas de visão computacional sem perder competitividade. Nossos métodos são avaliados em diferentes problemas para validar nossa hipótese: classificação de imagens e segmentação semântica de imagens médicas. Para tanto, as bases de dados CIFAR são estudadas para atarefa de classificação e o desafio CHAOS para a tarefa de segmentação. As menores taxas de erro encontradas nas bases CIFAR-10 e CIFAR-100 foram 2,17% ± 0,10 e 15,47% ± 0,51,respectivamente. Quanto às tarefas do desafio CHAOS, os valores de Dice ficaram entre 90% e96%. Os resultados obtidos com nossas propostas em ambas as tarefas mostraram a descoberta de redes robustas para ambas as tarefas com baixo custo na fase de busca, sendo competitivas em relação ao estado da arte em ambos os desafios.Abstract: Deep learning (DL) usage is growing fast since current computational power provides fast optimization and inference. Furthermore, several unique DL methods are evolving, enabling superior computer vision, speech recognition, and text analysis results. DL methods automatically extract features to represent a specific problem better, removing the hardworking of feature engineering from conventional methods. Even if this process is automated, intelligent network design is necessary for proper representation learning, which requires expertise in DL. The neural architecture search (NAS) field focuses on developing intelligent approaches that automatically design robust networks to reduce the expertise required for developing efficient networks. NAS may provide ways to discover different network representations, improving the state-of-the-art indifferent applications. Although NAS is relatively new, several approaches were developed for discovering robust models. Efficient evolutionary-based methods are widely popular in NAS, buttheir high GPU consumption (from a few days to months) discourages practical use. In the presentwork, we propose two efficient evolutionary-based NAS approaches with low-GPU cost, requiring only a few GPU hours (less than twelve in an RTX 2080Ti) to discover competitive models. Our approaches extract concepts from gene expression programming to represent and generate robust cell-based networks combined with fast candidate training, weight sharing, and dynamic combinations. Furthermore, the proposed methods are employed in a broader search space, withmore cells representing a unique network. Our central hypothesis is that evolutionary-based NAScan be used in a low-cost GPU search (combined with a robust strategy and efficient search) indiverse computer vision tasks without losing competitiveness. Our methods are evaluated indifferent problems to validate our hypothesis: image classification and medical image semantic segmentation. For this purpose, the CIFAR datasets are studied for the classification task andthe CHAOS challenge for the segmentation task. The lowest error rates found in CIFAR-10 andCIFAR-100 datasets were 2.17% ± 0.10 and 15.47% ± 0.51, respectively. As for the CHAOS challenge tasks, the dice scores were between 90% and 96%. The obtained results from our proposal in both tasks shown the discovery of robust networks for both tasks with little GPU costin the search phase, being competitive to state-of-the-art approaches in both challenges
    corecore