7 research outputs found

    On the Data Efficiency and Model Complexity of Visual Learning

    Get PDF
    Computer vision is a research field that aims to automate the procedure of gaining abstract understanding from digital images or videos. The recent rapid developments of deep neural networks have demonstrated human-level performance or beyond on many vision tasks that require high-level understanding, such as image recognition, object detection, etc. However, training deep neural networks usually requires large-scale datasets annotated by humans, and the models typically have millions of parameters and consume a lot of computation resources. The issues of data efficiency and model complexity are commonly observed in many frameworks based on deep neural networks, limiting their deployment in real-world applications. In this dissertation, I will present our research works that address the issues of data efficiency and model complexity of deep neural networks. For the data efficiency, (i) we study the problem of few-shot image recognition, where the training datasets are limited to having only a few examples per category. (ii) We also investigate semi-supervised visual learning, which provides unlabeled samples in addition to the annotated dataset and aims to utilize them to learn better models. For the model complexity, (iii) we seek alternatives to cascading layers or blocks for improving the representation capacities of convolutional neural networks without introducing additional computations. (iv) We improve the computational resource utilization of deep neural networks by finding, reallocating, and rejuvenating underutilized neurons. (v) We present two techniques for object detection that reuse computations to reduce the architecture complexity and improve the detection performance. (vi) Finally, we show our work on reusing visual features for multi-task learning to improve computation efficiency and share training information between different tasks
    corecore