Search CORE

1,086 research outputs found

Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey

Author: Sng Dennis
Wang Li
Publication venue
Publication date: 09/12/2015
Field of study

Deep learning has recently achieved very promising results in a wide range of areas such as computer vision, speech recognition and natural language processing. It aims to learn hierarchical representations of data by using deep architecture models. In a smart city, a lot of data (e.g. videos captured from many distributed sensors) need to be automatically processed and analyzed. In this paper, we review the deep learning algorithms applied to video analytics of smart city in terms of different research topics: object detection, object tracking, face recognition, image classification and scene labeling.Comment: 8 pages, 18 figure

arXiv.org e-Print Archive

Deep Learning in Information Security

Author: Menkovski Vlado
Petkovic Milan
Thaler Stefan
Publication venue
Publication date: 12/09/2018
Field of study

Machine learning has a long tradition of helping to solve complex information security problems that are difficult to solve manually. Machine learning techniques learn models from data representations to solve a task. These data representations are hand-crafted by domain experts. Deep Learning is a sub-field of machine learning, which uses models that are composed of multiple layers. Consequently, representations that are used to solve a task are learned from the data instead of being manually designed. In this survey, we study the use of DL techniques within the domain of information security. We systematically reviewed 77 papers and presented them from a data-centric perspective. This data-centric perspective reflects one of the most crucial advantages of DL techniques -- domain independence. If DL-methods succeed to solve problems on a data type in one domain, they most likely will also succeed on similar data from another domain. Other advantages of DL methods are unrivaled scalability and efficiency, both regarding the number of examples that can be analyzed as well as with respect of dimensionality of the input data. DL methods generally are capable of achieving high-performance and generalize well. However, information security is a domain with unique requirements and challenges. Based on an analysis of our reviewed papers, we point out shortcomings of DL-methods to those requirements and discuss further research opportunities

arXiv.org e-Print Archive

Identifying Synapses Using Deep and Wide Multiscale Recursive Networks

Author: Huang Gary B.
Plaza Stephen
Publication venue
Publication date: 05/09/2014
Field of study

In this work, we propose a learning framework for identifying synapses using a deep and wide multi-scale recursive (DAWMR) network, previously considered in image segmentation applications. We apply this approach on electron microscopy data from invertebrate fly brain tissue. By learning features directly from the data, we are able to achieve considerable improvements over existing techniques that rely on a small set of hand-designed features. We show that this system can reduce the amount of manual annotation required, in both acquisition of training data as well as verification of inferred detections

arXiv.org e-Print Archive

A Taxonomy of Deep Convolutional Neural Nets for Computer Vision

Author: Babu R. Venkatesh
Kruthiventi Srinivas S S
Mopuri Konda Reddy
Prabhu Nikita
Sarvadevabhatla Ravi Kiran
Srinivas Suraj
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

Traditional architectures for solving computer vision problems and the degree of success they enjoyed have been heavily reliant on hand-crafted features. However, of late, deep learning techniques have offered a compelling alternative -- that of automatically learning problem-specific features. With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective. Therefore, it has become important to understand what kind of deep networks are suitable for a given problem. Although general surveys of this fast-moving paradigm (i.e. deep-networks) exist, a survey specific to computer vision is missing. We specifically consider one form of deep networks widely used in computer vision - convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN and then examine the broad variations proposed over time to suit different applications. We hope that our recipe-style survey will serve as a guide, particularly for novice practitioners intending to use deep-learning techniques for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm

arXiv.org e-Print Archive

Frontiers - Publisher Connector

Learning Contextual Dependencies with Convolutional Hierarchical Recurrent Neural Networks

Author: Liu Xiao
Shuai Bing
Wang Bing
Wang Gang
Wang Xingxing
Zuo Zhen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/02/2016
Field of study

Existing deep convolutional neural networks (CNNs) have shown their great success on image classification. CNNs mainly consist of convolutional and pooling layers, both of which are performed on local image areas without considering the dependencies among different image regions. However, such dependencies are very important for generating explicit image representation. In contrast, recurrent neural networks (RNNs) are well known for their ability of encoding contextual information among sequential data, and they only require a limited number of network parameters. General RNNs can hardly be directly applied on non-sequential data. Thus, we proposed the hierarchical RNNs (HRNNs). In HRNNs, each RNN layer focuses on modeling spatial dependencies among image regions from the same scale but different locations. While the cross RNN scale connections target on modeling scale dependencies among regions from the same location but different scales. Specifically, we propose two recurrent neural network models: 1) hierarchical simple recurrent network (HSRN), which is fast and has low computational cost; and 2) hierarchical long-short term memory recurrent network (HLSTM), which performs better than HSRN with the price of more computational cost. In this manuscript, we integrate CNNs with HRNNs, and develop end-to-end convolutional hierarchical recurrent neural networks (C-HRNNs). C-HRNNs not only make use of the representation power of CNNs, but also efficiently encodes spatial and scale dependencies among different image regions. On four of the most challenging object/scene image classification benchmarks, our C-HRNNs achieve state-of-the-art results on Places 205, SUN 397, MIT indoor, and competitive results on ILSVRC 2012

arXiv.org e-Print Archive

Transfer of View-manifold Learning to Similarity Perception of Novel Objects

Author: Lee Tai Sing
Li Zhihao
Lin Xingyu
Wang Hao
Yuille Alan
Zhang Yimeng
Publication venue
Publication date: 31/03/2017
Field of study

We develop a model of perceptual similarity judgment based on re-training a deep convolution neural network (DCNN) that learns to associate different views of each 3D object to capture the notion of object persistence and continuity in our visual experience. The re-training process effectively performs distance metric learning under the object persistency constraints, to modify the view-manifold of object representations. It reduces the effective distance between the representations of different views of the same object without compromising the distance between those of the views of different objects, resulting in the untangling of the view-manifolds between individual objects within the same category and across categories. This untangling enables the model to discriminate and recognize objects within the same category, independent of viewpoints. We found that this ability is not limited to the trained objects, but transfers to novel objects in both trained and untrained categories, as well as to a variety of completely novel artificial synthetic objects. This transfer in learning suggests the modification of distance metrics in view- manifolds is more general and abstract, likely at the levels of parts, and independent of the specific objects or categories experienced during training. Interestingly, the resulting transformation of feature representation in the deep networks is found to significantly better match human perceptual similarity judgment than AlexNet, suggesting that object persistence could be an important constraint in the development of perceptual similarity judgment in biological neural networks.Comment: Accepted to ICLR201

arXiv.org e-Print Archive

Deep Trans-layer Unsupervised Networks for Representation Learning

Author: Chen Xilin
Miao Jun
Qing Laiyun
Zhu Wentao
Publication venue
Publication date: 26/09/2015
Field of study

Learning features from massive unlabelled data is a vast prevalent topic for high-level tasks in many machine learning applications. The recent great improvements on benchmark data sets achieved by increasingly complex unsupervised learning methods and deep learning models with lots of parameters usually requires many tedious tricks and much expertise to tune. However, filters learned by these complex architectures are quite similar to standard hand-crafted features visually. In this paper, unsupervised learning methods, such as PCA or auto-encoder, are employed as the building block to learn filter banks at each layer. The lower layer responses are transferred to the last layer (trans-layer) to form a more complete representation retaining more information. In addition, some beneficial methods such as local contrast normalization and whitening are added to the proposed deep trans-layer networks to further boost performance. The trans-layer representations are followed by block histograms with binary encoder schema to learn translation and rotation invariant representations, which are utilized to do high-level tasks such as recognition and classification. Compared to traditional deep learning methods, the implemented feature learning method has much less parameters and is validated in several typical experiments, such as digit recognition on MNIST and MNIST variations, object recognition on Caltech 101 dataset and face verification on LFW dataset. The deep trans-layer unsupervised learning achieves 99.45% accuracy on MNIST dataset, 67.11% accuracy on 15 samples per class and 75.98% accuracy on 30 samples per class on Caltech 101 dataset, 87.10% on LFW dataset.Comment: 21 pages, 3 figure

arXiv.org e-Print Archive

Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval

Author: Chandrasekhar Vijay
Lin Jie
Morère Olivier
Poggio Tomaso
Veillard Antoine
Publication venue
Publication date: 14/04/2016
Field of study

The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate scale, translation and rotation invariances but the scheme can be extended to any arbitrary sets of transformations. We also show that using moments of increasing order throughout nesting is important. The NIP descriptors are then hashed to the target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel batch-level regularization scheme specifically designed for the purpose of hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows that the results obtained both with the NIP descriptors and the NIP+RBMH hashes are consistently outstanding across a wide range of datasets.Comment: Image Instance Retrieval, CNN, Invariant Representation, Hashing, Unsupervised Learning, Regularization. arXiv admin note: text overlap with arXiv:1601.0209

arXiv.org e-Print Archive

PCANet: A Simple Deep Learning Baseline for Image Classification?

Author: Chan Tsung-Han
Gao Shenghua
Jia Kui
Lu Jiwen
Ma Yi
Zeng Zinan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/08/2014
Field of study

In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms. In the proposed architecture, PCA is employed to learn multistage filter banks. It is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus named as a PCA network (PCANet) and can be designed and learned extremely easily and efficiently. For comparison and better understanding, we also introduce and study two simple variations to the PCANet, namely the RandNet and LDANet. They share the same topology of PCANet but their cascaded filters are either selected randomly or learned from LDA. We have tested these basic networks extensively on many benchmark visual datasets for different tasks, such as LFW for face verification, MultiPIE, Extended Yale B, AR, FERET datasets for face recognition, as well as MNIST for hand-written digits recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state of the art features, either prefixed, highly hand-crafted or carefully learned (by DNNs). Even more surprisingly, it sets new records for many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST variations. Additional experiments on other public datasets also demonstrate the potential of the PCANet serving as a simple but highly competitive baseline for texture classification and object recognition

arXiv.org e-Print Archive

Application of Deep Learning on Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Author: Chen Lei
Hu Zhihong
Jaitly Vanya
Kanaan Zeyad
Lin Mei
Nguyen Andy N. D.
Rios Adan
Wahed Md. Amer
Wang Iris
Publication venue
Publication date: 30/10/2018
Field of study

We explore how Deep Learning (DL) can be utilized to predict prognosis of acute myeloid leukemia (AML). Out of TCGA (The Cancer Genome Atlas) database, 94 AML cases are used in this study. Input data include age, 10 common cytogenetic and 23 most common mutation results; output is the prognosis (diagnosis to death, DTD). In our DL network, autoencoders are stacked to form a hierarchical DL model from which raw data are compressed and organized and high-level features are extracted. The network is written in R language and is designed to predict prognosis of AML for a given case (DTD of more than or less than 730 days). The DL network achieves an excellent accuracy of 83% in predicting prognosis. As a proof-of-concept study, our preliminary results demonstrate a practical application of DL in future practice of prognostic prediction using next-gen sequencing (NGS) data.Comment: 11 pages, 1 table, 1 figure. arXiv admin note: substantial text overlap with arXiv:1801.0101

arXiv.org e-Print Archive