31 research outputs found
Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks
Learning deeper convolutional neural networks becomes a tendency in recent
years. However, many empirical evidences suggest that performance improvement
cannot be gained by simply stacking more layers. In this paper, we consider the
issue from an information theoretical perspective, and propose a novel method
Relay Backpropagation, that encourages the propagation of effective information
through the network in training stage. By virtue of the method, we achieved the
first place in ILSVRC 2015 Scene Classification Challenge. Extensive
experiments on two challenging large scale datasets demonstrate the
effectiveness of our method is not restricted to a specific dataset or network
architecture. Our models will be available to the research community later.Comment: Technical report for our submissions to the ILSVRC 2015 Scene
Classification Challenge, where we won the first plac
Modeling the Temporal Nature of Human Behavior for Demographics Prediction
Mobile phone metadata is increasingly used for humanitarian purposes in
developing countries as traditional data is scarce. Basic demographic
information is however often absent from mobile phone datasets, limiting the
operational impact of the datasets. For these reasons, there has been a growing
interest in predicting demographic information from mobile phone metadata.
Previous work focused on creating increasingly advanced features to be modeled
with standard machine learning algorithms. We here instead model the raw mobile
phone metadata directly using deep learning, exploiting the temporal nature of
the patterns in the data. From high-level assumptions we design a data
representation and convolutional network architecture for modeling patterns
within a week. We then examine three strategies for aggregating patterns across
weeks and show that our method reaches state-of-the-art accuracy on both age
and gender prediction using only the temporal modality in mobile metadata. We
finally validate our method on low activity users and evaluate the modeling
assumptions.Comment: Accepted at ECML 2017. A previous version of this paper was titled
'Using Deep Learning to Predict Demographics from Mobile Phone Metadata' and
was accepted at the ICLR 2016 worksho
Learning to Extract Motion from Videos in Convolutional Neural Networks
This paper shows how to extract dense optical flow from videos with a
convolutional neural network (CNN). The proposed model constitutes a potential
building block for deeper architectures to allow using motion without resorting
to an external algorithm, \eg for recognition in videos. We derive our network
architecture from signal processing principles to provide desired invariances
to image contrast, phase and texture. We constrain weights within the network
to enforce strict rotation invariance and substantially reduce the number of
parameters to learn. We demonstrate end-to-end training on only 8 sequences of
the Middlebury dataset, orders of magnitude less than competing CNN-based
motion estimation methods, and obtain comparable performance to classical
methods on the Middlebury benchmark. Importantly, our method outputs a
distributed representation of motion that allows representing multiple,
transparent motions, and dynamic textures. Our contributions on network design
and rotation invariance offer insights nonspecific to motion estimation
Learning multiple views with orthogonal denoising autoencoders
Multi-view learning techniques are necessary when data is
described by multiple distinct feature sets because single-view learning algorithms tend to overt on these high-dimensional data. Prior successful approaches followed either consensus or complementary principles. Recent work has focused on learning both the shared and private latent spaces of views in order to take advantage of both principles. However, these methods can not ensure that the latent spaces are strictly independent through encouraging the orthogonality in their objective functions. Also little work has explored representation learning techniques for multiview learning. In this paper, we use the denoising autoencoder to learn shared and private latent spaces, with orthogonal constraints | disconnecting every private latent space from the remaining views. Instead of computationally expensive optimization, we adapt the backpropagation algorithm to train our model
Deep Learning Application in Security and Privacy - Theory and Practice:A Position Paper
Technology is shaping our lives in a multitude of ways. This is fuelled by a
technology infrastructure, both legacy and state of the art, composed of a
heterogeneous group of hardware, software, services and organisations. Such
infrastructure faces a diverse range of challenges to its operations that
include security, privacy, resilience, and quality of services. Among these,
cybersecurity and privacy are taking the centre-stage, especially since the
General Data Protection Regulation (GDPR) came into effect. Traditional
security and privacy techniques are overstretched and adversarial actors have
evolved to design exploitation techniques that circumvent protection. With the
ever-increasing complexity of technology infrastructure, security and
privacy-preservation specialists have started to look for adaptable and
flexible protection methods that can evolve (potentially autonomously) as the
adversarial actor changes its techniques. For this, Artificial Intelligence
(AI), Machine Learning (ML) and Deep Learning (DL) were put forward as
saviours. In this paper, we look at the promises of AI, ML, and DL stated in
academic and industrial literature and evaluate how realistic they are. We also
put forward potential challenges a DL based security and privacy protection
technique has to overcome. Finally, we conclude the paper with a discussion on
what steps the DL and the security and privacy-preservation community have to
take to ensure that DL is not just going to be hype, but an opportunity to
build a secure, reliable, and trusted technology infrastructure on which we can
rely on for so much in our lives
Deep passenger state monitoring using viewpoint warping
The advent of autonomous and semi-autonomous vehicles has meant passengers now play a more significant role in the safety and comfort of vehicle journeys. In this paper, we propose a deep learning method to monitor and classify passenger state with camera data. The training of a convolutional neural network is supplemented by data captured from vehicle occupants in different seats and from different viewpoints. Existing driver data or data from one vehicle is augmented by viewpoint warping using planar homography, which does not require knowledge of the source camera parameters, and overcomes the need to re-train the model with large amounts of additional data. To analyse the performance of our approach, data is collected on occupants in two different vehicles, from different viewpoints inside the vehicle. We show that the inclusion of the additional training data and augmentation by homography increases the average passenger state classification rate by 11.1%. We conclude by proposing how occupant state may be used holistically for activity recognition and intention prediction for intelligent vehicle features