Search CORE

424,501 research outputs found

Representation Learning by Learning to Count

Author: Favaro Paolo
Noroozi Mehdi
Pirsiavash Hamed
Publication venue
Publication date: 01/01/2017
Field of study

We introduce a novel method for representation learning that uses an artificial supervision signal based on counting visual primitives. This supervision signal is obtained from an equivariance relation, which does not require any manual annotation. We relate transformations of images to transformations of the representations. More specifically, we look for the representation that satisfies such relation rather than the transformations that match a given representation. In this paper, we use two image transformations in the context of counting: scaling and tiling. The first transformation exploits the fact that the number of visual primitives should be invariant to scale. The second transformation allows us to equate the total number of visual primitives in each tile to that in the whole image. These two transformations are combined in one constraint and used to train a neural network with a contrastive loss. The proposed task produces representations that perform on par or exceed the state of the art in transfer learning benchmarks.Comment: ICCV 2017(oral

arXiv.org e-Print Archive

Bern Open Repository and Information System (BORIS)

Learning to count with deep object features

Author: Pujol Oriol
Seguí Santi
Vitrià Jordi
Publication venue
Publication date: 29/05/2015
Field of study

Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.Comment: This paper has been accepted at Deep Vision Workshop at CVPR 201

arXiv.org e-Print Archive

Crossref

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Author: Costeira João P.
Moura José M. F.
Wu Guanhang
Zhang Shanghang
Publication venue
Publication date: 31/07/2017
Field of study

In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of densities in each frame, which significantly accelerates the training of networks. To preserve feature map resolution, we propose a Hyper-Atrous combination to integrate atrous convolution in FCN and combine feature maps of different convolution layers. FCN-rLSTM enables refined feature representation and a novel end-to-end trainable mapping from pixels to vehicle count. We extensively evaluated the proposed method on different counting tasks with three datasets, with experimental results demonstrating their effectiveness and robustness. In particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21 on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Speaker recognition with hybrid features from a deep belief network

Author: AR Mohamed
Artur S. d’Avila Garcez
C Burges
Emmanouil Benetos
F Richardson
GE Hinton
GE Hinton
H Ali
H Ali
H Lee
Hazrat Ali
L Deng
N Dehak
N Roux Le
Son N. Tran
T Kinnunen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Learning representation from audio data has shown advantages over the handcrafted features such as mel-frequency cepstral coefficients (MFCCs) in many audio applications. In most of the representation learning approaches, the connectionist systems have been used to learn and extract latent features from the fixed length data. In this paper, we propose an approach to combine the learned features and the MFCC features for speaker recognition task, which can be applied to audio scripts of different lengths. In particular, we study the use of features from different levels of deep belief network for quantizing the audio data into vectors of audio word counts. These vectors represent the audio scripts of different lengths that make them easier to train a classifier. We show in the experiment that the audio word count vectors generated from mixture of DBN features at different layers give better performance than the MFCC features. We also can achieve further improvement by combining the audio word count vector and the MFCC features

City Research Online

Crossref

University of Tasmania Open Access Repository

Queen Mary Research Online

Learning Contact-Rich Manipulation Skills with Guided Policy Search

Author: Abbeel Pieter
Levine Sergey
Wagener Nolan
Publication venue
Publication date: 26/02/2015
Field of study

Autonomous learning of object manipulation skills can enable robots to acquire rich behavioral repertoires that scale to the variety of objects found in the real world. However, current motion skill learning methods typically restrict the behavior to a compact, low-dimensional representation, limiting its expressiveness and generality. In this paper, we extend a recently developed policy search method \cite{la-lnnpg-14} and use it to learn a range of dynamic manipulation behaviors with highly general policy representations, without using known models or example demonstrations. Our approach learns a set of trajectories for the desired motion skill by using iteratively refitted time-varying linear models, and then unifies these trajectories into a single control policy that can generalize to new situations. To enable this method to run on a real robot, we introduce several improvements that reduce the sample count and automate parameter selection. We show that our method can acquire fast, fluent behaviors after only minutes of interaction time, and can learn robust controllers for complex tasks, including putting together a toy airplane, stacking tight-fitting lego blocks, placing wooden rings onto tight-fitting pegs, inserting a shoe tree into a shoe, and screwing bottle caps onto bottles

arXiv.org e-Print Archive

Crossref

Classification of ductile cast iron specimens: A machine learning approach

Author: De Santis Alberto
Di Cocco Vittorio
Iacoviello Daniela
Iacoviello Francesco
Publication venue: 'Gruppo Italiano Frattura'
Publication date: 01/01/2017
Field of study

In this paper an automatic procedure based on a machine learning approach is proposed to classify ductile cast iron specimens according to the American Society for Testing and Materials guidelines. The mechanical properties of a specimen are strongly influenced by the peculiar morphology of their graphite elements and useful characteristics, the features, are extracted from the specimens’ images; these characteristics examine the shape, the distribution and the size of the graphite particle in the specimen, the nodularity and the nodule count. The principal components analysis are used to provide a more efficient representation of these data. Support vector machines are trained to obtain a classification of the data by yielding sequential binary classification steps. Numerical analysis is performed on a significant number of images providing robust results, also in presence of dust, scratches and measurement noise

IRIS Unicas (Università degli Studi di Cassino e del Lazio Meridionale)

Italian Group Fracture (IGF): E-Journals / Gruppo Italiano Frattura

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

The impact of polyhedron learning assisted by Edpuzzle in improving students' mathematical representation

Author: Kusumah Yaya S.
Martadiputra Bambang Avip Priatna
Rosdianwinata Eka
Sartika Nenden Suciyati
Sutihat Sutihat
Publication venue: 'Universitas Hamzanwadi'
Publication date: 02/01/2023
Field of study

The ability that students must have is the ability of mathematical representation. This research aims to determine if eighth-grade students at Al-Qona'ah Islamic Junior High School improve their mathematical representation skills after using the Edpuzzle application to learn how to construct polyhedrons. It was a quantitative research design using the Quasi Experiment research method with control and experimental classes. This research's population comprised 60 students (23 males and 37 females). After conducting research, it was determined that students' mathematical representation abilities did not improve with the aid of the Edpuzzle application on the polyhedron due to the need for more learning support facilities when utilizing the Edpuzzle application. This research is supported by the t-test, which yielded t-count t-table or -29.2936 -1.67155 so that H0 is accepted. The results of the n-gain calculation yielded a total n-gain score of 0.03, or the n-gain was low, indicating was no progress in learning using the Edpuzzle application on the polyhedron. This study did not have an impact on increasing students' mathematical representation

e-Journal of Hamzanwadi University