Search CORE

7,642 research outputs found

Distributed learning of CNNs on heterogeneous CPU/GPU architectures

Author: Alexandre Luís A.
Falcao Gabriel
Marques Jose
Publication venue
Publication date: 07/12/2017
Field of study

Convolutional Neural Networks (CNNs) have shown to be powerful classification tools in tasks that range from check reading to medical diagnosis, reaching close to human perception, and in some cases surpassing it. However, the problems to solve are becoming larger and more complex, which translates to larger CNNs, leading to longer training times that not even the adoption of Graphics Processing Units (GPUs) could keep up to. This problem is partially solved by using more processing units and distributed training methods that are offered by several frameworks dedicated to neural network training. However, these techniques do not take full advantage of the possible parallelization offered by CNNs and the cooperative use of heterogeneous devices with different processing capabilities, clock speeds, memory size, among others. This paper presents a new method for the parallel training of CNNs that can be considered as a particular instantiation of model parallelism, where only the convolutional layer is distributed. In fact, the convolutions processed during training (forward and backward propagation included) represent from

60

90

\% of global processing time. The paper analyzes the influence of network size, bandwidth, batch size, number of devices, including their processing capabilities, and other parameters. Results show that this technique is capable of diminishing the training time without affecting the classification performance for both CPUs and GPUs. For the CIFAR-10 dataset, using a CNN with two convolutional layers, and

500

and

1500

kernels, respectively, best speedups achieve

3.28\times

using four CPUs and

2.45\times

with three GPUs. Modern imaging datasets, larger and more complex than CIFAR-10 will certainly require more than

60

90

\% of processing time calculating convolutions, and speedups will tend to increase accordingly

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

UBibliorum repositorio digital da ubi

Directory of Open Access Journals

Distribution-Based Categorization of Classifier Transfer Learning

Author: Alexandre Luís A.
de Sá Joaquim Marques
Santos Jorge M.
Silva Luís M.
Sousa Ricardo Gamelas
Publication venue
Publication date: 06/12/2017
Field of study

Transfer Learning (TL) aims to transfer knowledge acquired in one problem, the source problem, onto another problem, the target problem, dispensing with the bottom-up construction of the target model. Due to its relevance, TL has gained significant interest in the Machine Learning community since it paves the way to devise intelligent learning models that can easily be tailored to many different applications. As it is natural in a fast evolving area, a wide variety of TL methods, settings and nomenclature have been proposed so far. However, a wide range of works have been reporting different names for the same concepts. This concept and terminology mixture contribute however to obscure the TL field, hindering its proper consideration. In this paper we present a review of the literature on the majority of classification TL methods, and also a distribution-based categorization of TL with a common nomenclature suitable to classification problems. Under this perspective three main TL categories are presented, discussed and illustrated with examples

arXiv.org e-Print Archive

UBibliorum repositorio digital da ubi

A logic for n-dimensional hierarchical refinement

Author: Barbosa Luís S.
Madeira Alexandre
Martins Manuel A.
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2016
Field of study

Hierarchical transition systems provide a popular mathematical structure to represent state-based software applications in which different layers of abstraction are represented by inter-related state machines. The decomposition of high level states into inner sub-states, and of their transitions into inner sub-transitions is common refinement procedure adopted in a number of specification formalisms. This paper introduces a hybrid modal logic for k-layered transition systems, its first-order standard translation, a notion of bisimulation, and a modal invariance result. Layered and hierarchical notions of refinement are also discussed in this setting.Comment: In Proceedings Refine'15, arXiv:1606.0134

arXiv.org e-Print Archive

Directory of Open Access Journals