7,696 research outputs found

    Deep Learning in the Automotive Industry: Applications and Tools

    Full text link
    Deep Learning refers to a set of machine learning techniques that utilize neural networks with many hidden layers for tasks, such as image classification, speech recognition, language understanding. Deep learning has been proven to be very effective in these domains and is pervasively used by many Internet services. In this paper, we describe different automotive uses cases for deep learning in particular in the domain of computer vision. We surveys the current state-of-the-art in libraries, tools and infrastructures (e.\,g.\ GPUs and clouds) for implementing, training and deploying deep neural networks. We particularly focus on convolutional neural networks and computer vision use cases, such as the visual inspection process in manufacturing plants and the analysis of social media data. To train neural networks, curated and labeled datasets are essential. In particular, both the availability and scope of such datasets is typically very limited. A main contribution of this paper is the creation of an automotive dataset, that allows us to learn and automatically recognize different vehicle properties. We describe an end-to-end deep learning application utilizing a mobile app for data collection and process support, and an Amazon-based cloud backend for storage and training. For training we evaluate the use of cloud and on-premises infrastructures (including multiple GPUs) in conjunction with different neural network architectures and frameworks. We assess both the training times as well as the accuracy of the classifier. Finally, we demonstrate the effectiveness of the trained classifier in a real world setting during manufacturing process.Comment: 10 page

    Distributed learning of CNNs on heterogeneous CPU/GPU architectures

    Get PDF
    Convolutional Neural Networks (CNNs) have shown to be powerful classification tools in tasks that range from check reading to medical diagnosis, reaching close to human perception, and in some cases surpassing it. However, the problems to solve are becoming larger and more complex, which translates to larger CNNs, leading to longer training times that not even the adoption of Graphics Processing Units (GPUs) could keep up to. This problem is partially solved by using more processing units and distributed training methods that are offered by several frameworks dedicated to neural network training. However, these techniques do not take full advantage of the possible parallelization offered by CNNs and the cooperative use of heterogeneous devices with different processing capabilities, clock speeds, memory size, among others. This paper presents a new method for the parallel training of CNNs that can be considered as a particular instantiation of model parallelism, where only the convolutional layer is distributed. In fact, the convolutions processed during training (forward and backward propagation included) represent from 6060-9090\% of global processing time. The paper analyzes the influence of network size, bandwidth, batch size, number of devices, including their processing capabilities, and other parameters. Results show that this technique is capable of diminishing the training time without affecting the classification performance for both CPUs and GPUs. For the CIFAR-10 dataset, using a CNN with two convolutional layers, and 500500 and 15001500 kernels, respectively, best speedups achieve 3.28×3.28\times using four CPUs and 2.45×2.45\times with three GPUs. Modern imaging datasets, larger and more complex than CIFAR-10 will certainly require more than 6060-9090\% of processing time calculating convolutions, and speedups will tend to increase accordingly

    Adaptive transfer functions: improved multiresolution visualization of medical models

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00371-016-1253-9Medical datasets are continuously increasing in size. Although larger models may be available for certain research purposes, in the common clinical practice the models are usually of up to 512x512x2000 voxels. These resolutions exceed the capabilities of conventional GPUs, the ones usually found in the medical doctors’ desktop PCs. Commercial solutions typically reduce the data by downsampling the dataset iteratively until it fits the available target specifications. The data loss reduces the visualization quality and this is not commonly compensated with other actions that might alleviate its effects. In this paper, we propose adaptive transfer functions, an algorithm that improves the transfer function in downsampled multiresolution models so that the quality of renderings is highly improved. The technique is simple and lightweight, and it is suitable, not only to visualize huge models that would not fit in a GPU, but also to render not-so-large models in mobile GPUs, which are less capable than their desktop counterparts. Moreover, it can also be used to accelerate rendering frame rates using lower levels of the multiresolution hierarchy while still maintaining high-quality results in a focus and context approach. We also show an evaluation of these results based on perceptual metrics.Peer ReviewedPostprint (author's final draft
    • 

    corecore