Search CORE

23,352 research outputs found

Fully automatic extraction of salient objects from videos in near real-time

Author: Fukuchi Ken
Kazuma Akamine
Kimura Akisato
Takagi Shigeru
Publication venue
Publication date: 12/08/2010
Field of study

Automatic video segmentation plays an important role in a wide range of computer vision and image processing applications. Recently, various methods have been proposed for this purpose. The problem is that most of these methods are far from real-time processing even for low-resolution videos due to the complex procedures. To this end, we propose a new and quite fast method for automatic video segmentation with the help of 1) efficient optimization of Markov random fields with polynomial time of number of pixels by introducing graph cuts, 2) automatic, computationally efficient but stable derivation of segmentation priors using visual saliency and sequential update mechanism, and 3) an implementation strategy in the principle of stream processing with graphics processor units (GPUs). Test results indicates that our method extracts appropriate regions from videos as precisely as and much faster than previous semi-automatic methods even though any supervisions have not been incorporated.Comment: submitted to Special Issue on High Performance Computation on Hardware Accelerators, the Computer Journa

arXiv.org e-Print Archive

\texttt{GooStats}: A GPU-based framework for multi-variate analysis in particle physics

Author: Ding Xuefeng
Publication venue: 'IOP Publishing'
Publication date: 13/12/2018
Field of study

\texttt{GooStats} is a software framework that provides a flexible environment and common tools to implement multi-variate statistical analysis. The framework is built upon the \texttt{CERN ROOT}, \texttt{MINUIT} and \texttt{GooFit} packages. Running a multi-variate analysis in parallel on graphics processing units yields a huge boost in performance and opens new possibilities. The design and benchmark of \texttt{GooStats} are presented in this article along with illustration of its application to statistical problems.Comment: 16 pages, 10 figure

arXiv.org e-Print Archive

Kinematic Modelling of Disc Galaxies using Graphics Processing Units

Author: Abraham Roberto
Bekiaris Georgios
Fluke Christopher J.
Glazebrook Karl
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/12/2015
Field of study

With large-scale Integral Field Spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such as the Graphics Processing Unit (GPU), as an accelerator for the computationally expensive model-fitting procedure. We review the algorithms involved in model-fitting and evaluate their suitability for GPU implementation. We employ different optimization techniques, including the Levenberg-Marquardt and Nested Sampling algorithms, but also a naive brute-force approach based on Nested Grids. We find that the GPU can accelerate the model-fitting procedure up to a factor of ~100 when compared to a single-threaded CPU, and up to a factor of ~10 when compared to a multi-threaded dual CPU configuration. Our method's accuracy, precision and robustness are assessed by successfully recovering the kinematic properties of simulated data, and also by verifying the kinematic modelling results of galaxies from the GHASP and DYNAMO surveys as found in the literature. The resulting GBKFIT code is available for download from: http://supercomputing.swin.edu.au/gbkfit.Comment: 34 pages, 16 figures, 8 tables, Accepted for publication in MNRA

arXiv.org e-Print Archive

Vector operations for accelerating expensive Bayesian computations -- a tutorial guide

Author: Drovandi Christopher
Sisson Scott A.
Warne David J.
Publication venue
Publication date: 14/12/2020
Field of study

Many applications in Bayesian statistics are extremely computationally intensive. However, they are often inherently parallel, making them prime targets for modern massively parallel processors. Multi-core and distributed computing is widely applied in the Bayesian community, however, very little attention has been given to fine-grain parallelisation using single instruction multiple data (SIMD) operations that are available on most modern commodity CPUs and is the basis of GPGPU computing. In this work, we practically demonstrate, using standard programming libraries, the utility of the SIMD approach for several topical Bayesian applications. We show that SIMD can improve the floating point arithmetic performance resulting in up to

6\times

improvement in serial algorithm performance. Importantly, these improvements are multiplicative to any gains achieved through multi-core processing. We illustrate the potential of SIMD for accelerating Bayesian computations and provide the reader with techniques for exploiting modern massively parallel processing environments using standard tools

arXiv.org e-Print Archive

On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing

Author: Ivanchenko Elena
Kenzel Michael
Kerbl Bernhard
Schmalstieg Dieter
Steinberger Markus
Tatzgern Wolfgang
Publication venue
Publication date: 22/05/2018
Field of study

Compute-mode rendering is becoming more and more attractive for non-standard rendering applications, due to the high flexibility of compute-mode execution. These newly designed pipelines often include streaming vertex and geometry processing stages. In typical triangle meshes, the same transformed vertex is on average required six times during rendering. To avoid redundant computation, a post-transform cache is traditionally suggested to enable reuse of vertex processing results. However, traditional caching neither scales well as the hardware becomes more parallel, nor can be efficiently implemented in a software design. We investigate alternative strategies to reusing vertex shading results on-the-fly for massively parallel software geometry processing. Forming static and dynamic batching on the data input stream, we analyze the effectiveness of identifying potential local reuse based on sorting, hashing, and efficient intra-thread-group communication. Altogether, we present four vertex reuse strategies, tailored to modern parallel architectures. Our simulations showcase that our batch-based strategies significantly outperform parallel caches in terms of reuse. On actual GPU hardware, our evaluation shows that our strategies not only lead to good reuse of processing results, but also boost performance by

2-3\times

compared to na\"ively ignoring reuse in a variety of practical applications

arXiv.org e-Print Archive

SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis

Author: Chen Wengling
Hays James
Publication venue
Publication date: 12/04/2018
Field of study

Synthesizing realistic images from human drawn sketches is a challenging problem in computer graphics and vision. Existing approaches either need exact edge maps, or rely on retrieval of existing photographs. In this work, we propose a novel Generative Adversarial Network (GAN) approach that synthesizes plausible images from 50 categories including motorcycles, horses and couches. We demonstrate a data augmentation technique for sketches which is fully automatic, and we show that the augmented data is helpful to our task. We introduce a new network building block suitable for both the generator and discriminator which improves the information flow by injecting the input image at multiple scales. Compared to state-of-the-art image translation methods, our approach generates more realistic images and achieves significantly higher Inception Scores.Comment: Accepted to CVPR 201

arXiv.org e-Print Archive

Convex Cauchy Schwarz Independent Component Analysis for Blind Source Separation

Author: Albataineh Zaid
Salem Fathi M.
Publication venue
Publication date: 01/08/2014
Field of study

We present a new high performance Convex Cauchy Schwarz Divergence (CCS DIV) measure for Independent Component Analysis (ICA) and Blind Source Separation (BSS). The CCS DIV measure is developed by integrating convex functions into the Cauchy Schwarz inequality. By including a convexity quality parameter, the measure has a broad control range of its convexity curvature. With this measure, a new CCS ICA algorithm is structured and a non parametric form is developed incorporating the Parzen window based distribution. Furthermore, pairwise iterative schemes are employed to tackle the high dimensional problem in BSS. We present two schemes of pairwise non parametric ICA algorithms, one is based on gradient decent and the second on the Jacobi Iterative method. Several case study scenarios are carried out on noise free and noisy mixtures of speech and music signals. Finally, the superiority of the proposed CCS ICA algorithm is demonstrated in metric comparison performance with FastICA, RobustICA, convex ICA (C ICA), and other leading existing algorithms.Comment: 13 page

arXiv.org e-Print Archive

Massively Parallel Computation Using Graphics Processors with Application to Optimal Experimentation in Dynamic Control

Author: Mathur Sudhanshu
Morozov Sergei
Publication venue
Publication date
Field of study

The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability has lead to its adoption in many non-graphics applications, including wide variety of scientific computing fields. At the same time, a number of important dynamic optimal policy problems in economics are athirst of computing power to help overcome dual curses of complexity and dimensionality. We investigate if computational economics may benefit from new tools on a case study of imperfect information dynamic programming problem with learning and experimentation trade-off that is, a choice between controlling the policy target and learning system parameters. Specifically, we use a model of active learning and control of linear autoregression with unknown slope that appeared in a variety of macroeconomic policy and other contexts. The endogeneity of posterior beliefs makes the problem difficult in that the value function need not be convex and policy function need not be continuous. This complication makes the problem a suitable target for massively-parallel computation using graphics processors. Our findings are cautiously optimistic in that new tools let us easily achieve a factor of 15 performance gain relative to an implementation targeting single-core processors and thus establish a better reference point on the computational speed vs. coding complexity trade-off frontier. While further gains and wider applicability may lie behind steep learning barrier, we argue that the future of many computations belong to parallel algorithms anyway.Graphics Processing Units, CUDA programming, Dynamic programming, Learning, Experimentation

Recommended from our members

Computational Strategies for Scalable Genomics Analysis.

Author: Shi Lizhen
Wang Zhong
Publication venue: eScholarship, University of California
Publication date: 06/12/2019
Field of study

The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications

eScholarship - University of California

Broad Neural Network for Change Detection in Aerial Images

Author: Aggarwal Alakh
Chattopadhyay Pratik
Shrivastava Shailesh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2019
Field of study

A change detection system takes as input two images of a region captured at two different times, and predicts which pixels in the region have undergone change over the time period. Since pixel-based analysis can be erroneous due to noise, illumination difference and other factors, contextual information is usually used to determine the class of a pixel (changed or not). This contextual information is taken into account by considering a pixel of the difference image along with its neighborhood. With the help of ground truth information, the labeled patterns are generated. Finally, Broad Learning classifier is used to get prediction about the class of each pixel. Results show that Broad Learning can classify the data set with a significantly higher F-Score than that of Multilayer Perceptron. Performance comparison has also been made with other popular classifiers, namely Multilayer Perceptron and Random Forest.Comment:

\textbf{Accepted at}

: IEMGraph (International Conference on Emerging Technology in Modelling and Graphics) 2018

\textbf{Date of Conference}

: 6-7 September, 2018

\textbf{Location of Conference}

: Kolkatta, Indi

arXiv.org e-Print Archive