Search CORE

1,260 research outputs found

Accelerating Deep Learning with Shrinkage and Recall

Author: Ding Chris
Vishnu Abhinav
Zheng Shuai
Publication venue
Publication date: 19/09/2016
Field of study

Deep Learning is a very powerful machine learning model. Deep Learning trains a large number of parameters for multiple layers and is very slow when data is in large scale and the architecture size is large. Inspired from the shrinking technique used in accelerating computation of Support Vector Machines (SVM) algorithm and screening technique used in LASSO, we propose a shrinking Deep Learning with recall (sDLr) approach to speed up deep learning computation. We experiment shrinking Deep Learning with recall (sDLr) using Deep Neural Network (DNN), Deep Belief Network (DBN) and Convolution Neural Network (CNN) on 4 data sets. Results show that the speedup using shrinking Deep Learning with recall (sDLr) can reach more than 2.0 while still giving competitive classification performance.Comment: The 22nd IEEE International Conference on Parallel and Distributed Systems (ICPADS 2016

arXiv.org e-Print Archive

Acceptance rate and reasons for rejection of manuscripts submitted to Veterinary Radiology & Ultrasound during 2012

Author: Lamb C R
Mai W
Publication venue
Publication date: 01/01/2011
Field of study

Session: P2P ComputingP2P live media streaming systems have proliferated and become indispensable vehicles for Internet based entertainment applications. However, it is also well known that scalability of such systems is limited by the lack of proper incentive mechanisms. Specifically, it is notoriously hard to efficiently allocate upload bandwidth at each peer so as to maximize overall system performance. In this paper, we propose a new auction based mechanism for optimizing the allocation of upload bandwidth at each peer. One of the distinctive features in our approach is that peers use real 'goods' (i.e., their own bandwidth resources) for payments, instead of relying on some fictitious currency. Essentially, peers use a barter mechanism in the payment step in the auction. Simulation results indicate that our proposed auction approach consistently outperforms existing practical approaches (e.g., titfor-tat) in terms of average incoming stream rate, average playback delay, and control packets ratio. © 2011 IEEE.published_or_final_versionThe IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS 2011), Tainan, Taiwan, 7-9 December 2011. In Proceedings of the 17th ICPADS, 2011, p. 573-58

HKU Scholars Hub

Design and Implementation of MapReduce using the PGAS Programming Model with UPC

Author: Doallo Ramón
López Taboada Guillermo
Teijeiro Barjas Carlos
Touriño Juan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/01/2012
Field of study

This is a post-peer-review, pre-copyedit version of an article published in International Conference on Parallel and Distributed Systems. Proceedings. The final authenticated version is available online at: http://dx.doi.org/10.1109/ICPADS.2011.162[Abstract] MapReduce is a powerful tool for processing large data sets used by many applications running in distributed environments. However, despite the increasing number of computationally intensive problems that require low-latency communications, the adoption of MapReduce in High Performance Computing (HPC) is still emerging. Here languages based on the Partitioned Global Address Space (PGAS) programming model have shown to be a good choice for implementing parallel applications, in order to take advantage of the increasing number of cores per node and the programmability benefits achieved by their global memory view, such as the transparent access to remote data. This paper presents the first PGAS-based MapReduce implementation that uses the Unified Parallel C (UPC) language, which (1) obtains programmability benefits in parallel programming, (2) offers advanced configuration options to define a customized load distribution for different codes, and (3) overcomes performance penalties and bottlenecks that have traditionally prevented the deployment of MapReduce applications in HPC. The performance evaluation of representative applications on shared and distributed memory environments assesses the scalability of the presented MapReduce framework, confirming its suitability.Ministerio de Ciencia e Innovación; TIN2010-1673

Non-contiguous processor allocation strategy for 2D mesh connected multicomputers based on sub-meshes available for allocation

Author: Abaneh I.
Bani-Mohammad S.
Mackenzie L.M.
Ould-Khaoua M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Contiguous allocation of parallel jobs usually suffers from the degrading effects of fragmentation as it requires that the allocated processors be contiguous and has the same topology as the network topology connecting these processors. In non-contiguous allocation, a job can execute on multiple disjoint smaller sub-meshes rather than always waiting until a single sub-mesh of the requested size is available. Lifting the contiguity condition in non-contiguous allocation is expected to reduce processor fragmentation and increase processor utilization. However, the communication overhead is increased because the distances traversed by messages can be longer. The extra communication overhead depends on how the allocation request is partitioned and allocated to free sub-meshes. In this paper, a new non-contiguous processor allocation strategy, referred to as Greedy-Available-Busy-List, is suggested for the 2D mesh network, and is compared using simulation against the well-known non-contiguous and contiguous allocation strategies. To show the performance improved by proposed strategy, we conducted simulation runs under the assumption of wormhole routing and all-to-all communication pattern. The results show that the proposed strategy can reduce the communication overhead and improve performance substantially in terms of turnaround times of jobs and finish times

Enlighten

Platform Dependent Verification: On Engineering Verification Tools for 21st Century

Author: A. Aggarwal
A. B. Kahn
Alfons Laarman
Armin Biere
B. R. Haverkort
Boudewijn R. Haverkort
Brad Bingham
Cornelia P. Inggs
D. Bosnacki
David L. Dill
Doron Peled
E. Allen Emerson
E. M. Clarke
E.M. Clarke
Flavio Lerda
Flavio Lerda
G. Behrmann
G. Ciardo
G. Jayachandran
Gerard J. Holzmann
Gerard J. Holzmann
Gerard J. Holzmann
Gianfranco Ciardo
Giuseppe Della Penna
H. Garavel
I. Černá
I. Černá
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. Barnat
J. R. Burch
Jaco Geldenhuys
Jiří Barnat
Jiří Barnat
K. Verstoep
Keijo Heljanko
Keijo Heljanko
L. Brim
L. Brim
Luboš Brim
M.Y. Vardi
Michael Jones
Moritz Hammer
Naga K. Govindaraju
P. Harish
Peter Lamborn
R. Korf
R. Korf
R. Pel\IeC ánek
Rahul Kumar
Rong Zhou
S. Allmaier
S. Caselli
Sami Evangelista
Shahid Jabbar
Shahid Jabbar
Stefan Edelkamp
T. von Eicken
Tonglaga Bao
U. Stern
U. Stern
W. Knottenbelt
W. Knottenbelt
Yi-Jen Chiang
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2011
Field of study

The paper overviews recent developments in platform-dependent explicit-state LTL model checking.Comment: In Proceedings PDMC 2011, arXiv:1111.006

arXiv.org e-Print Archive

Directory of Open Access Journals