3,185 research outputs found
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
TensorFlow has been the most widely adopted Machine/Deep Learning framework.
However, little exists in the literature that provides a thorough understanding
of the capabilities which TensorFlow offers for the distributed training of
large ML/DL models that need computation and communication at scale. Most
commonly used distributed training approaches for TF can be categorized as
follows: 1) Google Remote Procedure Call (gRPC), 2) gRPC+X: X=(InfiniBand
Verbs, Message Passing Interface, and GPUDirect RDMA), and 3) No-gRPC: Baidu
Allreduce with MPI, Horovod with MPI, and Horovod with NVIDIA NCCL. In this
paper, we provide an in-depth performance characterization and analysis of
these distributed training approaches on various GPU clusters including the Piz
Daint system (6 on Top500). We perform experiments to gain novel insights along
the following vectors: 1) Application-level scalability of DNN training, 2)
Effect of Batch Size on scaling efficiency, 3) Impact of the MPI library used
for no-gRPC approaches, and 4) Type and size of DNN architectures. Based on
these experiments, we present two key insights: 1) Overall, No-gRPC designs
achieve better performance compared to gRPC-based approaches for most
configurations, and 2) The performance of No-gRPC is heavily influenced by the
gradient aggregation using Allreduce. Finally, we propose a truly CUDA-Aware
MPI Allreduce design that exploits CUDA kernels and pointer caching to perform
large reductions efficiently. Our proposed designs offer 5-17X better
performance than NCCL2 for small and medium messages, and reduces latency by
29% for large messages. The proposed optimizations help Horovod-MPI to achieve
approximately 90% scaling efficiency for ResNet-50 training on 64 GPUs.
Further, Horovod-MPI achieves 1.8X and 3.2X higher throughput than the native
gRPC method for ResNet-50 and MobileNet, respectively, on the Piz Daint
cluster.Comment: 10 pages, 9 figures, submitted to IEEE IPDPS 2019 for peer-revie
Intention to purchase counterfeit luxury products: a comparative study between Pakistani and UK consumers
This study aims to provide a comparison between Pakistani and the UK consumersâ purchase intentions towards counterfeit luxury products by focusing on the relationships between the following factors: perceived quality, status consumption, low price and ethics. A sample of 251 university students from Pakistan (137) and the UK (114) was used. Data was analyzed using AMOS and SPSS. Results show that Pakistani consumers are satisfied with perceived quality of counterfeit products while the UK consumers are not. Status associated with the counterfeit products and prices of these products were found to be important factors for both samples. Pakistani consumers show less ethical behaviour compared to the UK consumers. Considering a single product category, i.e. luxury products, is a limitation of the study and selecting a single product category may possibly restrict the potential generalizability
Blood vessel segmentation in retinal images using echo state networks
We propose a novel supervised technique for blood vessel segmentation in retinal images based on echo state networks. Retinal vessel segmentation is widely used for numerous clinical purposes such as the detection of various cardiovascular and ophthalmologic diseases. A large number of retinal vessel segmentation methods have been reported, yet achieving accurate and efficient vessel segmentation still remains a challenge. Recently, reservoir computing has drawn much attention as a new computing framework based on recurrent neural networks. The Echo State Network (ESN), which uses neural nodes as the computing elements of the recurrent network, represents one of the efficient learning models of reservoir computing. This paper investigates the viability of echo state networks for blood vessel segmentation in retinal images. Initial image features are projected onto the echo state network reservoir which maps them, through its internal nodes activations, into a new set of features to be classified into vessel or non-vessel by the echo state network readout which consists, in the proposed approach, of a multi-layer perceptron. Experimental results on the publicly available DRIVE dataset, commonly used in retinal vessel segmentation research, demonstrate the ability of the proposed method in achieving promising performance results in terms of both segmentation accuracy and efficiency
2. Minimally invasive mitral valve surgery why do you take the risks?
During recent years, minimally invasive mitral valve surgery (MIMVS) become the preferred method of mitral valve repair and replacement in many institutions worldwide with excellent results, in spite of there is no clear difinition of minimally invasive surgery and we do not have efficient studies about the risks of MIMVS comparing to conventional mitral valve surgery. Many studies are needed to clarify the need for either conventional or minimally invasive mitral valve surgery instead of personal preference. The patientâs demographic profile, intraoperative data and postoperative outcomes of patients undergoing minimally invasive mitral valve surgery were retrospectively collected from our database from May 2011 to April 2014. We will present early and mid-term outcomes of patients undergoing minimally invasive mitral valve surgery in our institution. Seventy consecutive patients (45 male and 25 female), age 35±12 years, underwent MIMVS surgery. Mean preoperative New York Heart Association function class was 2.6±0.7. Mean ejection fraction was 50±8. Cardiopulmonary bypass was instituted through femoral cannulation (28 of 70, 40%), or direct aortic cannulation (42 of 70, 25%). Aortic cross-clamp used in (66 of 70, 94.2%). Without aortic cross-clamp in (4 of 70, 5.7%), mitral valve repair has been done in (52 of 70, 74.2%), mitral valve replacement (18 of 70, 25.7%). Concomitant procedures included AF ablation (24 of 70, 34.2%), and tricuspid valve repair (33 of 70, 47.1%). No mortality recorded, residual mitral regurge was found in (6 of 70, 8.5%) during 1 year follow up. Cardiopulmonary bypass, and âskin to skinâ surgery were 95±35 and 250±74min, respectively. 4 patients (5.7%) underwent reexploration for bleeding and (57 of 70, 81.4%) did not receive any blood transfusions. Six patients (8.5%) sustained face oedema. Mean length of hospital stay was 7±3.8days. 18 patients (25.7%) did not feel any interest regarding cosmotic advantage over conventional surgery. Minimally invasive mitral valve surgery is an excellent alternative to conventional mitral valve surgery in most cases however comparing to conventional mitral surgery it shows long bypass time, long cross clamp time, difficult reexploration for bleeding and multiple body incisions
On the prevalence of hierarchies in social networks
In this paper, we introduce two novel evolutionary processes for hierarchical networks referred to as dominance- and prestige-based evolution models, i.e., DBEM and PBEM, respectively. Our models are deterministic in nature which allows for closed-form derivation of equilibrium points for such type of networks, for the special case of complete networks. After deriving these equilibrium points, we are somewhat surprised in recovering the exponential and power-law strength distribution as the shared property of the resulting hierarchal networks. Additionally, we compute the network properties, Geodesic distance distribution and centrality closeness, for each model in closed form. Interestingly, these results demonstrate very different roles of hubs for each model, shedding the light on the evolutionary advantages of hierarchies in social networks: in short, hierarchies can lead to efficient sharing of resources and robustness to random failures. For the general case of any hierarchical network, we compare the estimations of tie intensities and node strengths using the proposed models to open-source real-world data. The prediction results are statistically compared using the KolmogorovâSmirnov test with the original data
- âŠ