26 research outputs found
Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy
Large convolutional neural networks (CNN) can be difficult to train in the
differentially private (DP) regime, since the optimization algorithms require a
computationally expensive operation, known as the per-sample gradient clipping.
We propose an efficient and scalable implementation of this clipping on
convolutional layers, termed as the mixed ghost clipping, that significantly
eases the private training in terms of both time and space complexities,
without affecting the accuracy. The improvement in efficiency is rigorously
studied through the first complexity analysis for the mixed ghost clipping and
existing DP training algorithms.
Extensive experiments on vision classification tasks, with large ResNet, VGG,
and Vision Transformers, demonstrate that DP training with mixed ghost clipping
adds memory overhead and slowdown to the standard
non-private training. Specifically, when training VGG19 on CIFAR10, the mixed
ghost clipping is faster than state-of-the-art Opacus library with
larger maximum batch size. To emphasize the significance of
efficient DP training on convolutional layers, we achieve 96.7\% accuracy on
CIFAR10 and 83.0\% on CIFAR100 at using BEiT, while the previous
best results are 94.8\% and 67.4\%, respectively. We open-source a privacy
engine (\url{https://github.com/JialinMao/private_CNN}) that implements DP
training of CNN with a few lines of code.Comment: Accepted to NeurIPS 202
Synthetic Datasets for Autonomous Driving: A Survey
Autonomous driving techniques have been flourishing in recent years while
thirsting for huge amounts of high-quality data. However, it is difficult for
real-world datasets to keep up with the pace of changing requirements due to
their expensive and time-consuming experimental and labeling costs. Therefore,
more and more researchers are turning to synthetic datasets to easily generate
rich and changeable data as an effective complement to the real world and to
improve the performance of algorithms. In this paper, we summarize the
evolution of synthetic dataset generation methods and review the work to date
in synthetic datasets related to single and multi-task categories for to
autonomous driving study. We also discuss the role that synthetic dataset plays
the evaluation, gap test, and positive effect in autonomous driving related
algorithm testing, especially on trustworthiness and safety aspects. Finally,
we discuss general trends and possible development directions. To the best of
our knowledge, this is the first survey focusing on the application of
synthetic datasets in autonomous driving. This survey also raises awareness of
the problems of real-world deployment of autonomous driving technology and
provides researchers with a possible solution.Comment: 19 pages, 5 figure
Comparison of Efficacy of Deep Brain Stimulation of Different Targets in Parkinson's Disease: A Network Meta-Analysis
Background: Deep brain stimulation (DBS) is considered an effective treatment option for Parkinson's disease (PD). Several studies have demonstrated the efficacy of neurostimulation in patients with advanced PD. The subthalamic nucleus (STN), the internal globus pallidus (GPi), ventral intermediate nucleus (Vim), and pedunculopontine nucleus (PPN) are reportedly effective DBS targets for control of Parkinsonian tremors. However, there is no consensus on the ideal target for DBS in patients with Parkinson's disease. Only a few studies have directly compared the efficacy of DBS of the Vim, STN, and GPi. Therefore, we searched PubMed, Embase, Cochrane Library, and other databases for observational studies, extracted data on unified Parkinson's disease rating scale (UPDRS) scores and performed a comprehensive network meta-analysis of different strategies of DBS and compared the efficiency of DBS at different targets.Methods: Forest plot was used to examine the overall efficiency of DBS; cumulative probability value was used to rank the strategies under examination. A node-splitting model was employed to assess consistency of reported outcomes inconsistency. A total of 16 studies which focused on UPDRS improvement were included in the network meta-analysis.Results: By comparing the overall efficiency associated with each target, we confirmed the efficacy of DBS therapy in PD. Our findings revealed similar efficacy of DBS targeted at GPi and STN in the on-medication phase [GPi-3.9 (95% CI −7.0 to −0.96); STN-3.1 (−5.9 to −0.38)]; however, in the off-medication phase, Vim-targeted DBS was associated with better improvement in UPDRS scores and could be a choice as a DBS target for tremor-dominant Parkinsonism.Conclusions: Our findings will help improve clinical application of DBS
SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
With the growing model size, deep neural networks (DNN) are increasingly
trained over massive GPU accelerators, which demands a proper parallelization
plan that transforms a DNN model into fine-grained tasks and then schedules
them to GPUs for execution. Due to the large search space, the contemporary
parallelization plan generators often rely on empirical rules that couple
transformation and scheduling, and fall short in exploring more flexible
schedules that yield better memory usage and compute efficiency. This tension
can be exacerbated by the emerging models with increasing complexity in their
structure and model size. SuperScaler is a system that facilitates the design
and generation of highly flexible parallelization plans. It formulates the plan
design and generation into three sequential phases explicitly: model
transformation, space-time scheduling, and data dependency preserving. Such a
principled approach decouples multiple seemingly intertwined factors and
enables the composition of highly flexible parallelization plans. As a result,
SuperScaler can not only generate empirical parallelization plans, but also
construct new plans that achieve up to 3.5X speedup compared to
state-of-the-art solutions like DeepSpeed, Megatron and Alpa, for emerging DNN
models like Swin-Transformer and AlphaFold2, as well as well-optimized models
like GPT-3
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
PAllidal versus SubThalamic deep brain Stimulation for Cervical Dystonia (PASTS-CD): study protocol for a multicentre randomised controlled trial
Introduction Deep brain stimulation (DBS) has been validated as a safe and effective treatment for refractory cervical dystonia (CD). Globus pallidus internus (GPi) and subthalamic nucleus (STN) are the two main stimulating targets. However, there has been no prospective study to clarify which target is the better DBS candidate for CD. The objective of this trial is to compare directly the efficacy and safety of GPi-DBS and STN-DBS, thereby instructing the selection of DBS target in clinical practice.Methods and analysis This multicentre, prospective, randomised, controlled study plans to enrol 98 refractory CD patients. Eligible CD patients will be randomly allocated to GPi-DBS group or STN-DBS group, with the DBS electrodes implanted into the posteroventral portion of GPi or the dorsolateral portion of STN, respectively. The primary outcome will be the improvement of symptomatic severity, measured by the changes in the Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS) severity subscale and the Tsui scale at 3 months, 6 months and 12 months after surgery. The secondary outcomes include the improvement of the TWSTRS-disability subscale, TWSTRS-pain subscale, quality of life, mental and cognitive condition, as well as the differences in stimulation parameters and adverse effects. In addition, this study intends to identify certain predictors of DBS efficacy for CD.Ethics and dissemination The trial has been approved by the Medical Ethics Committee of Chinese PLA General Hospital (S2022-613-01). The results of this study will be published in international peer-reviewed journals and shared in professional medical conferences.Trial registration number NCT05715138