29 research outputs found
The Impact of Terrorism on Foreign Direct Investment: Which Sectors are More Vulnerable?
The impact of conflict and violence on foreign direct investment (FDI) is not a topic that has been done justice by the literature, and what few studies exist have contradictory results. This paper studies the impact that transnational terrorism has on FDI inflows by economic sector, in developed countries. Results indicate a statistically significant negative correlation between terrorist events and total FDI inflows. Amongst a list of 12 broad industrial sectors, FDI inflows for manufacturing, trade and repair, and construction were found to have a statistically significant negative correlation with terrorist events
Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Sparsity has become one of the promising methods to compress and accelerate
Deep Neural Networks (DNNs). Among different categories of sparsity, structured
sparsity has gained more attention due to its efficient execution on modern
accelerators. Particularly, N:M sparsity is attractive because there are
already hardware accelerator architectures that can leverage certain forms of
N:M structured sparsity to yield higher compute-efficiency. In this work, we
focus on N:M sparsity and extensively study and evaluate various training
recipes for N:M sparsity in terms of the trade-off between model accuracy and
compute cost (FLOPs). Building upon this study, we propose two new decay-based
pruning methods, namely "pruning mask decay" and "sparse structure decay". Our
evaluations indicate that these proposed methods consistently deliver
state-of-the-art (SOTA) model accuracy, comparable to unstructured sparsity, on
a Transformer-based model for a translation task. The increase in the accuracy
of the sparse model using the new training recipes comes at the cost of
marginal increase in the total training compute (FLOPs).Comment: 11 pages, 2 figures, and 9 tables. Published at the ICML Workshop on
Sparsity in Neural Networks Advancing Understanding and Practice, 2022. First
two authors contributed equall
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
N:M Structured sparsity has garnered significant interest as a result of
relatively modest overhead and improved efficiency. Additionally, this form of
sparsity holds considerable appeal for reducing the memory footprint owing to
their modest representation overhead. There have been efforts to develop
training recipes for N:M structured sparsity, they primarily focus on
low-sparsity regions (50\%). Nonetheless, performance of models trained
using these approaches tends to decline when confronted with high-sparsity
regions (80\%). In this work, we study the effectiveness of existing sparse
training recipes at \textit{high-sparsity regions} and argue that these methods
fail to sustain the model quality on par with low-sparsity regions. We
demonstrate that the significant factor contributing to this disparity is the
presence of elevated levels of induced noise in the gradient magnitudes. To
mitigate this undesirable effect, we employ decay mechanisms to progressively
restrict the flow of gradients towards pruned elements. Our approach improves
the model quality by up to 2 and 5 in vision and language models at
high sparsity regime, respectively. We also evaluate the trade-off between
model accuracy and training compute cost in terms of FLOPs. At iso-training
FLOPs, our method yields better performance compared to conventional sparse
training recipes, exhibiting an accuracy improvement of up to 2. The source
code is available at
https://github.com/abhibambhaniya/progressive_gradient_flow_nm_sparsity.Comment: 18 pages, 8 figures, 17 tables. Code is available at
https://github.com/abhibambhaniya/progressive_gradient_flow_nm_sparsit
Exploring the Potential of Chemical Constituents of Datura metel in Breast Cancer from Molecular Docking Studies
Breast cancer remains a pervasive health challenge worldwide, prompting the exploration of novel therapeutic prospects. Datura metel has long been recognized for its pharmacological properties, particularly in containing various bioactive compounds like alkaloids, flavonoids, and terpenoids. This review focuses on the potential of chemical constituents sourced from Datura metel, a traditional medicinal plant, in combating breast cancer, primarily through molecular docking studies. The review meticulously scrutinizes the chemical composition of Datura metel, emphasizing the identified compounds known for their therapeutic attributes. Through an extensive analysis of molecular docking studies, the interactions between these Datura metel constituents and crucial molecular targets associated with breast cancer are elucidated. The phytoconstituents (compound 1-13) were found to be more potent as compare to Tomoxifen citrate as standard anticancer drug. The findings presented herein beckon for further exploration, highlighting a promising avenue in the pursuit of effective and targeted treatments for breast cancer. In conclusion, this review emphasizes the synergistic integration of computational approaches with traditional knowledge, accelerating the discovery and development of innovative breast cancer therapies
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
End-to-end automatic speech recognition (ASR) models have seen revolutionary
quality gains with the recent development of large-scale universal speech
models (USM). However, deploying these massive USMs is extremely expensive due
to the enormous memory usage and computational cost. Therefore, model
compression is an important research topic to fit USM-based ASR under budget in
real-world scenarios. In this study, we propose a USM fine-tuning approach for
ASR, with a low-bit quantization and N:M structured sparsity aware paradigm on
the model weights, reducing the model complexity from parameter precision and
matrix topology perspectives. We conducted extensive experiments with a
2-billion parameter USM on a large-scale voice search dataset to evaluate our
proposed method. A series of ablation studies validate the effectiveness of up
to int4 quantization and 2:4 sparsity. However, a single compression technique
fails to recover the performance well under extreme setups including int2
quantization and 1:4 sparsity. By contrast, our proposed method can compress
the model to have 9.4% of the size, at the cost of only 7.3% relative word
error rate (WER) regressions. We also provided in-depth analyses on the results
and discussions on the limitations and potential solutions, which would be
valuable for future studies.Comment: Accepted by ICASSP 2024. Preprin
JaxPruner: A concise library for sparsity research
This paper introduces JaxPruner, an open-source JAX-based pruning and sparse
training library for machine learning research. JaxPruner aims to accelerate
research on sparse neural networks by providing concise implementations of
popular pruning and sparse training algorithms with minimal memory and latency
overhead. Algorithms implemented in JaxPruner use a common API and work
seamlessly with the popular optimization library Optax, which, in turn, enables
easy integration with existing JAX based libraries. We demonstrate this ease of
integration by providing examples in four different codebases: Scenic, t5x,
Dopamine and FedJAX and provide baseline experiments on popular benchmarks.Comment: Jaxpruner is hosted at http://github.com/google-research/jaxprune
PaLM: Scaling Language Modeling with Pathways
Large language models have been shown to achieve remarkable performance
across a variety of natural language tasks using few-shot learning, which
drastically reduces the number of task-specific training examples needed to
adapt the model to a particular application. To further our understanding of
the impact of scale on few-shot learning, we trained a 540-billion parameter,
densely activated, Transformer language model, which we call Pathways Language
Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML
system which enables highly efficient training across multiple TPU Pods. We
demonstrate continued benefits of scaling by achieving state-of-the-art
few-shot learning results on hundreds of language understanding and generation
benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough
performance, outperforming the finetuned state-of-the-art on a suite of
multi-step reasoning tasks, and outperforming average human performance on the
recently released BIG-bench benchmark. A significant number of BIG-bench tasks
showed discontinuous improvements from model scale, meaning that performance
steeply increased as we scaled to our largest model. PaLM also has strong
capabilities in multilingual tasks and source code generation, which we
demonstrate on a wide array of benchmarks. We additionally provide a
comprehensive analysis on bias and toxicity, and study the extent of training
data memorization with respect to model scale. Finally, we discuss the ethical
considerations related to large language models and discuss potential
mitigation strategies
Improvement of Voltage output for Distribution System under Transient Condition with Dynamic Voltage Restorer
Abstract-Voltage sags and swells in the medium and low voltage distribution grid are considered to be the most frequent type of power quality problems based on recent power quality studies. Their impact on sensitive loads is severe. In this paper, the performance of voltage-source converter-based series compensators used for load voltage control in electrical power distribution network has been analyzed and compared, when a nonlinear load is connected across the load bus. Possible control schemes and their effects on the oscillation attenuation are also studied. Such studied control schemes include the commonly used single voltage loop control, voltage feedback plus reference feed forward control, and double-loop control with an outer voltage loop and an inner current loop. This research paper described DVR principles and voltage restoration methods for balanced and/or unbalanced voltage sags and swells in a distribution system. Simulation results were presented to illustrate and understand the performances of DVR under voltage sags/swells conditions. The MATLAB simulation verification of the results derived has been obtained using a model of the three-phase DVR