410 research outputs found
Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse
Normalization operations are essential for state-of-the-art neural networks
and enable us to train a network from scratch with a large learning rate (LR).
We attempt to explain the real effect of Batch Normalization (BN) from the
perspective of variance transmission by investigating the relationship between
BN and Weights Normalization (WN). In this work, we demonstrate that the
problem of the shift of the average gradient will amplify the variance of every
convolutional (conv) layer. We propose Parametric Weights Standardization
(PWS), a fast and robust to mini-batch size module used for conv filters, to
solve the shift of the average gradient. PWS can provide the speed-up of BN.
Besides, it has less computation and does not change the output of a conv
layer. PWS enables the network to converge fast without normalizing the
outputs. This result enhances the persuasiveness of the shift of the average
gradient and explains why BN works from the perspective of variance
transmission. The code and appendix will be made available on
https://github.com/lyxzzz/PWSConv.Comment: This paper has been accepted by AAAI2
Happiness Inequality in China
Along with China becoming an upper-middle-income country from a lower-middle-income one after 2009, the happiness inequality in China has been enlarged. Based on the Chinese General Social Survey (CGSS) database (2003-2012), this paper investigates the determinants of the happiness inequality in China and explores what factors contribute to its enlargement after 2009. We find that a rise of income inequality as well as the population share of middle age cohorts can widen China’s happiness inequality, while an increase in income or education level has a reducing impact. Owning a house and being in employment also have happiness inequality reducing impacts. A decomposition analysis shows that the deterioration of China’s happiness inequality is mainly caused by coefficient effects, i.e., the relationships between happiness inequality and its influencing factors have changed, which reflects the dramatic change in the Chinese economy and society. Among the coefficient effects, regional heterogeneity plays an important role. Policies enhancing economic performance and education as well as reducing income inequality and regional inequality can help to reduce happiness inequality and improve social harmony in China
Happiness Inequality in China
Along with China becoming an upper-middle-income country from a lower-middle-income one after 2009, the happiness inequality in China has been enlarged. Based on the Chinese General Social Survey (CGSS) database (2003-2012), this paper investigates the determinants of the happiness inequality in China and explores what factors contribute to its enlargement after 2009. We find that a rise of income inequality as well as the population share of middle age cohorts can widen China’s happiness inequality, while an increase in income or education level has a reducing impact. Owning a house and being in employment also have happiness inequality reducing impacts. A decomposition analysis shows that the deterioration of China’s happiness inequality is mainly caused by coefficient effects, i.e., the relationships between happiness inequality and its influencing factors have changed, which reflects the dramatic change in the Chinese economy and society. Among the coefficient effects, regional heterogeneity plays an important role. Policies enhancing economic performance and education as well as reducing income inequality and regional inequality can help to reduce happiness inequality and improve social harmony in China
ScalAna: Automating Scaling Loss Detection with Graph Analysis
Scaling a parallel program to modern supercomputers is challenging due to
inter-process communication, Amdahl's law, and resource contention. Performance
analysis tools for finding such scaling bottlenecks either base on profiling or
tracing. Profiling incurs low overheads but does not capture detailed
dependencies needed for root-cause analysis. Tracing collects all information
at prohibitive overheads. In this work, we design ScalAna that uses static
analysis techniques to achieve the best of both worlds - it enables the
analyzability of traces at a cost similar to profiling. ScalAna first leverages
static compiler techniques to build a Program Structure Graph, which records
the main computation and communication patterns as well as the program's
control structures. At runtime, we adopt lightweight techniques to collect
performance data according to the graph structure and generate a Program
Performance Graph. With this graph, we propose a novel approach, called
backtracking root cause detection, which can automatically and efficiently
detect the root cause of scaling loss. We evaluate ScalAna with real
applications. Results show that our approach can effectively locate the root
cause of scaling loss for real applications and incurs 1.73% overhead on
average for up to 2,048 processes. We achieve up to 11.11% performance
improvement by fixing the root causes detected by ScalAna on 2,048 processes.Comment: conferenc
Recommended from our members
Protective effect of human serum amyloid P on CCl4-induced acute liver injury in mice.
Human serum amyloid P (hSAP), a member of the pentraxin family, inhibits the activation of fibrocytes in culture and inhibits experimental renal, lung, skin and cardiac fibrosis. As hepatic inflammation is one of the causes of liver fibrosis, in the present study, we investigated the hepatoprotective effects of hSAP against carbon tetrachloride (CCl4)-induced liver injury. Our data indicated that hSAP attenuated hepatic histopathological abnormalities and significantly decreased inflammatory cell infiltration and pro-inflammatory factor expression. Moreover, CCl4-induced apoptosis in the mouse liver was inhibited by hSAP, as measured by terminal-deoxynucleotidyl transferase mediated nick-end labeling (TUNEL) assay and cleaved caspase-3 expression. hSAP significantly restored the expression of B cell lymphoma/leukemia (Bcl)-2 and suppressed the expression of Bcl-2-associated X protein (Bax) in vivo. The number of hepatocytes in early apoptosis stained with Annexin V was significantly reduced by 28-30% in the hSAP treatment group compared with the CCl4 group, and the expression of Bcl-2 was increased, whereas the expression of Bax and cleaved caspase-3 were significantly inhibited in the hSAP pre-treatment group compared with the CCl4 group. hSAP administration also inhibited the migration and activation of hepatic stellate cells (HSCs) in CCl4-injured liver and suppressed the activation of isolated primary HSCs induced by transforming growth factor (TGF)-β1 in vitro. Collectively, these findings suggest that hSAP exerts a protective effect againts CCl4-induced hepatic injury by suppressing the inflammatory response and hepatocyte apoptosis, potentially by inhibiting HSC activation
A Filtering Algorithm for Maneuvering Target Tracking Based on Smoothing Spline Fitting
Maneuvering target tracking is a challenge. Target's sudden speed or direction changing would make the common filtering tracker divergence. To improve the accuracy of maneuvering target tracking, we propose a tracking algorithm based on spline fitting. Curve fitting, based on historical point trace, reflects the mobility information. The innovation of this paper is assuming that there is no dynamic motion model, and prediction is only based on the curve fitting over the measured data. Monte Carlo simulation results show that, when sea targets are maneuvering, the proposed algorithm has better accuracy than the conventional Kalman filter algorithm and the interactive multiple model filtering algorithm, maintaining simple structure and small amount of storage
Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases
Large Language Models (LLMs) have demonstrated remarkable performance in code
completion. However, due to the lack of domain-specific knowledge, they may not
be optimal in completing code that requires intensive domain knowledge for
example completing the library names. Although there are several works that
have confirmed the effectiveness of fine-tuning techniques to adapt language
models for code completion in specific domains. They are limited by the need
for constant fine-tuning of the model when the project is in constant
iteration.
To address this limitation, in this paper, we propose NM-LM, a
retrieval-augmented language model (R-LM), that integrates domain knowledge
into language models without fine-tuning. Different from previous techniques,
our approach is able to automatically adapt to different language models and
domains. Specifically, it utilizes the in-domain code to build the
retrieval-based database decoupled from LM, and then combines it with LM
through Bayesian inference to complete the code. The extensive experiments on
the completion of intra-project and intra-scenario have confirmed that NM-LM
brings about appreciable enhancements when compared to CodeGPT and UnixCoder. A
deep analysis of our tool including the responding speed, storage usage,
specific type code completion, and API invocation completion has confirmed that
NM-LM provides satisfactory performance, which renders it highly appropriate
for domain adaptive code completion. Furthermore, our approach operates without
the requirement for direct access to the language model's parameters. As a
result, it can seamlessly integrate with black-box code completion models,
making it easy to integrate our approach as a plugin to further enhance the
performance of these models.Comment: Accepted by ASE202
NEDD8 Modification of CUL1 Dissociates p120CAND1, an Inhibitor of CUL1-SKP1 Binding and SCF Ligases
Cullin proteins assemble a large number of RING E3 ubiquitin ligases and regulate various physiological processes. Covalent modification of cullins by the ubiquitin-like protein NEDD8 activates cullin ligases through an as yet undefined mechanism. We show here that p120(CAND1) selectively binds to unneddylated CUL1 and is dissociated by CUL1 neddylation. CAND1 formed a ternary complex with CUL1 and ROC1. CAND1 dissociated SKP1 from CUL1 and inhibited SCF ligase activity in vitro. Suppression of CAND1 in vivo increased the level of the CUL1-SKP1 complex. We suggest that by restricting SKP1-CUL1 interaction, CAND1 regulated the assembly of productive SCF ubiquitin ligases, allowing a common CUL1-ROC core to be utilized by a large number of SKP1-F box-substrate subcomplexes
Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning
Self-supervised learning enables networks to learn discriminative features
from massive data itself. Most state-of-the-art methods maximize the similarity
between two augmentations of one image based on contrastive learning. By
utilizing the consistency of two augmentations, the burden of manual
annotations can be freed. Contrastive learning exploits instance-level
information to learn robust features. However, the learned information is
probably confined to different views of the same instance. In this paper, we
attempt to leverage the similarity between two distinct images to boost
representation in self-supervised learning. In contrast to instance-level
information, the similarity between two distinct images may provide more useful
information. Besides, we analyze the relation between similarity loss and
feature-level cross-entropy loss. These two losses are essential for most deep
learning methods. However, the relation between these two losses is not clear.
Similarity loss helps obtain instance-level representation, while feature-level
cross-entropy loss helps mine the similarity between two distinct images. We
provide theoretical analyses and experiments to show that a suitable
combination of these two losses can get state-of-the-art results. Code is
available at https://github.com/guijiejie/ICCL.Comment: This paper is accepted by IEEE Transactions on Image Processin
- …