76 research outputs found
Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations
Using weight decay to penalize the L2 norms of weights in neural networks has
been a standard training practice to regularize the complexity of networks. In
this paper, we show that a family of regularizers, including weight decay, is
ineffective at penalizing the intrinsic norms of weights for networks with
positively homogeneous activation functions, such as linear, ReLU and
max-pooling functions. As a result of homogeneity, functions specified by the
networks are invariant to the shifting of weight scales between layers. The
ineffective regularizers are sensitive to such shifting and thus poorly
regularize the model capacity, leading to overfitting. To address this
shortcoming, we propose an improved regularizer that is invariant to weight
scale shifting and thus effectively constrains the intrinsic norm of a neural
network. The derived regularizer is an upper bound for the input gradient of
the network so minimizing the improved regularizer also benefits the
adversarial robustness. Residual connections are also considered and we show
that our regularizer also forms an upper bound to input gradients of such a
residual network. We demonstrate the efficacy of our proposed regularizer on
various datasets and neural network architectures at improving generalization
and adversarial robustness.Comment: 14 pages, 5 figure
An Efficient Cervical Whole Slide Image Analysis Framework Based on Multi-scale Semantic and Spatial Deep Features
Digital gigapixel whole slide image (WSI) is widely used in clinical
diagnosis, and automated WSI analysis is key for computer-aided diagnosis.
Currently, analyzing the integrated descriptor of probabilities or feature maps
from massive local patches encoded by ResNet classifier is the main manner for
WSI-level prediction. Feature representations of the sparse and tiny lesion
cells in cervical slides, however, are still challengeable for the
under-promoted upstream encoders, while the unused spatial representations of
cervical cells are the available features to supply the semantics analysis. As
well as patches sampling with overlap and repetitive processing incur the
inefficiency and the unpredictable side effect. This study designs a novel
inline connection network (InCNet) by enriching the multi-scale connectivity to
build the lightweight model named You Only Look Cytopathology Once (YOLCO) with
the additional supervision of spatial information. The proposed model allows
the input size enlarged to megapixel that can stitch the WSI without any
overlap by the average repeats decreased from to
for collecting features and predictions at two scales. Based on Transformer for
classifying the integrated multi-scale multi-task features, the experimental
results appear AUC score better and faster than the best
conventional method in WSI classification on multicohort datasets of 2,019
slides from four scanning devices.Comment: 16 pages, 8 figures, already submitted to Medical Image Analysi
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions
The deployment of large language models (LLMs) raises concerns regarding
their cultural misalignment and potential ramifications on individuals from
various cultural norms. Existing work investigated political and social biases
and public opinions rather than their cultural values. To address this
limitation, the proposed Cultural Alignment Test (CAT) quantifies cultural
alignment using Hofstede's cultural dimension framework, which offers an
explanatory cross-cultural comparison through the latent variable analysis. We
apply our approach to assess the cultural values embedded in state-of-the-art
LLMs, such as: ChatGPT and Bard, across diverse cultures of countries: United
States (US), Saudi Arabia, China, and Slovakia, using different prompting
styles and hyperparameter settings. Our results not only quantify cultural
alignment of LLMs with certain countries, but also reveal the difference
between LLMs in explanatory cultural dimensions. While all LLMs did not provide
satisfactory results in understanding cultural values, GPT-4 exhibited the
highest CAT score for the cultural values of the US.Comment: 31 page
Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression
Nested networks or slimmable networks are neural networks whose architectures
can be adjusted instantly during testing time, e.g., based on computational
constraints. Recent studies have focused on a "nested dropout" layer, which is
able to order the nodes of a layer by importance during training, thus
generating a nested set of sub-networks that are optimal for different
configurations of resources. However, the dropout rate is fixed as a
hyper-parameter over different layers during the whole training process.
Therefore, when nodes are removed, the performance decays in a human-specified
trajectory rather than in a trajectory learned from data. Another drawback is
the generated sub-networks are deterministic networks without well-calibrated
uncertainty. To address these two problems, we develop a Bayesian approach to
nested neural networks. We propose a variational ordering unit that draws
samples for nested dropout at a low cost, from a proposed Downhill
distribution, which provides useful gradients to the parameters of nested
dropout. Based on this approach, we design a Bayesian nested neural network
that learns the order knowledge of the node distributions. In experiments, we
show that the proposed approach outperforms the nested network in terms of
accuracy, calibration, and out-of-domain detection in classification tasks. It
also outperforms the related approach on uncertainty-critical tasks in computer
vision.Comment: 16 pages, 10 figure
Offloading Decision Algorithm Based on Distance Weighted K-Nearest Neighbor in Power Internet of Things
With the widespread popularity of power Internet of Things (PIoT), the data collected from smart meters are growing explosively, which makes the calculation task of power data more and more complex. In order to improve computing power and maximize resource utilization, an offloading decision algorithm based on weighted K-nearest neighbor (WKNN) is proposed. It first collects the training set required by the WKNN-based algorithm, including the Received Signal Strength (RSS) required for offloading, the transmission rate, and the load balance of the Access Point (AP), and then the Euclidean distance between the training set and the sample is weighted by Gaussian function. Finally, the result with the largest K similarities in the training set is the offloading result. The simulation results show that the proposed algorithm reduces the offloading delay of the computing tasks and improves the resource utilization rate effectively when the number of meters increases in the network, which ensures that the resources of the mobile edge computing (MEC) servers in the system can be effectively and evenly utilized
- …