114 research outputs found
One Forward is Enough for Neural Network Training via Likelihood Ratio Method
While backpropagation (BP) is the mainstream approach for gradient
computation in neural network training, its heavy reliance on the chain rule of
differentiation constrains the designing flexibility of network architecture
and training pipelines. We avoid the recursive computation in BP and develop a
unified likelihood ratio (ULR) method for gradient estimation with just one
forward propagation. Not only can ULR be extended to train a wide variety of
neural network architectures, but the computation flow in BP can also be
rearranged by ULR for better device adaptation. Moreover, we propose several
variance reduction techniques to further accelerate the training process. Our
experiments offer numerical results across diverse aspects, including various
neural network training scenarios, computation flow rearrangement, and
fine-tuning of pre-trained models. All findings demonstrate that ULR
effectively enhances the flexibility of neural network training by permitting
localized module training without compromising the global objective and
significantly boosts the network robustness
Remote Sensing of Environmental Changes in Cold Regions
This Special Issue gathers papers reporting recent advances in the remote sensing of cold regions. It includes contributions presenting improvements in modeling microwave emissions from snow, assessment of satellite-based sea ice concentration products, satellite monitoring of ice jam and glacier lake outburst floods, satellite mapping of snow depth and soil freeze/thaw states, near-nadir interferometric imaging of surface water bodies, and remote sensing-based assessment of high arctic lake environment and vegetation recovery from wildfire disturbances in Alaska. A comprehensive review is presented to summarize the achievements, challenges, and opportunities of cold land remote sensing
A Novel Noise Injection-based Training Scheme for Better Model Robustness
Noise injection-based method has been shown to be able to improve the
robustness of artificial neural networks in previous work. In this work, we
propose a novel noise injection-based training scheme for better model
robustness. Specifically, we first develop a likelihood ratio method to
estimate the gradient with respect to both synaptic weights and noise levels
for stochastic gradient descent training. Then, we design an approximation for
the vanilla noise injection-based training method to reduce memory and improve
computational efficiency. Next, we apply our proposed scheme to spiking neural
networks and evaluate the performance of classification accuracy and robustness
on MNIST and Fashion-MNIST datasets. Experiment results show that our proposed
method achieves a much better performance on adversarial robustness and
slightly better performance on original accuracy, compared with the
conventional gradient-based training method
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Security has always been a critical issue in machine learning (ML)
applications. Due to the high cost of model training -- such as collecting
relevant samples, labeling data, and consuming computing power --
model-stealing attack is one of the most fundamental but vitally important
issues. When it comes to quantum computing, such a quantum machine learning
(QML) model-stealing attack also exists and is even more severe because the
traditional encryption method, such as homomorphic encryption can hardly be
directly applied to quantum computation. On the other hand, due to the limited
quantum computing resources, the monetary cost of training QML model can be
even higher than classical ones in the near term. Therefore, a well-tuned QML
model developed by a third-party company can be delegated to a quantum cloud
provider as a service to be used by ordinary users. In this case, the QML model
will likely be leaked if the cloud provider is under attack. To address such a
problem, we propose a novel framework, namely QuMoS, to preserve model
security. We propose to divide the complete QML model into multiple parts and
distribute them to multiple physically isolated quantum cloud providers for
execution. As such, even if the adversary in a single provider can obtain a
partial model, it does not have sufficient information to retrieve the complete
model. Although promising, we observed that an arbitrary model design under
distributed settings cannot provide model security. We further developed a
reinforcement learning-based security engine, which can automatically optimize
the model design under the distributed setting, such that a good trade-off
between model performance and security can be made. Experimental results on
four datasets show that the model design proposed by QuMoS can achieve
competitive performance while providing the highest security than the
baselines
MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements
Hyperparameter optimization (HPO) is a fundamental problem in automatic
machine learning (AutoML). However, due to the expensive evaluation cost of
models (e.g., training deep learning models or training models on large
datasets), vanilla Bayesian optimization (BO) is typically computationally
infeasible. To alleviate this issue, Hyperband (HB) utilizes the early stopping
mechanism to speed up configuration evaluations by terminating those
badly-performing configurations in advance. This leads to two kinds of quality
measurements: (1) many low-fidelity measurements for configurations that get
early-stopped, and (2) few high-fidelity measurements for configurations that
are evaluated without being early stopped. The state-of-the-art HB-style
method, BOHB, aims to combine the benefits of both BO and HB. Instead of
sampling configurations randomly in HB, BOHB samples configurations based on a
BO surrogate model, which is constructed with the high-fidelity measurements
only. However, the scarcity of high-fidelity measurements greatly hampers the
efficiency of BO to guide the configuration search. In this paper, we present
MFES-HB, an efficient Hyperband method that is capable of utilizing both the
high-fidelity and low-fidelity measurements to accelerate the convergence of
HPO tasks. Designing MFES-HB is not trivial as the low-fidelity measurements
can be biased yet informative to guide the configuration search. Thus we
propose to build a Multi- Fidelity Ensemble Surrogate (MFES) based on the
generalized Product of Experts framework, which can integrate useful
information from multi-fidelity measurements effectively. The empirical studies
on the real-world AutoML tasks demonstrate that MFES-HB can achieve 3.3-8.9x
speedups over the state-of-the-art approach - BOHB
Prism: Revealing Hidden Functional Clusters from Massive Instances in Cloud Systems
Ensuring the reliability of cloud systems is critical for both cloud vendors
and customers. Cloud systems often rely on virtualization techniques to create
instances of hardware resources, such as virtual machines. However,
virtualization hinders the observability of cloud systems, making it
challenging to diagnose platform-level issues. To improve system observability,
we propose to infer functional clusters of instances, i.e., groups of instances
having similar functionalities. We first conduct a pilot study on a large-scale
cloud system, i.e., Huawei Cloud, demonstrating that instances having similar
functionalities share similar communication and resource usage patterns.
Motivated by these findings, we formulate the identification of functional
clusters as a clustering problem and propose a non-intrusive solution called
Prism. Prism adopts a coarse-to-fine clustering strategy. It first partitions
instances into coarse-grained chunks based on communication patterns. Within
each chunk, Prism further groups instances with similar resource usage patterns
to produce fine-grained functional clusters. Such a design reduces noises in
the data and allows Prism to process massive instances efficiently. We evaluate
Prism on two datasets collected from the real-world production environment of
Huawei Cloud. Our experiments show that Prism achieves a v-measure of ~0.95,
surpassing existing state-of-the-art solutions. Additionally, we illustrate the
integration of Prism within monitoring systems for enhanced cloud reliability
through two real-world use cases.Comment: The paper was accepted by the 38th IEEE/ACM International Conference
on Automated Software Engineering (ASE 2023
A Large-scale Benchmark for Log Parsing
Log data is pivotal in activities like anomaly detection and failure
diagnosis in the automated maintenance of software systems. Due to their
unstructured format, log parsing is often required to transform them into a
structured format for automated analysis. A variety of log parsers exist,
making it vital to benchmark these tools to comprehend their features and
performance. However, existing datasets for log parsing are limited in terms of
scale and representativeness, posing challenges for studies that aim to
evaluate or develop log parsers. This problem becomes more pronounced when
these parsers are evaluated for production use. To address these issues, we
introduce a new collection of large-scale annotated log datasets, named LogPub,
which more accurately mirrors log data observed in real-world software systems.
LogPub comprises 14 datasets, each averaging 3.6 million log lines. Utilizing
LogPub, we re-evaluate 15 log parsers in a more rigorous and practical setting.
We also propose a new evaluation metric to lessen the sensitivity of current
metrics to imbalanced data distribution. Furthermore, we are the first to
scrutinize the detailed performance of log parsers on logs that represent rare
system events and offer comprehensive information for system troubleshooting.
Parsing such logs accurately is vital yet challenging. We believe that our work
could shed light on the design and evaluation of log parsers in more realistic
settings, thereby facilitating their implementation in production systems
STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks
Recent advances in deep learning motivate the use of deep neural networks in
Internet-of-Things (IoT) applications. These networks are modelled after signal
processing in the human brain, thereby leading to significant advantages at
perceptual tasks such as vision and speech recognition. IoT applications,
however, often measure physical phenomena, where the underlying physics (such
as inertia, wireless signal propagation, or the natural frequency of
oscillation) are fundamentally a function of signal frequencies, offering
better features in the frequency domain. This observation leads to a
fundamental question: For IoT applications, can one develop a new brand of
neural network structures that synthesize features inspired not only by the
biology of human perception but also by the fundamental nature of physics?
Hence, in this paper, instead of using conventional building blocks (e.g.,
convolutional and recurrent layers), we propose a new foundational neural
network building block, the Short-Time Fourier Neural Network (STFNet). It
integrates a widely-used time-frequency analysis method, the Short-Time Fourier
Transform, into data processing to learn features directly in the frequency
domain, where the physics of underlying phenomena leave better foot-prints.
STFNets bring additional flexibility to time-frequency analysis by offering
novel nonlinear learnable operations that are spectral-compatible. Moreover,
STFNets show that transforming signals to a domain that is more connected to
the underlying physics greatly simplifies the learning process. We demonstrate
the effectiveness of STFNets with extensive experiments. STFNets significantly
outperform the state-of-the-art deep learning models in all experiments. A
STFNet, therefore, demonstrates superior capability as the fundamental building
block of deep neural networks for IoT applications for various sensor inputs
IDH1突变体通过抑制JNK的激活减少生长因子缺失诱导的细胞凋亡
文章简介抵抗凋亡和能在血清营养因子缺乏的情况下生长是肿瘤细胞的两个主要特征。JNK的激活是血清饥饿诱导的细胞凋亡所必须的因素。目前研究表明IDH1突变体产生的致癌代谢物2-羟基戊二酸(2-HG)是突变的导致肿瘤形成的主要原因。然而目前尚不清楚2-HG是否能抑制JNK的激活,进而使细胞抵抗血清饥饿诱导的凋亡。课题组以IDH1 R132Q的基因敲入MEF为研究对象
- …