149 research outputs found
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Actor-critic (AC) methods have exhibited great empirical success compared
with other reinforcement learning algorithms, where the actor uses the policy
gradient to improve the learning policy and the critic uses temporal difference
learning to estimate the policy gradient. Under the two time-scale learning
rate schedule, the asymptotic convergence of AC has been well studied in the
literature. However, the non-asymptotic convergence and finite sample
complexity of actor-critic methods are largely open. In this work, we provide a
non-asymptotic analysis for two time-scale actor-critic methods under
non-i.i.d. setting. We prove that the actor-critic method is guaranteed to find
a first-order stationary point (i.e., ) of the non-concave performance function
, with sample
complexity. To the best of our knowledge, this is the first work providing
finite-time analysis and sample complexity bound for two time-scale
actor-critic methods.Comment: 45 page
Distinguishing Computer-generated Graphics from Natural Images Based on Sensor Pattern Noise and Deep Learning
Computer-generated graphics (CGs) are images generated by computer software.
The~rapid development of computer graphics technologies has made it easier to
generate photorealistic computer graphics, and these graphics are quite
difficult to distinguish from natural images (NIs) with the naked eye. In this
paper, we propose a method based on sensor pattern noise (SPN) and deep
learning to distinguish CGs from NIs. Before being fed into our convolutional
neural network (CNN)-based model, these images---CGs and NIs---are clipped into
image patches. Furthermore, three high-pass filters (HPFs) are used to remove
low-frequency signals, which represent the image content. These filters are
also used to reveal the residual signal as well as SPN introduced by the
digital camera device. Different from the traditional methods of distinguishing
CGs from NIs, the proposed method utilizes a five-layer CNN to classify the
input image patches. Based on the classification results of the image patches,
we deploy a majority vote scheme to obtain the classification results for the
full-size images. The~experiments have demonstrated that (1) the proposed
method with three HPFs can achieve better results than that with only one HPF
or no HPF and that (2) the proposed method with three HPFs achieves 100\%
accuracy, although the NIs undergo a JPEG compression with a quality factor of
75.Comment: This paper has been published by Sensors. doi:10.3390/s18041296;
Sensors 2018, 18(4), 129
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves
Misunderstandings arise not only in interpersonal communication but also
between humans and Large Language Models (LLMs). Such discrepancies can make
LLMs interpret seemingly unambiguous questions in unexpected ways, yielding
incorrect responses. While it is widely acknowledged that the quality of a
prompt, such as a question, significantly impacts the quality of the response
provided by LLMs, a systematic method for crafting questions that LLMs can
better comprehend is still underdeveloped. In this paper, we present a method
named `Rephrase and Respond' (RaR), which allows LLMs to rephrase and expand
questions posed by humans and provide responses in a single prompt. This
approach serves as a simple yet effective prompting method for improving
performance. We also introduce a two-step variant of RaR, where a rephrasing
LLM first rephrases the question and then passes the original and rephrased
questions together to a different responding LLM. This facilitates the
effective utilization of rephrased questions generated by one LLM with another.
Our experiments demonstrate that our methods significantly improve the
performance of different models across a wide range to tasks. We further
provide a comprehensive comparison between RaR and the popular Chain-of-Thought
(CoT) methods, both theoretically and empirically. We show that RaR is
complementary to CoT and can be combined with CoT to achieve even better
performance. Our work not only contributes to enhancing LLM performance
efficiently and effectively but also sheds light on a fair evaluation of LLM
capabilities. Data and codes are available at
https://github.com/uclaml/Rephrase-and-Respond.Comment: 25 pages, 7 figures, 22 table
DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks
Pre-trained large language models demonstrate potential in extracting
information from DNA sequences, yet adapting to a variety of tasks and data
modalities remains a challenge. To address this, we propose DNAGPT, a
generalized DNA pre-training model trained on over 200 billion base pairs from
all mammals. By enhancing the classic GPT model with a binary classification
task (DNA sequence order), a numerical regression task (guanine-cytosine
content prediction), and a comprehensive token language, DNAGPT can handle
versatile DNA analysis tasks while processing both sequence and numerical data.
Our evaluation of genomic signal and region recognition, mRNA abundance
regression, and artificial genomes generation tasks demonstrates DNAGPT's
superior performance compared to existing models designed for specific
downstream tasks, benefiting from pre-training using the newly designed model
structure
CaMU: Disentangling Causal Effects in Deep Model Unlearning
Machine unlearning requires removing the information of forgetting data while
keeping the necessary information of remaining data. Despite recent
advancements in this area, existing methodologies mainly focus on the effect of
removing forgetting data without considering the negative impact this can have
on the information of the remaining data, resulting in significant performance
degradation after data removal. Although some methods try to repair the
performance of remaining data after removal, the forgotten information can also
return after repair. Such an issue is due to the intricate intertwining of the
forgetting and remaining data. Without adequately differentiating the influence
of these two kinds of data on the model, existing algorithms take the risk of
either inadequate removal of the forgetting data or unnecessary loss of
valuable information from the remaining data. To address this shortcoming, the
present study undertakes a causal analysis of the unlearning and introduces a
novel framework termed Causal Machine Unlearning (CaMU). This framework adds
intervention on the information of remaining data to disentangle the causal
effects between forgetting data and remaining data. Then CaMU eliminates the
causal impact associated with forgetting data while concurrently preserving the
causal relevance of the remaining data. Comprehensive empirical results on
various datasets and models suggest that CaMU enhances performance on the
remaining data and effectively minimizes the influences of forgetting data.
Notably, this work is the first to interpret deep model unlearning tasks from a
new perspective of causality and provide a solution based on causal analysis,
which opens up new possibilities for future research in deep model unlearning.Comment: Full version of the paper accepted for the SDM 24 conferenc
- …