28 research outputs found
An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation
Generating paraphrases from given sentences involves decoding words step by
step from a large vocabulary. To learn a decoder, supervised learning which
maximizes the likelihood of tokens always suffers from the exposure bias.
Although both reinforcement learning (RL) and imitation learning (IL) have been
widely used to alleviate the bias, the lack of direct comparison leads to only
a partial image on their benefits. In this work, we present an empirical study
on how RL and IL can help boost the performance of generating paraphrases, with
the pointer-generator as a base model. Experiments on the benchmark datasets
show that (1) imitation learning is constantly better than reinforcement
learning; and (2) the pointer-generator models with imitation learning
outperform the state-of-the-art methods with a large margin.Comment: 9 pages, 2 figures, EMNLP201
Read, Revise, Repeat: A System Demonstration for Human-in-the-loop Iterative Text Revision
Revision is an essential part of the human writing process. It tends to be
strategic, adaptive, and, more importantly, iterative in nature. Despite the
success of large language models on text revision tasks, they are limited to
non-iterative, one-shot revisions. Examining and evaluating the capability of
large language models for making continuous revisions and collaborating with
human writers is a critical step towards building effective writing assistants.
In this work, we present a human-in-the-loop iterative text revision system,
Read, Revise, Repeat (R3), which aims at achieving high quality text revisions
with minimal human efforts by reading model-generated revisions and user
feedbacks, revising documents, and repeating human-machine interactions. In R3,
a text revision model provides text editing suggestions for human writers, who
can accept or reject the suggested edits. The accepted edits are then
incorporated into the model for the next iteration of document revision.
Writers can therefore revise documents iteratively by interacting with the
system and simply accepting/rejecting its suggested edits until the text
revision model stops making further revisions or reaches a predefined maximum
number of revisions. Empirical experiments show that R3 can generate revisions
with comparable acceptance rate to human writers at early revision depths, and
the human-machine interaction can get higher quality revisions with fewer
iterations and edits. The collected human-model interaction dataset and system
code are available at \url{https://github.com/vipulraheja/IteraTeR}. Our system
demonstration is available at \url{https://youtu.be/lK08tIpEoaE}.Comment: Accepted by The First Workshop on Intelligent and Interactive Writing
Assistants at ACL202
Dynamic RACH Partition for Massive Access of Differentiated M2M Services
In machine-to-machine (M2M) networks, a key challenge is to overcome the overload problem caused by random access requests from massive machine-type communication (MTC) devices. When differentiated services coexist, such as delay-sensitive and delay-tolerant services, the problem becomes more complicated and challenging. This is because delay-sensitive services often use more aggressive policies, and thus, delay-tolerant services get much fewer chances to access the network. To conquer the problem, we propose an efficient mechanism for massive access control over differentiated M2M services, including delay-sensitive and delay-tolerant services. Specifically, based on the traffic loads of the two types of services, the proposed scheme dynamically partitions and allocates the random access channel (RACH) resource to each type of services. The RACH partition strategy is thoroughly optimized to increase the access performances of M2M networks. Analyses and
simulation demonstrate the effectiveness of our design. The proposed scheme can outperform the baseline access class barring (ACB) scheme, which ignores service types in access control, in terms of access success probability and the average access delay
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
Despite recent progress in open-domain dialogue evaluation, how to develop
automatic metrics remains an open problem. We explore the potential of dialogue
evaluation featuring dialog act information, which was hardly explicitly
modeled in previous methods. However, defined at the utterance level in
general, dialog act is of coarse granularity, as an utterance can contain
multiple segments possessing different functions. Hence, we propose segment
act, an extension of dialog act from utterance level to segment level, and
crowdsource a large-scale dataset for it. To utilize segment act flows,
sequences of segment acts, for evaluation, we develop the first consensus-based
dialogue evaluation framework, FlowEval. This framework provides a
reference-free approach for dialog evaluation by finding pseudo-references.
Extensive experiments against strong baselines on three benchmark datasets
demonstrate the effectiveness and other desirable characteristics of our
FlowEval, pointing out a potential path for better dialogue evaluation.Comment: EMNLP 2022 camera-ready versio
Research Progress on Essential Oils from Lauraceae Plants
There are various species of plants in the Lauraceae family, and these plants are widely distributed in China. Their essential oils have many biological activities, such as bacteriostatic, antioxidant and insect repellent activities and are ideal natural preservatives, effective antioxidants, and green insect repellents. Therefore, they are of research significance and have broad application prospects. The effects of the growth cycle, plant parts, extraction methods, and geographical environment on the major components of essential oils from Lauraceae plants (LEO) are reviewed with a focus on the antimicrobial, antioxidant and insect repellent activities and mechanisms of LEO. In addition, common edible coatings, microencapsulation, and nano-emulsion technologies based on LEO for food packaging and preservation are summarized. We hope that this review will provide a theoretical basis for the application of LEO in food preservation
Security Enhancement for Multicast over Internet of Things by Dynamically Constructed Fountain Codes
The Internet of Things (IoT) is expected to accommodate every object which exists in this world or likely to exist in the near future. The enormous scale of the objects is challenged by big security concerns, especially for common information dissemination via multicast services, where the reliability assurance for multiple multicast users at the cost of increasing redundancy and/or retransmissions also benefits eavesdroppers in successfully decoding the overheard signals. The objective of this work is to address the security challenge present in IoT multicast applications. Specifically, with the presence of the eavesdropper, an adaptive fountain code design is proposed in this paper to enhance the security for multicast in IoT. The main novel features of the proposed scheme include two folds: (i) dynamical encoding scheme which can effectively decrease intercept probability at the eavesdropper; (ii) increasing the transmission efficiency compared with the conventional nondynamical design. The analysis and simulation results show that the proposed scheme can effectively enhance information security while achieving higher transmission efficiency with a little accredited complexity, thus facilitating the secured wireless multicast transmissions over IoT
Comprehensively benchmarking applications for detecting copy number variation.
MOTIVATION:Recently, copy number variation (CNV) has gained considerable interest as a type of genomic variation that plays an important role in complex phenotypes and disease susceptibility. Since a number of CNV detection methods have recently been developed, it is necessary to help investigators choose suitable methods for CNV detection depending on their objectives. For this reason, this study compared ten commonly used CNV detection applications, including CNVnator, ReadDepth, RDXplorer, LUMPY and Control-FREEC, benchmarking the applications by sensitivity, specificity and computational demands. Taking the DGV gold standard variants as a standard dataset, we evaluated the ten applications with real sequencing data at sequencing depths from 5X to 50X. Among the ten methods benchmarked, LUMPY performs the best for both high sensitivity and specificity at each sequencing depth. For the purpose of high specificity, Canvas is also a good choice. If high sensitivity is preferred, CNVnator and RDXplorer are better choices. Additionally, CNVnator and GROM-RD perform well for low-depth sequencing data. Our results provide a comprehensive performance evaluation for these selected CNV detection methods and facilitate future development and improvement in CNV prediction methods