52 research outputs found
REST: Retrieval-Based Speculative Decoding
We introduce Retrieval-Based Speculative Decoding (REST), a novel algorithm
designed to speed up language model generation. The key insight driving the
development of REST is the observation that the process of text generation
often includes certain common phases and patterns. Unlike previous methods that
rely on a draft language model for speculative decoding, REST harnesses the
power of retrieval to generate draft tokens. This method draws from the
reservoir of existing knowledge, retrieving and employing relevant tokens based
on the current context. Its plug-and-play nature allows for seamless
integration and acceleration of any language models, all without necessitating
additional training. When benchmarked on 7B and 13B language models in a
single-batch setting, REST achieves a significant speedup of 1.62X to 2.36X on
code or text generation. The code of REST is available at
https://github.com/FasterDecoding/REST
LatticeGen: A Cooperative Framework which Hides Generated Text in a Lattice for Privacy-Aware Generation on Cloud
In the current user-server interaction paradigm of prompted generation with
large language models (LLM) on cloud, the server fully controls the generation
process, which leaves zero options for users who want to keep the generated
text to themselves. We propose LatticeGen, a cooperative framework in which the
server still handles most of the computation while the user controls the
sampling operation. The key idea is that the true generated sequence is mixed
with noise tokens by the user and hidden in a noised lattice. Considering
potential attacks from a hypothetically malicious server and how the user can
defend against it, we propose the repeated beam-search attack and the mixing
noise scheme. In our experiments we apply LatticeGen to protect both prompt and
generation. It is shown that while the noised lattice degrades generation
quality, LatticeGen successfully protects the true generation to a remarkable
degree under strong attacks (more than 50% of the semantic remains hidden as
measured by BERTScore)
Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information
Many data analysis tasks heavily rely on a deep understanding of tables
(multi-dimensional data). Across the tasks, there exist comonly used metadata
attributes of table fields / columns. In this paper, we identify four such
analysis metadata: Measure/dimension dichotomy, common field roles, semantic
field type, and default aggregation function. While those metadata face
challenges of insufficient supervision signals, utilizing existing knowledge
and understanding distribution. To inference these metadata for a raw table, we
propose our multi-tasking Metadata model which fuses field distribution and
knowledge graph information into pre-trained tabular models. For model training
and evaluation, we collect a large corpus (~582k tables from private
spreadsheet and public tabular datasets) of analysis metadata by using diverse
smart supervisions from downstream tasks. Our best model has accuracy = 98%,
hit rate at top-1 > 67%, accuracy > 80%, and accuracy = 88% for the four
analysis metadata inference tasks, respectively. It outperforms a series of
baselines that are based on rules, traditional machine learning methods, and
pre-trained tabular models. Analysis metadata models are deployed in a popular
data analysis product, helping downstream intelligent features such as insights
mining, chart / pivot table recommendation, and natural language QA...Comment: 13pages, 7 figures, 9 table
Crystal Structure of the Cysteine Desulfurase DndA from Streptomyces lividans Which Is Involved in DNA Phosphorothioation
DNA phosphorothioation is widespread among prokaryotes, and might function to restrict gene transfer among different kinds of bacteria. There has been little investigation into the structural mechanism of the DNA phosphorothioation process. DndA is a cysteine desulfurase which is involved in the first step of DNA phosphorothioation. In this study, we determined the crystal structure of Streptomyces lividans DndA in complex with its covalently bound cofactor PLP, to a resolution of 2.4 Å. Our structure reveals the molecular mechanism that DndA employs to recognize its cofactor PLP, and suggests the potential binding site for the substrate L-cysteine on DndA. In contrast to previously determined structures of cysteine desulfurases, the catalytic cysteine of DndA was found to reside on a β strand. This catalytic cysteine is very far away from the presumable location of the substrate, suggesting that a conformational change of DndA is required during the catalysis process to bring the catalytic cysteine close to the substrate cysteine. Moreover, our in vitro enzymatic assay results suggested that this conformational change is unlikely to be a simple result of random thermal motion, since moving the catalytic cysteine two residues forward or backward in the primary sequence completely disabled the cysteine desulfurase activity of DndA
3D snapshot: Invertible embedding of 3D neural representations in a single image
3D neural rendering enables photo-realistic reconstruction of a specific scene by encoding discontinuous inputs into a neural representation. Despite the remarkable rendering results, the storage of network parameters is not transmission-friendly and not extendable to metaverse applications. In this paper, we propose an invertible neural rendering approach that enables generating an interactive 3D model from a single image (i.e., 3D Snapshot). Our idea is to distill a pre-trained neural rendering model (e.g., NeRF) into a visualizable image form that can then be easily inverted back to a neural network. To this end, we first present a neural image distillation method to optimize three neural planes for representing the original neural rendering model. However, this representation is noisy and visually meaningless. We thus propose a dynamic invertible neural network to embed this noisy representation into a plausible image representation of the scene. We demonstrate promising reconstruction quality quantitatively and qualitatively, by comparing to the original neural rendering model, as well as video-based invertible methods. On the other hand, our method can store dozens of NeRFs with a compact restoration network (5MB), and embedding each 3D scene takes up only 160KB of storage. More importantly, our approach is the first solution that allows embedding a neural rendering model into image representations, which enables applications like creating an interactive 3D model from a printed image in the metaverse
Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models
This paper presents a comprehensive survey of ChatGPT and GPT-4,
state-of-the-art large language models (LLM) from the GPT series, and their
prospective applications across diverse domains. Indeed, key innovations such
as large-scale pre-training that captures knowledge across the entire world
wide web, instruction fine-tuning and Reinforcement Learning from Human
Feedback (RLHF) have played significant roles in enhancing LLMs' adaptability
and performance. We performed an in-depth analysis of 194 relevant papers on
arXiv, encompassing trend analysis, word cloud representation, and distribution
analysis across various application domains. The findings reveal a significant
and increasing interest in ChatGPT/GPT-4 research, predominantly centered on
direct natural language processing applications, while also demonstrating
considerable potential in areas ranging from education and history to
mathematics, medicine, and physics. This study endeavors to furnish insights
into ChatGPT's capabilities, potential implications, ethical concerns, and
offer direction for future advancements in this field.Comment: 35 pages, 3 figure
- …