338 research outputs found
Statistical inference using SGD
We present a novel method for frequentist statistical inference in
-estimation problems, based on stochastic gradient descent (SGD) with a
fixed step size: we demonstrate that the average of such SGD sequences can be
used for statistical inference, after proper scaling. An intuitive analysis
using the Ornstein-Uhlenbeck process suggests that such averages are
asymptotically normal. From a practical perspective, our SGD-based inference
procedure is a first order method, and is well-suited for large scale problems.
To show its merits, we apply it to both synthetic and real datasets, and
demonstrate that its accuracy is comparable to classical statistical methods,
while requiring potentially far less computation.Comment: To appear in AAAI 201
A Hierarchical Framework for Relation Extraction with Reinforcement Learning
Most existing methods determine relation types only after all the entities
have been recognized, thus the interaction between relation types and entity
mentions is not fully modeled. This paper presents a novel paradigm to deal
with relation extraction by regarding the related entities as the arguments of
a relation. We apply a hierarchical reinforcement learning (HRL) framework in
this paradigm to enhance the interaction between entity mentions and relation
types. The whole extraction process is decomposed into a hierarchy of two-level
RL policies for relation detection and entity extraction respectively, so that
it is more feasible and natural to deal with overlapping relations. Our model
was evaluated on public datasets collected via distant supervision, and results
show that it gains better performance than existing methods and is more
powerful for extracting overlapping relations.Comment: To appear in AAAI 1
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory
Perception and expression of emotion are key factors to the success of
dialogue systems or conversational agents. However, this problem has not been
studied in large-scale conversation generation so far. In this paper, we
propose Emotional Chatting Machine (ECM) that can generate appropriate
responses not only in content (relevant and grammatical) but also in emotion
(emotionally consistent). To the best of our knowledge, this is the first work
that addresses the emotion factor in large-scale conversation generation. ECM
addresses the factor using three new mechanisms that respectively (1) models
the high-level abstraction of emotion expressions by embedding emotion
categories, (2) captures the change of implicit internal emotion states, and
(3) uses explicit emotion expressions with an external emotion vocabulary.
Experiments show that the proposed model can generate responses appropriate not
only in content but also in emotion.Comment: Accepted in AAAI 201
Spatial-temporal Evolution and Its Influencing Factors of Tourism Eco-efficiency in China
Eco-efficiency is an invaluable indicator for the measurement of the relationship between production activities and environmental depletion. This study measures the tourism eco-efficiency of 30 provinces in China from 2005 to 2020 based on the super-efficiency SBM model, and explores its spatial-temporal evolution characteristics using the kernel density function, standard deviation ellipse, and center of gravity model. Then, the influencing factors of the tourism eco-efficiency in China are analyzed by Tobit regression model. The results show that the tourism eco-efficiency of China is generally fluctuating upwards, but has not yet reached the maximum production possibility frontier. The kernel density curve shows a unimodal-bimodal-unimodal pattern, while the inter-provincial differences have been decreasing and becoming more balanced. The center of gravity of tourism eco-efficiency is located at the junction of Henan and Hubei province and generally moves to the south (slightly to the southwest). Meanwhile, it is revealed that the level of economic development and the tourism eco-efficiency has a significant inverted U-shaped relationship. The level of economic openness, traffic conditions, and tourism eco-efficiency is positively correlated. The environmental regulations and industrial structure have a negative but limited impact on tourism eco-efficiency. Finally, recommendations and suggestions for policy formulation to promote quality and sustainable development of the tourism industry are put forward, such as increasing investment in ecological protection and governance in tourism development, improving capacity-building in allocating green and low-carbon technologies and resources, strengthening tourism infrastructure construction, and enhancing environmental governance systems and mechanisms
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Augmenting large language models (LLMs) with external tools has emerged as a
promising approach to solving complex problems. However, traditional methods,
which finetune LLMs with tool demonstration data, can be both costly and
restricted to a predefined set of tools. Recent in-context learning paradigm
alleviates these issues, but the limited context length only allows for a few
shots of demonstrations, leading to suboptimal understandings of the tools.
Moreover, when there are numerous tools to choose from, in-context learning
could completely fail to work. In this paper, we propose an alternative
approach, , which combines the benefits of both sides. Our
approach represents each as a to
() and learns an embedding for it, enabling tool calls in the
same way as generating a regular word token. Once a toolken is triggered, the
LLM is prompted to complete arguments for the tool to execute. ToolkenGPT
offers the flexibility to plug in an arbitrary number of tools by expanding the
set of toolkens on the fly. In addition, it improves tool use by allowing
extensive demonstration data for learning the toolken embeddings. In diverse
domains, including numerical reasoning, knowledge-based question answering, and
embodied plan generation, our approach effectively augments LLMs with tools and
substantially outperforms various latest baselines. ToolkenGPT demonstrates the
promising ability to use relevant tools from a large tool set in complex
scenarios
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
Contrastive learning, especially self-supervised contrastive learning (SSCL),
has achieved great success in extracting powerful features from unlabeled data.
In this work, we contribute to the theoretical understanding of SSCL and
uncover its connection to the classic data visualization method, stochastic
neighbor embedding (SNE), whose goal is to preserve pairwise distances. From
the perspective of preserving neighboring information, SSCL can be viewed as a
special case of SNE with the input space pairwise similarities specified by
data augmentation. The established correspondence facilitates deeper
theoretical understanding of learned features of SSCL, as well as
methodological guidelines for practical improvement. Specifically, through the
lens of SNE, we provide novel analysis on domain-agnostic augmentations,
implicit bias and robustness of learned features. To illustrate the practical
advantage, we demonstrate that the modifications from SNE to -SNE can also
be adopted in the SSCL setting, achieving significant improvement in both
in-distribution and out-of-distribution generalization.Comment: Accepted by ICLR 202
- …