Search CORE

309 research outputs found

Statistical inference using SGD

Author: Caramanis Constantine
Kyrillidis Anastasios
Li Tianyang
Liu Liu
Publication venue
Publication date: 19/11/2017
Field of study

We present a novel method for frequentist statistical inference in

M

-estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGD-based inference procedure is a first order method, and is well-suited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Hierarchical Framework for Relation Extraction with Reinforcement Learning

Author: Huang Minlie
Liu Jiexi
Takanobu Ryuichi
Zhang Tianyang
Publication venue
Publication date: 09/11/2018
Field of study

Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled. This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation. We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types. The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations. Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations.Comment: To appear in AAAI 1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

Author: Huang Minlie
Liu Bing
Zhang Tianyang
Zhou Hao
Zhu Xiaoyan
Publication venue
Publication date: 25/04/2018
Field of study

Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). To the best of our knowledge, this is the first work that addresses the emotion factor in large-scale conversation generation. ECM addresses the factor using three new mechanisms that respectively (1) models the high-level abstraction of emotion expressions by embedding emotion categories, (2) captures the change of implicit internal emotion states, and (3) uses explicit emotion expressions with an external emotion vocabulary. Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion.Comment: Accepted in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Spatial-temporal Evolution and Its Influencing Factors of Tourism Eco-efficiency in China

Author: Liu Zhiliang
Lu Chengpeng
Ma Tianyang
Publication venue: 'Bilingual Publishing Co.'
Publication date: 20/06/2022
Field of study

Eco-efficiency is an invaluable indicator for the measurement of the relationship between production activities and environmental depletion. This study measures the tourism eco-efficiency of 30 provinces in China from 2005 to 2020 based on the super-efficiency SBM model, and explores its spatial-temporal evolution characteristics using the kernel density function, standard deviation ellipse, and center of gravity model. Then, the influencing factors of the tourism eco-efficiency in China are analyzed by Tobit regression model. The results show that the tourism eco-efficiency of China is generally fluctuating upwards, but has not yet reached the maximum production possibility frontier. The kernel density curve shows a unimodal-bimodal-unimodal pattern, while the inter-provincial differences have been decreasing and becoming more balanced. The center of gravity of tourism eco-efficiency is located at the junction of Henan and Hubei province and generally moves to the south (slightly to the southwest). Meanwhile, it is revealed that the level of economic development and the tourism eco-efficiency has a significant inverted U-shaped relationship. The level of economic openness, traffic conditions, and tourism eco-efficiency is positively correlated. The environmental regulations and industrial structure have a negative but limited impact on tourism eco-efficiency. Finally, recommendations and suggestions for policy formulation to promote quality and sustainable development of the tourism industry are put forward, such as increasing investment in ecological protection and governance in tourism development, improving capacity-building in allocating green and low-carbon technologies and resources, strengthening tourism infrastructure construction, and enhancing environmental governance systems and mechanisms

Bilingual Publishing Co. (BPC): E-Journals

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings

Author: Hao Shibo
Hu Zhiting
Liu Tianyang
Wang Zhen
Publication venue
Publication date: 19/05/2023
Field of study

Augmenting large language models (LLMs) with external tools has emerged as a promising approach to solving complex problems. However, traditional methods, which finetune LLMs with tool demonstration data, can be both costly and restricted to a predefined set of tools. Recent in-context learning paradigm alleviates these issues, but the limited context length only allows for a few shots of demonstrations, leading to suboptimal understandings of the tools. Moreover, when there are numerous tools to choose from, in-context learning could completely fail to work. In this paper, we propose an alternative approach,

\textbf{ToolkenGPT}

, which combines the benefits of both sides. Our approach represents each

\underline{tool}

as a to

\underline{ken}

(

\textit{toolken}

) and learns an embedding for it, enabling tool calls in the same way as generating a regular word token. Once a toolken is triggered, the LLM is prompted to complete arguments for the tool to execute. ToolkenGPT offers the flexibility to plug in an arbitrary number of tools by expanding the set of toolkens on the fly. In addition, it improves tool use by allowing extensive demonstration data for learning the toolken embeddings. In diverse domains, including numerical reasoning, knowledge-based question answering, and embodied plan generation, our approach effectively augments LLMs with tools and substantially outperforms various latest baselines. ToolkenGPT demonstrates the promising ability to use relevant tools from a large tool set in complex scenarios

arXiv.org e-Print Archive