27 research outputs found

    ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

    Full text link
    Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks. We study the problem of prompt designing in the text-to-SQL task and attempt to improve the LLMs' reasoning ability when generating SQL queries. Besides the trivial few-shot in-context learning setting, we design our chain-of-thought (CoT) prompt with a similar method to schema linking. We provide a method named ACT-SQL to automatically generate auto-CoT exemplars and thus the whole process doesn't need manual labeling. Our approach is cost-saving since we only use the LLMs' API call once when generating one SQL query. Furthermore, we extend our in-context learning method to the multi-turn text-to-SQL task. The experiment results show that the LLMs' performance can benefit from our ACT-SQL approach. Our approach achieves SOTA performance on the Spider dev set among existing in-context learning approaches

    Collaborative Group Learning

    Full text link
    Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima. However, previous approaches typically struggle with drastically aggravated student homogenization when the number of students rises. In this paper, we propose Collaborative Group Learning, an efficient framework that aims to diversify the feature representation and conduct an effective regularization. Intuitively, similar to the human group study mechanism, we induce students to learn and exchange different parts of course knowledge as collaborative groups. First, each student is established by randomly routing on a modular neural network, which facilitates flexible knowledge communication between students due to random levels of representation sharing and branching. Second, to resist the student homogenization, students first compose diverse feature sets by exploiting the inductive bias from sub-sets of training data, and then aggregate and distill different complementary knowledge by imitating a random sub-group of students at each time step. Overall, the above mechanisms are beneficial for maximizing the student population to further improve the model generalization without sacrificing computational efficiency. Empirical evaluations on both image and text tasks indicate that our method significantly outperforms various state-of-the-art collaborative approaches whilst enhancing computational efficiency.Comment: Accepted by AAAI 2021; Camera ready versio

    Large Language Models Are Semi-Parametric Reinforcement Learning Agents

    Full text link
    Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory. We further introduce Reinforcement Learning with Experience Memory (RLEM) to update the memory. Thus, the whole system can learn from the experiences of both success and failure, and evolve its capability without fine-tuning the parameters of the LLM. In this way, the proposed REMEMBERER constitutes a semi-parametric RL agent. Extensive experiments are conducted on two RL task sets to evaluate the proposed framework. The average results with different initialization and training sets exceed the prior SOTA by 4% and 2% for the success rate on two task sets and demonstrate the superiority and robustness of REMEMBERER

    ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

    Full text link
    Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema. To ensure the well-formedness of output SQLs, one prominent approach adopts a grammar-based recurrent decoder to produce the equivalent SQL abstract syntax tree (AST). However, previous methods mainly utilize an RNN-series decoder, which 1) is time-consuming and inefficient and 2) introduces very few structure priors. In this work, we propose an AST structure-aware Transformer decoder (ASTormer) to replace traditional RNN cells. The structural knowledge, such as node types and positions in the tree, is seamlessly incorporated into the decoder via both absolute and relative position embeddings. Besides, the proposed framework is compatible with different traversing orders even considering adaptive node selection. Extensive experiments on five text-to-SQL benchmarks demonstrate the effectiveness and efficiency of our structured decoder compared to competitive baselines

    Adaptive Vague Preference Policy Learning for Multi-round Conversational Recommendation

    Full text link
    Conversational recommendation systems (CRS) effectively address information asymmetry by dynamically eliciting user preferences through multi-turn interactions. Existing CRS widely assumes that users have clear preferences. Under this assumption, the agent will completely trust the user feedback and treat the accepted or rejected signals as strong indicators to filter items and reduce the candidate space, which may lead to the problem of over-filtering. However, in reality, users' preferences are often vague and volatile, with uncertainty about their desires and changing decisions during interactions. To address this issue, we introduce a novel scenario called Vague Preference Multi-round Conversational Recommendation (VPMCR), which considers users' vague and volatile preferences in CRS.VPMCR employs a soft estimation mechanism to assign a non-zero confidence score for all candidate items to be displayed, naturally avoiding the over-filtering problem. In the VPMCR setting, we introduce an solution called Adaptive Vague Preference Policy Learning (AVPPL), which consists of two main components: Uncertainty-aware Soft Estimation (USE) and Uncertainty-aware Policy Learning (UPL). USE estimates the uncertainty of users' vague feedback and captures their dynamic preferences using a choice-based preferences extraction module and a time-aware decaying strategy. UPL leverages the preference distribution estimated by USE to guide the conversation and adapt to changes in users' preferences to make recommendations or ask for attributes. Our extensive experiments demonstrate the effectiveness of our method in the VPMCR scenario, highlighting its potential for practical applications and improving the overall performance and applicability of CRS in real-world settings, particularly for users with vague or dynamic preferences

    Reduced expression of tissue factor pathway inhibitor-2 contributes to apoptosis and angiogenesis in cervical cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tissue factor pathway inhibitor-2 (TFPI-2) is an extracellular matrix associated broad-spectrum Kunitz-type serine proteinase inhibitor. Recently, down regulation of TFPI-2 was suggested to be involved in tumor invasion and metastasis in some cancers.</p> <p>Methods</p> <p>This study involved 12 normal cervical squamous epithelia, 48 cervical intraepithelial neoplasia (CIN), and 68 cervical cancer. The expression of TFPI-2, Ki-67 and vascular endothelial growth factor (VEGF) were investigated by immunohistochemistry staining. The apoptolic index(AI) was determined with an in situ end-labeling assay(TUNEL). And the marker of CD34 staining was used as an indicator of microvessel density (MVD).</p> <p>Results</p> <p>TFPI-2 expression has a decreasing trend with the progression of cervical cancer and was significantly correlated with FIGO stage, lymph node metastasis and HPV infection. In addition, there were significant positive correlations between the grading of TFPI-2 expression and AI(P = 0.004). In contrast, the expression of TFPI-2 and VEGF or MVD was negatively correlated (both p < 0.001). However, we did not establish any significant correlation between Ki-67 and TFPI-2 expression in cervical cancer.</p> <p>Conclusions</p> <p>The results suggested that the expression of TFPI-2 had a decreasing trend with tumor progression of cervical cancer. There was a close association between the expression of TFPI-2 and tumor cell apoptosis and angiogenesis in patients with cervical cancer. TFPI-2 may play an inhibitive role during the development of cervical cancer.</p

    The protective role of DOT1L in UV-induced melanomagenesis

    Get PDF
    The DOT1L histone H3 lysine 79 (H3K79) methyltransferase plays an oncogenic role in MLL-rearranged leukemogenesis. Here, we demonstrate that, in contrast to MLL-rearranged leukemia, DOT1L plays a protective role in ultraviolet radiation (UVR)-induced melanoma development. Specifically, the DOT1L gene is located in a frequently deleted region and undergoes somatic mutation in human melanoma. Specific mutations functionally compromise DOT1L methyltransferase enzyme activity leading to reduced H3K79 methylation. Importantly, in the absence of DOT1L, UVR-induced DNA damage is inefficiently repaired, so that DOT1L loss promotes melanoma development in mice after exposure to UVR. Mechanistically, DOT1L facilitates DNA damage repair, with DOT1L-methylated H3K79 involvement in binding and recruiting XPC to the DNA damage site for nucleotide excision repair (NER). This study indicates that DOT1L plays a protective role in UVR-induced melanomagenesis
    corecore