85 research outputs found

    A Unified Framework for Multi-intent Spoken Language Understanding with prompting

    Full text link
    Multi-intent Spoken Language Understanding has great potential for widespread implementation. Jointly modeling Intent Detection and Slot Filling in it provides a channel to exploit the correlation between intents and slots. However, current approaches are apt to formulate these two sub-tasks differently, which leads to two issues: 1) It hinders models from effective extraction of shared features. 2) Pretty complicated structures are involved to enhance expression ability while causing damage to the interpretability of frameworks. In this work, we describe a Prompt-based Spoken Language Understanding (PromptSLU) framework, to intuitively unify two sub-tasks into the same form by offering a common pre-trained Seq2Seq model. In detail, ID and SF are completed by concisely filling the utterance into task-specific prompt templates as input, and sharing output formats of key-value pairs sequence. Furthermore, variable intents are predicted first, then naturally embedded into prompts to guide slot-value pairs inference from a semantic perspective. Finally, we are inspired by prevalent multi-task learning to introduce an auxiliary sub-task, which helps to learn relationships among provided labels. Experiment results show that our framework outperforms several state-of-the-art baselines on two public datasets.Comment: Work in progres

    API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

    Full text link
    Recent research has demonstrated that Large Language Models (LLMs) can enhance their capabilities by utilizing external tools. However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2) How can we enhance LLMs' ability to utilize tools? (3) What obstacles need to be overcome to leverage tools? To address these questions, we introduce API-Bank, a groundbreaking benchmark, specifically designed for tool-augmented LLMs. For the first question, we develop a runnable evaluation system consisting of 73 API tools. We annotate 314 tool-use dialogues with 753 API calls to assess the existing LLMs' capabilities in planning, retrieving, and calling APIs. For the second question, we construct a comprehensive training set containing 1,888 tool-use dialogues from 2,138 APIs spanning 1,000 distinct domains. Using this dataset, we train Lynx, a tool-augmented LLM initialized from Alpaca. Experimental results demonstrate that GPT-3.5 exhibits improved tool utilization compared to GPT-3, while GPT-4 excels in planning. However, there is still significant potential for further improvement. Moreover, Lynx surpasses Alpaca's tool utilization performance by more than 26 pts and approaches the effectiveness of GPT-3.5. Through error analysis, we highlight the key challenges for future research in this field to answer the third question.Comment: EMNLP 202

    Review of Service Restoration Methods in Distribution Networks

    Get PDF

    The gene normalization task in BioCreative III

    Get PDF
    BACKGROUND: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance

    A survey on puncture models and path planning algorithms of bevel-tipped flexible needles

    No full text
    Percutaneous needle insertion is a minimally invasive surgery with broad medical application prospects, such as biopsy and brachytherapy. However, the currently adopted rigid needles have limitations, as they cannot bypass obstacles or correct puncture deviations and can only travel along a straight path. Bevel-tip flexible needles are increasingly being adopted to address these issues, owing to their needle body's ease of deformation and bending. Successful puncture of flexible needles relies on accurate models and path planning, ensuring the needle reaches the target while avoiding vital tissues. This review investigates puncture models and path-planning algorithms by reviewing recent literature, focusing on the path-planning part. According to the literature, puncture models can be divided into three types: mechanical, finite element method (FEM), and kinematic models, while path-planning algorithms are categorized and discussed following the division used for mobile robots, which differs from the conventional approach for flexible needles—an innovation in this review. This review systematically summarizes the following categories: graph theory search, sampling-based, intelligent search, local obstacle avoidance, and other algorithms, including their implementation, advantages, and disadvantages, to further explore the potential to overcome obstacles in path planning for minimally invasive puncture needles. Finally, this study proposes future development trends in path-planning algorithms, providing possible directions for subsequent research for bevel-tipped flexible needles. This research aims to provide a resource for researchers to quickly learn about common path-planning algorithms, their backgrounds, and puncture models

    A Global-Local Blur Disentangling Network for Dynamic Scene Deblurring

    No full text
    Images captured in a real scene usually suffer from complex non-uniform degradation, which includes both global and local blurs. It is difficult to handle the complex blur variances by a unified processing model. We propose a global-local blur disentangling network, which can effectively extract global and local blur features via two branches. A phased training scheme is designed to disentangle the global and local blur features, that is the branches are trained with task-specific datasets, respectively. A branch attention mechanism is introduced to dynamically fuse global and local features. Complex blurry images are used to train the attention module and the reconstruction module. The visualized feature maps of different branches indicated that our dual-branch network can decouple the global and local blur features efficiently. Experimental results show that the proposed dual-branch blur disentangling network can improve both the subjective and objective deblurring effects for real captured images

    Effect of Ellipsoidal Modulus and Internal Pressure on Bearing Capacity of Thrust-Bearing Aft Dome

    No full text
    In order to analyse the effect of ellipsoidal modulus and internal pressure on bearing capacity of thrust-bearing aft dome, we obtain the stress and strain distribution and bearing capacity of 1.6, 1.4 and 1.0 modulus ellipsoidal aft dome under 0MPa~0.98MPa internal pressure and engine thrust by finite element method. We find as the modulus decreases, the bearing capacity of the ellipsoidal aft dome increases, and as internal pressure decreases, within the engineering range (0~0.98MPa), the bearing capacity increases. The conclusion can provide a guidance for the design of thrust-bearing aft dome
    corecore