162 research outputs found
On the Robotic Uncertainty of Fully Autonomous Traffic
Recent transportation research suggests that autonomous vehicles (AVs) have
the potential to improve traffic flow efficiency as they are able to maintain
smaller car-following distances. Nevertheless, being a unique class of ground
robots, AVs are susceptible to robotic errors, particularly in their perception
module, leading to uncertainties in their movements and an increased risk of
collisions. Consequently, conservative operational strategies, such as larger
headway and slower speeds, are implemented to prioritize safety over traffic
capacity in real-world operations. To reconcile the inconsistency, this paper
proposes an analytical model framework that delineates the endogenous
reciprocity between traffic safety and efficiency that arises from robotic
uncertainty in AVs. Car-following scenarios are extensively examined, with
uncertain headway as the key parameter for bridging the single-lane capacity
and the collision probability. A Markov chain is then introduced to describe
the dynamics of the lane capacity, and the resulting expected
collision-inclusive capacity is adopted as the ultimate performance measure for
fully autonomous traffic. With the help of this analytical model, it is
possible to support the settings of critical parameters in AV operations and
incorporate optimization techniques to assist traffic management strategies for
autonomous traffic
Learning to Collaborate by Grouping: a Consensus-oriented Strategy for Multi-agent Reinforcement Learning
Multi-agent systems require effective coordination between groups and
individuals to achieve common goals. However, current multi-agent reinforcement
learning (MARL) methods primarily focus on improving individual policies and do
not adequately address group-level policies, which leads to weak cooperation.
To address this issue, we propose a novel Consensus-oriented Strategy (CoS)
that emphasizes group and individual policies simultaneously. Specifically, CoS
comprises two main components: (a) the vector quantized group consensus module,
which extracts discrete latent embeddings that represent the stable and
discriminative group consensus, and (b) the group consensus-oriented strategy,
which integrates the group policy using a hypernet and the individual policies
using the group consensus, thereby promoting coordination at both the group and
individual levels. Through empirical experiments on cooperative navigation
tasks with both discrete and continuous spaces, as well as Google research
football, we demonstrate that CoS outperforms state-of-the-art MARL algorithms
and achieves better collaboration, thus providing a promising solution for
achieving effective coordination in multi-agent systems
Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and Text-to-Function -- with Real Applications in Traffic Domain
The previous state-of-the-art (SOTA) method achieved a remarkable execution
accuracy on the Spider dataset, which is one of the largest and most diverse
datasets in the Text-to-SQL domain. However, during our reproduction of the
business dataset, we observed a significant drop in performance. We examined
the differences in dataset complexity, as well as the clarity of questions'
intentions, and assessed how those differences could impact the performance of
prompting methods. Subsequently, We develop a more adaptable and more general
prompting method, involving mainly query rewriting and SQL boosting, which
respectively transform vague information into exact and precise information and
enhance the SQL itself by incorporating execution feedback and the query
results from the database content. In order to prevent information gaps, we
include the comments, value types, and value samples for columns as part of the
database description in the prompt. Our experiments with Large Language Models
(LLMs) illustrate the significant performance improvement on the business
dataset and prove the substantial potential of our method. In terms of
execution accuracy on the business dataset, the SOTA method scored 21.05, while
our approach scored 65.79. As a result, our approach achieved a notable
performance improvement even when using a less capable pre-trained language
model. Last but not least, we also explore the Text-to-Python and
Text-to-Function options, and we deeply analyze the pros and cons among them,
offering valuable insights to the community
Polyostotic feet acrometastases from breast carcinoma demonstrated on [18F]FDG PET/CT imaging
Acrometastases are rare. Less than 0.01% of patients have metastasis in the foot bone. Polyostotic metastasis in the foot is extremely rare. We report a 50-year-old woman who complained of progressive pain and swelling in the right foot after radical right mastectomy for 4 years. [18F]FDG PET/CT demonstrated multiple mixed bone destruction in the right foot with intense [18F]FDG PET/CT uptake. CT-guided calcaneus biopsy confirmed the diagnosis of metastatic breast carcinoma
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
Recent research has demonstrated that Large Language Models (LLMs) can
enhance their capabilities by utilizing external tools. However, three pivotal
questions remain unanswered: (1) How effective are current LLMs in utilizing
tools? (2) How can we enhance LLMs' ability to utilize tools? (3) What
obstacles need to be overcome to leverage tools? To address these questions, we
introduce API-Bank, a groundbreaking benchmark, specifically designed for
tool-augmented LLMs. For the first question, we develop a runnable evaluation
system consisting of 73 API tools. We annotate 314 tool-use dialogues with 753
API calls to assess the existing LLMs' capabilities in planning, retrieving,
and calling APIs. For the second question, we construct a comprehensive
training set containing 1,888 tool-use dialogues from 2,138 APIs spanning 1,000
distinct domains. Using this dataset, we train Lynx, a tool-augmented LLM
initialized from Alpaca. Experimental results demonstrate that GPT-3.5 exhibits
improved tool utilization compared to GPT-3, while GPT-4 excels in planning.
However, there is still significant potential for further improvement.
Moreover, Lynx surpasses Alpaca's tool utilization performance by more than 26
pts and approaches the effectiveness of GPT-3.5. Through error analysis, we
highlight the key challenges for future research in this field to answer the
third question.Comment: EMNLP 202
Self-Explanation Prompting Improves Dialogue Understanding in Large Language Models
Task-oriented dialogue (TOD) systems facilitate users in executing various
activities via multi-turn dialogues, but Large Language Models (LLMs) often
struggle to comprehend these intricate contexts. In this study, we propose a
novel "Self-Explanation" prompting strategy to enhance the comprehension
abilities of LLMs in multi-turn dialogues. This task-agnostic approach requires
the model to analyze each dialogue utterance before task execution, thereby
improving performance across various dialogue-centric tasks. Experimental
results from six benchmark datasets confirm that our method consistently
outperforms other zero-shot prompts and matches or exceeds the efficacy of
few-shot prompts, demonstrating its potential as a powerful tool in enhancing
LLMs' comprehension in complex dialogue tasks
Evolution of Microstructural Characteristics of Carbonated Cement Pastes Subjected to High Temperatures Evaluated by MIP and SEM
The microstructural evolutions of both uncarbonated and carbonated cement pastes subjected to various high temperatures (30 degrees C, 200 degrees C, 400 degrees C, 500 degrees C, 600 degrees C, 720 degrees C, and 950 degrees C) are presented in this study by the means of mercury intrusion porosimetry (MIP) and scanning electron microscopy (SEM). It was found that the thermal stabilities of uncarbonated cement pastes were significantly changed from 400 to 500 degrees C due to the decomposition of portlandite at this temperature range. More large pores and microcracks were generated from 600 to 720 degrees C, with the depolymerization of C-S-H. After carbonation, the microstructures of carbonated cement pastes remained unchanged below 500 degrees C and started to degrade at 600 degrees C, due to the decompositions of calcium carbonates and calcium modified silica gel. At 950 degrees C, both uncarbonated and carbonated cement pastes showed a loosely honeycombed microstructure, composed mainly of beta-C2S and lime. It can be concluded that carbonation improves the high-temperature resistance of cement pastes up to 500 degrees C, but this advantage is lost at temperatures over 600 degrees C
Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning
Social psychology and real experiences show that cognitive consistency plays
an important role to keep human society in order: if people have a more
consistent cognition about their environments, they are more likely to achieve
better cooperation. Meanwhile, only cognitive consistency within a neighborhood
matters because humans only interact directly with their neighbors. Inspired by
these observations, we take the first step to introduce \emph{neighborhood
cognitive consistency} (NCC) into multi-agent reinforcement learning (MARL).
Our NCC design is quite general and can be easily combined with existing MARL
methods. As examples, we propose neighborhood cognition consistent deep
Q-learning and Actor-Critic to facilitate large-scale multi-agent cooperations.
Extensive experiments on several challenging tasks (i.e., packet routing, wifi
configuration, and Google football player control) justify the superior
performance of our methods compared with state-of-the-art MARL approaches.Comment: Accepted by AAAI2020 with oral presentation
(https://aaai.org/Conferences/AAAI-20/wp-content/uploads/2020/01/AAAI-20-Accepted-Paper-List.pdf).
Since AAAI2020 has started, I have the right to distribute this paper on
arXi
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
Task-oriented dialogue (TOD) models have made significant progress in recent
years. However, previous studies primarily focus on datasets written by
annotators, which has resulted in a gap between academic research and
real-world spoken conversation scenarios. While several small-scale spoken TOD
datasets are proposed to address robustness issues such as ASR errors, they
ignore the unique challenges in spoken conversation. To tackle the limitations,
we introduce SpokenWOZ, a large-scale speech-text dataset for spoken TOD,
containing 8 domains, 203k turns, 5.7k dialogues and 249 hours of audios from
human-to-human spoken conversations. SpokenWOZ further incorporates common
spoken characteristics such as word-by-word processing and reasoning in spoken
language. Based on these characteristics, we present cross-turn slot and
reasoning slot detection as new challenges. We conduct experiments on various
baselines, including text-modal models, newly proposed dual-modal models, and
LLMs, e.g., ChatGPT. The results show that the current models still have
substantial room for improvement in spoken conversation, where the most
advanced dialogue state tracker only achieves 25.65% in joint goal accuracy and
the SOTA end-to-end model only correctly completes the user request in 52.1% of
dialogues. The dataset, code, and leaderboard are available:
https://spokenwoz.github.io/SpokenWOZ-github.io/
- …