257 research outputs found

    TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation

    Full text link
    The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents' abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios

    Perceive, Excavate and Purify: A Novel Object Mining Framework for Instance Segmentation

    Full text link
    Recently, instance segmentation has made great progress with the rapid development of deep neural networks. However, there still exist two main challenges including discovering indistinguishable objects and modeling the relationship between instances. To deal with these difficulties, we propose a novel object mining framework for instance segmentation. In this framework, we first introduce the semantics perceiving subnetwork to capture pixels that may belong to an obvious instance from the bottom up. Then, we propose an object excavating mechanism to discover indistinguishable objects. In the mechanism, preliminary perceived semantics are regarded as original instances with classifications and locations, and then indistinguishable objects around these original instances are mined, which ensures that hard objects are fully excavated. Next, an instance purifying strategy is put forward to model the relationship between instances, which pulls the similar instances close and pushes away different instances to keep intra-instance similarity and inter-instance discrimination. In this manner, the same objects are combined as the one instance and different objects are distinguished as independent instances. Extensive experiments on the COCO dataset show that the proposed approach outperforms state-of-the-art methods, which validates the effectiveness of the proposed object mining framework.Comment: Accepted by CVPR Workshops 202

    Motion-state Alignment for Video Semantic Segmentation

    Full text link
    In recent years, video semantic segmentation has made great progress with advanced deep neural networks. However, there still exist two main challenges \ie, information inconsistency and computation cost. To deal with the two difficulties, we propose a novel motion-state alignment framework for video semantic segmentation to keep both motion and state consistency. In the framework, we first construct a motion alignment branch armed with an efficient decoupled transformer to capture dynamic semantics, guaranteeing region-level temporal consistency. Then, a state alignment branch composed of a stage transformer is designed to enrich feature spaces for the current frame to extract static semantics and achieve pixel-level state consistency. Next, by a semantic assignment mechanism, the region descriptor of each semantic category is gained from dynamic semantics and linked with pixel descriptors from static semantics. Benefiting from the alignment of these two kinds of effective information, the proposed method picks up dynamic and static semantics in a targeted way, so that video semantic regions are consistently segmented to obtain precise locations with low computational complexity. Extensive experiments on Cityscapes and CamVid datasets show that the proposed approach outperforms state-of-the-art methods and validates the effectiveness of the motion-state alignment framework.Comment: Accepted by CVPR Workshops 202

    Text2Street: Controllable Text-to-image Generation for Street Views

    Full text link
    Text-to-image generation has made remarkable progress with the emergence of diffusion models. However, it is still a difficult task to generate images for street views based on text, mainly because the road topology of street scenes is complex, the traffic status is diverse and the weather condition is various, which makes conventional text-to-image models difficult to deal with. To address these challenges, we propose a novel controllable text-to-image framework, named \textbf{Text2Street}. In the framework, we first introduce the lane-aware road topology generator, which achieves text-to-map generation with the accurate road structure and lane lines armed with the counting adapter, realizing the controllable road topology generation. Then, the position-based object layout generator is proposed to obtain text-to-layout generation through an object-level bounding box diffusion strategy, realizing the controllable traffic object layout generation. Finally, the multiple control image generator is designed to integrate the road topology, object layout and weather description to realize controllable street-view image generation. Extensive experiments show that the proposed approach achieves controllable street-view text-to-image generation and validates the effectiveness of the Text2Street framework for street views

    Revisiting Non-Autoregressive Translation at Scale

    Full text link
    In real-world systems, scaling has been critical for improving the translation quality in autoregressive translation (AT), which however has not been well studied for non-autoregressive translation (NAT). In this work, we bridge the gap by systematically studying the impact of scaling on NAT behaviors. Extensive experiments on six WMT benchmarks over two advanced NAT models show that scaling can alleviate the commonly-cited weaknesses of NAT models, resulting in better translation performance. To reduce the side-effect of scaling on decoding speed, we empirically investigate the impact of NAT encoder and decoder on the translation performance. Experimental results on the large-scale WMT20 En-De show that the asymmetric architecture (e.g. bigger encoder and smaller decoder) can achieve comparable performance with the scaling model, while maintaining the superiority of decoding speed with standard NAT models. To this end, we establish a new benchmark by validating scaled NAT models on the scaled dataset, which can be regarded as a strong baseline for future works. We release code and system outputs at https://github.com/DeepLearnXMU/Scaling4NAT.Comment: 13 pages, Findings of ACL 202

    Changes in interleukin-27 levels in patients with acute coronary syndrome and their clinical significance

    Get PDF
    Background This study evaluated changes in interleukin (IL)-27 levels in patients with acute coronary syndrome (ACS) and their influence on Th1, Th2, and Th17 cells. Methods Serum levels of IL-27, IL-4, IL-17, and interferon (IFN)-γ in healthy subjects as well as patients with ACS, including stable angina pectoris (SA), unstable angina pectoris (UA), and acute myocardial infarction (AMI), were determined using an enzyme-linked immunosorbent assay. The proportions of Th1, Th2, and Th17 cells among peripheral blood mononuclear cells (PBMCs), were measured using flow cytometry, after incubation with phorbol myristate acetate (PMA) for 4 h. The proportions of Th1 and Th17 cells among PBMCs in AMI and UA were detected after stimulation with IL-27 or PMA + IL-27 for 4, 8, and 12 h. Results Serum levels of IL-27 in patients with AMI and UA were significantly lower than those in SA and control groups, while serum levels of IL-17 and IFN-γ in AMI and UA groups were dramatically increased compared to those in SA and healthy control groups. However, there were no statistically significant differences in serum IL-4. The proportions of Th1 and Th17 cells among PBMCs were statistically significantly higher in the AMI and UA groups than those in the SA and control groups, while there was no statistically significant difference in the proportion of Th2 cells among different groups. For patients with AMI and UA, the effect of co-stimulation of PBMCs with PMA and IL-27 was not significantly different from that of PMA single stimulation, while PMA + IL-27 co-stimulation lowered the Th17 cell proportion significantly compared to PMA single stimulation. Discussion Compared to SA patients and healthy controls, patients with ACS (AMI + UA) had lower serum levels of IL-27 and higher proportions of PBMC Th1 and Th17 cells, which could be attributed to the inhibitory effects of IL-27 on the proliferation of Th17 cells. These results indicated that IL-27 could be a novel therapeutic target in ACS patients

    Representation Learning with Large Language Models for Recommendation

    Full text link
    Recommender systems have seen significant advancements with the influence of deep learning and graph neural networks, particularly in capturing complex user-item relationships. However, these graph-based recommenders heavily depend on ID-based data, potentially disregarding valuable textual information associated with users and items, resulting in less informative learned representations. Moreover, the utilization of implicit feedback data introduces potential noise and bias, posing challenges for the effectiveness of user preference learning. While the integration of large language models (LLMs) into traditional ID-based recommenders has gained attention, challenges such as scalability issues, limitations in text-only reliance, and prompt input constraints need to be addressed for effective implementation in practical recommender systems. To address these challenges, we propose a model-agnostic framework RLMRec that aims to enhance existing recommenders with LLM-empowered representation learning. It proposes a recommendation paradigm that integrates representation learning with LLMs to capture intricate semantic aspects of user behaviors and preferences. RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals through a cross-view alignment framework. This work further establish a theoretical foundation demonstrating that incorporating textual signals through mutual information maximization enhances the quality of representations. In our evaluation, we integrate RLMRec with state-of-the-art recommender models, while also analyzing its efficiency and robustness to noise data. Our implementation codes are available at https://github.com/HKUDS/RLMRec.Comment: Published as a WWW'24 full pape

    Metallic surface states in a correlated d-electron topological Kondo insulator candidate FeSb2

    Full text link
    The resistance of a conventional insulator diverges as temperature approaches zero. The peculiar low temperature resistivity saturation in the 4f Kondo insulator (KI) SmB6 has spurred proposals of a correlation-driven topological Kondo insulator (TKI) with exotic ground states. However, the scarcity of model TKI material families leaves difficulties in disentangling key ingredients from irrelevant details. Here we use angle-resolved photoemission spectroscopy (ARPES) to study FeSb2, a correlated d-electron KI candidate that also exhibits a low temperature resistivity saturation. On the (010) surface, we find a rich assemblage of metallic states with two-dimensional dispersion. Measurements of the bulk band structure reveal band renormalization, a large temperature-dependent band shift, and flat spectral features along certain high symmetry directions, providing spectroscopic evidence for strong correlations. Our observations suggest that exotic insulating states resembling those in SmB6 and YbB12 may also exist in systems with d instead of f electrons
    corecore