Search CORE

285 research outputs found

Multi-Agent Combinatorial Path Finding with Heterogeneous Task Duration

Author: Ren Zhongqiang
Wang Hesheng
Zhang Yuanhang
Publication venue
Publication date: 26/11/2023
Field of study

Multi-Agent Combinatorial Path Finding (MCPF) seeks collision-free paths for multiple agents from their initial locations to destinations, visiting a set of intermediate target locations in the middle of the paths, while minimizing the sum of arrival times. While a few approaches have been developed to handle MCPF, most of them simply direct the agent to visit the targets without considering the task duration, i.e., the amount of time needed for an agent to execute the task (such as picking an item) at a target location. MCPF is NP-hard to solve to optimality, and the inclusion of task duration further complicates the problem. This paper investigates heterogeneous task duration, where the duration can be different with respect to both the agents and targets. We develop two methods, where the first method post-processes the paths planned by any MCPF planner to include the task duration and has no solution optimality guarantee; and the second method considers task duration during planning and is able to ensure solution optimality. The numerical and simulation results show that our methods can handle up to 20 agents and 50 targets in the presence of task duration, and can execute the paths subject to robot motion disturbance

arXiv.org e-Print Archive

BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy

Author: Lu Huchuan
Wang Lijun
Wang Yifan
Zhang Yuanhang
Zhang Zaibin
Publication venue
Publication date: 10/01/2024
Field of study

A popular approach for constructing bird's-eye-view (BEV) representation in 3D detection is to lift 2D image features onto the viewing frustum space based on explicitly predicted depth distribution. However, depth distribution can only characterize the 3D geometry of visible object surfaces but fails to capture their internal space and overall geometric structure, leading to sparse and unsatisfactory 3D representations. To mitigate this issue, we present BEV-IO, a new 3D detection paradigm to enhance BEV representation with instance occupancy information. At the core of our method is the newly-designed instance occupancy prediction (IOP) module, which aims to infer point-level occupancy status for each instance in the frustum space. To ensure training efficiency while maintaining representational flexibility, it is trained using the combination of both explicit and implicit supervision. With the predicted occupancy, we further design a geometry-aware feature propagation mechanism (GFP), which performs self-attention based on occupancy distribution along each ray in frustum and is able to enforce instance-level feature consistency. By integrating the IOP module with GFP mechanism, our BEV-IO detector is able to render highly informative 3D scene structures with more comprehensive BEV representations. Experimental results demonstrate that BEV-IO can outperform state-of-the-art methods while only adding a negligible increase in parameters (0.2%) and computational overhead (0.24%in GFLOPs).Comment: v

arXiv.org e-Print Archive

Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling

Author: Gao Cuiyun
Liu Chuanyi
Qi Shiyi
Wang Qifan
Xu Zenglin
Yang Yuanhang
Publication venue
Publication date: 22/10/2023
Field of study

Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of cross-attention and computation speedup still needs better coordinated. To this end, this paper introduces a novel paradigm MixEncoder for efficient sentence pair modeling. MixEncoder involves a light-weight cross-attention mechanism. It conducts query encoding only once while modeling the query-candidate interaction in parallel. Extensive experiments conducted on four tasks demonstrate that our MixEncoder can speed up sentence pairing by over 113x while achieving comparable performance as the more expensive cross-attention models.Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

Average Polarization of Electromagnetic Gaussian Schell-Model Beams through Anisotropic Non-Kolmogorov Turbulence

Author: Qiu Wang
Yixin Zhang
Yuanhang Zhao
Publication venue: 'Brno University of Technology'
Publication date: 01/12/2016
Field of study

Polarization properties of electromagnetic Gaussian Schell-model beams propagating through the anisotropic non-Kolmogorov turbulence of marine-atmosphere channel are studied based on the cross-spectral density matrix. Detailed analysis shows that the average polarization decreases with increasing the spectral index, inner scale of turbulence and generalized refractive-index structure parameter. We find the effects of anisotropic turbulence on the average polarization is less than that of the isotropic turbulence and the depolarization effect of turbulence in marine-atmosphere is larger than terrene-atmosphere. The electromagnetic Gaussian Schell-model beam with the parameters of smaller σxx ,σyy and Ax, but larger Ay will reduce the interference of turbulence

Directory of Open Access Journals

Digital library of Brno University of Technology

Dilated FCN: Listening Longer to Hear Better

Author: Gong Shuyu
Liu Jundong
Smith Charles D.
Sun Tao
Wang Zhewei
Xu Li
Zhang Yuanhang
Publication venue
Publication date: 27/07/2019
Field of study

Deep neural network solutions have emerged as a new and powerful paradigm for speech enhancement (SE). The capabilities to capture long context and extract multi-scale patterns are crucial to design effective SE networks. Such capabilities, however, are often in conflict with the goal of maintaining compact networks to ensure good system generalization. In this paper, we explore dilation operations and apply them to fully convolutional networks (FCNs) to address this issue. Dilations equip the networks with greatly expanded receptive fields, without increasing the number of parameters. Different strategies to fuse multi-scale dilations, as well as to install the dilation modules are explored in this work. Using Noisy VCTK and AzBio sentences datasets, we demonstrate that the proposed dilation models significantly improve over the baseline FCN and outperform the state-of-the-art SE solutions.Comment: 5 pages; will appear in WASPAA conferenc

arXiv.org e-Print Archive

College Towns: Handle Data With Care

Author: Jansen Dennis W.
Navarro Carlos I.
Wang Yuanhang
Publication venue: Texas A&M University. Library
Publication date: 18/11/2019
Field of study

EconomicGrowth_Development_TechnicalChangeAlthough the term â€˜college townâ€™ may invoke idyllic images from our past, government statistics paint a different picture. College towns often appear as poverty â€“ ridden, with unaffordable housing and low incomes. However, by their very nature college students are young, often have very little income, and usually have an ability to spend far more than their official income. In this paper, authors Dennis W. Jansen, Carlos I. Navarro and Yuanhang Wang show how government statistics for college towns can be misleading with respect to income, poverty, and housing affordability, as well as ways in which college towns are not so different from other areas with respect to statistics on crime or unemployment rates

OAKTrust Digital Repository (Texas A&M Univ)