754 research outputs found

    A Novel Improved Bat Algorithm in UAV Path Planning

    Get PDF
    Path planning algorithm is the key point to UAV path planning scenario. Many traditional path planning methods still suffer from low convergence rate and insufficient robustness. In this paper, three main methods are contributed to solving these problems. First, the improved artificial potential field (APF) method is adopted to accelerate the convergence process of the bat’s position update. Second, the optimal success rate strategy is proposed to improve the adaptive inertia weight of bat algorithm. Third chaos strategy is proposed to avoid falling into a local optimum. Compared with standard APF and chaos strategy in UAV path planning scenarios, the improved algorithm CPFIBA (The improved artificial potential field method combined with chaotic bat algorithm, CPFIBA) significantly increases the success rate of finding suitable planning path and decrease the convergence time. Simulation results show that the proposed algorithm also has great robustness for processing with path planning problems. Meanwhile, it overcomes the shortcomings of the traditional meta-heuristic algorithms, as their convergence process is the potential to fall into a local optimum. From the simulation, we can see also obverse that the proposed CPFIBA provides better performance than BA and DEBA in problems of UAV path planning

    PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling

    Full text link
    Masked Image Modeling (MIM) has achieved promising progress with the advent of Masked Autoencoders (MAE) and BEiT. However, subsequent works have complicated the framework with new auxiliary tasks or extra pre-trained models, inevitably increasing computational overhead. This paper undertakes a fundamental analysis of MIM from the perspective of pixel reconstruction, which examines the input image patches and reconstruction target, and highlights two critical but previously overlooked bottlenecks. Based on this analysis, we propose a remarkably simple and effective method, {\ourmethod}, that entails two strategies: 1) filtering the high-frequency components from the reconstruction target to de-emphasize the network's focus on texture-rich details and 2) adopting a conservative data transform strategy to alleviate the problem of missing foreground in MIM training. {\ourmethod} can be easily integrated into most existing pixel-based MIM approaches (\ie, using raw images as reconstruction target) with negligible additional computation. Without bells and whistles, our method consistently improves three MIM approaches, MAE, ConvMAE, and LSMAE, across various downstream tasks. We believe this effective plug-and-play method will serve as a strong baseline for self-supervised learning and provide insights for future improvements of the MIM framework. Code and models are available at \url{https://github.com/open-mmlab/mmselfsup/tree/dev-1.x/configs/selfsup/pixmim}.Comment: Update code link and add additional result

    GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models

    Full text link
    The field of relation extraction (RE) is experiencing a notable shift towards generative relation extraction (GRE), leveraging the capabilities of large language models (LLMs). However, we discovered that traditional relation extraction (RE) metrics like precision and recall fall short in evaluating GRE methods. This shortfall arises because these metrics rely on exact matching with human-annotated reference relations, while GRE methods often produce diverse and semantically accurate relations that differ from the references. To fill this gap, we introduce GenRES for a multi-dimensional assessment in terms of the topic similarity, uniqueness, granularity, factualness, and completeness of the GRE results. With GenRES, we empirically identified that (1) precision/recall fails to justify the performance of GRE methods; (2) human-annotated referential relations can be incomplete; (3) prompting LLMs with a fixed set of relations or entities can cause hallucinations. Next, we conducted a human evaluation of GRE methods that shows GenRES is consistent with human preferences for RE quality. Last, we made a comprehensive evaluation of fourteen leading LLMs using GenRES across document, bag, and sentence level RE datasets, respectively, to set the benchmark for future research in GR

    Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision

    Full text link
    Instance segmentation on 3D point clouds has been attracting increasing attention due to its wide applications, especially in scene understanding areas. However, most existing methods operate on fully annotated data while manually preparing ground-truth labels at point-level is very cumbersome and labor-intensive. To address this issue, we propose a novel weakly supervised method RWSeg that only requires labeling one object with one point. With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information respectively to unknown regions using self-attention and a cross-graph random walk method. Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs to resolve ambiguities in closely placed objects, improving instance assignment accuracy. RWSeg generates high-quality instance-level pseudo labels. Experimental results on ScanNet-v2 and S3DIS datasets show that our approach achieves comparable performance with fully-supervised methods and outperforms previous weakly-supervised methods by a substantial margin

    Learning-Based Biharmonic Augmentation for Point Cloud Classification

    Full text link
    Point cloud datasets often suffer from inadequate sample sizes in comparison to image datasets, making data augmentation challenging. While traditional methods, like rigid transformations and scaling, have limited potential in increasing dataset diversity due to their constraints on altering individual sample shapes, we introduce the Biharmonic Augmentation (BA) method. BA is a novel and efficient data augmentation technique that diversifies point cloud data by imposing smooth non-rigid deformations on existing 3D structures. This approach calculates biharmonic coordinates for the deformation function and learns diverse deformation prototypes. Utilizing a CoefNet, our method predicts coefficients to amalgamate these prototypes, ensuring comprehensive deformation. Moreover, we present AdvTune, an advanced online augmentation system that integrates adversarial training. This system synergistically refines the CoefNet and the classification network, facilitating the automated creation of adaptive shape deformations contingent on the learner status. Comprehensive experimental analysis validates the superiority of Biharmonic Augmentation, showcasing notable performance improvements over prevailing point cloud augmentation techniques across varied network designs

    REACTO: Reconstructing Articulated Objects from a Single Video

    Full text link
    In this paper, we address the challenge of reconstructing general articulated 3D objects from a single video. Existing works employing dynamic neural radiance fields have advanced the modeling of articulated objects like humans and animals from videos, but face challenges with piece-wise rigid general articulated objects due to limitations in their deformation models. To tackle this, we propose Quasi-Rigid Blend Skinning, a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints. Our primary insight combines three distinct approaches: 1) an enhanced bone rigging system for improved component modeling, 2) the use of quasi-sparse skinning weights to boost part rigidity and reconstruction fidelity, and 3) the application of geodesic point assignment for precise motion and seamless deformation. Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects, as demonstrated on both real and synthetic datasets. Project page: https://chaoyuesong.github.io/REACTO

    HiCRISP: An LLM-based Hierarchical Closed-Loop Robotic Intelligent Self-Correction Planner

    Full text link
    The integration of Large Language Models (LLMs) into robotics has revolutionized human-robot interactions and autonomous task planning. However, these systems are often unable to self-correct during the task execution, which hinders their adaptability in dynamic real-world environments. To address this issue, we present a Hierarchical Closed-loop Robotic Intelligent Self-correction Planner (HiCRISP), an innovative framework that enables robots to correct errors within individual steps during the task execution. HiCRISP actively monitors and adapts the task execution process, addressing both high-level planning and low-level action errors. Extensive benchmark experiments, encompassing virtual and real-world scenarios, showcase HiCRISP's exceptional performance, positioning it as a promising solution for robotic task planning with LLMs

    Performance Analysis of Uplink/Downlink Decoupled Access in Cellular-V2X Networks

    Full text link
    This paper firstly develops an analytical framework to investigate the performance of uplink (UL) / downlink (DL) decoupled access in cellular vehicle-to-everything (C-V2X) networks, in which a vehicle's UL/DL can be connected to different macro/small base stations (MBSs/SBSs) separately. Using the stochastic geometry analytical tool, the UL/DL decoupled access C-V2X is modeled as a Cox process, and we obtain the following theoretical results, i.e., 1) the probability of different UL/DL joint association cases i.e., both the UL and DL are associated with the different MBSs or SBSs, or they are associated with different types of BSs; 2) the distance distribution of a vehicle to its serving BSs in each case; 3) the spectral efficiency of UL/DL in each case; and 4) the UL/DL coverage probability of MBS/SBS. The analyses reveal the insights and performance gain of UL/DL decoupled access. Through extensive simulations, \textcolor{black}{the accuracy of the proposed analytical framework is validated.} Both the analytical and simulation results show that UL/DL decoupled access can improve spectral efficiency. The theoretical results can be directly used for estimating the statistical performance of a UL/DL decoupled access C-V2X network.Comment: 15 pages, 10 figure
    corecore