251 research outputs found

    Beyond pairwise clustering

    Get PDF
    We consider the problem of clustering in domains where the affinity relations are not dyadic (pairwise), but rather triadic, tetradic or higher. The problem is an instance of the hypergraph partitioning problem. We propose a two-step algorithm for solving this problem. In the first step we use a novel scheme to approximate the hypergraph using a weighted graph. In the second step a spectral partitioning algorithm is used to partition the vertices of this graph. The algorithm is capable of handling hyperedges of all orders including order two, thus incorporating information of all orders simultaneously. We present a theoretical analysis that relates our algorithm to an existing hypergraph partitioning algorithm and explain the reasons for its superior performance. We report the performance of our algorithm on a variety of computer vision problems and compare it to several existing hypergraph partitioning algorithms

    Physical Biology of the Materials-Microorganism Interface.

    Get PDF
    Future solar-to-chemical production will rely upon a deep understanding of the material-microorganism interface. Hybrid technologies, which combine inorganic semiconductor light harvesters with biological catalysis to transform light, air, and water into chemicals, already demonstrate a wide product scope and energy efficiencies surpassing that of natural photosynthesis. But optimization to economic competitiveness and fundamental curiosity beg for answers to two basic questions: (1) how do materials transfer energy and charge to microorganisms, and (2) how do we design for bio- and chemocompatibility between these seemingly unnatural partners? This Perspective highlights the state-of-the-art and outlines future research paths to inform the cadre of spectroscopists, electrochemists, bioinorganic chemists, material scientists, and biologists who will ultimately solve these mysteries

    UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking

    Full text link
    In recent years, numerous effective multi-object tracking (MOT) methods are developed because of the wide range of applications. Existing performance evaluations of MOT methods usually separate the object tracking step from the object detection step by using the same fixed object detection results for comparisons. In this work, we perform a comprehensive quantitative study on the effects of object detection accuracy to the overall MOT performance, using the new large-scale University at Albany DETection and tRACking (UA-DETRAC) benchmark dataset. The UA-DETRAC benchmark dataset consists of 100 challenging video sequences captured from real-world traffic scenes (over 140,000 frames with rich annotations, including occlusion, weather, vehicle category, truncation, and vehicle bounding boxes) for object detection, object tracking and MOT system. We evaluate complete MOT systems constructed from combinations of state-of-the-art object detection and object tracking methods. Our analysis shows the complex effects of object detection accuracy on MOT system performance. Based on these observations, we propose new evaluation tools and metrics for MOT systems that consider both object detection and object tracking for comprehensive analysis.Comment: 18 pages, 11 figures, accepted by CVI

    X-PDNet: Accurate Joint Plane Instance Segmentation and Monocular Depth Estimation with Cross-Task Distillation and Boundary Correction

    Full text link
    Segmentation of planar regions from a single RGB image is a particularly important task in the perception of complex scenes. To utilize both visual and geometric properties in images, recent approaches often formulate the problem as a joint estimation of planar instances and dense depth through feature fusion mechanisms and geometric constraint losses. Despite promising results, these methods do not consider cross-task feature distillation and perform poorly in boundary regions. To overcome these limitations, we propose X-PDNet, a framework for the multitask learning of plane instance segmentation and depth estimation with improvements in the following two aspects. Firstly, we construct the cross-task distillation design which promotes early information sharing between dual-tasks for specific task improvements. Secondly, we highlight the current limitations of using the ground truth boundary to develop boundary regression loss, and propose a novel method that exploits depth information to support precise boundary region segmentation. Finally, we manually annotate more than 3000 images from Stanford 2D-3D-Semantics dataset and make available for evaluation of plane instance segmentation. Through the experiments, our proposed methods prove the advantages, outperforming the baseline with large improvement margins in the quantitative results on the ScanNet and the Stanford 2D-3D-S dataset, demonstrating the effectiveness of our proposals.Comment: Accepted to BMVC 202

    Gramian Attention Heads are Strong yet Efficient Vision Learners

    Full text link
    We introduce a novel architecture design that enhances expressiveness by incorporating multiple head classifiers (\ie, classification heads) instead of relying on channel expansion or additional building blocks. Our approach employs attention-based aggregation, utilizing pairwise feature similarity to enhance multiple lightweight heads with minimal resource overhead. We compute the Gramian matrices to reinforce class tokens in an attention layer for each head. This enables the heads to learn more discriminative representations, enhancing their aggregation capabilities. Furthermore, we propose a learning algorithm that encourages heads to complement each other by reducing correlation for aggregation. Our models eventually surpass state-of-the-art CNNs and ViTs regarding the accuracy-throughput trade-off on ImageNet-1K and deliver remarkable performance across various downstream tasks, such as COCO object instance segmentation, ADE20k semantic segmentation, and fine-grained visual classification datasets. The effectiveness of our framework is substantiated by practical experimental results and further underpinned by generalization error bound. We release the code publicly at: https://github.com/Lab-LVM/imagenet-models
    corecore