1,322 research outputs found

    Scalable Approach to Uncertainty Quantification and Robust Design of Interconnected Dynamical Systems

    Full text link
    Development of robust dynamical systems and networks such as autonomous aircraft systems capable of accomplishing complex missions faces challenges due to the dynamically evolving uncertainties coming from model uncertainties, necessity to operate in a hostile cluttered urban environment, and the distributed and dynamic nature of the communication and computation resources. Model-based robust design is difficult because of the complexity of the hybrid dynamic models including continuous vehicle dynamics, the discrete models of computations and communications, and the size of the problem. We will overview recent advances in methodology and tools to model, analyze, and design robust autonomous aerospace systems operating in uncertain environment, with stress on efficient uncertainty quantification and robust design using the case studies of the mission including model-based target tracking and search, and trajectory planning in uncertain urban environment. To show that the methodology is generally applicable to uncertain dynamical systems, we will also show examples of application of the new methods to efficient uncertainty quantification of energy usage in buildings, and stability assessment of interconnected power networks

    Vision-Based Localization Algorithm Based on Landmark Matching, Triangulation, Reconstruction, and Comparison

    No full text
    Many generic position-estimation algorithms are vulnerable to ambiguity introduced by nonunique landmarks. Also, the available high-dimensional image data is not fully used when these techniques are extended to vision-based localization. This paper presents the landmark matching, triangulation, reconstruction, and comparison (LTRC) global localization algorithm, which is reasonably immune to ambiguous landmark matches. It extracts natural landmarks for the (rough) matching stage before generating the list of possible position estimates through triangulation. Reconstruction and comparison then rank the possible estimates. The LTRC algorithm has been implemented using an interpreted language, onto a robot equipped with a panoramic vision system. Empirical data shows remarkable improvement in accuracy when compared with the established random sample consensus method. LTRC is also robust against inaccurate map data

    The very same thing: Extending the object token concept to incorporate causal constraints on individual identity

    Get PDF
    The contributions of feature recognition, object categorization, and recollection of episodic memories to the re-identification of a perceived object as the very same thing encountered in a previous perceptual episode are well understood in terms of both cognitive-behavioral phenomenology and neurofunctional implementation. Human beings do not, however, rely solely on features and context to re-identify individuals; in the presence of featural change and similarly-featured distractors, people routinely employ causal constraints to establish object identities. Based on available cognitive and neurofunctional data, the standard object-token based model of individual re-identification is extended to incorporate the construction of unobserved and hence fictive causal histories (FCHs) of observed objects by the pre-motor action planning system. Cognitive-behavioral and implementation-level predictions of this extended model and methods for testing them are outlined. It is suggested that functional deficits in the construction of FCHs are associated with clinical outcomes in both Autism Spectrum Disorders and later-stage stage Alzheimer's disease.\u

    Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive Evaluation on a Mobile Dataset

    Full text link
    The potential of digital-twin technology, involving the creation of precise digital replicas of physical objects, to reshape AR experiences in 3D object tracking and localization scenarios is significant. However, enabling robust 3D object tracking in dynamic mobile AR environments remains a formidable challenge. These scenarios often require a more robust pose estimator capable of handling the inherent sensor-level measurement noise. In this paper, recognizing the challenges of comprehensive solutions in existing literature, we propose a transformer-based 6DoF pose estimator designed to achieve state-of-the-art accuracy under real-world noisy data. To systematically validate the new solution's performance against the prior art, we also introduce a novel RGBD dataset called Digital Twin Tracking Dataset v2 (DTTD2), which is focused on digital-twin object tracking scenarios. Expanded from an existing DTTD v1 (DTTD1), the new dataset adds digital-twin data captured using a cutting-edge mobile RGBD sensor suite on Apple iPhone 14 Pro, expanding the applicability of our approach to iPhone sensor data. Through extensive experimentation and in-depth analysis, we illustrate the effectiveness of our methods under significant depth data errors, surpassing the performance of existing baselines. Code and dataset are made publicly available at: https://github.com/augcog/DTTD

    Automatic annotation of tennis games: An integration of audio, vision, and learning

    Get PDF
    Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning. At the low level processing, we improve upon our previously proposed state-of-the-art tennis ball tracking algorithm and employ audio signal processing techniques to detect key events and construct features for classifying the events. At high level analysis, we model event classification as a sequence labelling problem, and investigate four machine learning techniques using simulated event sequences. Finally, we evaluate our proposed approach on three real world tennis games, and discuss the interplay between audio, vision and learning. To the best of our knowledge, our system is the only one that can annotate tennis game at such a detailed level

    Perceptual grouping based on iterative multi-scale tensor voting

    Get PDF
    Abstract. We propose a new approach for perceptual grouping of oriented segments in highly cluttered images based on tensor voting. Segments are represented as second-order tensors and communicate with each other through a voting scheme that incorporates the Gestalt principles of visual perception. An iterative scheme has been devised which removes noise segments in a conservative way using multi-scale analysis and re-voting. We have tested our approach on data sets composed of real objects in real backgrounds. Our experimental results indicate that our method can segment successfully objects in images with up to twenty times more noise segments than object ones.

    Efficient Zero-shot Visual Search via Target and Context-aware Transformer

    Full text link
    Visual search is a ubiquitous challenge in natural vision, including daily tasks such as finding a friend in a crowd or searching for a car in a parking lot. Human rely heavily on relevant target features to perform goal-directed visual search. Meanwhile, context is of critical importance for locating a target object in complex scenes as it helps narrow down the search area and makes the search process more efficient. However, few works have combined both target and context information in visual search computational models. Here we propose a zero-shot deep learning architecture, TCT (Target and Context-aware Transformer), that modulates self attention in the Vision Transformer with target and contextual relevant information to enable human-like zero-shot visual search performance. Target modulation is computed as patch-wise local relevance between the target and search images, whereas contextual modulation is applied in a global fashion. We conduct visual search experiments on TCT and other competitive visual search models on three natural scene datasets with varying levels of difficulty. TCT demonstrates human-like performance in terms of search efficiency and beats the SOTA models in challenging visual search tasks. Importantly, TCT generalizes well across datasets with novel objects without retraining or fine-tuning. Furthermore, we also introduce a new dataset to benchmark models for invariant visual search under incongruent contexts. TCT manages to search flexibly via target and context modulation, even under incongruent contexts
    corecore