12 research outputs found
On learning spatial sequences with the movement of attention
In this paper we start with a simple question, how is it possible that humans
can recognize different movements over skin with only a prior visual experience
of them? Or in general, what is the representation of spatial sequences that
are invariant to scale, rotation, and translation across different modalities?
To answer, we rethink the mathematical representation of spatial sequences,
argue against the minimum description length principle, and focus on the
movements of attention. We advance the idea that spatial sequences must be
represented on different levels of abstraction, this adds redundancy but is
necessary for recognition and generalization. To address the open question of
how these abstractions are formed we propose two hypotheses: the first invites
exploring selectionism learning, instead of finding parameters in some models;
the second proposes to find new data structures, not neural network
architectures, to efficiently store and operate over redundant features to be
further selected. Movements of attention are central to human cognition and
lessons should be applied to new better learning algorithms.Comment: 10 pages, 3 figure
Automatic Robot Hand-Eye Calibration Enabled by Learning-Based 3D Vision
Hand-eye calibration, as a fundamental task in vision-based robotic systems,
aims to estimate the transformation matrix between the coordinate frame of the
camera and the robot flange. Most approaches to hand-eye calibration rely on
external markers or human assistance. We proposed Look at Robot Base Once
(LRBO), a novel methodology that addresses the hand-eye calibration problem
without external calibration objects or human support, but with the robot base.
Using point clouds of the robot base, a transformation matrix from the
coordinate frame of the camera to the robot base is established as I=AXB. To
this end, we exploit learning-based 3D detection and registration algorithms to
estimate the location and orientation of the robot base. The robustness and
accuracy of the method are quantified by ground-truth-based evaluation, and the
accuracy result is compared with other 3D vision-based calibration methods. To
assess the feasibility of our methodology, we carried out experiments utilizing
a low-cost structured light scanner across varying joint configurations and
groups of experiments. The proposed hand-eye calibration method achieved a
translation deviation of 0.930 mm and a rotation deviation of 0.265 degrees
according to the experimental results. Additionally, the 3D reconstruction
experiments demonstrated a rotation error of 0.994 degrees and a position error
of 1.697 mm. Moreover, our method offers the potential to be completed in 1
second, which is the fastest compared to other 3D hand-eye calibration methods.
Code is released at github.com/leihui6/LRBO.Comment: 17 pages, 19 figures, 6 tables, submitted to MSS
Point cloud registration: a mini-review of current state, challenging issues and future directions
A point cloud is a set of data points in space. Point cloud registration is the process of aligning two or more 3D point clouds collected from different locations of the same scene. Registration enables point cloud data to be transformed into a common coordinate system, forming an integrated dataset representing the scene surveyed. In addition to those reliant on targets being placed in the scene before data capture, there are various registration methods available that are based on using only the point cloud data captured. Until recently, cloud-to-cloud registration methods have generally been centered upon the use of a coarse-to-fine optimization strategy. The challenges and limitations inherent in this process have shaped the development of point cloud registration and the associated software tools over the past three decades. Based on the success of deep learning methods applied to imagery data, attempts at applying these approaches to point cloud datasets have received much attention. This study reviews and comments on more recent developments in point cloud registration without using any targets and explores remaining issues, based on which recommendations on potential future studies in this topic are made
Ambient Intelligence for Next-Generation AR
Next-generation augmented reality (AR) promises a high degree of
context-awareness - a detailed knowledge of the environmental, user, social and
system conditions in which an AR experience takes place. This will facilitate
both the closer integration of the real and virtual worlds, and the provision
of context-specific content or adaptations. However, environmental awareness in
particular is challenging to achieve using AR devices alone; not only are these
mobile devices' view of an environment spatially and temporally limited, but
the data obtained by onboard sensors is frequently inaccurate and incomplete.
This, combined with the fact that many aspects of core AR functionality and
user experiences are impacted by properties of the real environment, motivates
the use of ambient IoT devices, wireless sensors and actuators placed in the
surrounding environment, for the measurement and optimization of environment
properties. In this book chapter we categorize and examine the wide variety of
ways in which these IoT sensors and actuators can support or enhance AR
experiences, including quantitative insights and proof-of-concept systems that
will inform the development of future solutions. We outline the challenges and
opportunities associated with several important research directions which must
be addressed to realize the full potential of next-generation AR.Comment: This is a preprint of a book chapter which will appear in the
Springer Handbook of the Metavers
Localization and Mapping for Self-Driving Vehicles:A Survey
The upsurge of autonomous vehicles in the automobile industry will lead to better driving experiences while also enabling the users to solve challenging navigation problems. Reaching such capabilities will require significant technological attention and the flawless execution of various complex tasks, one of which is ensuring robust localization and mapping. Recent surveys have not provided a meaningful and comprehensive description of the current approaches in this field. Accordingly, this review is intended to provide adequate coverage of the problems affecting autonomous vehicles in this area, by examining the most recent methods for mapping and localization as well as related feature extraction and data security problems. First, a discussion of the contemporary methods of extracting relevant features from equipped sensors and their categorization as semantic, non-semantic, and deep learning methods is presented. We conclude that representativeness, low cost, and accessibility are crucial constraints in the choice of the methods to be adopted for localization and mapping tasks. Second, the survey focuses on methods to build a vehicle’s environment map, considering both the commercial and the academic solutions available. The analysis proposes a difference between two types of environment, known and unknown, and develops solutions in each case. Third, the survey explores different approaches to vehicles’ localization and also classifies them according to their mathematical characteristics and priorities. Each section concludes by presenting the related challenges and some future directions. The article also highlights the security problems likely to be encountered in self-driving vehicles, with an assessment of possible defense mechanisms that could prevent security attacks in vehicles. Finally, the article ends with a debate on the potential impacts of autonomous driving, spanning energy consumption and emission reduction, sound and light pollution, integration into smart cities, infrastructure optimization, and software refinement. This thorough investigation aims to foster a comprehensive understanding of the diverse implications of autonomous driving across various domains
Rapid Localization and Mapping Method Based on Adaptive Particle Filters.
With the development of autonomous vehicles, localization and mapping technologies have become crucial to equip the vehicle with the appropriate knowledge for its operation. In this paper, we extend our previous work by prepossessing a localization and mapping architecture for autonomous vehicles that do not rely on GPS, particularly in environments such as tunnels, under bridges, urban canyons, and dense tree canopies. The proposed approach is of two parts. Firstly, a K-means algorithm is employed to extract features from LiDAR scenes to create a local map of each scan. Then, we concatenate the local maps to create a global map of the environment and facilitate data association between frames. Secondly, the main localization task is performed by an adaptive particle filter that works in four steps: (a) generation of particles around an initial state (provided by the GPS); (b) updating the particle positions by providing the motion (translation and rotation) of the vehicle using an inertial measurement device; (c) selection of the best candidate particles by observing at each timestamp the match rate (also called particle weight) of the local map (with the real-time distances to the objects) and the distances of the particles to the corresponding chunks of the global map; (d) averaging the selected particles to derive the estimated position, and, finally, using a resampling method on the particles to ensure the reliability of the position estimation. The performance of the newly proposed technique is investigated on different sequences of the Kitti and Pandaset raw data with different environmental setups, weather conditions, and seasonal changes. The obtained results validate the performance of the proposed approach in terms of speed and representativeness of the feature extraction for real-time localization in comparison with other state-of-the-art methods