940 research outputs found

    Omnidirectional video stabilisation on a virtual camera using sensor fusion

    Get PDF
    This paper presents a method for robustly stabilising omnidirectional video given the presence of significantrotations and translations by creating a virtual camera and using a combination of sensor fusion and scene tracking. Real time rotational movements of the camera are measured by an Inertial Measurement Unit (IMU), which provides an initial estimate of the ego-motion of the camera platform. Image registration is then used to refine these estimates. The calculated ego-motion is then used to adjust an extract of the omnidirectional video, forming a virtual camera that is focused on the scene. Experiments show the technique is effective under challenging ego-motions and overcomes deficiencies that are associated with unimodal approaches making it robust and suitable to be used in many surveillance applications

    Structure-from-motion in Spherical Video using the von Mises-Fisher Distribution

    Get PDF
    In this paper, we present a complete pipeline for computing structure-from-motion from the sequences of spherical images. We revisit problems from multiview geometry in the context of spherical images. In particular, we propose methods suited to spherical camera geometry for the spherical-n-point problem (estimating camera pose for a spherical image) and calibrated spherical reconstruction (estimating the position of a 3-D point from multiple spherical images). We introduce a new probabilistic interpretation of spherical structure-from-motion which uses the von Mises-Fisher distribution to model noise in spherical feature point positions. This model provides an alternate objective function that we use in bundle adjustment. We evaluate our methods quantitatively and qualitatively on both synthetic and real world data and show that our methods developed for spherical images outperform straightforward adaptations of methods developed for perspective images. As an application of our method, we use the structure-from-motion output to stabilise the viewing direction in fully spherical video

    Teleoperated visual inspection and surveillance with unmanned ground and aerial vehicles,” Int

    Get PDF
    Abstract—This paper introduces our robotic system named UGAV (Unmanned Ground-Air Vehicle) consisting of two semi-autonomous robot platforms, an Unmanned Ground Vehicle (UGV) and an Unmanned Aerial Vehicles (UAV). The paper focuses on three topics of the inspection with the combined UGV and UAV: (A) teleoperated control by means of cell or smart phones with a new concept of automatic configuration of the smart phone based on a RKI-XML description of the vehicles control capabilities, (B) the camera and vision system with the focus to real time feature extraction e.g. for the tracking of the UAV and (C) the architecture and hardware of the UAV

    Local Features, Structure-from-motion and View Synthesis in Spherical Video

    Get PDF
    This thesis addresses the problem of synthesising new views from spherical video or image sequences. We propose an interest point detector and feature descriptor that allows us to robustly match local features between pairs of spherical images and use this as part of a structure-from-motion pipeline that allows us to estimate camera pose from a spherical video sequence. With pose estimates to hand, we propose methods for view stabilisation and novel viewpoint synthesis. In Chapter 3 we describe our contribution in the area of feature detection and description in spherical images. First, we present a novel representation for spherical images which uses a discrete geodesic grid composed of hexagonal pixels. Second, we extend the BRISK binary descriptor to the sphere, proposing methods for multiscale corner detection, sub-pixel position and sub-octave scale refinement and descriptor construction in the tangent space to the sphere. In Chapter 4 we describe our contributions in the area of spherical structure-from-motion. We revisit problems from multiview geometry in the context of spherical images. We propose methods suited to spherical camera geometry for the spherical-n-point problem and calibrated spherical reconstruction. We introduce a new probabilistic interpretation of spherical structure-from-motion which uses the von Mises-Fisher distribution in spherical feature point positions. This model provides an alternate objective function that we use in bundle adjustment. In Chapter 5 we describe our contributions in the area of view synthesis from spherical images. We exploit the camera pose estimates made by our pipeline and use these in two view synthesis applications. The first is view stabilisation where we remove the effect of viewing direction changes, often present in first person video. Second, we propose a method for synthesising novel viewpoints

    Metric and appearance based visual SLAM for mobile robots

    Get PDF
    Simultaneous Localization and Mapping (SLAM) maintains autonomy for mobile robots and it has been studied extensively during the last two decades. It is the process of building the map of an unknown environment and determining the location of the robot using this map concurrently. Different kinds of sensors such as Global Positioning System (GPS), Inertial Measurement Unit (IMU), laser range finder and sonar are used for data acquisition in SLAM. In recent years, passive visual sensors are utilized in visual SLAM (vSLAM) problem because of their increasing ubiquity. This thesis is concerned with the metric and appearance-based vSLAM problems for mobile robots. From the point of view of metric-based vSLAM, a performance improvement technique is developed. Template matching based video stabilization and Harris corner detector are integrated. Extracting Harris corner features from stabilized video consistently increases the accuracy of the localization. Data coming from a video camera and odometry are fused in an Extended Kalman Filter (EKF) to determine the pose of the robot and build the map of the environment. Simulation results validate the performance improvement obtained by the proposed technique. Moreover, a visual perception system is proposed for appearance-based vSLAM and used for under vehicle classification. The proposed system consists of three main parts: monitoring, detection and classification. In the first part a new catadioptric camera system, where a perspective camera points downwards to a convex mirror mounted to the body of a mobile robot, is designed. Thanks to the catadioptric mirror the scenes against the camera optical axis direction can be viewed. In the second part speeded up robust features (SURF) are used to detect the hidden objects that are under vehicles. Fast appearance based mapping algorithm (FAB-MAP) is then exploited for the classification of the means of transportations in the third part. Experimental results show the feasibility of the proposed system. The proposed solution is implemented using a non-holonomic mobile robot. In the implementations the bottom of the tables in the laboratory are considered as the under vehicles. A database that includes di erent under vehicle images is used. All the algorithms are implemented in Microsoft Visual C++ and OpenCV 2.4.4

    Accuracy vs. Energy: An Assessment of Bee Object Inference in Videos From On-Hive Video Loggers With YOLOv3, YOLOv4-Tiny, and YOLOv7-Tiny

    Get PDF
    A continuing trend in precision apiculture is to use computer vision methods to quantify characteristics of bee traffic in managed colonies at the hive\u27s entrance. Since traffic at the hive\u27s entrance is a contributing factor to the hive\u27s productivity and health, we assessed the potential of three open-source convolutional network models, YOLOv3, YOLOv4-tiny, and YOLOv7-tiny, to quantify omnidirectional traffic in videos from on-hive video loggers on regular, unmodified one- and two-super Langstroth hives and compared their accuracies, energy efficacies, and operational energy footprints. We trained and tested the models with a 70/30 split on a dataset of 23,173 flying bees manually labeled in 5819 images from 10 randomly selected videos and manually evaluated the trained models on 3600 images from 120 randomly selected videos from different apiaries, years, and queen races. We designed a new energy efficacy metric as a ratio of performance units per energy unit required to make a model operational in a continuous hive monitoring data pipeline. In terms of accuracy, YOLOv3 was first, YOLOv7-tiny—second, and YOLOv4-tiny—third. All models underestimated the true amount of traffic due to false negatives. YOLOv3 was the only model with no false positives, but had the lowest energy efficacy and highest operational energy footprint in a deployed hive monitoring data pipeline. YOLOv7-tiny had the highest energy efficacy and the lowest operational energy footprint in the same pipeline. Consequently, YOLOv7-tiny is a model worth considering for training on larger bee datasets if a primary objective is the discovery of non-invasive computer vision models of traffic quantification with higher energy efficacies and lower operational energy footprints
    corecore