282 research outputs found

    ???????????? ??????????????? ????????? ???????????? ?????? ??????

    Get PDF
    Department of Mehcanical EngineeringUnmanned aerial vehicles (UAVs) are widely used in various areas such as exploration, transportation and rescue activity due to light weight, low cost, high mobility and intelligence. This intelligent system consists of highly integrated and embedded systems along with a microprocessor to perform specific task by computing algorithm or processing data. In particular, image processing is one of main core technologies to handle important tasks such as target tracking, positioning, visual servoing using visual system. However, it often requires heavy amount of computation burden and an additional micro PC controller with a flight computer should be additionally used to process image data. However, performance of the controller is not so good enough due to limited power, size, and weight. Therefore, efficient image processing techniques are needed considering computing load and hardware resources for real time operation on embedded systems. The objective of the thesis research is to develop an efficient image processing framework on embedded systems utilizing neural network and various optimized computation techniques to satisfy both efficient computing speed versus resource usage and accuracy. Image processing techniques has been proposed and tested for management computing resources and operating high performance missions in embedded systems. Graphic processing units (GPUs) available in the market can be used for parallel computing to accelerate computing speed. Multiple cores within central processing units (CPUs) are used like multi-threading during data uploading and downloading between the CPU and the GPU. In order to minimize computing load, several methods have been proposed. The first method is visualization of convolutional neural network (CNN) that can perform both localization and detection simultaneously. The second is region proposal for input area of CNN through simple image processing, which helps algorithm to avoid full frame processing. Finally, surplus computing resources can be saved by control the transient performance such as the FPS limitation. These optimization methods have been experimentally applied to a ground vehicle and quadrotor UAVs and verified that the developed methods offer an optimization to process in embedded environment by saving CPU and memory resources. In addition, they can support to perform various tasks such as object detection and path planning, obstacle avoidance. Through optimization and algorithms, they reveal a number of improvements for the embedded system compared to the existing. Considering the characteristics of the system to transplant the various useful algorithms to the embedded system, the method developed in the research can be further applied to various practical applications.ope

    Development of a Visual Odometry System as a Location Aid for Self-Driving Cars

    Get PDF
    Conocer la posición exacta que ocupa un robot y la trayectoria que describe es esencial en el ámbito de la automoción. Durante años se han desarrollado distintos sensores y técnicas para este cometido que se estudian a lo largo del trabajo. En este proyecto se utilizan dos cámaras a bordo del vehículo como sensores de percepción del entorno. Se propone un algoritmo basado únicamente en odometría visual, es decir, analizando la secuencia de imágenes captadas por las cámaras, sin conocimiento previo del entorno y sin el uso de otros sensores, se pretende obtener una estimación real de la posición y orientación del vehículo. Dicha propuesta se ha validado en el dataset de KITTI y se ha comparado con otras técnicas de odometría visual existentes en el estado del arteKnowing the exact position occupied by a robot and the trajectory it describes is essential in the automotive field. Some techniques and sensors have been developed over the years for this purpose which are studied in this work. In this project, two cameras on board the vehicle are used as sensors for the environment perception. The proposed algorithm is based only on visual odometry, it means, using the sequence of images captured by the cameras, without prior knowledge of the environment and without the use of other sensors. The aim is to obtain a real estimation of the position and orientation of the vehicle. This proposal has been validated on the KITTI benchmark and compared with other Visual Odometry techniques existing in the state of the art.Grado en Ingeniería en Electrónica y Automática Industria

    Tecniche per la rilevazione automatica marker-less di persone e marker-based di robot all'interno di reti di telecamere RGB-Depth

    Get PDF
    OpenPTrack is a state of the art solution for people detection and tracking, in this work we extended some of the functionalities (detection from highly tilted camera) of the software and introduced new ones (automatic ground plane equation calculator). Also, we test the feasibility and the behaviour of a mobile camera mounted on a people-following robot and dynamically registered in the OPT network through a fiducial cubic marke

    Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor

    Get PDF
    This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields

    Convolutional Neural Network Architecture Study for Aerial Visual Localization

    Get PDF
    In unmanned aerial navigation the ability to determine the aircraft\u27s location is essential for safe flight. The Global Positioning System (GPS) is the default modern application used for geospatial location determination. GPS is extremely robust, very accurate, and has essentially solved aerial localization. Unfortunately, the signals from all Global Navigation Satellite Systems (GNSS) to include GPS can be jammed or spoofed. To this response it is essential to develop alternative systems that could be used to supplement navigation systems, in the event of a lost GNSS signal. Public and governmental satellites have provided large amounts of high-resolution satellite imagery. These could be exploited through machine learning to aid onboard navigation equipment to provide a geospatial location solution. Deep learning and Convolutional Neural Networks (CNNs) have provided significant advances in specific image processing algorithms. This thesis will discuss the performance of CNN architectures with various hyperparameters and industry leading model designs to address visual aerial localization. The localization algorithm is trained and tested through satellite imagery of a localized area of 150 square kilometers. Three hyper-parameters of focus are: initializations, optimizers, and finishing layers. The five model architectures are: MobileNet V2, Inception V3, ResNet 50, Xception, and DenseNet 201. The hyper-parameter analysis demonstrates that specific initializations, optimizations and finishing layers can have significant effects on the training of a CNN architecture for this specific task. The lessons learned from the hyper-parameter analysis were implemented into the CNN comparison study. After all the models were trained for 150 epochs they were evaluated on the test set. The Xception model with pretrained initialization outperformed all other models with a Root Mean Squared (RMS) error of only 85 meters

    Application of augmented reality and robotic technology in broadcasting: A survey

    Get PDF
    As an innovation technique, Augmented Reality (AR) has been gradually deployed in the broadcast, videography and cinematography industries. Virtual graphics generated by AR are dynamic and overlap on the surface of the environment so that the original appearance can be greatly enhanced in comparison with traditional broadcasting. In addition, AR enables broadcasters to interact with augmented virtual 3D models on a broadcasting scene in order to enhance the performance of broadcasting. Recently, advanced robotic technologies have been deployed in a camera shooting system to create a robotic cameraman so that the performance of AR broadcasting could be further improved, which is highlighted in the paper

    Cartographer slam method for optimization with an adaptive multi-distance scan scheduler

    Get PDF
    This paper presents the use of Google's simultaneous localization and mapping (SLAM) technique, namely Cartographer, and adaptive multistage distance scheduler (AMDS) to improve the processing speed. This approach optimizes the processing speed of SLAM which is known to have performance degradation as the map grows due to a larger scan matcher. In this proposed work, the adaptive method was successfully tested in an actual vehicle to map roads in real time. The AMDS performs a local pose correction by controlling the LiDAR sensor scan range and scan matcher search window with the help ofscheduling algorithms. The scheduling algorithms manage the SLAM that swaps between short and long distances during map data collection. As a result, the algorithms efficiently improved performance speed similar to short distance LiDAR scanswhile maintaining the accuracy of the full distance of LiDAR. By swapping the scan distance of the sensor, and adaptively limiting the search size of the scan matcher to handle difference scan sizes, the pose's generation performance time is improved by approximately 16% as compared with a fixed scan distance, while maintaining similar accuracy

    Real-time High Resolution Fusion of Depth Maps on GPU

    Full text link
    A system for live high quality surface reconstruction using a single moving depth camera on a commodity hardware is presented. High accuracy and real-time frame rate is achieved by utilizing graphics hardware computing capabilities via OpenCL and by using sparse data structure for volumetric surface representation. Depth sensor pose is estimated by combining serial texture registration algorithm with iterative closest points algorithm (ICP) aligning obtained depth map to the estimated scene model. Aligned surface is then fused into the scene. Kalman filter is used to improve fusion quality. Truncated signed distance function (TSDF) stored as block-based sparse buffer is used to represent surface. Use of sparse data structure greatly increases accuracy of scanned surfaces and maximum scanning area. Traditional GPU implementation of volumetric rendering and fusion algorithms were modified to exploit sparsity to achieve desired performance. Incorporation of texture registration for sensor pose estimation and Kalman filter for measurement integration improved accuracy and robustness of scanning process
    corecore