13 research outputs found

    Real-time video stabilization without phantom movements for micro aerial vehicles

    Get PDF
    In recent times, micro aerial vehicles (MAVs) are becoming popular for several applications as rescue, surveillance, mapping, etc. Undesired motion between consecutive frames is a problem in a video recorded by MAVs. There are different approaches, applied in video post-processing, to solve this issue. However, there are only few algorithms able to be applied in real time. An additional and critical problem is the presence of false movements in the stabilized video. In this paper, we present a new approach of video stabilization which can be used in real time without generating false movements. Our proposal uses a combination of a low-pass filter and control action information to estimate the motion intention.Peer ReviewedPostprint (published version

    Motion intention optimization for multirotor robust video stabilization

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksIn this paper we present an optimization algorithm for simultaneously detecting video freeze and obtaining the minimum number of the frame required in motion intention estimation for real time robust video stabilization on multirotor unmanned aerial vehicles. A combination of a filter and a threshold is used to the video freeze detection, and for optimizing the algorithm, we find the minimum number of frames for motion intention estimation without decrease the performance.Peer ReviewedPostprint (author's final draft

    Estabilización de vídeo en micro vehículos aéreos y su aplicación en la detección de caras

    Get PDF
    Actualmente, los vehículos aéreos de micros escala (MAVs) se han tornado populares para múltiples aplicaciones como rescate, vigilancia, mapeo, entre otras. Para todos los casos, es necesario un óptimo desempeño de los vídeos capturados a bordo, y uno de los principales problemas constituyen los movimientos indeseados entre fotogramas consecutivos. Para solventar esta problemática existes diferentes enfoques que, aplicados a post-procesamiento, consiguen una estabilización robusta en la imagen. Sin embargo, muy pocos algoritmos son capaces de ser aplicados en tiempo real. En este artículo se presenta un nuevo enfoque que puede ser implementado en tiempo real sin que se generen movimientos falsos. Nuestra propuesta usa una combinación de un filtro pasabajos, y la información de la acción de control para la estimación de la intención de movimiento. Adicionalmente, se presenta la aplicación de nuestra propuesta en el algoritmo de detección de caras, en el cual, la robustez se incrementa al ser implementado a partir de la secuencia estable de vídeo.Peer ReviewedPostprint (published version

    Estabilización robusta de vídeo basada en diferencia de nivel de gris

    Get PDF
    La estabilización de vídeo se está convirtiendo en una importante técnica de post-procesado para secuencias de fotogramas (frames) adquiridas con cámaras digitales, especialmente debido al uso generalizado de cámaras de mano (hand-held) así como la utilización de estos dispositivos como elementos de entrada en sistemas robotizados complejos, robots humanoides o vehículos aéreos no tripulados. El presente artículo propone una combinación del método iterativo RANSAC (RANdom SAmple Consensus), para otorgar robustez a la estimación del movimiento como parte del proceso de estabilización de vídeo, en conjunto con una función coste basada en la diferencia del nivel de gris entre imágenes.Peer ReviewedPostprint (published version

    Advances and Applications of Computer Vision Techniques in Vehicle Trajectory Generation and Surrogate Traffic Safety Indicators

    Full text link
    The application of Computer Vision (CV) techniques massively stimulates microscopic traffic safety analysis from the perspective of traffic conflicts and near misses, which is usually measured using Surrogate Safety Measures (SSM). However, as video processing and traffic safety modeling are two separate research domains and few research have focused on systematically bridging the gap between them, it is necessary to provide transportation researchers and practitioners with corresponding guidance. With this aim in mind, this paper focuses on reviewing the applications of CV techniques in traffic safety modeling using SSM and suggesting the best way forward. The CV algorithm that are used for vehicle detection and tracking from early approaches to the state-of-the-art models are summarized at a high level. Then, the video pre-processing and post-processing techniques for vehicle trajectory extraction are introduced. A detailed review of SSMs for vehicle trajectory data along with their application on traffic safety analysis is presented. Finally, practical issues in traffic video processing and SSM-based safety analysis are discussed, and the available or potential solutions are provided. This review is expected to assist transportation researchers and engineers with the selection of suitable CV techniques for video processing, and the usage of SSMs for various traffic safety research objectives

    Estabilización de vídeo en tiempo real : aplicaciones en teleoperación de micro vehículos aéreos de ala rotativa

    Get PDF
    Micro Aerial Vehicles (MAVs), a subset of Unmanned Aerial Vehicles (UAVs), also known as drones, are becoming popular for several applications and gaining interest due to advantages as manufacturing and maintenance cost, size and weight, energy consumption, and flight maneuverability. Required skills for drone teleoperators being lower than for aircraft pilots, however their training process can last several weeks or months depending on the target at hands. In particular, this process is harder when teleoperators cannot observe directly the vehicle, depending only on onboard sensors and cameras. The presence of oscillations in the captured video is a major problem with cameras on UAVs. It is even more complex for MAVs because the external disturbances increase the instability. There exists mechanical video stabilizers that reduce camera oscillations, however this mechanical device adds weight and increases the manufacturing cost, energy consumption, size, weight, and the system becomes less safe for people. In this thesis, we propose to develop video stabilization software algorithms, without additional mechanical elements in the system, to be applied in real-time during the UAV navigation. In the literature, there are a few video stabilization algorithms able to be applied in real-time, but most of them generate false motion (phantom movements) in the stabilized image. Our algorithm represents a good tradeoff between stable video recording and simultaneously keeping UAV real motion. Several experiments with MAVs have been performed and the employed measurements demonstrate the good performance of the introduced algorithm.Los micro vehículos aéreos (MAVs), un subconjunto de vehículos aéreos no tripulados (UAVs), también llamados drones, han ganado popularidad en múltiples aplicaciones y un creciente interés debido a sus ventajas como costo de fabricación y mantenimiento, volumen, peso del vehículo, gasto energético, y maniobrabilidad de vuelo. La destreza requerida para un teleoperador de drones es inferior a la de un piloto de aeronaves de mayor dimensión, no obstante, su proceso de entrenamiento puede durar varias semanas o incluso meses dependiendo del objetivo que se persiga. Este proceso se dificulta cuando el teleoperador no puede observar de forma directa al vehículo y depende únicamente de los sensores y cámaras a bordo del sistema. Uno de los principales problemas con cámaras a bordo de drones es la oscilación presente en los vídeos capturados. Este inconveniente es más complejo para los MAVs porque las perturbaciones externas provocan mayor inestabilidad. Existen dispositivos mecánicos de estabilización de vídeo que reducen las oscilaciones en la cámara. Sin embargo, estos mecanismos implican una carga adicional al sistema y aumentan el costo de producción, gasto energético y el riesgo para las personas que se encuentren cerca en caso de accidente. En la presente tesis se propone el desarrollo de algoritmos de estabilización de vídeo por software sin elementos mecánicos adicionales en el sistema, a ser utilizados en tiempo real durante la navegación de los UAVs. En la literatura existen pocos algoritmos de estabilización de video aplicables en tiempo real, los cuales generan falsos movimientos (movimientos fantasma) en la imagen estabilizada. El algoritmo desarrollado es capaz de obtener una imagen estable y simultáneamente mantener los movimientos reales. Se han llevado a cabo múltiples experimentos con MAVs y las métricas de evaluación utilizadas evidencian el buen desempeño del algoritmo introducido

    Contributions to improve the technologies supporting unmanned aircraft operations

    Get PDF
    Mención Internacional en el título de doctorUnmanned Aerial Vehicles (UAVs), in their smaller versions known as drones, are becoming increasingly important in today's societies. The systems that make them up present a multitude of challenges, of which error can be considered the common denominator. The perception of the environment is measured by sensors that have errors, the models that interpret the information and/or define behaviors are approximations of the world and therefore also have errors. Explaining error allows extending the limits of deterministic models to address real-world problems. The performance of the technologies embedded in drones depends on our ability to understand, model, and control the error of the systems that integrate them, as well as new technologies that may emerge. Flight controllers integrate various subsystems that are generally dependent on other systems. One example is the guidance systems. These systems provide the engine's propulsion controller with the necessary information to accomplish a desired mission. For this purpose, the flight controller is made up of a control law for the guidance system that reacts to the information perceived by the perception and navigation systems. The error of any of the subsystems propagates through the ecosystem of the controller, so the study of each of them is essential. On the other hand, among the strategies for error control are state-space estimators, where the Kalman filter has been a great ally of engineers since its appearance in the 1960s. Kalman filters are at the heart of information fusion systems, minimizing the error covariance of the system and allowing the measured states to be filtered and estimated in the absence of observations. State Space Models (SSM) are developed based on a set of hypotheses for modeling the world. Among the assumptions are that the models of the world must be linear, Markovian, and that the error of their models must be Gaussian. In general, systems are not linear, so linearization are performed on models that are already approximations of the world. In other cases, the noise to be controlled is not Gaussian, but it is approximated to that distribution in order to be able to deal with it. On the other hand, many systems are not Markovian, i.e., their states do not depend only on the previous state, but there are other dependencies that state space models cannot handle. This thesis deals a collection of studies in which error is formulated and reduced. First, the error in a computer vision-based precision landing system is studied, then estimation and filtering problems from the deep learning approach are addressed. Finally, classification concepts with deep learning over trajectories are studied. The first case of the collection xviiistudies the consequences of error propagation in a machine vision-based precision landing system. This paper proposes a set of strategies to reduce the impact on the guidance system, and ultimately reduce the error. The next two studies approach the estimation and filtering problem from the deep learning approach, where error is a function to be minimized by learning. The last case of the collection deals with a trajectory classification problem with real data. This work completes the two main fields in deep learning, regression and classification, where the error is considered as a probability function of class membership.Los vehículos aéreos no tripulados (UAV) en sus versiones de pequeño tamaño conocidos como drones, van tomando protagonismo en las sociedades actuales. Los sistemas que los componen presentan multitud de retos entre los cuales el error se puede considerar como el denominador común. La percepción del entorno se mide mediante sensores que tienen error, los modelos que interpretan la información y/o definen comportamientos son aproximaciones del mundo y por consiguiente también presentan error. Explicar el error permite extender los límites de los modelos deterministas para abordar problemas del mundo real. El rendimiento de las tecnologías embarcadas en los drones, dependen de nuestra capacidad de comprender, modelar y controlar el error de los sistemas que los integran, así como de las nuevas tecnologías que puedan surgir. Los controladores de vuelo integran diferentes subsistemas los cuales generalmente son dependientes de otros sistemas. Un caso de esta situación son los sistemas de guiado. Estos sistemas son los encargados de proporcionar al controlador de los motores información necesaria para cumplir con una misión deseada. Para ello se componen de una ley de control de guiado que reacciona a la información percibida por los sistemas de percepción y navegación. El error de cualquiera de estos sistemas se propaga por el ecosistema del controlador siendo vital su estudio. Por otro lado, entre las estrategias para abordar el control del error se encuentran los estimadores en espacios de estados, donde el filtro de Kalman desde su aparición en los años 60, ha sido y continúa siendo un gran aliado para los ingenieros. Los filtros de Kalman son el corazón de los sistemas de fusión de información, los cuales minimizan la covarianza del error del sistema, permitiendo filtrar los estados medidos y estimarlos cuando no se tienen observaciones. Los modelos de espacios de estados se desarrollan en base a un conjunto de hipótesis para modelar el mundo. Entre las hipótesis se encuentra que los modelos del mundo han de ser lineales, markovianos y que el error de sus modelos ha de ser gaussiano. Generalmente los sistemas no son lineales por lo que se realizan linealizaciones sobre modelos que a su vez ya son aproximaciones del mundo. En otros casos el ruido que se desea controlar no es gaussiano, pero se aproxima a esta distribución para poder abordarlo. Por otro lado, multitud de sistemas no son markovianos, es decir, sus estados no solo dependen del estado anterior, sino que existen otras dependencias que los modelos de espacio de estados no son capaces de abordar. Esta tesis aborda un compendio de estudios sobre los que se formula y reduce el error. En primer lugar, se estudia el error en un sistema de aterrizaje de precisión basado en visión por computador. Después se plantean problemas de estimación y filtrado desde la aproximación del aprendizaje profundo. Por último, se estudian los conceptos de clasificación con aprendizaje profundo sobre trayectorias. El primer caso del compendio estudia las consecuencias de la propagación del error de un sistema de aterrizaje de precisión basado en visión artificial. En este trabajo se propone un conjunto de estrategias para reducir el impacto sobre el sistema de guiado, y en última instancia reducir el error. Los siguientes dos estudios abordan el problema de estimación y filtrado desde la perspectiva del aprendizaje profundo, donde el error es una función que minimizar mediante aprendizaje. El último caso del compendio aborda un problema de clasificación de trayectorias con datos reales. Con este trabajo se completan los dos campos principales en aprendizaje profundo, regresión y clasificación, donde se plantea el error como una función de probabilidad de pertenencia a una clase.I would like to thank the Ministry of Science and Innovation for granting me the funding with reference PRE2018-086793, associated to the project TEC2017-88048-C2-2-R, which provide me the opportunity to carry out all my PhD. activities, including completing an international research internship.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Antonio Berlanga de Jesús.- Secretario: Daniel Arias Medina.- Vocal: Alejandro Martínez Cav

    Estabilização digital de vídeos : algoritmos e avaliação

    Get PDF
    Orientador: Hélio PedriniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O desenvolvimento de equipamentos multimídia permitiu um crescimento significativo na produção de vídeos por meio de câmeras, celulares e outros dispositivos móveis. No entanto, os vídeos capturados por esses dispositivos estão sujeitos a movimentos indesejados devido à vibração da câmera. Para superar esse problema, a estabilização digital visa remover o movimento indesejado dos vídeos pela aplicação de ferramentas computacionais, sem o uso de hardware específico, para melhorar a qualidade visual das cenas de forma a melhorar aspectos do vídeo segundo a percepção humana ou facilitar aplicações finais, como detecção e rastreamento de objetos. O processo de estabilização digital de vídeos bidimensional geralmente é dividido em três etapas principais: estimativa de movimento da câmera, remoção do movimento indesejado e geração do vídeo corrigido. Neste trabalho, investigamos e avaliamos métodos de estabilização digital de vídeos para corrigir vibrações e instabilidades que ocorrem durante o processo de aquisição. Na etapa de estimativa de movimento, desenvolvemos e analisamos um método consensual para combinar um conjunto de técnicas de características locais para estimativa do movimento global. Também apresentamos e testamos uma nova abordagem que identifica falhas na estimativa do movimento da câmera por meio de técnicas de otimização e calcula uma estimativa corrigida. Na etapa de remoção do movimento indesejável, propomos e avaliamos uma nova abordagem para estabilização de vídeos com base em um filtro Gaussiano adaptativo para suavizar a trajetória da câmera. Devido a incoerências existentes nas medidas de avaliação disponíveis na literatura em relação à percepção humana, duas representações são propostas para avaliar qualitativamente os métodos de estabilização de vídeos: a primeira baseia-se em ritmos visuais e representa o comportamento do movimento do vídeo, enquanto que a segunda é baseada na imagem da energia do movimento e representa a quantidade de movimento presente no vídeo. Experimentos foram realizados em três bases de dados. A primeira consiste em onze vídeos disponíveis na base de dados GaTech VideoStab e outros três vídeos coletados separadamente. A segunda, proposta por Liu et al., consiste em 139 vídeos divididos em diferentes categorias. Finalmente, propomos uma base de dados complementar às demais, composta a partir de quatro vídeos coletados separadamente. Trechos dos vídeos originais com presença de objetos em movimento e com fundo pouco representativo foram extraídos, gerando-se um total de oito vídeos. Resultados experimentais demonstraram a eficácia das representações visuais como medida qualitativa para avaliar a estabilidade dos vídeos, bem como o método de combinação de características locais. O método proposto baseado em otimização foi capaz de detectar e corrigir falhas de estimativa de movimento, obtendo resultados significativamente superiores em relação à não aplicação dessa correção. O filtro Gaussiano adaptativo permitiu gerar vídeos com equilíbrio adequado entre a taxa de estabilização e a quantidade de pixels preservados nos quadros dos vídeos. Os resultados alcançados como o nosso método de otimização nos vídeos da base de dados proposta foram superiores aos obtidos pelo método implementado no YouTubeAbstract: The development of multimedia equipments has allowed a significant growth in the production of videos through professional and amateur cameras, smartphones and other mobile devices. However, videos captured by these devices are subject to unwanted vibrations due to camera shaking. To overcome such problem, digital stabilization aims to remove undesired motion from videos through software techniques, without the use of specific hardware, to enhance visual quality either with the intention of enhancing human perception or improving final applications, such as detection and tracking of objects. The two-dimensional digital video stabilization process is usually divided into three main steps: camera motion estimation, removal of unwanted motion, and generation of the corrected video. In this work, we investigate and evaluate digital video stabilization methods for correcting disturbances and instabilities that occur during the process of video acquisition. In the motion estimation step, we develop and analyzed a consensual method for combining a set of local feature techniques for global motion estimation. We also introduce and test a novel approach that identifies failures in the global motion estimation of the camera through optimization and computes a new estimate of the corrected motion. In the removal of unwanted motion step, we propose and evaluate a novel approach to video stabilization based on an adaptive Gaussian filter to smooth the camera path. Due to the incoherence of assessment measures available in the literature regarding human perception, two novel representations are proposed for qualitative evaluation of video stabilization methods: the first is based on the visual rhythms and represents the behavior of the video motion, whereas the second is based on the motion energy image and represents the amount of motion present in the video. Experiments are conducted on three video databases. The first consists of eleven videos available from the GaTech VideoStab database, and three other videos collected separately. The second, proposed by Liu et al., consists of 139 videos divided into different categories. Finally, we propose a database that is complementary to the others, composed from four videos collected separately, which are excerpts from the original videos with moving objects in the foreground and with little representative background extracted, resulting in eight final videos. Experimental results demonstrated the effectiveness of the visual representations as qualitative measure for evaluating video stability, as well as the combination method over individual local feature approaches. The proposed method based on optimization was able to detect and correct the motion estimation failures, achieving considerably superior results compared to when this correction is not applied. The adaptive Gaussian filter allowed to generate videos with adequate trade-off between stabilization rate and amount of frame pixels. The results reached with our optimization method for the videos of the proposed database were superior to those obtained with YouTube's state-of-the-art methodMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE

    Synthetic Data for Machine Learning

    Get PDF
    Supervised machine learning methods require large-scale training datasets to converge. Collecting and annotating training data is expensive, time-consuming, error-prone, and not always practical. Usually, synthetic data is used as a feasible data source to increase the amount of training data. However, just directly using synthetic data may actually harm the model’s performance or may not be as effective as it could be. This thesis addresses the challenges of generating large-scale synthetic data, improving domain adaptation in semantic segmentation, advancing video stabilization in adverse conditions, and conducting a rigorous assessment of synthetic data usability in classification tasks. By contributing novel solutions to these multifaceted problems, this work bolsters the field of computer vision, offering strong foundations for a broad range of applications for utilizing synthetic data for computer vision tasks. In this thesis, we divide the study into three main problems: (i) Tackle the problem of generating diverse and photorealistic synthetic data; (ii) Explore synthetic-aware computer vision solutions for semantic segmentation and video stabilization; (iii) Assess the usability of synthetically generated data for different computer vision tasks. We developed a new synthetic data generator called Silver. Photo-realism, diversity, scalability, and full 3D virtual world generation at run-time are the key aspects of this generator. The photo-realism was approached by utilizing the stateof-the-art High Definition Render Pipeline (HDRP) of the Unity game engine. In parallel, the Procedural Content Generation (PCG) concept was employed to create a full 3D virtual world at run-time, while the scalability (expansion and adaptability) of the system was attained by taking advantage of the modular approach followed as we built the system from scratch. Silver can be used to provide clean, unbiased, and large-scale training and testing data for various computer vision tasks. Regarding synthetic-aware computer vision models, we developed a novel architecture specifically designed to use synthetic training data for semantic segmentation domain adaptation. We propose a simple yet powerful addition to DeepLabV3+ by using weather and time-of-the-day supervisors trained with multitask learning, making it both weather and nighttime-aware, which improves its mIoU accuracy under adverse conditions while maintaining adequate performance under standard conditions. Similarly, we also proposed a synthetic-aware adverse weather video stabilization algorithm that dispenses real data for training, relying solely on synthetic data. Our approach leverages specially generated synthetic data to avoid the feature extraction issues faced by current methods. To achieve this, we leveraged our novel data generator to produce the required training data with an automatic ground-truth extraction procedure. We also propose a new dataset called VSAC105Real and compare our method to five recent video stabilization algorithms using two benchmarks. Our method generalizes well on real-world videos across all weather conditions and does not require large-scale synthetic training data. Finally, we assess the usability of the generated synthetic data. We propose a novel usability metric that disentangles photorealism from diversity. This new metric is a simple yet effective way to rank synthetic images. The quantitative results show that we can achieve similar or better results by training on 50% less synthetic data. Additionally, we qualitatively assess the impact of photorealism and evaluate many architectures on different datasets for that aim
    corecore