108 research outputs found

    A particle filter to reconstruct a free-surface flow from a depth camera

    Get PDF
    We investigate the combined use of a Kinect depth sensor and of a stochastic data assimilation method to recover free-surface flows. More specifically, we use a Weighted ensemble Kalman filter method to reconstruct the complete state of free-surface flows from a sequence of depth images only. This particle filter accounts for model and observations errors. This data assimilation scheme is enhanced with the use of two observations instead of one classically. We evaluate the developed approach on two numerical test cases: a collapse of a water column as a toy-example and a flow in an suddenly expanding flume as a more realistic flow. The robustness of the method to depth data errors and also to initial and inflow conditions is considered. We illustrate the interest of using two observations instead of one observation into the correction step, especially for unknown inflow boundary conditions. Then, the performance of the Kinect sensor to capture temporal sequences of depth observations is investigated. Finally, the efficiency of the algorithm is qualified for a wave in a real rectangular flat bottom tank. It is shown that for basic initial conditions, the particle filter rapidly and remarkably reconstructs velocity and height of the free surface flow based on noisy measurements of the elevation alone

    3D Reconstruction of Small Solar System Bodies using Rendered and Compressed Images

    Get PDF
    Synthetic image generation and reconstruction of Small Solar System Bodies and the influence of compression is becoming an important study topic because of the advent of small spacecraft in deep space missions. Most of these missions are fly-by scenarios, for example in the Comet Interceptor mission. Due to limited data budgets of small satellite missions, maximising scientific return requires investigating effects of lossy compression. A preliminary simulation pipeline had been developed that uses physics-based rendering in combination with procedural terrain generation to overcome limitations of currently used methods for image rendering like the Hapke model. The rendered Small Solar System Body images are combined with a star background and photometrically calibrated to represent realistic imagery. Subsequently, a Structure-from-Motion pipeline reconstructs three-dimensional models from the rendered images. In this work, the preliminary simulation pipeline was developed further into the Space Imaging Simulator for Proximity Operations software package and a compression package was added. The compression package was used to investigate effects of lossy compression on reconstructed models and the possible amount of data reduction of lossy compression to lossless compression. Several scenarios with varying fly-by distances ranging from 50 km to 400 km and body sizes of 1 km and 10 km were simulated and compressed with lossless and several quality levels of lossy compression using PNG and JPEG 2000 respectively. It was found that low compression ratios introduce artefacts resembling random noise while high compression ratios remove surface features. The random noise artefacts introduced by low compression ratios frequently increased the number of vertices and faces of the reconstructed three-dimensional model

    Development of an image based 3D technique to determine spread patterns of centrifugal fertilizer spreaders

    Get PDF

    Design and Implementation of the Kinect Controlled Electro-Mechanical Skeleton (K.C.E.M.S)

    Get PDF
    Mimicking real-time human motion with a low cost solution has been an extremely difficult task in the past but with the release of the Microsoft Kinect motion capture system, this problem has been simplified. This thesis discusses the feasibility and design behind a simple robotic skeleton that utilizes the Kinect to mimic human movements in near real-time. The goal of this project is to construct a 1/3-scale model of a robotically enhanced skeleton and demonstrate the abilities of the Kinect as a tool for human movement mimicry. The resulting robot was able to mimic many human movements but was mechanically limited in the shoulders. Its movements were slower then real-time due to the inability for the controller to handle real-time motions. This research was presented and published at the 2012 SouthEastCon. Along with this, research papers about the formula hybrid accumulator design and the 2010 autonomous surface vehicle were presented and published

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Event-based neuromorphic stereo vision

    Full text link

    3D data fusion by depth refinement and pose recovery

    Get PDF
    Refining depth maps from different sources to obtain a refined depth map, and aligning the rigid point clouds from different views, are two core techniques. Existing depth fusion algorithms do not provide a general framework to obtain a highly accurate depth map. Furthermore, existing rigid point cloud registration algorithms do not always align noisy point clouds robustly and accurately, especially when there are many outliers and large occlusions. In this thesis, we present a general depth fusion framework based on supervised, semi-supervised, and unsupervised adversarial network approaches. We show that the refined depth maps are more accurate than the source depth maps by depth fusion. We develop a new rigid point cloud registration algorithm by aligning two uncertainty-based Gaussian mixture models, which represent the structures of the two point clouds. We show that we can register rigid point clouds more accurately over a larger range of perturbations. Subsequently, the new supervised depth fusion algorithm and new rigid point cloud registration algorithm are integrated into the ROS system of a real gardening robot (called TrimBot) for practical usage in real environments. All the proposed algorithms have been evaluated on multiple existing datasets to show their superiority compared to prior work in the field

    Data fusion architecture for intelligent vehicles

    Get PDF
    Traffic accidents are an important socio-economic problem. Every year, the cost in human lives and the economic consequences are inestimable. During the latest years, efforts to reduce or mitigate this problem have lead to a reduction in casualties. But, the death toll in road accidents is still a problem, which means that there is still much work to be done. Recent advances in information technology have lead to more complex applications, which have the ability to help or even substitute the driver in case of hazardous situations, allowing more secure and efficient driving. But these complex systems require more trustable and accurate sensing technology that allows detecting and identifying the surrounding environment as well as identifying the different objects and users. However, the sensing technology available nowadays is insufficient itself, and thus combining the different available technologies is mandatory in order to fulfill the exigent requirements of safety road applications. In this way, the limitations of every system are overcome. More dependable and reliable information can be thus obtained. These kinds of applications are called Data Fusion (DF) applications. The present document tries to provide a solution for the Data Fusion problem in the Intelligent Transport System (ITS) field by providing a set of techniques and algorithms that allow the combination of information from different sensors. By combining these sensors the basic performances of the classical approaches in ITS can be enhanced, satisfying the demands of safety applications. The works presented are related with two researching fields. Intelligent Transport System is the researching field where this thesis was established. ITS tries to use the recent advances in Information Technology to increase the security and efficiency of the transport systems. Data Fusion techniques, on the other hand, try to give solution to the process related with the combination of information from different sources, enhancing the basic capacities of the systems and adding trustability to the inferences. This work attempts to use the Data Fusion algorithms and techniques to provide solution to classic ITS applications. The sensors used in the present application include a laser scanner and computer vision. First is a well known sensor, widely used, and during more recent years have started to be applied in different ITS applications, showing advanced performance mainly related to its trustability. Second is a recent sensor in automotive applications widely used in all recent ITS advances in the last decade. Thanks to computer vision road security applications (e.g. traffic sign detection, driver monitoring, lane detection, pedestrian detection, etc.) advancements are becoming possible. The present thesis tries to solve the environment reconstruction problem, identifying users of the roads (i.e. pedestrians and vehicles) by the use of Data Fusion techniques. The solution delivers a complete level based solution to the Data Fusion problem. It provides different tools for detecting as well as estimates the degree of danger that involve any detection. Presented algorithms represents a step forward in the ITS world, providing novel Data Fusion based algorithms that allow the detection and estimation of movement of pedestrians and vehicles in a robust and trustable way. To perform such a demanding task other information sources were needed: GPS, inertial systems and context information. Finally, it is important to remark that in the frame of the present thesis, the lack of detection and identification techniques based in radar laser resulted in the need to research and provide more innovative approaches, based in the use of laser scanner, able to detect and identify the different actors involved in the road environment. ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Los accidentes de tráfico son un grave problema social y económico, cada año el coste tanto en vidas humanas como económico es incontable, por lo que cualquier acción que conlleve la reducción o eliminación de esta lacra es importante. Durante los últimos años se han hecho avances para mitigar el número de accidentes y reducir sus consecuencias. Estos esfuerzos han dado sus frutos, reduciendo el número de accidentes y sus víctimas. Sin embargo el número de heridos y muertos en accidentes de este tipo es aún muy alto, por lo que no hay que rebajar los esfuerzos encaminados a hacer desaparecer tan importante problema. Los recientes avances en tecnologías de la información han permitido la creación de sistemas de ayuda a la conducción cada vez más complejos, capaces de ayudar e incluso sustituir al conductor, permitiendo una conducción más segura y eficiente. Pero estos complejos sistemas requieren de los sensores más fiables, capaces de permitir reconstruir el entorno, identificar los distintos objetos que se encuentran en él e identificar los potenciales peligros. Los sensores disponibles en la actualidad han demostrado ser insuficientes para tan ardua tarea, debido a los enormes requerimientos que conlleva una aplicación de seguridad en carretera. Por lo tanto, combinar los diferentes sensores disponibles se antoja necesario para llegar a los niveles de eficiencia y confianza que requieren este tipo de aplicaciones. De esta forma, las limitaciones de cada sensor pueden ser superadas, gracias al uso combinado de los diferentes sensores, cada uno de ellos proporcionando información que complementa la obtenida por otros sistemas. Este tipo de aplicaciones se denomina aplicaciones de Fusión Sensorial. El presente trabajo busca aportar soluciones en el entorno de los vehículos inteligentes, mediante técnicas de fusión sensorial, a clásicos problemas relacionados con la seguridad vial. Se buscará combinar diferentes sensores y otras fuentes de información, para obtener un sistema fiable, capaz de satisfacer las exigentes demandas de este tipo de aplicaciones. Los estudios realizados y algoritmos propuestos están enmarcados en dos campos de investigación bien conocidos y populares. Los Sistemas Inteligentes de Transporte (ITS- por sus siglas en ingles- Intelligent Transportation Systems), marco en el que se centra la presente tesis, que engloba las diferentes tecnologías que durante los últimos años han permitido dotar a los sistemas de transporte de mejoras que aumentan la seguridad y eficiencia de los sistemas de transporte tradicionales, gracias a las novedades en el campo de las tecnologías de la información. Por otro lado las técnicas de Fusión Sensorial (DF -por sus siglas en ingles- Data Fusión) engloban las diferentes técnicas y procesos necesarios para combinar diferentes fuentes de información, permitiendo mejorar las prestaciones y dando fiabilidad a los sistemas finales. La presente tesis buscará el empleo de las técnicas de Fusión Sensorial para dar solución a problemas relacionados con Sistemas Inteligentes de Transporte. Los sensores escogidos para esta aplicación son un escáner láser y visión por computador. El primero es un sensor ampliamente conocido, que durante los últimos años ha comenzado a emplearse en el mundo de los ITS con unos excelentes resultados. El segundo de este conjunto de sensores es uno de los sistemas más empleados durante los últimos años, para dotar de cada vez más complejos y versátiles aplicaciones en el mundo de los ITS. Gracias a la visión por computador, aplicaciones tan necesarias para la seguridad como detección de señales de tráfico, líneas de la carreta, peatones, etcétera, que hace unos años parecía ciencia ficción, están cada vez más cerca. La aplicación que se presenta pretende dar solución al problema de reconstrucción de entornos viales, identificando a los principales usuarios de la carretera (vehículos y peatones) mediante técnicas de Fusión Sensorial. La solución implementada busca dar una completa solución a todos los niveles del proceso de fusión sensorial, proveyendo de las diferentes herramientas, no solo para detectar los otros usuarios, sino para dar una estimación del peligro que cada una de estas detecciones implica. Para lograr este propósito, además de los sensores ya comentados han sido necesarias otras fuentes de información, como sensores GPS, inerciales e información contextual. Los algoritmos presentados pretenden ser un importante paso adelante en el mundo de los Sistemas Inteligentes de Transporte, proporcionando novedosos algoritmos basados en tecnologías de Fusión Sensorial que permitirán detectar y estimar el movimiento de los peatones y vehículos de forma fiable y robusta. Finalmente hay que remarcar que en el marco de la presente tesis, la falta de sistemas de detección e identificación de obstáculos basados en radar láser provocó la necesidad de implementar novedosos algoritmos que detectasen e identificasen, en la medida de lo posible y pese a las limitaciones de la tecnología, los diferentes obstáculos que se pueden encontrar en la carretera basándose en este sensor

    3D 손 포즈 인식을 위한 인조 데이터의 이용

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(지능형융합시스템전공), 2021.8. 양한열.3D hand pose estimation (HPE) based on RGB images has been studied for a long time. Relevant methods have focused mainly on optimization of neural framework for graphically connected finger joints. Training RGB-based HPE models has not been easy to train because of the scarcity on RGB hand pose datasets; unlike human body pose datasets, the finger joints that span hand postures are structured delicately and exquisitely. Such structure makes accurately annotating each joint with unique 3D world coordinates difficult, which is why many conventional methods rely on synthetic data samples to cover large variations of hand postures. Synthetic dataset consists of very precise annotations of ground truths, and further allows control over the variety of data samples, yielding a learning model to be trained with a large pose space. Most of the studies, however, have performed frame-by-frame estimation based on independent static images. Synthetic visual data can provide practically infinite diversity and rich labels, while avoiding ethical issues with privacy and bias. However, for many tasks, current models trained on synthetic data generalize poorly to real data. The task of 3D human hand pose estimation is a particularly interesting example of this synthetic-to-real problem, because learning-based approaches perform reasonably well given real training data, yet labeled 3D poses are extremely difficult to obtain in the wild, limiting scalability. In this dissertation, we attempt to not only consider the appearance of a hand but incorporate the temporal movement information of a hand in motion into the learning framework for better 3D hand pose estimation performance, which leads to the necessity of a large scale dataset with sequential RGB hand images. We propose a novel method that generates a synthetic dataset that mimics natural human hand movements by re-engineering annotations of an extant static hand pose dataset into pose-flows. With the generated dataset, we train a newly proposed recurrent framework, exploiting visuo-temporal features from sequential images of synthetic hands in motion and emphasizing temporal smoothness of estimations with a temporal consistency constraint. Our novel training strategy of detaching the recurrent layer of the framework during domain finetuning from synthetic to real allows preservation of the visuo-temporal features learned from sequential synthetic hand images. Hand poses that are sequentially estimated consequently produce natural and smooth hand movements which lead to more robust estimations. We show that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations by outperforming state-of-the-art methods in experiments on hand pose estimation benchmarks. Since a fixed set of dataset provides a finite distribution of data samples, the generalization of a learning pose estimation network is limited in terms of pose, RGB and viewpoint spaces. We further propose to augment the data automatically such that the augmented pose sampling is performed in favor of training pose estimators generalization performance. Such auto-augmentation of poses is performed within a learning feature space in order to avoid computational burden of generating synthetic sample for every iteration of updates. The proposed effort can be considered as generating and utilizing synthetic samples for network training in the feature space. This allows training efficiency by requiring less number of real data samples, enhanced generalization power over multiple dataset domains and estimation performance caused by efficient augmentation.2D 이미지에서 사람의 손 모양과 포즈를 인식하고 구현흐는 연구는 각 손가락 조인트들의 3D 위치를 검출하는 것을 목표로한다. 손 포즈는 손가락 조인트들로 구성되어 있고 손목 관절부터 MCP, PIP, DIP 조인트들로 사람 손을 구성하는 신체적 요소들을 의미한다. 손 포즈 정보는 다양한 분야에서 활용될수 있고 손 제스쳐 감지 연구 분야에서 손 포즈 정보가 매우 훌륭한 입력 특징 값으로 사용된다. 사람의 손 포즈 검출 연구를 실제 시스템에 적용하기 위해서는 높은 정확도, 실시간성, 다양한 기기에 사용 가능하도록 가벼운 모델이 필요하고, 이것을 가능케 하기 위해서 학습한 인공신경망 모델을 학습하는데에는 많은 데이터가 필요로 한다. 하지만 사람 손 포즈를 측정하는 기계들이 꽤 불안정하고, 이 기계들을 장착하고 있는 이미지는 사람 손 피부 색과는 많이 달라 학습에 사용하기가 적절하지 않다. 그러기 때문에 본 논문에서는 이러한 문제를 해결하기 위해 인공적으로 만들어낸 데이터를 재가공 및 증량하여 학습에 사용하고, 그것을 통해 더 좋은 학습성과를 이루려고 한다. 인공적으로 만들어낸 사람 손 이미지 데이터들은 실제 사람 손 피부색과는 비슷할지언정 디테일한 텍스쳐가 많이 달라, 실제로 인공 데이터를 학습한 모델은 실제 손 데이터에서 성능이 현저히 많이 떨어진다. 이 두 데이타의 도메인을 줄이기 위해서 첫번째로는 사람손의 구조를 먼저 학습 시키기위해, 손 모션을 재가공하여 그 움직임 구조를 학스한 시간적 정보를 뺀 나머지만 실제 손 이미지 데이터에 학습하였고 크게 효과를 내었다. 이때 실제 사람 손모션을 모방하는 방법론을 제시하였다. 두번째로는 두 도메인이 다른 데이터를 네트워크 피쳐 공간에서 align시켰다. 그뿐만아니라 인공 포즈를 특정 데이터들로 augment하지 않고 네트워크가 많이 보지 못한 포즈가 만들어지도록 하나의 확률 모델로서 설정하여 그것에서 샘플링하는 구조를 제안하였다. 본 논문에서는 인공 데이터를 더 효과적으로 사용하여 annotation이 어려운 실제 데이터를 더 모으는 수고스러움 없이 인공 데이터들을 더 효과적으로 만들어 내는 것 뿐만 아니라, 더 안전하고 지역적 특징과 시간적 특징을 활용해서 포즈의 성능을 개선하는 방법들을 제안했다. 또한, 네트워크가 스스로 필요한 데이터를 찾아서 학습할수 있는 자동 데이터 증량 방법론도 함께 제안하였다. 이렇게 제안된 방법을 결합해서 더 나은 손 포즈의 성능을 향상 할 수 있다.1. Introduction 1 2. Related Works 14 3. Preliminaries: 3D Hand Mesh Model 27 4. SeqHAND: RGB-sequence-based 3D Hand Pose and Shape Estimation 31 5. Hand Pose Auto-Augment 66 6. Conclusion 85 Abstract (Korea) 101 감사의 글 103박
    corecore