42 research outputs found
Using Linear Features for Aerial Image Sequence Mosaiking
With recent advances in sensor technology and digital image processing techniques, automatic image mosaicking has received increased attention in a variety of geospatial applications, ranging from panorama generation and video surveillance to image based rendering. The geometric transformation used to link images in a mosaic is the subject of image orientation, a fundamental photogrammetric task that represents a major research area in digital image analysis. It involves the determination of the parameters that express the location and pose of a camera at the time it captured an image. In aerial applications the typical parameters comprise two translations (along the x and y coordinates) and one rotation (rotation about the z axis). Orientation typically proceeds by extracting from an image control points, i.e. points with known coordinates. Salient points such as road intersections, and building corners are commonly used to perform this task. However, such points may contain minimal information other than their radiometric uniqueness, and, more importantly, in some areas they may be impossible to obtain (e.g. in rural and arid areas). To overcome this problem we introduce an alternative approach that uses linear features such as roads and rivers for image mosaicking. Such features are identified and matched to their counterparts in overlapping imagery. Our matching approach uses critical points (e.g. breakpoints) of linear features and the information conveyed by them (e.g. local curvature values and distance metrics) to match two such features and orient the images in which they are depicted. In this manner we orient overlapping images by comparing breakpoint representations of complete or partial linear features depicted in them. By considering broader feature metrics (instead of single points) in our matching scheme we aim to eliminate the effect of erroneous point matches in image mosaicking. Our approach does not require prior approximate parameters, which are typically an essential requirement for successful convergence of point matching schemes. Furthermore, we show that large rotation variations about the z-axis may be recovered. With the acquired orientation parameters, image sequences are mosaicked. Experiments with synthetic aerial image sequences are included in this thesis to demonstrate the performance of our approach
VISUAL MOVEMENT IDENTIFICATION PROCESS WITH REAL TIME PARAMETERS
Motion recognition can be used to look for the trajectory from the projectile, its orientation in accordance with the plane, its velocity and it is spin. It's very helpful in discovering the existence of any projectile in situation of high-speed video. Motion recognition is generally software based monitoring formula which, if this detects motions will signal the surveillance camera to start recording the big event or just shows the motion recognition using graphical method. Android application can be used for listen the audio begin to see the video and control the gate from remote location. It's stored within the principal room from the institution. MATLAB is the greatest tool to get this done type of operation because of its highly accurate and efficient nature. It really transforms our computers into Motion recognition system. It handles the idea of motion tacking using cameras instantly. This paper provides create motion recognition system using software. It handles the idea of motion tracking using cameras instantly. It is made to produce a customer identification system by which motion is detected MATLAB system reads predefined message
Image-based food classification and volume estimation for dietary assessment: a review.
A daily dietary assessment method named 24-hour dietary recall has commonly been used in nutritional epidemiology studies to capture detailed information of the food eaten by the participants to help understand their dietary behaviour. However, in this self-reporting technique, the food types and the portion size reported highly depends on users' subjective judgement which may lead to a biased and inaccurate dietary analysis result. As a result, a variety of visual-based dietary assessment approaches have been proposed recently. While these methods show promises in tackling issues in nutritional epidemiology studies, several challenges and forthcoming opportunities, as detailed in this study, still exist. This study provides an overview of computing algorithms, mathematical models and methodologies used in the field of image-based dietary assessment. It also provides a comprehensive comparison of the state of the art approaches in food recognition and volume/weight estimation in terms of their processing speed, model accuracy, efficiency and constraints. It will be followed by a discussion on deep learning method and its efficacy in dietary assessment. After a comprehensive exploration, we found that integrated dietary assessment systems combining with different approaches could be the potential solution to tackling the challenges in accurate dietary intake assessment
Uncalibrated stereo vision applied to breast cancer treatment aesthetic assessment
Mestrado Integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201
Key characteristics of specular stereo.
Because specular reflection is view-dependent, shiny surfaces behave radically differently from matte, textured surfaces when viewed with two eyes. As a result, specular reflections pose substantial problems for binocular stereopsis. Here we use a combination of computer graphics and geometrical analysis to characterize the key respects in which specular stereo differs from standard stereo, to identify how and why the human visual system fails to reconstruct depths correctly from specular reflections. We describe rendering of stereoscopic images of specular surfaces in which the disparity information can be varied parametrically and independently of monocular appearance. Using the generated surfaces and images, we explain how stereo correspondence can be established with known and unknown surface geometry. We show that even with known geometry, stereo matching for specular surfaces is nontrivial because points in one eye may have zero, one, or multiple matches in the other eye. Matching features typically yield skew (nonintersecting) rays, leading to substantial ortho-epipolar components to the disparities, which makes deriving depth values from matches nontrivial. We suggest that the human visual system may base its depth estimates solely on the epipolar components of disparities while treating the ortho-epipolar components as a measure of the underlying reliability of the disparity signals. Reconstructing virtual surfaces according to these principles reveals that they are piece-wise smooth with very large discontinuities close to inflection points on the physical surface. Together, these distinctive characteristics lead to cues that the visual system could use to diagnose specular reflections from binocular information.The work was funded by the Wellcome Trust (grants 08459/Z/07/Z & 095183/Z/10/Z) and the EU Marie Curie Initial Training Network “PRISM” (FP7-PEOPLE-2012-ITN, Agreement: 316746).This is the author accepted manuscript. The final version is available from ARVO via http://dx.doi.org/10.1167/14.14.1
A non-contact geomatics technique for monitoring membrane roof structures
This thesis presents research carried out to monitor the behaviour of membrane structures, using the non-contact geornatics techniques of terrestrial laser scanning and videogrammetry. Membrane structures are covers or enclosures in which fabric surface is pre-tensioned to provide a stable shape under environmental loads. It is most often adopted by structural engineers as the solution to the roof of a building. Membrane structures resist extemally-imposed loads by a combination of curvature and tension of the highly flexible fabric membrane. However, collapse may occur if the real deflections exceed the designed tolerances. In order to avoid such failures in the future, a generic monitoring system, incorporating in-house software for observing and analysing the behaviour of existing membrane structures, was developed. This system has been applied to observe three different types of as-built membrane structures, with two primary issues investigated and resolved. The first aspect of the research was devoted to determining differences which exist between the designed model and the finished structure. To address this issue, terrestrial laser scanning was applied to generate the as-built model of the membrane structure. Statistical comparisons were then performed between the resultant scanned model and the designed mathematical model. The disparities were determined, allowing the factors causing these differences to be further explored. The second research issue investigated the effects of loading on the displacement of the membrane roof. A videogrammetric monitoring system employing stereo CCD video cameras was used to observe the movements of the membrane roofs. In order to accommodate constraints at the test site, a non-contact control method and structured light targeting were adopted in the monitoring scheme. Once the processing was completed, displacements occurring over time were determined. Investigations on the three types of finished membrane structures have been successfully achieved, proving the system to be a viable metrology tool for structural engineers involved in monitoring real-world membrane structures. The system effectively fulfilled the requirements for understanding the interaction of membrane surface geometry, applied loads and structural response. The information acquired by the system offers great potential to collaborating engineers who are involved in the design and refinement of such structures.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recommended from our members
Perceptual monocular depth estimation
Monocular depth estimation (MDE), which is the task of using a single image to predict scene depths, has gained considerable interest, in large part owing to the popularity of applying deep learning methods to solve “computer vision problems”. Monocular cues provide sufficient data for humans to instantaneously extract an understanding of scene geometries and relative depths, which is evidence of both the processing power of the human visual system and the predictive power of the monocular data. However, developing computational models to predict depth from monocular images remains challenging. Hand-designed MDE features do not perform particularly well, and even current “deep” models are still evolving. Here we propose a novel approach that uses perceptually-relevant natural scene statistics (NSS) features to predict depths from monocular images in a simple, scale-agnostic way that is competitive with state-of-the-art systems. While the statistics of natural photographic images have been successfully used in a variety of image and video processing, analysis, and quality assessment tasks, they have never been applied in a predictive end-to-end deep-learning model for monocular depth. Here we accomplish this by developing a new closed-form bivariate model of image luminances and use features extracted from this model and from other NSS models to drive a novel deep learning framework for predicting depth given a single image. We then extend our perceptually-based MDE model to fisheye images, which suffer from severe spatial distortions, and we show that our method that uses monocular cues performs comparably to our best fisheye stereo matching approach. Fisheye cameras have become increasingly popular in automotive applications, because they provide a wider (approximately 180 degrees) field-of-view (FoV), thereby giving drivers and driver assistance systems more visibility with minimal hardware. We explore fisheye stereo as it pertains to the problem of automotive surround-view (SV), specifically, which is a system comprising four fisheye cameras positioned on the front, right, rear, and left sides of a vehicle. The SV system perspectively transforms the images captured by these four cameras and stitches them together in a birdseye-view representation of the scene centered around the ego vehicle to display to the driver. With the camera axes oriented orthogonally away from each other and with each camera capturing approximately 180 degrees laterally, there exists an overlap in FoVs between adjacent cameras. It is within these regions where we have stereo vision, and can thus triangulate depths with an appropriate correspondence matching method. Each stereo system within the SV configuration has a wide baseline and two orthogonally-divergent camera axes, both of which make traditional methods for estimating stereo correspondences perform poorly. Our stereo pipeline, which relies on a neural network trained for predicting stereo correspondences, performs well even when the stereo system has limited overlap in FoVs and two dissimilar views. Our monocular approach, however, can be applied to entire fisheye images and does not rely on the underlying geometry of the stereo configuration. We compare these two depth-prediction methods in both performance and application. To explore stereo correspondence matching using fisheye images and MDE in non-fisheye images, we also generated a large-scale photorealistic synthetic database containing co-registered RGB images and depth maps using a simulated SV camera configuration. The database was first captured using fisheye cameras with known intrinsic parameters, and the fisheye distortions were then removed to create the non-fisheye portion of the database. We detail the process of creating the synthetic-but-realistic city scene in which we captured the images and depth maps along with the methodology for generating such a large, varied, and generalizable datasetElectrical and Computer Engineerin
Real-time 3d person tracking and dense stereo maps using GPU acceleration
Interfacing with a computer, especially when interacting with a virtual three di- mensional (3D) scene, found in video games for example, can be frustrating when using only a mouse and keyboard. Recent work has been focused on alternative modes of interactions, including 3D tracking of the human body. One of the essential steps in this process is acquiring depth information of the scene. Stereo vision is the process of using two separate images of the same scene, taken from slightly different positions, to get a three dimensional view of the scene. One of the largest issues with dense stereo map generation is the high processor usage, usually preventing this process from being done in real time. In order to solve this problem, this project attempts to move the bulk of the processing to the GPU. The depth map extraction is done by matching points between the images, and using the difference in their positions to determine the depth, using multiple passes in a series of openGL vertex and fragment shaders. Once a depth map has been created, the software uses it to track a person’s movement and pose in three dimensions, by tracking key points on the person across frames, and using the depth map to find the third dimension
Least squares optimization: From theory to practice
Nowadays, Nonlinear Least-Squares embodies the foundation of many Robotics and Computer Vision systems. The research community deeply investigated this topic in the last few years, and this resulted in the development of several open-source solvers to approach constantly increasing classes of problems. In this work, we propose a unified methodology to design and develop efficient Least-Squares Optimization algorithms, focusing on the structures and patterns of each specific domain. Furthermore, we present a novel open-source optimization system that addresses problems transparently with a different structure and designed to be easy to extend. The system is written in modern C++ and runs efficiently on embedded systemsWe validated our approach by conducting comparative experiments on several problems using standard datasets. The results show that our system achieves state-of-the-art performances in all tested scenarios