4,894 research outputs found

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration

    Spectrum radial velocity analyser (SERVAL). High-precision radial velocities and two alternative spectral indicators

    Full text link
    Context: The CARMENES survey is a high-precision radial velocity (RV) programme that aims to detect Earth-like planets orbiting low-mass stars. Aims: We develop least-squares fitting algorithms to derive the RVs and additional spectral diagnostics implemented in the SpEctrum Radial Velocity Analyser (SERVAL), a publicly available python code. Methods: We measured the RVs using high signal-to-noise templates created by coadding all available spectra of each star.We define the chromatic index as the RV gradient as a function of wavelength with the RVs measured in the echelle orders. Additionally, we computed the differential line width by correlating the fit residuals with the second derivative of the template to track variations in the stellar line width. Results: Using HARPS data, our SERVAL code achieves a RV precision at the level of 1m/s. Applying the chromatic index to CARMENES data of the active star YZ CMi, we identify apparent RV variations induced by stellar activity. The differential line width is found to be an alternative indicator to the commonly used full width half maximum. Conclusions: We find that at the red optical wavelengths (700--900 nm) obtained by the visual channel of CARMENES, the chromatic index is an excellent tool to investigate stellar active regions and to identify and perhaps even correct for activity-induced RV variations.Comment: 13 pages, 13 figures. A&A in press. Code is available at https://github.com/mzechmeister/serva

    Single View Reconstruction for Human Face and Motion with Priors

    Get PDF
    Single view reconstruction is fundamentally an under-constrained problem. We aim to develop new approaches to model human face and motion with model priors that restrict the space of possible solutions. First, we develop a novel approach to recover the 3D shape from a single view image under challenging conditions, such as large variations in illumination and pose. The problem is addressed by employing the techniques of non-linear manifold embedding and alignment. Specifically, the local image models for each patch of facial images and the local surface models for each patch of 3D shape are learned using a non-linear dimensionality reduction technique, and the correspondences between these local models are then learned by a manifold alignment method. Local models successfully remove the dependency of large training databases for human face modeling. By combining the local shapes, the global shape of a face can be reconstructed directly from a single linear system of equations via least square. Unfortunately, this learning-based approach cannot be successfully applied to the problem of human motion modeling due to the internal and external variations in single view video-based marker-less motion capture. Therefore, we introduce a new model-based approach for capturing human motion using a stream of depth images from a single depth sensor. While a depth sensor provides metric 3D information, using a single sensor, instead of a camera array, results in a view-dependent and incomplete measurement of object motion. We develop a novel two-stage template fitting algorithm that is invariant to subject size and view-point variations, and robust to occlusions. Starting from a known pose, our algorithm first estimates a body configuration through temporal registration, which is used to search the template motion database for a best match. The best match body configuration as well as its corresponding surface mesh model are deformed to fit the input depth map, filling in the part that is occluded from the input and compensating for differences in pose and body-size between the input image and the template. Our approach does not require any makers, user-interaction, or appearance-based tracking. Experiments show that our approaches can achieve good modeling results for human face and motion, and are capable of dealing with variety of challenges in single view reconstruction, e.g., occlusion

    Pattern Recognition and Event Reconstruction in Particle Physics Experiments

    Full text link
    This report reviews methods of pattern recognition and event reconstruction used in modern high energy physics experiments. After a brief introduction into general concepts of particle detectors and statistical evaluation, different approaches in global and local methods of track pattern recognition are reviewed with their typical strengths and shortcomings. The emphasis is then moved to methods which estimate the particle properties from the signals which pattern recognition has associated. Finally, the global reconstruction of the event is briefly addressed.Comment: 101 pages, 58 figure
    corecore