25 research outputs found

    Image usefulness of compressed surveillance footage with different scene contents

    Get PDF
    The police use both subjective (i.e. police staff) and automated (e.g. face recognition systems) methods for the completion of visual tasks (e.g person identification). Image quality for police tasks has been defined as the image usefulness, or image suitability of the visual material to satisfy a visual task. It is not necessarily affected by any artefact that may affect the visual image quality (i.e. decrease fidelity), as long as these artefacts do not affect the relevant useful information for the task. The capture of useful information will be affected by the unconstrained conditions commonly encountered by CCTV systems such as variations in illumination and high compression levels. The main aim of this thesis is to investigate aspects of image quality and video compression that may affect the completion of police visual tasks/applications with respect to CCTV imagery. This is accomplished by investigating 3 specific police areas/tasks utilising: 1) the human visual system (HVS) for a face recognition task, 2) automated face recognition systems, and 3) automated human detection systems. These systems (HVS and automated) were assessed with defined scene content properties, and video compression, i.e. H.264/MPEG-4 AVC. The performance of imaging systems/processes (e.g. subjective investigations, performance of compression algorithms) are affected by scene content properties. No other investigation has been identified that takes into consideration scene content properties to the same extend. Results have shown that the HVS is more sensitive to compression effects in comparison to the automated systems. In automated face recognition systems, `mixed lightness' scenes were the most affected and `low lightness' scenes were the least affected by compression. In contrast the HVS for the face recognition task, `low lightness' scenes were the most affected and `medium lightness' scenes the least affected. For the automated human detection systems, `close distance' and `run approach' are some of the most commonly affected scenes. Findings have the potential to broaden the methods used for testing imaging systems for security applications

    Linking Spatial Video and GIS

    Get PDF
    Spatial Video is any form of geographically referenced videographic data. The forms in which it is acquired, stored and used vary enormously; as does the standard of accuracy in the spatial data and the quality of the video footage. This research deals with a specific form of Spatial Video where these data have been captured from a moving road-network survey vehicle. The spatial data are GPS sentences while the video orientation is approximately orthogonal and coincident with the direction of travel. GIS that use these data are usually bespoke standalone systems or third party extensions to existing platforms. They specialise in using the video as a visual enhancement with limited spatial functionality and interoperability. While enormous amounts of these data exist, they do not have a generalised, cross-platform spatial data structure that is suitable for use within a GIS. The objectives of this research have been to define, develop and implement a novel Spatial Video data structure and demonstrate how this can achieve a spatial approach to the study of video. This data structure is called a Viewpoint and represents the capture location and geographical extent of each video frame. It is generalised to represent any form or format of Spatial Video. It is shown how a Viewpoint improves on existing data structure methodologies and how it can be theoretically defined in 3D space. A 2D implementation is then developed where Viewpoints are constructed from the spatial and camera parameters of each survey in the study area. A number of problems are defined and solutions provided towards the implementation of a post-processing system to calculate, index and store each video frame Viewpoint in a centralised spatial database. From this spatial database a number of geospatial analysis approaches are demonstrated that represent novel ways of using and studying Spatial Video based on the Viewpoint data structure. Also, a unique application is developed where the Viewpoints are used as a spatial control to dynamically access and play video in a location aware system. While video has been to date largely ignored as a GIS spatial data source; it is shown through this novel Viewpoint implementation and the geospatial analysis demonstrations that this need not be the case anymore

    Image-based 3-D reconstruction of constrained environments

    Get PDF
    Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions.Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions

    A video summarisation system for post-production

    Get PDF
    Post-production facilities deal with large amounts of digital video, which presents difficulties when tracking, managing and searching this material. Recent research work in image and video analysis promises to offer help in these tasks, but there is a gap between what these systems can provide and what users actually need. In particular the popular research models for indexing and retrieving visual data do not fit well with how users actually work. In this thesis we explore how image and video analysis can be applied to an online video collection to assist users in reviewing and searching for material faster, rather than purporting to do it for them. We introduce a framework for automatically generating static 2-dimen- sional storyboards from video sequences. The storyboard consists of a series of frames, one for each shot in the sequence, showing the principal objects and motions of the shot. The storyboards are rendered as vector images in a familiar comic book style, allowing them to be quickly viewed and understood. The process consists of three distinct steps: shot-change detection, object segmentation, and presentation. The nature of the video material encountered in a post-production fa- cility is quite different from other material such as television programmes. Video sequences such as commercials and music videos are highly dy- namic with very short shots, rapid transitions and ambiguous edits. Video is often heavily manipulated, causing difficulties for many video processing techniques. We study the performance of a variety of published shot-change de- tection algorithms on the type of highly dynamic video typically encoun- tered in post-production work. Finding their performance disappointing, we develop a novel algorithm for detecting cuts and fades that operates directly on Motion-JPEG compressed video, exploiting the DCT coeffi- cients to save computation. The algorithm shows superior performance on highly dynamic material while performing comparably to previous algorithms on other material

    Study on quality in 3D digitisation of tangible cultural heritage: mapping parameters, formats, standards, benchmarks, methodologies and guidelines: final study report.

    Get PDF
    This study was commissioned by the Commission to help advance 3D digitisation across Europe and thereby to support the objectives of the Recommendation on a common European data space for cultural heritage (C(2021) 7953 final), adopted on 10 November 2021. The Recommendation encourages Member States to set up digital strategies for cultural heritage, which sets clear digitisation and digital preservation goals aiming at higher quality through the use of advanced technologies, notably 3D. The aim of the study is to map the parameters, formats, standards, benchmarks, methodologies and guidelines relating to 3D digitisation of tangible cultural heritage. The overall objective is to further the quality of 3D digitisation projects by enabling cultural heritage professionals, institutions, content-developers, stakeholders and academics to define and produce high-quality digitisation standards for tangible cultural heritage. This unique study identifies key parameters of the digitisation process, estimates the relative complexity and how it is linked to technology, its impact on quality and its various factors. It also identifies standards and formats used for 3D digitisation, including data types, data formats and metadata schemas for 3D structures. Finally, the study forecasts the potential impacts of future technological advances on 3D digitisation

    Biometric fusion methods for adaptive face recognition in computer vision

    Get PDF
    PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

    Patterns and Pattern Languages for Mobile Augmented Reality

    Get PDF
    Mixed Reality is a relatively new field in computer science which uses technology as a medium to provide modified or enhanced views of reality or to virtually generate a new reality. Augmented Reality is a branch of Mixed Reality which blends the real-world as viewed through a computer interface with virtual objects generated by a computer. The 21st century commodification of mobile devices with multi-core Central Processing Units, Graphics Processing Units, high definition displays and multiple sensors controlled by capable Operating Systems such as Android and iOS means that Mobile Augmented Reality applications have become increasingly feasible. Mobile Augmented Reality is a multi-disciplinary field requiring a synthesis of many technologies such as computer graphics, computer vision, machine learning and mobile device programming while also requiring theoretical knowledge of diverse fields such as Linear Algebra, Projective and Differential Geometry, Probability and Optimisation. This multi-disciplinary nature has led to a fragmentation of knowledge into various specialisations, making it difficult to integrate different solution components into a coherent architecture. Software design patterns provide a solution space of tried and tested best practices for a specified problem within a given context. The solution space is non-prescriptive and is described in terms of relationships between roles that can be assigned to software components. Architectural patterns are used to specify high level designs of complete systems, as opposed to domain or tactical level patterns that address specific lower level problem areas. Pattern Languages comprise multiple software patterns combining in multiple possible sequences to form a language with the individual patterns forming the language vocabulary while the valid sequences through the patterns define the grammar. Pattern Languages provide flexible generalised solutions within a particular domain that can be customised to solve problems of differing characteristics and levels of iii complexity within the domain. The specification of one or more Pattern Languages tailored to the Mobile Augmented Reality domain can therefore provide a generalised guide for the design and architecture of Mobile Augmented Reality applications from an architectural level down to the ”nuts-and-bolts” implementation level. While there is a large body of research into the technical specialisations pertaining to Mobile Augmented Reality, there is a dearth of up-to-date literature covering Mobile Augmented Reality design. This thesis fills this vacuum by: 1. Providing architectural patterns that provide the spine on which the design of Mobile Augmented Reality artefacts can be based; 2. Documenting existing patterns within the context of Mobile Augmented Reality; 3. Identifying new patterns specific to Mobile Augmented Reality; and 4. Combining the patterns into Pattern Languages for Detection & Tracking, Rendering & Interaction and Data Access for Mobile Augmented Reality. The resulting Pattern Languages support design at multiple levels of complexity from an object-oriented framework down to specific one-off Augmented Reality applications. The practical contribution of this thesis is the specification of architectural patterns and Pattern Language that provide a unified design approach for both the overall architecture and the detailed design of Mobile Augmented Reality artefacts. The theoretical contribution is a design theory for Mobile Augmented Reality gleaned from the extraction of patterns and creation of a pattern language or languages

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion

    Depth Image-Based Rendering for Full Parallax Displays: Rendering, Compression, and Interpolation of Content for Autostereoscopic Poster and Video Displays

    Get PDF
    Advancements in production and display techniques allowed for novel displays to emerge that project a high-resolution light field for static poster content and video content, as well. These displays allow a full parallax, hence an audience can perceive a stereoscopic view of a scene without special glasses, which adjusts to the observer's position. The application of such displays are public places where the audience does not wear special glasses and is not restricted in movement. The rendering, storage, and transfer of the large amount of data required by those displays is a challenge. The image data for a static poster display is about 200 GB and the data rate for video displays are to be expected two to four orders of magnitude higher than HDTV. In this work the challenges are met by utilising DIBR to reduce the amount of data at the very beginning, during rendering. A fraction of the full amount of colour and depth images are rendered and used to interpolate the full data set. The rendering with state of the art ray tracers is described and a novel method to render image data for full parallax displays using OpenGL is contributed, that addresses some shortcomings of previous approaches. For static poster displays a scene based representation for image interpolation is introduced, which efficiently utilises multi-core processors and graphics hardware for parallelization, found on modern workstations. The introduced approach implements lossy compression of the input data, and handles arbitrary scenes, using a novel BNV selection algorithm. For video displays the real-time constraint does not allow for a costly interpolation or scene analysis. Hence, a novel approach is presented that uses a basic and computational inexpensive interpolation, and combines the interpolation results of different image representations without introducing prominent artefacts

    Groupwise non-rigid registration for automatic construction of appearance models of the human craniofacial complex for analysis, synthesis and simulation

    Get PDF
    Finally, a novel application of 3D appearance modelling is proposed: a faster than real-time algorithm for statistically constrained quasi-mechanical simulation. Experiments demonstrate superior realism, achieved in the proposed method by employing statistical appearance models to drive the simulation, in comparison with the comparable state-of-the-art quasi-mechanical approaches.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore