563 research outputs found

    Focus of attention and region segregation by low-level geometry

    Get PDF
    Research has shown that regions with conspicuous colours are very effective in attracting attention, and that regions with different textures also play an important role. We present a biologically plausible model to obtain a saliency map for Focus-of-Attention (FoA), based on colour and texture boundaries. By applying grouping cells which are devoted to low-level geometry, boundary information can be completed such that segregated regions are obtained. Furthermore, we show that low-level geometry, in addition to rendering filled regions, provides important local cues like corners, bars and blobs for region categorisation. The integration of FoA,region segregation and categorisation is important for developing fast gist vision, i.e., which types of objects are about where in a scene

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Get PDF
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    Developing serious games for cultural heritage: a state-of-the-art review

    Get PDF
    Although the widespread use of gaming for leisure purposes has been well documented, the use of games to support cultural heritage purposes, such as historical teaching and learning, or for enhancing museum visits, has been less well considered. The state-of-the-art in serious game technology is identical to that of the state-of-the-art in entertainment games technology. As a result, the field of serious heritage games concerns itself with recent advances in computer games, real-time computer graphics, virtual and augmented reality and artificial intelligence. On the other hand, the main strengths of serious gaming applications may be generalised as being in the areas of communication, visual expression of information, collaboration mechanisms, interactivity and entertainment. In this report, we will focus on the state-of-the-art with respect to the theories, methods and technologies used in serious heritage games. We provide an overview of existing literature of relevance to the domain, discuss the strengths and weaknesses of the described methods and point out unsolved problems and challenges. In addition, several case studies illustrating the application of methods and technologies used in cultural heritage are presented

    3D visualization of cadastre : assessing the suitability of visual variables and enhancement techniques in the 3D model of condominium property units

    Get PDF
    La visualisation 3D de données cadastrales a été exploitée dans de nombreuses études, car elle offre de nouvelles possibilités d’examiner des situations de supervision verticale des propriétés. Les chercheurs actifs dans ce domaine estiment que la visualisation 3D pourrait fournir aux utilisateurs une compréhension plus intuitive d’une situation où des propriétés se superposent, ainsi qu’une plus grande capacité et avec moins d’ambiguïté de montrer des problèmes potentiels de chevauchement des unités de propriété. Cependant, la visualisation 3D est une approche qui apporte de nombreux défis par rapport à la visualisation 2D. Les précédentes recherches effectuées en cadastre 3D, et qui utilisent la visualisation 3D, ont très peu enquêté l’impact du choix des variables visuelles (ex. couleur, style) sur la prise de décision. Dans l’optique d'améliorer la visualisation 3D de données cadastres, cette thèse de doctorat examine l’adéquation du choix des variables visuelles et des techniques de rehaussement associées afin de produire un modèle de condominium 3D optimal, et ce, en fonction de certaines tâches spécifiques de visualisation. Les tâches visées sont celles dédiées à la compréhension dans l’espace 3D des limites de propriété du condominium. En ce sens, ce sont principalement des tâches notariales qui ont été ciblées. De plus, cette thèse va mettre en lumière les différences de l’impact des variables visuelles entre une visualisation 2D et 3D. Cette thèse identifie dans un premier temps un cadre théorique pour l'interprétation des variables visuelles dans le contexte d’une visualisation 3D et de données cadastrales au regard d’une revue de littéraire. Dans un deuxième temps, des expérimentations ont été réalisées afin de mettre à l’épreuve la performance des variables visuelles (ex. couleur, valeur, texture) et des techniques de rehaussement (transparence, annotation, déplacement). Trois approches distinctes ont été utilisées : 1) discussion directe avec des personnes œuvrant en géomatique, 2) entrevue face à face avec des notaires et 3) questionnaire en ligne avec des groupes ciblés. L’utilisabilité mesurée en termes d’efficacité, d’efficience et de degré de satisfaction a servi aux comparaisons des expérimentations. Les principaux résultats de cette recherche sont : 1) Une liste de tâches visuelles notariales utiles à la délimitation des unités de propriété dans le contexte de la visualisation 3D de condominium ; 2) Des recommandations quant à l'adéquation de huit variables visuelles et de trois techniques de rehaussement afin d’optimiser la réalisation d’un certain nombre de tâches notariales ; 3) Une analyse comparative de la performance de ces variables entre une visualisation 2D et 3D.3D visualization is being widely used in GIS (geographic information system) and CAD (computer-aided design) applications. It has also been introduced in cadastre studies to better communicate overlaps to the viewer, where the property units vertically stretch over or cover one part of the land parcel. Researchers believe that 3D visualization could provide viewers with a more intuitive perception, and it has the capability to demonstrate overlapping property units in condominiums unambiguously. However, 3D visualization has many challenges compared with 2D visualization. Many cadastre researchers adopted 3D visualization without thoroughly investigating the potential users, the visual tasks for decision-making, and the appropriateness of their representation design. Neither designers nor users may be aware of the risk of producing an inadequate 3D visualization, especially in an era when 3D visualization is relatively novel in the cadastre domain. With a general aim to improve the 3D visualization of cadastre data, this dissertation addresses the design of the 3D cadastre model from a graphics semiotics viewpoint including visual variables and enhancement techniques. The research questions are, firstly, what is the suitability of the visual variables and enhancement techniques in the 3D cadastre model to support the intended users' decision-making goal of delimitating condominium property units, and secondly, what are the perceptual properties of visual variables in 3D visualization compared with 2D visualization? This dissertation firstly identifies the theoretical framework for the interpretation of visual variables in 3D visualization as well as cadastre-related knowledge with literature review. Then, we carry out a preliminary evaluation of the feasibility of visual variables and enhancement techniques in a form of an expert-group review. With the result of the preliminary evaluation, this research then performs the hypothetico-deductive scientific approach to establishing a list of hypotheses to be validated by empirical tests regarding the suitability of visual variables and enhancement techniques in a cartographic representation of property units in condominiums for 3D visualization. The evaluation is based on the usability specification, which contains three measurements: effectiveness, efficiency, and preference. Several empirical tests are conducted with cadastral users in the forms of face-to-face interviews and online questionnaires, followed by statistical analysis. Size, shape, brightness, saturation, hue, orientation, texture, and transparency are the most discussed and used visual variables in existing cartographic research and implementations; thus, these eight visual variables have been involved in the tests. Their perceptual properties exhibited in the empirical test with concrete 3D models in this work are compared with those in a 2D visualization, which is derived from a literature-based synthesis. Three enhancement techniques, including labeling, 3D explosion, and highlighting, are tested as well. There are three main outcomes of this work. First, we established a list of visual tasks adapted to notaries for delimiting property units in the context of 3D visualization of condominium cadastres. Second, we describe the suitability of eight visual variables (Size, Shape, Brightness, Saturation, Hue, Orientation, Texture, and Transparency) of the property units and three enhancement techniques (labeling, 3D explosion and highlighting) in the context of 3D visualisation of condominium property units, based on the usability specification for delimitating visual tasks. For example, brightness only shows good performance in helping users distinguish private and common parts in the context of 3D visualization of property units in condominiums. As well, color hue and saturation are effective and preferable. The third outcome is a statement of the perceptual properties’ differences of visual variables between 3D visualization and 2D visualization. For example, according to Bertin (1983)’s definition, orientation is associative and selective in 2D, yet it does not perform in a 3D visualization. In addition, 3D visualization affects the performance of brightness, making it marginally dissociative and selective

    Image-based 3-D reconstruction of constrained environments

    Get PDF
    Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions.Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions

    Correcting inter-sectional accuracy differences in drowsiness detection systems using generative adversarial networks (GANs)

    Get PDF
    Doctoral Degrees. University of KwaZulu-Natal, Durban.oad accidents contribute to many injuries and deaths among the human population. There is substantial evidence that proves drowsiness is one of the most prominent causes of road accidents all over the world. This results in fatalities and severe injuries for drivers, passengers, and pedestrians. These alarming facts are raising the interest in equipping vehicles with robust driver drowsiness detection systems to minimise accident rates. One of the primary concerns of motor industries is the safety of passengers and as a consequence they have invested significantly in research and development to equip vehicles with systems that can help minimise to road accidents. A number research endeavours have attempted to use Artificial intelligence, and particularly Deep Neural Networks (DNN), to build intelligent systems that can detect drowsiness automatically. However, datasets are crucial when training a DNN. When datasets are unrepresentative, trained models are prone to bias because they are unable to generalise. This is particularly problematic for models trained in specific cultural contexts, which may not represent a wide range of races, and thus fail to generalise. This is a specific challenge for driver drowsiness detection task, where most publicly available datasets are unrepresentative as they cover only certain ethnicity groups. This thesis investigates the problem of an unrepresentative dataset in the training phase of Convolutional Neural Networks (CNNs) models. Firstly, CNNs are compared with several machine learning techniques to establish their superior suitability for the driver drowsiness detection task. An investigation into the implementation of CNNs was performed and highlighted that publicly available datasets such as NTHU, DROZY and CEW do not represent a wide spectrum of ethnicity groups and lead to biased systems. A population bias visualisation technique was proposed to help identify the regions, or individuals where a model is failing to generalise on a picture grid. Furthermore, the use of Generative Adversarial Networks (GANs) with lightweight convolutions called Depthwise Separable Convolutions (DSC) for image translation to multi-domain outputs was investigated in an attempt to generate synthetic datasets. This thesis further showed that GANs can be used to generate more realistic images with varied facial attributes for predicting drowsiness across multiple ethnicity groups. Lastly, a novel framework was developed to detect bias and correct it using synthetic generated images which are produced by GANs. Training models using this framework results in a substantial performance boost
    corecore