432 research outputs found

    Gaussian mixture model classifiers for detection and tracking in UAV video streams.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces. This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter. The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers

    Meta-KANSEI modeling with Valence-Arousal fMRI dataset of brain

    Get PDF
    Background: Traditional KANSEI methodology is an important tool in the field of psychology to comprehend the concepts and meanings; it mainly focusses on semantic differential methods. Valence-Arousal is regarded as a reflection of the KANSEI adjectives, which is the core concept in the theory of effective dimensions for brain recognition. From previous studies, it has been found that brain fMRI datasets can contain significant information related to Valence and Arousal. Methods: In this current work, a Valence-Arousal based meta-KANSEI modeling method is proposed to improve the traditional KANSEI presentation. Functional Magnetic Resonance Imaging (fMRI) was used to acquire the response dataset of Valence-Arousal of the brain in the amygdala and orbital frontal cortex respectively. In order to validate the feasibility of the proposed modeling method, the dataset was processed under dimension reduction by using Kernel Density Estimation (KDE) based segmentation and Mean Shift (MS) clustering. Furthermore, Affective Norm English Words (ANEW) by IAPS (International Affective Picture System) were used for comparison and analysis. The data sets from fMRI and ANEW under four KANSEI adjectives of angry, happy, sad and pleasant were processed by the Fuzzy C-Means (FCM) algorithm. Finally, a defined distance based on similarity computing was adopted for these two data sets. Results: The results illustrate that the proposed model is feasible and has better stability per the normal distribution plotting of the distance. The effectiveness of the experimental methods proposed in the current work was higher than in the literature. Conclusions: mean shift can be used to cluster and central points based meta-KANSEI model combining with the advantages of a variety of existing intelligent processing methods are expected to shift the KANSEI Engineering (KE) research into the medical imaging field

    4-D Tomographic Inference: Application to SPECT and MR-driven PET

    Get PDF
    Emission tomographic imaging is framed in the Bayesian and information theoretic framework. The first part of the thesis is inspired by the new possibilities offered by PET-MR systems, formulating models and algorithms for 4-D tomography and for the integration of information from multiple imaging modalities. The second part of the thesis extends the models described in the first part, focusing on the imaging hardware. Three key aspects for the design of new imaging systems are investigated: criteria and efficient algorithms for the optimisation and real-time adaptation of the parameters of the imaging hardware; learning the characteristics of the imaging hardware; exploiting the rich information provided by depthof- interaction (DOI) and energy resolving devices. The document concludes with the description of the NiftyRec software toolkit, developed to enable 4-D multi-modal tomographic inference

    Estimation of the Image Quality in Emission Tomography: Application to Optimization of SPECT System Design

    Get PDF
    In Emission Tomography the design of the Imaging System has a great influence on the quality of the output image. Optimisation of the system design is a difficult problem due to the computational complexity and to the challenges in its mathematical formulation. In order to compare different system designs, an efficient and effective method to calculate the Image Quality is needed. In this thesis the statistical and deterministic methods for the calculation of the uncertainty in the reconstruction are presented. In the deterministic case, the Fisher Information Matrix (FIM) formalism can be employed to characterize such uncertainty. Unfortunately, computing, storing and inverting the FIM is not feasible with 3D imaging systems. In order to tackle the problem of the computational load in calculating the inverse of the FIM a novel approximation, that relies on a sub-sampling of the FIM, is proposed. The FIM is calculated over a subset of voxels arranged in a grid that covers the whole volume. This formulation reduces the computational complexity in inverting the FIM but nevertheless accounts for the global interdependence between the variables, for the acquisition geometry and for the object dependency. Using this approach, the noise properties as a function of the system geometry parameterisation were investigated for three different cases. In the first study, the design of a parallel-hole collimator for SPECT is optimised. The new method can be applied to evaluating problems like trading-off collimator resolution and sensitivity. In the second study, the reconstructed image quality was evaluated in the case of truncated projection data; showing how the subsampling approach is very accurate for evaluating the effects of missing data. Finally, the noise properties of a D-SPECT system were studied for varying acquisition protocols; showing how the new method is well-suited to problems like optimising adaptive data sampling schemes

    Development of a Tomographic Atmospheric Monitoring System based on Differential Optical Absorption Spectroscopy

    Get PDF
    The aim of this thesis is to describe the design and development of a proof of concept for a commercially viable large area atmospheric analysis tool, for use in trace gas concentration mapping and quanti cation. Atmospheric monitoring is a very well researched eld, with dozens of available analytical systems and subsystems. However, current systems require a very important compromise between spatial and operational complexity. We address this issue asking how we could integrate the Di erential Optical Absorption Spectroscopy (DOAS) atmospheric analysis technique in a Unmanned Aerial Vehicle (UAV) with tomographic capabilities. Using a two-part methodology, I proposed two hypotheses for proving the possibility of a miniaturised tomographic system, both related to how the spectroscopic data is acquired. The rst hypothesis addresses the projection forming aspect of the acquisition, its matrix assembly and the resolution of the consequent equations. This hypothesis was con rmed theoretically by the development of a simulation platform for the reconstruction of a trace gas concentration mapping. The second hypothesis deals with the way in which data is collected in spectroscopic terms. I proposed that with currently available equipment, it should be possible to leverage a consequence of the Beer-Lambert law to produce molecular density elds for trace gases using passiveDOAS. This hypothesiswas partially con rmed, with de nite conclusions being possible only through the use of complex autonomous systems for improved accuracy. This work has been a very important rst step in the establishment of DOAS tomography as a commercially viable solution for atmospheric monitoring, although further studies are required for de nite results. Moreover, this thesis has conducted to the development of a DOAS software library for Python that is currently being used in a production environment. Finally, it is important to mention that two journal articles were published from pursuing this work, both in important journals with Impact Factors over 3.0.Era o objectivo deste trabalho descrever o processo de desenho e implementação de uma prova de conceito para um sistema de avaliação atmosférica comercialmente viável, para uso no mapeamento das concentrações de compostos traço na atmosfera. A avaliação atmosférica é um campo muito estudado, estando no presente momento disponíveis para instalação diversos sistemas e subsistemas com estas capacidades. No entanto, é marcante o compromisso que se veri ca entre a resolução espacial e a complexidade operacional destes equipamentos. Nesta tese, desa o este problema e levanto a questão sobre como se poderia desenvolver um sistema com os mesmos ns, mas sem este premente compromisso. Usando uma metodologia a duas partes, proponho duas hipóteses para comprovar a exequibilidade deste sistema. A primeira diz respeito à formação da matriz tomográ ca e à resolução das equações que dela derivam e que formam a imagem que se pretende. Con rmei esta hipótese teoricamente através do desenvolvimento de uma plataforma de simulação para a reconstrução tomográ ca de um campo de concentrações fantoma. A segunda é dirigida a aquisição de dados espectroscópicos. Proponho que com o material presentemente disponível comercialmente, deverá ser possível aproveitar uma consequência da lei de Beer-Lambert para retirar os valores de concentração molecular de gases traço na atmosfera. Foi apenas possível validar esta hipótese parcialmente, sendo que resultados mais conclusivos necessitariam de equipamentos automatizados dos quais não foi possível dispôr. No nal, este trabalho constitui um importante primeiro passo no estabelecimento da técnica de DOAS tomográ co como uma alternativa comercialmente viável para a análise atmosférica. Ademais, o desenvolvimento desta tese levou à escrita de uma biblioteca em Python para análise de dados DOAS actualmente usada em ambiente de produção. Por m, importa realçar que dos trabalhos realizados no decorrer da tese foram publicados dois artigos em revistas cientí cas com Impact Factor acima de 3

    Object detection, recognition and re-identification in video footage

    Get PDF
    There has been a significant number of security concerns in recent times; as a result, security cameras have been installed to monitor activities and to prevent crimes in most public places. These analysis are done either through video analytic or forensic analysis operations on human observations. To this end, within the research context of this thesis, a proactive machine vision based military recognition system has been developed to help monitor activities in the military environment. The proposed object detection, recognition and re-identification systems have been presented in this thesis. A novel technique for military personnel recognition is presented in this thesis. Initially the detected camouflaged personnel are segmented using a grabcut segmentation algorithm. Since in general a camouflaged personnel's uniform appears to be similar both at the top and the bottom of the body, an image patch is initially extracted from the segmented foreground image and used as the region of interest. Subsequently the colour and texture features are extracted from each patch and used for classification. A second approach for personnel recognition is proposed through the recognition of the badge on the cap of a military person. A feature matching metric based on the extracted Speed Up Robust Features (SURF) from the badge on a personnel's cap enabled the recognition of the personnel's arm of service. A state-of-the-art technique for recognising vehicle types irrespective of their view angle is also presented in this thesis. Vehicles are initially detected and segmented using a Gaussian Mixture Model (GMM) based foreground/background segmentation algorithm. A Canny Edge Detection (CED) stage, followed by morphological operations are used as pre-processing stage to help enhance foreground vehicular object detection and segmentation. Subsequently, Region, Histogram Oriented Gradient (HOG) and Local Binary Pattern (LBP) features are extracted from the refined foreground vehicle object and used as features for vehicle type recognition. Two different datasets with variant views of front/rear and angle are used and combined for testing the proposed technique. For night-time video analytics and forensics, the thesis presents a novel approach to pedestrian detection and vehicle type recognition. A novel feature acquisition technique named, CENTROG, is proposed for pedestrian detection and vehicle type recognition in this thesis. Thermal images containing pedestrians and vehicular objects are used to analyse the performance of the proposed algorithms. The video is initially segmented using a GMM based foreground object segmentation algorithm. A CED based pre-processing step is used to enhance segmentation accuracy prior using Census Transforms for initial feature extraction. HOG features are then extracted from the Census transformed images and used for detection and recognition respectively of human and vehicular objects in thermal images. Finally, a novel technique for people re-identification is proposed in this thesis based on using low-level colour features and mid-level attributes. The low-level colour histogram bin values were normalised to 0 and 1. A publicly available dataset (VIPeR) and a self constructed dataset have been used in the experiments conducted with 7 clothing attributes and low-level colour histogram features. These 7 attributes are detected using features extracted from 5 different regions of a detected human object using an SVM classifier. The low-level colour features were extracted from the regions of a detected human object. These 5 regions are obtained by human object segmentation and subsequent body part sub-division. People are re-identified by computing the Euclidean distance between a probe and the gallery image sets. The experiments conducted using SVM classifier and Euclidean distance has proven that the proposed techniques attained all of the aforementioned goals. The colour and texture features proposed for camouflage military personnel recognition surpasses the state-of-the-art methods. Similarly, experiments prove that combining features performed best when recognising vehicles in different views subsequent to initial training based on multi-views. In the same vein, the proposed CENTROG technique performed better than the state-of-the-art CENTRIST technique for both pedestrian detection and vehicle type recognition at night-time using thermal images. Finally, we show that the proposed 7 mid-level attributes and the low-level features results in improved performance accuracy for people re-identification

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

    Semi-automated learning strategies for large-scale segmentation of histology and other big bioimaging stacks and volumes

    Get PDF
    Labelled high-resolution datasets are becoming increasingly common and necessary in different areas of biomedical imaging. Examples include: serial histology and ex-vivo MRI for atlas building, OCT for studying the human brain, and micro X-ray for tissue engineering. Labelling such datasets, typically, requires manual delineation of a very detailed set of regions of interest on a large number of sections or slices. This process is tedious, time-consuming, not reproducible and rather inefficient due to the high similarity of adjacent sections. In this thesis, I explore the potential of a semi-automated slice level segmentation framework and a suggestive region level framework which aim to speed up the segmentation process of big bioimaging datasets. The thesis includes two well validated, published, and widely used novel methods and one algorithm which did not yield an improvement compared to the current state-of the-art. The slice-wise method, SmartInterpol, consists of a probabilistic model for semi-automated segmentation of stacks of 2D images, in which the user manually labels a sparse set of sections (e.g., one every n sections), and lets the algorithm complete the segmentation for other sections automatically. The proposed model integrates in a principled manner two families of segmentation techniques that have been very successful in brain imaging: multi-atlas segmentation and convolutional neural networks. Labelling every structure on a sparse set of slices is not necessarily optimal, therefore I also introduce a region level active learning framework which requires the labeller to annotate one region of interest on one slice at the time. The framework exploits partial annotations, weak supervision, and realistic estimates of class and section-specific annotation effort in order to greatly reduce the time it takes to produce accurate segmentations for large histological datasets. Although both frameworks have been created targeting histological datasets, they have been successfully applied to other big bioimaging datasets, reducing labelling effort by up to 60−70% without compromising accuracy
    corecore