19,025 research outputs found

    Automatic detection and extraction of artificial text in video

    Get PDF
    A significant challenge in large multimedia databases is the provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion algorithm [1][2] is enhanced using smoothing and multiple binarisation. The output text is input to an “off-the-shelf” noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented

    Colour consistency in computer vision : a multiple image dynamic exposure colour classification system : a thesis presented to the Institute of Natural and Mathematical Sciences in fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University, Albany, Auckland, New Zealand

    Get PDF
    Colour classification vision systems face difficulty when a scene contains both very bright and dark regions. An indistinguishable colour at one exposure may be distinguishable at another. The use of multiple cameras with varying levels of sensitivity is explored in this thesis, aiding the classification of colours in scenes with high illumination ranges. Titled the Multiple Image Dynamic Exposure Colour Classification (MIDECC) System, pie-slice classifiers are optimised for normalised red/green and cyan/magenta colour spaces. The MIDECC system finds a limited section of hyperspace for each classifier, resulting in a process which requires minimal manual input with the ability to filter background samples without specialised training. In experimental implementation, automatic multiple-camera exposure, data sampling, training and colour space evaluation to recognise 8 target colours across 14 different lighting scenarios is processed in approximately 30 seconds. The system provides computationally effective training and classification, outputting an overall true positive score of 92.4% with an illumination range between bright and dim regions of 880 lux. False positive classifications are minimised to 4.24%, assisted by heuristic background filtering. The limited search space classifiers and layout of the colour spaces ensures the MIDECC system is less likely to classify dissimilar colours, requiring a certain ‘confidence’ level before a match is outputted. Unfortunately the system struggles to classify colours under extremely bright illumination due to the simplistic classification building technique. Results are compared to the common machine learning algorithms Naïve Bayes, Neural Networks, Random Tree and C4.5 Tree Classifiers. These algorithms return greater than 98.5% true positives and less than 1.53% false positives, with Random Tree and Naïve Bayes providing the best and worst comparable algorithms, respectively. Although resulting in a lower classification rate, the MIDECC system trains with minimal user input, ignores background and untrained samples when classifying and trains faster than most of the studied machine learning algorithms.Colour classification vision systems face difficulty when a scene contains both very bright and dark regions. An indistinguishable colour at one exposure may be distinguishable at another. The use of multiple cameras with varying levels of sensitivity is explored in this thesis, aiding the classification of colours in scenes with high illumination ranges. Titled the Multiple Image Dynamic Exposure Colour Classification (MIDECC) System, pie-slice classifiers are optimised for normalised red/green and cyan/magenta colour spaces. The MIDECC system finds a limited section of hyperspace for each classifier, resulting in a process which requires minimal manual input with the ability to filter background samples without specialised training. In experimental implementation, automatic multiple-camera exposure, data sampling, training and colour space evaluation to recognise 8 target colours across 14 different lighting scenarios is processed in approximately 30 seconds. The system provides computationally effective training and classification, outputting an overall true positive score of 92.4% with an illumination range between bright and dim regions of 880 lux. False positive classifications are minimised to 4.24%, assisted by heuristic background filtering. The limited search space classifiers and layout of the colour spaces ensures the MIDECC system is less likely to classify dissimilar colours, requiring a certain ‘confidence’ level before a match is outputted. Unfortunately the system struggles to classify colours under extremely bright illumination due to the simplistic classification building technique. Results are compared to the common machine learning algorithms Naïve Bayes, Neural Networks, Random Tree and C4.5 Tree Classifiers. These algorithms return greater than 98.5% true positives and less than 1.53% false positives, with Random Tree and Naïve Bayes providing the best and worst comparable algorithms, respectively. Although resulting in a lower classification rate, the MIDECC system trains with minimal user input, ignores background and untrained samples when classifying and trains faster than most of the studied machine learning algorithms

    A preliminary approach to intelligent x-ray imaging for baggage inspection at airports

    Get PDF
    Identifying explosives in baggage at airports relies on being able to characterize the materials that make up an X-ray image. If a suspicion is generated during the imaging process (step 1), the image data could be enhanced by adapting the scanning parameters (step 2). This paper addresses the first part of this problem and uses textural signatures to recognize and characterize materials and hence enabling system control. Directional Gabor-type filtering was applied to a series of different X-ray images. Images were processed in such a way as to simulate a line scanning geometry. Based on our experiments with images of industrial standards and our own samples it was found that different materials could be characterized in terms of the frequency range and orientation of the filters. It was also found that the signal strength generated by the filters could be used as an indicator of visibility and optimum imaging conditions predicted

    Enhanced target detection in CCTV network system using colour constancy

    Get PDF
    The focus of this research is to study how targets can be more faithfully detected in a multi-camera CCTV network system using spectral feature for the detection. The objective of the work is to develop colour constancy (CC) methodology to help maintain the spectral feature of the scene into a constant stable state irrespective of variable illuminations and camera calibration issues. Unlike previous work in the field of target detection, two versions of CC algorithms have been developed during the course of this work which are capable to maintain colour constancy for every image pixel in the scene: 1) a method termed as Enhanced Luminance Reflectance CC (ELRCC) which consists of a pixel-wise sigmoid function for an adaptive dynamic range compression, 2) Enhanced Target Detection and Recognition Colour Constancy (ETDCC) algorithm which employs a bidirectional pixel-wise non-linear transfer PWNLTF function, a centre-surround luminance enhancement and a Grey Edge white balancing routine. The effectiveness of target detections for all developed CC algorithms have been validated using multi-camera ‘Imagery Library for Intelligent Detection Systems’ (iLIDS), ‘Performance Evaluation of Tracking and Surveillance’ (PETS) and ‘Ground Truth Colour Chart’ (GTCC) datasets. It is shown that the developed CC algorithms have enhanced target detection efficiency by over 175% compared with that without CC enhancement. The contribution of this research has been one journal paper published in the Optical Engineering together with 3 conference papers in the subject of research

    Enhancement of dronogram aid to visual interpretation of target objects via intuitionistic fuzzy hesitant sets

    Get PDF
    In this paper, we address the hesitant information in enhancement task often caused by differences in image contrast. Enhancement approaches generally use certain filters which generate artifacts or are unable to recover all the objects details in images. Typically, the contrast of an image quantifies a unique ratio between the amounts of black and white through a single pixel. However, contrast is better represented by a group of pix- els. We have proposed a novel image enhancement scheme based on intuitionistic hesi- tant fuzzy sets (IHFSs) for drone images (dronogram) to facilitate better interpretations of target objects. First, a given dronogram is divided into foreground and background areas based on an estimated threshold from which the proposed model measures the amount of black/white intensity levels. Next, we fuzzify both of them and determine the hesitant score indicated by the distance between the two areas for each point in the fuzzy plane. Finally, a hyperbolic operator is adopted for each membership grade to improve the pho- tographic quality leading to enhanced results via defuzzification. The proposed method is tested on a large drone image database. Results demonstrate better contrast enhancement, improved visual quality, and better recognition compared to the state-of-the-art methods.Web of Science500866

    Reconstructing vectorised photographic images

    Get PDF
    We address the problem of representing captured images in the continuous mathematical space more usually associated with certain forms of drawn ('vector') images. Such an image is resolution-independent so can be used as a master for varying resolution-specific formats. We briefly describe the main features of a vectorising codec for photographic images, whose significance is that drawing programs can access images and image components as first-class vector objects. This paper focuses on the problem of rendering from the isochromic contour form of a vectorised image and demonstrates a new fill algorithm which could also be used in drawing generally. The fill method is described in terms of level set diffusion equations for clarity. Finally we show that image warping is both simplified and enhanced in this form and that we can demonstrate real histogram equalisation with genuinely rectangular histograms

    Development of retinal blood vessel segmentation methodology using wavelet transforms for assessment of diabetic retinopathy

    Get PDF
    Automated image processing has the potential to assist in the early detection of diabetes, by detecting changes in blood vessel diameter and patterns in the retina. This paper describes the development of segmentation methodology in the processing of retinal blood vessel images obtained using non-mydriatic colour photography. The methods used include wavelet analysis, supervised classifier probabilities and adaptive threshold procedures, as well as morphology-based techniques. We show highly accurate identification of blood vessels for the purpose of studying changes in the vessel network that can be utilized for detecting blood vessel diameter changes associated with the pathophysiology of diabetes. In conjunction with suitable feature extraction and automated classification methods, our segmentation method could form the basis of a quick and accurate test for diabetic retinopathy, which would have huge benefits in terms of improved access to screening people for risk or presence of diabetes

    Assessment of a photogrammetric approach for urban DSM extraction from tri-stereoscopic satellite imagery

    Get PDF
    Built-up environments are extremely complex for 3D surface modelling purposes. The main distortions that hamper 3D reconstruction from 2D imagery are image dissimilarities, concealed areas, shadows, height discontinuities and discrepancies between smooth terrain and man-made features. A methodology is proposed to improve automatic photogrammetric extraction of an urban surface model from high resolution satellite imagery with the emphasis on strategies to reduce the effects of the cited distortions and to make image matching more robust. Instead of a standard stereoscopic approach, a digital surface model is derived from tri-stereoscopic satellite imagery. This is based on an extensive multi-image matching strategy that fully benefits from the geometric and radiometric information contained in the three images. The bundled triplet consists of an IKONOS along-track pair and an additional near-nadir IKONOS image. For the tri-stereoscopic study a densely built-up area, extending from the centre of Istanbul to the urban fringe, is selected. The accuracy of the model extracted from the IKONOS triplet, as well as the model extracted from only the along-track stereopair, are assessed by comparison with 3D check points and 3D building vector data
    corecore