3,517 research outputs found

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Co-operative surveillance cameras for high quality face acquisition in a real-time door monitoring system

    Get PDF
    A poster session on co-operative surveillance cameras for high quality face acquisition in a real-time door monitoring syste

    Image enhancement from a stabilised video sequence

    Get PDF
    The aim of video stabilisation is to create a new video sequence where the motions (i.e. rotations, translations) and scale differences between frames (or parts of a frame) have effectively been removed. These stabilisation effects can be obtained via digital video processing techniques which use the information extracted from the video sequence itself, with no need for additional hardware or knowledge about camera physical motion. A video sequence usually contains a large overlap between successive frames, and regions of the same scene are sampled at different positions. In this paper, this multiple sampling is combined to achieve images with a higher spatial resolution. Higher resolution imagery play an important role in assisting in the identification of people, vehicles, structures or objects of interest captured by surveillance cameras or by video cameras used in face recognition, traffic monitoring, traffic law reinforcement, driver assistance and automatic vehicle guidance systems

    Recent Developments in Video Surveillance

    Get PDF
    With surveillance cameras installed everywhere and continuously streaming thousands of hours of video, how can that huge amount of data be analyzed or even be useful? Is it possible to search those countless hours of videos for subjects or events of interest? Shouldn’t the presence of a car stopped at a railroad crossing trigger an alarm system to prevent a potential accident? In the chapters selected for this book, experts in video surveillance provide answers to these questions and other interesting problems, skillfully blending research experience with practical real life applications. Academic researchers will find a reliable compilation of relevant literature in addition to pointers to current advances in the field. Industry practitioners will find useful hints about state-of-the-art applications. The book also provides directions for open problems where further advances can be pursued

    Privacy-Friendly Photo Sharing and Relevant Applications Beyond

    Get PDF
    Popularization of online photo sharing brings people great convenience, but has also raised concerns for privacy. Researchers proposed various approaches to enable image privacy, most of which focus on encrypting or distorting image visual content. In this thesis, we investigate novel solutions to protect image privacy with a particular emphasis on online photo sharing. To this end, we propose not only algorithms to protect visual privacy in image content but also design of architectures for privacy-preserving photo sharing. Beyond privacy, we also explore additional impacts and potentials of employing daily images in other three relevant applications. First, we propose and study two image encoding algorithms to protect visual content in image, within a Secure JPEG framework. The first method scrambles a JPEG image by randomly changing the signs of its DCT coefficients based on a secret key. The second method, named JPEG Transmorphing, allows one to protect arbitrary image regions with any obfuscation, while secretly preserving the original image regions in application segments of the obfuscated JPEG image. Performance evaluations reveal a good degree of storage overhead and privacy protection capability for both methods, and particularly a good level of pleasantness for JPEG Transmorphing, if proper manipulations are applied. Second, we investigate the design of two architectures for privacy-preserving photo sharing. The first architecture, named ProShare, is built on a public key infrastructure (PKI) integrated with a ciphertext-policy attribute-based encryption (CP-ABE), to enable the secure and efficient access to user-posted photos protected by Secure JPEG. The second architecture is named ProShare S, in which a photo sharing service provider helps users make photo sharing decisions automatically based on their past decisions using machine learning. The photo sharing service analyzes not only the content of a user's photo, but also context information about the image capture and a prospective requester, and finally makes decision whether or not to share a particular photo to the requester, and if yes, at which granularity. A user study along with extensive evaluations were performed to validate the proposed architecture. In the end, we research into three relevant topics in regard to daily photos captured or shared by people, but beyond their privacy implications. In the first study, inspired by JPEG Transmorphing, we propose an animated JPEG file format, named aJPEG. aJPEG preserves its animation frames as application markers in a JPEG image and provides smaller file size and better image quality than conventional GIF. In the second study, we attempt to understand the impact of popular image manipulations applied in online photo sharing on evoked emotions of observers. The study reveals that image manipulations indeed influence people's emotion, but such impact also depends on the image content. In the last study, we employ a deep convolutional neural network (CNN), the GoogLeNet model, to perform automatic food image detection and categorization. The promising results obtained provide meaningful insights in design of automatic dietary assessment system based on multimedia techniques, e.g. image analysis

    Do We Train on Test Data? The Impact of Near-Duplicates on License Plate Recognition

    Full text link
    This work draws attention to the large fraction of near-duplicates in the training and test sets of datasets widely adopted in License Plate Recognition (LPR) research. These duplicates refer to images that, although different, show the same license plate. Our experiments, conducted on the two most popular datasets in the field, show a substantial decrease in recognition rate when six well-known models are trained and tested under fair splits, that is, in the absence of duplicates in the training and test sets. Moreover, in one of the datasets, the ranking of models changed considerably when they were trained and tested under duplicate-free splits. These findings suggest that such duplicates have significantly biased the evaluation and development of deep learning-based models for LPR. The list of near-duplicates we have found and proposals for fair splits are publicly available for further research at https://raysonlaroca.github.io/supp/lpr-train-on-test/Comment: Accepted for presentation at the International Joint Conference on Neural Networks (IJCNN) 202

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Super-resolution towards license plate recognition

    Get PDF
    Orientador: David MenottiDissertação (mestrado) - Universidade Federal do ParanĂĄ, Setor de CiĂȘncias Exatas, Programa de PĂłs-Graduação em InformĂĄtica. Defesa : Curitiba, 24/04/2023Inclui referĂȘncias: p. 51-59Área de concentração: CiĂȘncia da ComputaçãoResumo: Nos Ășltimos anos, houve avanços significativos no campo de Reconhecimento de placas de veiculares (LPR, do inglĂȘs License Plate Recognition) por meio da integração de tĂ©cnicas de aprendizado profundo e do aumento da disponibilidade de dados para treinamento. No entanto, reconstruir placas veiculares a partir de imagens de sistemas de vigilĂąncia em baixa resolução ainda Ă© um desafio. Para enfrentar essa dificuldade, apresentamos uma abordagem de Super Resolução de Imagem Única (SISR, do inglĂȘs Single-Image Super-Resolution) que integra mĂłdulos de atenção para aprimorar a detecção de caracterĂ­stica estruturais e texturais em imagens de baixa resolução. Nossa abordagem utiliza camadas de convolução sub-pixel (tambĂ©m conhecidas como PixelShuffle) e uma função de perda que emprega um modelo de Reconhecimento Óptico de Caracteres (OCR, do inglĂȘs Optical Character Recognition) para extração de caracterĂ­sticas. Treinamos a arquitetura proposta com imagens sintĂ©ticas criadas aplicando ruĂ­do gaussiano pesado Ă  imagens de alta resolução de placas veiculares de dois conjuntos de dados pĂșblicos, seguido de redução de sua resolução com interpolação bicĂșbica. Como resultado, as imagens geradas tĂȘm um Índice de Similaridade Estrutural (SSIM, do inglĂȘs Structural Similarity Index Measure) inferior a 0,10. Nossos resultados experimentais mostram que a abordagem proposta para reconstruir essas imagens sintĂ©ticas de baixa resolução superou as existentes tanto em medidas quantitativas quanto qualitativas.Abstract: Recent years have seen significant developments in the field of License Plate Recognition (LPR) through the integration of deep learning techniques and the increasing availability of training data. Nevertheless, reconstructing license plates (LPs) from low-resolution (LR) surveillance footage remains challenging. To address this issue, we introduce a Single-Image Super-Resolution (SISR) approach that integrates attention and transformer modules to enhance the detection of structural and textural features in LR images. Our approach incorporates sub-pixel convolution layers (also known as PixelShuffle) and a loss function that uses an Optical Character Recognition (OCR) model for feature extraction. We trained the proposed architecture on synthetic images created by applying heavy Gaussian noise to high-resolution LP images from two public datasets, followed by bicubic downsampling. As a result, the generated images have a Structural Similarity Index Measure (SSIM) of less than 0.10. Our results show that our approach for reconstructing these low-resolution synthesized images outperforms existing ones in both quantitative and qualitative measures
    • 

    corecore