357 research outputs found

    Automatic Test Methods for Image and Video Verification

    Get PDF
    In this thesis four methods for automatic verification of images and video on mobile platforms are developed. Both the case of recording images and video and the case of viewing images and video on the mobile lcd screen are considered. The first method is used to test the zoom function of the camera. It uses SURF decriptors along with clustering and histograms to determine which of six discrete zoom levels the current frame belongs to. The second method identifies color effects and color anomalies using histograms. The third method determines if the autofocus works correctly by measuring the average length of edges in the image. The fourth method is an artifact detection scheme using a non-reference implementation of the SSIM metric, used in conjunction with a for this purpose specially designed test setup. Together these methods form a tool kit for detecting the mnost common errors to occur in images and video during the development stage of mobile platforms

    Underwater image restoration: super-resolution and deblurring via sparse representation and denoising by means of marine snow removal

    Get PDF
    Underwater imaging has been widely used as a tool in many fields, however, a major issue is the quality of the resulting images/videos. Due to the light's interaction with water and its constituents, the acquired underwater images/videos often suffer from a significant amount of scatter (blur, haze) and noise. In the light of these issues, this thesis considers problems of low-resolution, blurred and noisy underwater images and proposes several approaches to improve the quality of such images/video frames. Quantitative and qualitative experiments validate the success of proposed algorithms

    Content-based Image Understanding with Applications to Affective Computing and Person Recognition in Natural Settings

    Get PDF
    Understanding the visual content of images is one of the most important topics in computer vision. Many researchers have tried to teach the machine to see and perceive like human. In this dissertation, we develop several new approaches for image understanding with applications to affective computing, and person detection and recognition. Our proposed method applied to fashion photo analysis can understand the aesthetic quality of photos. Further, a bilinear model that takes into account the relative confidence of region proposals and the mutual relationship between multiple labels is developed to boost multi-label classification. It is evaluated both on object recognition and aesthetic attributes learning. We also develop a person detection and recognition system in natural settings that can robustly handle various pose, viewpoints, and lighting conditions. The system is then put into several real scenarios that has different amount of labelled data. Our algorithm that utilizes unlabelled data reduces the effort needed for data annotation while achieving similar results as with labelled data

    Representations and representation learning for image aesthetics prediction and image enhancement

    Get PDF
    With the continual improvement in cell phone cameras and improvements in the connectivity of mobile devices, we have seen an exponential increase in the images that are captured, stored and shared on social media. For example, as of July 1st 2017 Instagram had over 715 million registered users which had posted just shy of 35 billion images. This represented approximately seven and nine-fold increase in the number of users and photos present on Instagram since 2012. Whether the images are stored on personal computers or reside on social networks (e.g. Instagram, Flickr), the sheer number of images calls for methods to determine various image properties, such as object presence or appeal, for the purpose of automatic image management and curation. One of the central problems in consumer photography centers around determining the aesthetic appeal of an image and motivates us to explore questions related to understanding aesthetic preferences, image enhancement and the possibility of using such models on devices with constrained resources. In this dissertation, we present our work on exploring representations and representation learning approaches for aesthetic inference, composition ranking and its application to image enhancement. Firstly, we discuss early representations that mainly consisted of expert features, and their possibility to enhance Convolutional Neural Networks (CNN). Secondly, we discuss the ability of resource-constrained CNNs, and the different architecture choices (inputs size and layer depth) in solving various aesthetic inference tasks: binary classification, regression, and image cropping. We show that if trained for solving fine-grained aesthetics inference, such models can rival the cropping performance of other aesthetics-based croppers, however they fall short in comparison to models trained for composition ranking. Lastly, we discuss our work on exploring and identifying the design choices in training composition ranking functions, with the goal of using them for image composition enhancement

    Data-Driven Image Restoration

    Get PDF
    Every day many images are taken by digital cameras, and people are demanding visually accurate and pleasing result. Noise and blur degrade images captured by modern cameras, and high-level vision tasks (such as segmentation, recognition, and tracking) require high-quality images. Therefore, image restoration specifically, image deblurring and image denoising is a critical preprocessing step. A fundamental problem in image deblurring is to recover reliably distinct spatial frequencies that have been suppressed by the blur kernel. Existing image deblurring techniques often rely on generic image priors that only help recover part of the frequency spectrum, such as the frequencies near the high-end. To this end, we pose the following specific questions: (i) Does class-specific information offer an advantage over existing generic priors for image quality restoration? (ii) If a class-specific prior exists, how should it be encoded into a deblurring framework to recover attenuated image frequencies? Throughout this work, we devise a class-specific prior based on the band-pass filter responses and incorporate it into a deblurring strategy. Specifically, we show that the subspace of band-pass filtered images and their intensity distributions serve as useful priors for recovering image frequencies. Next, we present a novel image denoising algorithm that uses external, category specific image database. In contrast to existing noisy image restoration algorithms, our method selects clean image “support patches” similar to the noisy patch from an external database. We employ a content adaptive distribution model for each patch where we derive the parameters of the distribution from the support patches. Our objective function composed of a Gaussian fidelity term that imposes category specific information, and a low-rank term that encourages the similarity between the noisy and the support patches in a robust manner. Finally, we propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, each neuron in the last convolution layer of each module can observe the full receptive field of the first layer

    On the Design of a Photo Beauty Measurement Mechanism Based on Image Composition and Machine Learning

    Get PDF
    In this chapter, we propose a machine learning scheme on how to measure the beauty of a photo. Different from traditional measurements that focus on the quality of captured signals, the beauty of photos is based on high-level concepts from the knowledge of photo aesthetics. Because the concept of beauty is mostly defined by human being, the measurement must contain some knowledge obtained from them. Therefore, our measurement can be realized by a machine learning mechanism, which is trained by collected data from the human. There are several computational aesthetic manners used for building a photo beauty measurement system, including low-level feature extraction, image composition analysis, photo semantics parsing, and classification rule generation. Because the meaning of beauty may vary from different people, the personal preference is also taken into consideration. In this chapter, the performance of two computational aesthetic manners for the perception of beauty is evaluated, which are based on image composition analysis and low-level features to determine whether a photo meets the criterion of a professional photographing via different classifiers. The experimental results manifest that both decision tree and multilayer perceptron-based classifiers attain high accuracy of more than 90% for evaluation

    Media aesthetics based multimedia storytelling.

    Get PDF
    Since the earliest of times, humans have been interested in recording their life experiences, for future reference and for storytelling purposes. This task of recording experiences --i.e., both image and video capture-- has never before in history been as easy as it is today. This is creating a digital information overload that is becoming a great concern for the people that are trying to preserve their life experiences. As high-resolution digital still and video cameras become increasingly pervasive, unprecedented amounts of multimedia, are being downloaded to personal hard drives, and also uploaded to online social networks on a daily basis. The work presented in this dissertation is a contribution in the area of multimedia organization, as well as automatic selection of media for storytelling purposes, which eases the human task of summarizing a collection of images or videos in order to be shared with other people. As opposed to some prior art in this area, we have taken an approach in which neither user generated tags nor comments --that describe the photographs, either in their local or on-line repositories-- are taken into account, and also no user interaction with the algorithms is expected. We take an image analysis approach where both the context images --e.g. images from online social networks to which the image stories are going to be uploaded--, and the collection images --i.e., the collection of images or videos that needs to be summarized into a story--, are analyzed using image processing algorithms. This allows us to extract relevant metadata that can be used in the summarization process. Multimedia-storytellers usually follow three main steps when preparing their stories: first they choose the main story characters, the main events to describe, and finally from these media sub-groups, they choose the media based on their relevance to the story as well as based on their aesthetic value. Therefore, one of the main contributions of our work has been the design of computational models --both regression based, as well as classification based-- that correlate well with human perception of the aesthetic value of images and videos. These computational aesthetics models have been integrated into automatic selection algorithms for multimedia storytelling, which are another important contribution of our work. A human centric approach has been used in all experiments where it was feasible, and also in order to assess the final summarization results, i.e., humans are always the final judges of our algorithms, either by inspecting the aesthetic quality of the media, or by inspecting the final story generated by our algorithms. We are aware that a perfect automatically generated story summary is very hard to obtain, given the many subjective factors that play a role in such a creative process; rather, the presented approach should be seen as a first step in the storytelling creative process which removes some of the ground work that would be tedious and time consuming for the user. Overall, the main contributions of this work can be capitalized in three: (1) new media aesthetics models for both images and videos that correlate with human perception, (2) new scalable multimedia collection structures that ease the process of media summarization, and finally, (3) new media selection algorithms that are optimized for multimedia storytelling purposes.Postprint (published version

    Automated ventricular systems segmentation in brain CT images by combining low-level segmentation and high-level template matching

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate analysis of CT brain scans is vital for diagnosis and treatment of Traumatic Brain Injuries (TBI). Automatic processing of these CT brain scans could speed up the decision making process, lower the cost of healthcare, and reduce the chance of human error. In this paper, we focus on automatic processing of CT brain images to segment and identify the ventricular systems. The segmentation of ventricles provides quantitative measures on the changes of ventricles in the brain that form vital diagnosis information.</p> <p>Methods</p> <p>First all CT slices are aligned by detecting the ideal midlines in all images. The initial estimation of the ideal midline of the brain is found based on skull symmetry and then the initial estimate is further refined using detected anatomical features. Then a two-step method is used for ventricle segmentation. First a low-level segmentation on each pixel is applied on the CT images. For this step, both Iterated Conditional Mode (ICM) and Maximum A Posteriori Spatial Probability (MASP) are evaluated and compared. The second step applies template matching algorithm to identify objects in the initial low-level segmentation as ventricles. Experiments for ventricle segmentation are conducted using a relatively large CT dataset containing mild and severe TBI cases.</p> <p>Results</p> <p>Experiments show that the acceptable rate of the ideal midline detection is over 95%. Two measurements are defined to evaluate ventricle recognition results. The first measure is a sensitivity-like measure and the second is a false positive-like measure. For the first measurement, the rate is 100% indicating that all ventricles are identified in all slices. The false positives-like measurement is 8.59%. We also point out the similarities and differences between ICM and MASP algorithms through both mathematically relationships and segmentation results on CT images.</p> <p>Conclusion</p> <p>The experiments show the reliability of the proposed algorithms. The novelty of the proposed method lies in its incorporation of anatomical features for ideal midline detection and the two-step ventricle segmentation method. Our method offers the following improvements over existing approaches: accurate detection of the ideal midline and accurate recognition of ventricles using both anatomical features and spatial templates derived from Magnetic Resonance Images.</p
    corecore