132,786 research outputs found

    Color Image Edge Detection and Segmentation: A Comparison of the Vector Angle and the Euclidean Distance Color Similarity Measures

    Get PDF
    This work is based on Shafer's Dichromatic Reflection Model as applied to color image formation. The color spaces RGB, XYZ, CIELAB, CIELUV, rgb, l1l2l3, and the new h1h2h3 color space are discussed from this perspective. Two color similarity measures are studied: the Euclidean distance and the vector angle. The work in this thesis is motivated from a practical point of view by several shortcomings of current methods. The first problem is the inability of all known methods to properly segment objects from the background without interference from object shadows and highlights. The second shortcoming is the non-examination of the vector angle as a distance measure that is capable of directly evaluating hue similarity without considering intensity especially in RGB. Finally, there is inadequate research on the combination of hue- and intensity-based similarity measures to improve color similarity calculations given the advantages of each color distance measure. These distance measures were used for two image understanding tasks: edge detection, and one strategy for color image segmentation, namely color clustering. Edge detection algorithms using Euclidean distance and vector angle similarity measures as well as their combinations were examined. The list of algorithms is comprised of the modified Roberts operator, the Sobel operator, the Canny operator, the vector gradient operator, and the 3x3 difference vector operator. Pratt's Figure of Merit is used for a quantitative comparison of edge detection results. Color clustering was examined using the k-means (based on the Euclidean distance) and Mixture of Principal Components (based on the vector angle) algorithms. A new quantitative image segmentation evaluation procedure is introduced to assess the performance of both algorithms. Quantitative and qualitative results on many color images (artificial, staged scenes and natural scene images) indicate good edge detection performance using a vector version of the Sobel operator on the h1h2h3 color space. The results using combined hue- and intensity-based difference measures show a slight improvement qualitatively and over using each measure independently in RGB. Quantitative and qualitative results for image segmentation on the same set of images suggest that the best image segmentation results are obtained using the Mixture of Principal Components algorithm on the RGB, XYZ and rgb color spaces. Finally, poor color clustering results in the h1h2h3 color space suggest that some assumptions in deriving a simplified version of the Dichromatic Reflectance Model might have been violated

    Sketch Plus Colorization Deep Convolutional Neural Networks for Photos Generation from Sketches

    Get PDF
    In this paper, we introduce a method to generate photos from sketches using Deep Convolutional Neural Networks (DCNN). This research proposes a method by combining a network to invert sketches into photos (sketch inversion net) with a network to predict color given grayscale images (colorization net). By using this method, the quality of generated photos is expected to be more similar to the actual photos. We first artificially constructed uncontrolled conditions for the dataset. The dataset, which consists of hand-drawn sketches and their corresponding photos, were pre-processed using several data augmentation techniques to train the models in addressing the issues of rotation, scaling, shape, noise, and positioning. Validation was measured using two types of similarity measurements: pixel- difference based and human visual system (HVS) which mimics human perception in evaluating the quality of an image. The pixel- difference based metric consists of Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR) while the HVS consists of Universal Image Quality Index (UIQI) and Structural Similarity (SSIM). Our method gives the best quality of generated photos for all measures (844.04 for MSE, 19.06 for PSNR, 0.47 for UIQI, and 0.66 for SSIM)

    Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction

    Full text link
    The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include light field or image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One of the key advantages of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on Graphics for revie

    Accessibility-based reranking in multimedia search engines

    Get PDF
    Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context

    Full Reference Objective Quality Assessment for Reconstructed Background Images

    Full text link
    With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.Comment: Associated source code: https://github.com/ashrotre/RBQI, Associated Database: https://drive.google.com/drive/folders/1bg8YRPIBcxpKIF9BIPisULPBPcA5x-Bk?usp=sharing (Email for permissions at: ashrotreasuedu

    Monitoring Processes in Visual Search Enhanced by Professional Experience: The Case of Orange Quality-Control Workers

    Get PDF
    Visual search tasks have often been used to investigate how cognitive processes change with expertise. Several studies have shown visual experts' advantages in detecting objects related to their expertise. Here, we tried to extend these findings by investigating whether professional search experience could boost top-down monitoring processes involved in visual search, independently of advantages specific to objects of expertise. To this aim, we recruited a group of quality-control workers employed in citrus farms. Given the specific features of this type of job, we expected that the extensive employment of monitoring mechanisms during orange selection could enhance these mechanisms even in search situations in which orange-related expertise is not suitable. To test this hypothesis, we compared performance of our experimental group and of a well-matched control group on a computerized visual search task. In one block the target was an orange (expertise target) while in the other block the target was a Smurfette doll (neutral target). The a priori hypothesis was to find an advantage for quality-controllers in those situations in which monitoring was especially involved, that is, when deciding the presence/absence of the target required a more extensive inspection of the search array. Results were consistent with our hypothesis. Quality-controllers were faster in those conditions that extensively required monitoring processes, specifically, the Smurfette-present and both target-absent conditions. No differences emerged in the orange-present condition, which resulted to mainly rely on bottom-up processes. These results suggest that top-down processes in visual search can be enhanced through immersive real-life experience beyond visual expertise advantages

    Physical Representation-based Predicate Optimization for a Visual Analytics Database

    Full text link
    Querying the content of images, video, and other non-textual data sources requires expensive content extraction methods. Modern extraction techniques are based on deep convolutional neural networks (CNNs) and can classify objects within images with astounding accuracy. Unfortunately, these methods are slow: processing a single image can take about 10 milliseconds on modern GPU-based hardware. As massive video libraries become ubiquitous, running a content-based query over millions of video frames is prohibitive. One promising approach to reduce the runtime cost of queries of visual content is to use a hierarchical model, such as a cascade, where simple cases are handled by an inexpensive classifier. Prior work has sought to design cascades that optimize the computational cost of inference by, for example, using smaller CNNs. However, we observe that there are critical factors besides the inference time that dramatically impact the overall query time. Notably, by treating the physical representation of the input image as part of our query optimization---that is, by including image transforms, such as resolution scaling or color-depth reduction, within the cascade---we can optimize data handling costs and enable drastically more efficient classifier cascades. In this paper, we propose Tahoma, which generates and evaluates many potential classifier cascades that jointly optimize the CNN architecture and input data representation. Our experiments on a subset of ImageNet show that Tahoma's input transformations speed up cascades by up to 35 times. We also find up to a 98x speedup over the ResNet50 classifier with no loss in accuracy, and a 280x speedup if some accuracy is sacrificed.Comment: Camera-ready version of the paper submitted to ICDE 2019, In Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE 2019
    • …
    corecore