607 research outputs found

    Geometry-Aware Neighborhood Search for Learning Local Models for Image Reconstruction

    Get PDF
    Local learning of sparse image models has proven to be very effective to solve inverse problems in many computer vision applications. To learn such models, the data samples are often clustered using the K-means algorithm with the Euclidean distance as a dissimilarity metric. However, the Euclidean distance may not always be a good dissimilarity measure for comparing data samples lying on a manifold. In this paper, we propose two algorithms for determining a local subset of training samples from which a good local model can be computed for reconstructing a given input test sample, where we take into account the underlying geometry of the data. The first algorithm, called Adaptive Geometry-driven Nearest Neighbor search (AGNN), is an adaptive scheme which can be seen as an out-of-sample extension of the replicator graph clustering method for local model learning. The second method, called Geometry-driven Overlapping Clusters (GOC), is a less complex nonadaptive alternative for training subset selection. The proposed AGNN and GOC methods are evaluated in image super-resolution, deblurring and denoising applications and shown to outperform spectral clustering, soft clustering, and geodesic distance based subset selection in most settings.Comment: 15 pages, 10 figures and 5 table

    Advancement in Denoising MRI Images via 3D-GAN Model with Direction Coupled Magnitude Histogram Consistency Loss

    Get PDF
    The diagnostics of medical pictures are essential for recognizing and comprehending a wide range of medical problems. This work introduces the Direction Coupled Magnitude Histogram (DCMH) as a novel structure picture descriptor to improve diagnostic accuracy. One of DCMH's unique selling points is its ability to include the edge oriented information that are oriented in any way inside a frame, enabling the expression of delicate nuances using various gradient features. The proposed method applies cartoon texture based textural loss and DCMH based structural loss to identify and analyse structural and textural information during the denoising time. A major contribution that improves the interpretability of images by emphasizing structural aspects that is inherent to the image. The proposed DCMH_3D_GANaverage results show exceptional performance, with an SSIM of 0.972995 and PSNR of 48.74, highlighting the effectiveness of the DCMH-based method in enhancing medical picture diagnosis. The capacity of Structured Loss to improve picture interpretability and lead to a more precise diagnosis is unquestionably advantageous. The newly developed DCMH-based approach, which includes texture loss and structured components, is a promising development in healthcare image processing that will enable better patient care through enhanced diagnostic abilities

    Colour-based image retrieval algorithms based on compact colour descriptors and dominant colour-based indexing methods

    Get PDF
    Content based image retrieval (CBIR) is reported as one of the most active research areas in the last two decades, but it is still young. Three CBIR’s performance problem in this study is inaccuracy of image retrieval, high complexity of feature extraction, and degradation of image retrieval after database indexing. This situation led to discrepancies to be applied on limited-resources devices (such as mobile devices). Therefore, the main objective of this thesis is to improve performance of CBIR. Images’ Dominant Colours (DCs) is selected as the key contributor for this purpose due to its compact property and its compatibility with the human visual system. Semantic image retrieval is proposed to solve retrieval inaccuracy problem by concentrating on the images’ objects. The effect of image background is reduced to provide more focus on the object by setting weights to the object and the background DCs. The accuracy improvement ratio is raised up to 50% over the compared methods. Weighting DCs framework is proposed to generalize this technique where it is demonstrated by applying it on many colour descriptors. For reducing high complexity of colour Correlogram in terms of computations and memory space, compact representation of Correlogram is proposed. Additionally, similarity measure of an existing DC-based Correlogram is adapted to improve its accuracy. Both methods are incorporated to produce promising colour descriptor in terms of time and memory space complexity. As a result, the accuracy is increased up to 30% over the existing methods and the memory space is decreased to less than 10% of its original space. Converting the abundance of colours into a few DCs framework is proposed to generalize DCs concept. In addition, two DC-based indexing techniques are proposed to overcome time problem, by using RGB and perceptual LUV colour spaces. Both methods reduce the search space to less than 25% of the database size with preserving the same accuracy

    Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

    Full text link
    Humans have long been recorded in a variety of forms since antiquity. For example, sculptures and paintings were the primary media for depicting human beings before the invention of cameras. However, most current human-centric computer vision tasks like human pose estimation and human image generation focus exclusively on natural images in the real world. Artificial humans, such as those in sculptures, paintings, and cartoons, are commonly neglected, making existing models fail in these scenarios. As an abstraction of life, art incorporates humans in both natural and artificial scenes. We take advantage of it and introduce the Human-Art dataset to bridge related tasks in natural and artificial scenarios. Specifically, Human-Art contains 50k high-quality images with over 123k person instances from 5 natural and 15 artificial scenarios, which are annotated with bounding boxes, keypoints, self-contact points, and text information for humans represented in both 2D and 3D. It is, therefore, comprehensive and versatile for various downstream tasks. We also provide a rich set of baseline results and detailed analyses for related tasks, including human detection, 2D and 3D human pose estimation, image generation, and motion transfer. As a challenging dataset, we hope Human-Art can provide insights for relevant research and open up new research questions.Comment: CVPR202

    Recent Progress in Image Deblurring

    Full text link
    This paper comprehensively reviews the recent development of image deblurring, including non-blind/blind, spatially invariant/variant deblurring techniques. Indeed, these techniques share the same objective of inferring a latent sharp image from one or several corresponding blurry images, while the blind deblurring techniques are also required to derive an accurate blur kernel. Considering the critical role of image restoration in modern imaging systems to provide high-quality images under complex environments such as motion, undesirable lighting conditions, and imperfect system components, image deblurring has attracted growing attention in recent years. From the viewpoint of how to handle the ill-posedness which is a crucial issue in deblurring tasks, existing methods can be grouped into five categories: Bayesian inference framework, variational methods, sparse representation-based methods, homography-based modeling, and region-based methods. In spite of achieving a certain level of development, image deblurring, especially the blind case, is limited in its success by complex application conditions which make the blur kernel hard to obtain and be spatially variant. We provide a holistic understanding and deep insight into image deblurring in this review. An analysis of the empirical evidence for representative methods, practical issues, as well as a discussion of promising future directions are also presented.Comment: 53 pages, 17 figure

    COMMUNITY DETECTION AND INFLUENCE MAXIMIZATION IN ONLINE SOCIAL NETWORKS

    Get PDF
    The detecting and clustering of data and users into communities on the social web are important and complex issues in order to develop smart marketing models in changing and evolving social ecosystems. These marketing models are created by individual decision to purchase a product and are influenced by friends and acquaintances. This leads to novel marketing models, which view users as members of online social network communities, rather than the traditional view of marketing to individuals. This thesis starts by examining models that detect communities in online social networks. Then an enhanced approach to detect community which clusters similar nodes together is suggested. Social relationships play an important role in determining user behavior. For example, a user might purchase a product that his/her friend recently bought. Such a phenomenon is called social influence and is used to study how far the action of one user can affect the behaviors of others. Then an original metric used to compute the influential power of social network users based on logs of common actions in order to infer a probabilistic influence propagation model. Finally, a combined community detection algorithm and suggested influence propagation approach reveals a new influence maximization model by identifying and using the most influential users within their communities. In doing so, we employed a fuzzy logic based technique to determine the key users who drive this influence in their communities and diffuse a certain behavior. This original approach contrasts with previous influence propagation models, which did not use similarity opportunities among members of communities to maximize influence propagation. The performance results show that the model activates a higher number of overall nodes in contemporary social networks, starting from a smaller set of key users, as compared to existing landmark approaches which influence fewer nodes, yet employ a larger set of key users

    An audio-visual approach to web video categorization

    Get PDF
    International audienceIn this paper we address the issue of automatic video genre categorization of web media using an audio-visual approach. To this end, we propose content descriptors which exploit audio, temporal structure and color information. The potential of our descriptors is experimentally validated both from the perspective of a classification system and as an information retrieval approach. Validation is carried out on a real scenario, namely on more than 288 hours of video footage and 26 video genres specific to blip.tv media platform. Additionally, to reduce semantic gap, we propose a new relevance feedback technique which is based on hierarchical clustering. Experimental tests prove that retrieval performance can be significantly increased in this case, becoming comparable to the one obtained with high level semantic textual descriptors

    A unified model for recognition and prediction using a compressed internal timeline

    Full text link
    It has long been understood that there is a deep connection between time and memory. From episodic memory in humans to conditioning tasks in animals, temporal relationships play a crucial role in memory performance. While recognition memory is a subset of episodic memory, most recognition memory models disregard information about time and assume that memory is a composite store with a noisy record of items and their associations. Another class of models posits that memory depends on temporal representations in which ‘what’ and ‘when’ information is stored conjointly. Using three experiments, I found evidence for serially accessing memory (scanning) in both short-term and long-term memory and in predicting the future. These findings support the hypothesis that memories are stored in temporal representations. In Experiment 1, I hypothesized that scanning in a judgment-of-recency task is due to a compressed temporal representation. In 107 healthy young adults, response times depended only on the lag to the target and varied sub-linearly with lag. This result was consistent with the hypothesis. In Experiment 2, the hypothesis was that memory search on a long-term recognition task is driven by serially scanning a compressed representation. In a continuous recognition paradigm with 88 healthy young adults across three studies, the time at which information starts becoming accessible varied as a function of the logarithm of the lag. This result suggests that information in long-term memory is stored in a compressed representation that can be accessed using a serial backward scan. In Experiment 3, I tested the hypothesis that our ability to access what is going to happen a few seconds in the future is similar to our ability to access the immediate past. Sixty healthy young adults performed a relative order judgment task for future events. The response times in this novel judgment-of-imminence task showed that a search through prospective memory representation was serial and closely paralleled the serial search observed in the judgment-of-recency task (Experiment 1). Together, these results suggest that it is possible to generate a temporally ordered representation that can be scanned to access the past and the future

    Mathematical Approaches for Image Enhancement Problems

    Get PDF
    This thesis develops novel techniques that can solve some image enhancement problems using theoretically and technically proven and very useful mathematical tools to image processing such as wavelet transforms, partial differential equations, and variational models. Three subtopics are mainly covered. First, color image denoising framework is introduced to achieve high quality denoising results by considering correlations between color components while existing denoising approaches can be plugged in flexibly. Second, a new and efficient framework for image contrast and color enhancement in the compressed wavelet domain is proposed. The proposed approach is capable of enhancing both global and local contrast and brightness as well as preserving color consistency. The framework does not require inverse transform for image enhancement since linear scale factors are directly applied to both scaling and wavelet coefficients in the compressed domain, which results in high computational efficiency. Also contaminated noise in the image can be efficiently reduced by introducing wavelet shrinkage terms adaptively in different scales. The proposed method is able to enhance a wavelet-coded image computationally efficiently with high image quality and less noise or other artifact. The experimental results show that the proposed method produces encouraging results both visually and numerically compared to some existing approaches. Finally, image inpainting problem is discussed. Literature review, psychological analysis, and challenges on image inpainting problem and related topics are described. An inpainting algorithm using energy minimization and texture mapping is proposed. Mumford-Shah energy minimization model detects and preserves edges in the inpainting domain by detecting both the main structure and the detailed edges. This approach utilizes faster hierarchical level set method and guarantees convergence independent of initial conditions. The estimated segmentation results in the inpainting domain are stored in segmentation map, which is referred by a texture mapping algorithm for filling textured regions. We also propose an inpainting algorithm using wavelet transform that can expect better global structure estimation of the unknown region in addition to shape and texture properties since wavelet transforms have been used for various image analysis problems due to its nice multi-resolution properties and decoupling characteristics

    Textural Difference Enhancement based on Image Component Analysis

    Get PDF
    In this thesis, we propose a novel image enhancement method to magnify the textural differences in the images with respect to human visual characteristics. The method is intended to be a preprocessing step to improve the performance of the texture-based image segmentation algorithms. We propose to calculate the six Tamura's texture features (coarseness, contrast, directionality, line-likeness, regularity and roughness) in novel measurements. Each feature follows its original understanding of the certain texture characteristic, but is measured by some local low-level features, e.g., direction of the local edges, dynamic range of the local pixel intensities, kurtosis and skewness of the local image histogram. A discriminant texture feature selection method based on principal component analysis (PCA) is then proposed to find the most representative characteristics in describing textual differences in the image. We decompose the image into pairwise components representing the texture characteristics strongly and weakly, respectively. A set of wavelet-based soft thresholding methods are proposed as the dictionaries of morphological component analysis (MCA) to sparsely highlight the characteristics strongly and weakly from the image. The wavelet-based thresholding methods are proposed in pair, therefore each of the resulted pairwise components can exhibit one certain characteristic either strongly or weakly. We propose various wavelet-based manipulation methods to enhance the components separately. For each component representing a certain texture characteristic, a non-linear function is proposed to manipulate the wavelet coefficients of the component so that the component is enhanced with the corresponding characteristic accentuated independently while having little effect on other characteristics. Furthermore, the above three methods are combined into a uniform framework of image enhancement. Firstly, the texture characteristics differentiating different textures in the image are found. Secondly, the image is decomposed into components exhibiting these texture characteristics respectively. Thirdly, each component is manipulated to accentuate the corresponding texture characteristics exhibited there. After re-combining these manipulated components, the image is enhanced with the textural differences magnified with respect to the selected texture characteristics. The proposed textural differences enhancement method is used prior to both grayscale and colour image segmentation algorithms. The convincing results of improving the performance of different segmentation algorithms prove the potential of the proposed textural difference enhancement method
    • …
    corecore