62 research outputs found

    Transfert de couleurs et colorisation basés sur des propriétés texturelles

    Get PDF
    This paper targets two related color manipulation problems: Color transfer for modifying an image colors and colorization for adding colors to a greyscale image. Automatic methods for these two applications propose to modify the input image using a reference that contains the desired colors. Previous approaches usually do not target both applications and suffer from two main limitations: possible misleading associations between input and reference regions and poor spatial coherence around image structures. In this paper, we propose a unified framework that uses the textural content of the images to guide the color transfer and colorization. Our method introduces an edge-aware texture descriptor based on region covariance, allowing for local color transformations. We show that our approach is able to produce results comparable or better than state-of-the-art methods in both applications.Cet article se concentre sur deux problèmes de manipulation de couleurs liés : le transfert de couleurs qui modifie les couleurs d'une image, et la colorisation qui ajoute des couleurs à une image en niveaux de gris. Les méthodes automatiques pour ces deux applications modifient l'image d'entrée à l'aide d'une image de référence contenant les couleurs désirées. Les approches précédentes visent rarement les deux problèmes simultanement et souffrent de deux principales limitations : les correspondances créées entre les images d'entrée et de référence sont incorrectes ou approximatives, et une mauvaise cohérence spatiale autour des structures de l'image. Dans cet article, nous proposons un pipeline unifiant les deux problèmes, basé sur le contenu texturel des images pour guider le transfert ou la colorisation. Notre méthode introduit un descripteur de textures préservant les contours de l'image, basé sur des matrices de covariance, permettant d'appliquer des transformations de couleurs locales. Nous montrons que notre approche est capable de produire des résultats comparables ou meilleurs que d'autres méthodes de l'état de l'art dans les deux applications

    A Machine Learning Framework for Generating Photorealistic Photos of Real Time Objects using Adam Optimizer by a Generative Adversarial Network (GAN)

    Get PDF
    Photographic training can result in new photographs that, to human observers, appear to be at least superficially authentic, with many realistic features. will discuss a number of intriguing GAN applications in order to help you develop an understanding of the types of problems where GANs can be used and useful. It is not an exhaustive list, but it includes numerous examples of GAN applications that have garnered media attention. This Paper Proposes a Framework for Generating Photorealistic Photos of real time objects (FGPPO) using Adam Optimizer by Generative Adversarial Networks

    Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting

    Full text link
    This paper proposes a weakly- and self-supervised deep convolutional neural network (WSSDCNN) for content-aware image retargeting. Our network takes a source image and a target aspect ratio, and then directly outputs a retargeted image. Retargeting is performed through a shift map, which is a pixel-wise mapping from the source to the target grid. Our method implicitly learns an attention map, which leads to a content-aware shift map for image retargeting. As a result, discriminative parts in an image are preserved, while background regions are adjusted seamlessly. In the training phase, pairs of an image and its image-level annotation are used to compute content and structure losses. We demonstrate the effectiveness of our proposed method for a retargeting application with insightful analyses.Comment: 10 pages, 11 figures. To appear in ICCV 2017, Spotlight Presentatio

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    Iterative, Deep Synthetic Aperture Sonar Image Segmentation

    Full text link
    Synthetic aperture sonar (SAS) systems produce high-resolution images of the seabed environment. Moreover, deep learning has demonstrated superior ability in finding robust features for automating imagery analysis. However, the success of deep learning is conditioned on having lots of labeled training data, but obtaining generous pixel-level annotations of SAS imagery is often practically infeasible. This challenge has thus far limited the adoption of deep learning methods for SAS segmentation. Algorithms exist to segment SAS imagery in an unsupervised manner, but they lack the benefit of state-of-the-art learning methods and the results present significant room for improvement. In view of the above, we propose a new iterative algorithm for unsupervised SAS image segmentation combining superpixel formation, deep learning, and traditional clustering methods. We call our method Iterative Deep Unsupervised Segmentation (IDUS). IDUS is an unsupervised learning framework that can be divided into four main steps: 1) A deep network estimates class assignments. 2) Low-level image features from the deep network are clustered into superpixels. 3) Superpixels are clustered into class assignments (which we call pseudo-labels) using kk-means. 4) Resulting pseudo-labels are used for loss backpropagation of the deep network prediction. These four steps are performed iteratively until convergence. A comparison of IDUS to current state-of-the-art methods on a realistic benchmark dataset for SAS image segmentation demonstrates the benefits of our proposal even as the IDUS incurs a much lower computational burden during inference (actual labeling of a test image). Finally, we also develop a semi-supervised (SS) extension of IDUS called IDSS and demonstrate experimentally that it can further enhance performance while outperforming supervised alternatives that exploit the same labeled training imagery.Comment: arXiv admin note: text overlap with arXiv:2107.1456

    Penerapan Metode Distance Transform pada Linear Discriminant Analysis untuk Kemunculan Kulit pada Deteksi Kulit

    Full text link
    Deteksi kulit memainkan peranan penting dalam berbagai aplikasi pengolah citra, mulai dari deteksi wajah, pelacakan wajah, penyaringan konten pornografi, berdasarkan sistem pencarian citra dan berbagai domain interaksi manusia dan komputer. Pendekatan informasi warna dapat mendeteksi warna kulit dengan baik menggunakan skin probability map (SPM) dengan aturan bayes. Namun SPM memiliki permasalahan dalam mendeteksi tekstur kulit. Linear discriminant analysis (LDA) merupakan algoritma ekstraksi fitur, dalam deteksi kulit digunakan untuk mengekstrak fitur tekstur kulit yang dapat menangani masalah SPM. Namun LDA memiliki permasalahan apabila digunakan untuk mengekstrak fitur tekstur kulit pada kernel yang berbeda. Distance transform (DT) merupakan algoritma untuk menghitung jarak citra biner pada setiap pikel gambar dan fitur poin terdekatnya, DT merupakan algoritma yang dapat mengatasi masalah pada LDA. Kombinasi algoritma SPM, LDA dan DT diusulkan untuk memperbaiki performa dari kemunculan kulit pada deteksi kulit. Dataset pada metode yang diusulkan menggunakan IBTD dataset. Hasil dari metode yang diusulkan bahwa metode yang diusulkan menunjukan peningkatan akurasi deteksi kesalahan yang signifikan pada SPM dan LDA

    Prospects for Theranostics in Neurosurgical Imaging: Empowering Confocal Laser Endomicroscopy Diagnostics via Deep Learning

    Get PDF
    Confocal laser endomicroscopy (CLE) is an advanced optical fluorescence imaging technology that has the potential to increase intraoperative precision, extend resection, and tailor surgery for malignant invasive brain tumors because of its subcellular dimension resolution. Despite its promising diagnostic potential, interpreting the gray tone fluorescence images can be difficult for untrained users. In this review, we provide a detailed description of bioinformatical analysis methodology of CLE images that begins to assist the neurosurgeon and pathologist to rapidly connect on-the-fly intraoperative imaging, pathology, and surgical observation into a conclusionary system within the concept of theranostics. We present an overview and discuss deep learning models for automatic detection of the diagnostic CLE images and discuss various training regimes and ensemble modeling effect on the power of deep learning predictive models. Two major approaches reviewed in this paper include the models that can automatically classify CLE images into diagnostic/nondiagnostic, glioma/nonglioma, tumor/injury/normal categories and models that can localize histological features on the CLE images using weakly supervised methods. We also briefly review advances in the deep learning approaches used for CLE image analysis in other organs. Significant advances in speed and precision of automated diagnostic frame selection would augment the diagnostic potential of CLE, improve operative workflow and integration into brain tumor surgery. Such technology and bioinformatics analytics lend themselves to improved precision, personalization, and theranostics in brain tumor treatment.Comment: See the final version published in Frontiers in Oncology here: https://www.frontiersin.org/articles/10.3389/fonc.2018.00240/ful

    Multi Traffic Scene Perception Using Support Vector Machine and Digital Image Processing

    Get PDF
    Traffic accidents are especially intense for a rainy day, Night, rainy season, rainy season, ice and day without street lighting many low-level conditions. Current View Drive the help systems are designed to be done under good-nature Weather. Classification is a method of identifying Optical characteristics of vision expansion protocols more efficient. Improve computer vision in awkward manner Weather environments, multi-class weather classification system many weather features and supervision were made Learning. First, basic visual features are extracted Multiple traffic pictures, then the feature is revealed. The team has eight dimensions. Secondly, five supervision was made Learning methods are used to train instructors. Analysis the extracted features indicate that the image describes accurately the highest recognition of etymology and classmates is the accuracy rate and adaptive skills. Provides the basis for the proposed method anterior vehicle innovation increases invention Night light changes, as well as increases View of driving field on an ice day. Image feature extraction is the most important process in pattern recognition and it is the most efficient way to simplify high-dimensional image data. Because it is hard to obtain some information from the M × N × 3-dimensional image matrix. Therefore, owing to perceive multi-traffic scene, the key information must be extracted from the image

    Perceptually inspired image estimation and enhancement

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2009.Includes bibliographical references (p. 137-144).In this thesis, we present three image estimation and enhancement algorithms inspired by human vision. In the first part of the thesis, we propose an algorithm for mapping one image to another based on the statistics of a training set. Many vision problems can be cast as image mapping problems, such as, estimating reflectance from luminance, estimating shape from shading, separating signal and noise, etc. Such problems are typically under-constrained, and yet humans are remarkably good at solving them. Classic computational theories about the ability of the human visual system to solve such under-constrained problems attribute this feat to the use of some intuitive regularities of the world, e.g., surfaces tend to be piecewise constant. In recent years, there has been considerable interest in deriving more sophisticated statistical constraints from natural images, but because of the high-dimensional nature of images, representing and utilizing the learned models remains a challenge. Our techniques produce models that are very easy to store and to query. We show these techniques to be effective for a number of applications: removing noise from images, estimating a sharp image from a blurry one, decomposing an image into reflectance and illumination, and interpreting lightness illusions. In the second part of the thesis, we present an algorithm for compressing the dynamic range of an image while retaining important visual detail. The human visual system confronts a serious challenge with dynamic range, in that the physical world has an extremely high dynamic range, while neurons have low dynamic ranges.(cont.) The human visual system performs dynamic range compression by applying automatic gain control, in both the retina and the visual cortex. Taking inspiration from that, we designed techniques that involve multi-scale subband transforms and smooth gain control on subband coefficients, and resemble the contrast gain control mechanism in the visual cortex. We show our techniques to be successful in producing dynamic-range-compressed images without compromising the visibility of detail or introducing artifacts. We also show that the techniques can be adapted for the related problem of "companding", in which a high dynamic range image is converted to a low dynamic range image and saved using fewer bits, and later expanded back to high dynamic range with minimal loss of visual quality. In the third part of the thesis, we propose a technique that enables a user to easily localize image and video editing by drawing a small number of rough scribbles. Image segmentation, usually treated as an unsupervised clustering problem, is extremely difficult to solve. With a minimal degree of user supervision, however, we are able to generate selection masks with good quality. Our technique learns a classifier using the user-scribbled pixels as training examples, and uses the classifier to classify the rest of the pixels into distinct classes. It then uses the classification results as per-pixel data terms, combines them with a smoothness term that respects color discontinuities, and generates better results than state-of-art algorithms for interactive segmentation.by Yuanzhen Li.Ph.D
    corecore