62 research outputs found
Transfert de couleurs et colorisation basés sur des propriétés texturelles
This paper targets two related color manipulation problems: Color transfer for modifying an image colors and colorization for adding colors to a greyscale image. Automatic methods for these two applications propose to modify the input image using a reference that contains the desired colors. Previous approaches usually do not target both applications and suffer from two main limitations: possible misleading associations between input and reference regions and poor spatial coherence around image structures. In this paper, we propose a unified framework that uses the textural content of the images to guide the color transfer and colorization. Our method introduces an edge-aware texture descriptor based on region covariance, allowing for local color transformations. We show that our approach is able to produce results comparable or better than state-of-the-art methods in both applications.Cet article se concentre sur deux problèmes de manipulation de couleurs liés : le transfert de couleurs qui modifie les couleurs d'une image, et la colorisation qui ajoute des couleurs à une image en niveaux de gris. Les méthodes automatiques pour ces deux applications modifient l'image d'entrée à l'aide d'une image de référence contenant les couleurs désirées. Les approches précédentes visent rarement les deux problèmes simultanement et souffrent de deux principales limitations : les correspondances créées entre les images d'entrée et de référence sont incorrectes ou approximatives, et une mauvaise cohérence spatiale autour des structures de l'image. Dans cet article, nous proposons un pipeline unifiant les deux problèmes, basé sur le contenu texturel des images pour guider le transfert ou la colorisation. Notre méthode introduit un descripteur de textures préservant les contours de l'image, basé sur des matrices de covariance, permettant d'appliquer des transformations de couleurs locales. Nous montrons que notre approche est capable de produire des résultats comparables ou meilleurs que d'autres méthodes de l'état de l'art dans les deux applications
A Machine Learning Framework for Generating Photorealistic Photos of Real Time Objects using Adam Optimizer by a Generative Adversarial Network (GAN)
Photographic training can result in new photographs that, to human observers, appear to be at least superficially authentic, with many realistic features. will discuss a number of intriguing GAN applications in order to help you develop an understanding of the types of problems where GANs can be used and useful. It is not an exhaustive list, but it includes numerous examples of GAN applications that have garnered media attention. This Paper Proposes a Framework for Generating Photorealistic Photos of real time objects (FGPPO) using Adam Optimizer by Generative Adversarial Networks
Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting
This paper proposes a weakly- and self-supervised deep convolutional neural
network (WSSDCNN) for content-aware image retargeting. Our network takes a
source image and a target aspect ratio, and then directly outputs a retargeted
image. Retargeting is performed through a shift map, which is a pixel-wise
mapping from the source to the target grid. Our method implicitly learns an
attention map, which leads to a content-aware shift map for image retargeting.
As a result, discriminative parts in an image are preserved, while background
regions are adjusted seamlessly. In the training phase, pairs of an image and
its image-level annotation are used to compute content and structure losses. We
demonstrate the effectiveness of our proposed method for a retargeting
application with insightful analyses.Comment: 10 pages, 11 figures. To appear in ICCV 2017, Spotlight Presentatio
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
Iterative, Deep Synthetic Aperture Sonar Image Segmentation
Synthetic aperture sonar (SAS) systems produce high-resolution images of the
seabed environment. Moreover, deep learning has demonstrated superior ability
in finding robust features for automating imagery analysis. However, the
success of deep learning is conditioned on having lots of labeled training
data, but obtaining generous pixel-level annotations of SAS imagery is often
practically infeasible. This challenge has thus far limited the adoption of
deep learning methods for SAS segmentation. Algorithms exist to segment SAS
imagery in an unsupervised manner, but they lack the benefit of
state-of-the-art learning methods and the results present significant room for
improvement. In view of the above, we propose a new iterative algorithm for
unsupervised SAS image segmentation combining superpixel formation, deep
learning, and traditional clustering methods. We call our method Iterative Deep
Unsupervised Segmentation (IDUS). IDUS is an unsupervised learning framework
that can be divided into four main steps: 1) A deep network estimates class
assignments. 2) Low-level image features from the deep network are clustered
into superpixels. 3) Superpixels are clustered into class assignments (which we
call pseudo-labels) using -means. 4) Resulting pseudo-labels are used for
loss backpropagation of the deep network prediction. These four steps are
performed iteratively until convergence. A comparison of IDUS to current
state-of-the-art methods on a realistic benchmark dataset for SAS image
segmentation demonstrates the benefits of our proposal even as the IDUS incurs
a much lower computational burden during inference (actual labeling of a test
image). Finally, we also develop a semi-supervised (SS) extension of IDUS
called IDSS and demonstrate experimentally that it can further enhance
performance while outperforming supervised alternatives that exploit the same
labeled training imagery.Comment: arXiv admin note: text overlap with arXiv:2107.1456
Penerapan Metode Distance Transform pada Linear Discriminant Analysis untuk Kemunculan Kulit pada Deteksi Kulit
Deteksi kulit memainkan peranan penting dalam berbagai aplikasi pengolah citra, mulai dari deteksi wajah, pelacakan wajah, penyaringan konten pornografi, berdasarkan sistem pencarian citra dan berbagai domain interaksi manusia dan komputer. Pendekatan informasi warna dapat mendeteksi warna kulit dengan baik menggunakan skin probability map (SPM) dengan aturan bayes. Namun SPM memiliki permasalahan dalam mendeteksi tekstur kulit. Linear discriminant analysis (LDA) merupakan algoritma ekstraksi fitur, dalam deteksi kulit digunakan untuk mengekstrak fitur tekstur kulit yang dapat menangani masalah SPM. Namun LDA memiliki permasalahan apabila digunakan untuk mengekstrak fitur tekstur kulit pada kernel yang berbeda. Distance transform (DT) merupakan algoritma untuk menghitung jarak citra biner pada setiap pikel gambar dan fitur poin terdekatnya, DT merupakan algoritma yang dapat mengatasi masalah pada LDA. Kombinasi algoritma SPM, LDA dan DT diusulkan untuk memperbaiki performa dari kemunculan kulit pada deteksi kulit. Dataset pada metode yang diusulkan menggunakan IBTD dataset. Hasil dari metode yang diusulkan bahwa metode yang diusulkan menunjukan peningkatan akurasi deteksi kesalahan yang signifikan pada SPM dan LDA
Prospects for Theranostics in Neurosurgical Imaging: Empowering Confocal Laser Endomicroscopy Diagnostics via Deep Learning
Confocal laser endomicroscopy (CLE) is an advanced optical fluorescence
imaging technology that has the potential to increase intraoperative precision,
extend resection, and tailor surgery for malignant invasive brain tumors
because of its subcellular dimension resolution. Despite its promising
diagnostic potential, interpreting the gray tone fluorescence images can be
difficult for untrained users. In this review, we provide a detailed
description of bioinformatical analysis methodology of CLE images that begins
to assist the neurosurgeon and pathologist to rapidly connect on-the-fly
intraoperative imaging, pathology, and surgical observation into a
conclusionary system within the concept of theranostics. We present an overview
and discuss deep learning models for automatic detection of the diagnostic CLE
images and discuss various training regimes and ensemble modeling effect on the
power of deep learning predictive models. Two major approaches reviewed in this
paper include the models that can automatically classify CLE images into
diagnostic/nondiagnostic, glioma/nonglioma, tumor/injury/normal categories and
models that can localize histological features on the CLE images using weakly
supervised methods. We also briefly review advances in the deep learning
approaches used for CLE image analysis in other organs. Significant advances in
speed and precision of automated diagnostic frame selection would augment the
diagnostic potential of CLE, improve operative workflow and integration into
brain tumor surgery. Such technology and bioinformatics analytics lend
themselves to improved precision, personalization, and theranostics in brain
tumor treatment.Comment: See the final version published in Frontiers in Oncology here:
https://www.frontiersin.org/articles/10.3389/fonc.2018.00240/ful
Multi Traffic Scene Perception Using Support Vector Machine and Digital Image Processing
Traffic accidents are especially intense for a rainy day, Night, rainy season, rainy season, ice and day without street lighting many low-level conditions. Current View Drive the help systems are designed to be done under good-nature Weather. Classification is a method of identifying Optical characteristics of vision expansion protocols more efficient. Improve computer vision in awkward manner Weather environments, multi-class weather classification system many weather features and supervision were made Learning. First, basic visual features are extracted Multiple traffic pictures, then the feature is revealed. The team has eight dimensions. Secondly, five supervision was made Learning methods are used to train instructors. Analysis the extracted features indicate that the image describes accurately the highest recognition of etymology and classmates is the accuracy rate and adaptive skills. Provides the basis for the proposed method anterior vehicle innovation increases invention Night light changes, as well as increases View of driving field on an ice day. Image feature extraction is the most important process in pattern recognition and it is the most efficient way to simplify high-dimensional image data. Because it is hard to obtain some information from the M × N × 3-dimensional image matrix. Therefore, owing to perceive multi-traffic scene, the key information must be extracted from the image
Perceptually inspired image estimation and enhancement
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2009.Includes bibliographical references (p. 137-144).In this thesis, we present three image estimation and enhancement algorithms inspired by human vision. In the first part of the thesis, we propose an algorithm for mapping one image to another based on the statistics of a training set. Many vision problems can be cast as image mapping problems, such as, estimating reflectance from luminance, estimating shape from shading, separating signal and noise, etc. Such problems are typically under-constrained, and yet humans are remarkably good at solving them. Classic computational theories about the ability of the human visual system to solve such under-constrained problems attribute this feat to the use of some intuitive regularities of the world, e.g., surfaces tend to be piecewise constant. In recent years, there has been considerable interest in deriving more sophisticated statistical constraints from natural images, but because of the high-dimensional nature of images, representing and utilizing the learned models remains a challenge. Our techniques produce models that are very easy to store and to query. We show these techniques to be effective for a number of applications: removing noise from images, estimating a sharp image from a blurry one, decomposing an image into reflectance and illumination, and interpreting lightness illusions. In the second part of the thesis, we present an algorithm for compressing the dynamic range of an image while retaining important visual detail. The human visual system confronts a serious challenge with dynamic range, in that the physical world has an extremely high dynamic range, while neurons have low dynamic ranges.(cont.) The human visual system performs dynamic range compression by applying automatic gain control, in both the retina and the visual cortex. Taking inspiration from that, we designed techniques that involve multi-scale subband transforms and smooth gain control on subband coefficients, and resemble the contrast gain control mechanism in the visual cortex. We show our techniques to be successful in producing dynamic-range-compressed images without compromising the visibility of detail or introducing artifacts. We also show that the techniques can be adapted for the related problem of "companding", in which a high dynamic range image is converted to a low dynamic range image and saved using fewer bits, and later expanded back to high dynamic range with minimal loss of visual quality. In the third part of the thesis, we propose a technique that enables a user to easily localize image and video editing by drawing a small number of rough scribbles. Image segmentation, usually treated as an unsupervised clustering problem, is extremely difficult to solve. With a minimal degree of user supervision, however, we are able to generate selection masks with good quality. Our technique learns a classifier using the user-scribbled pixels as training examples, and uses the classifier to classify the rest of the pixels into distinct classes. It then uses the classification results as per-pixel data terms, combines them with a smoothness term that respects color discontinuities, and generates better results than state-of-art algorithms for interactive segmentation.by Yuanzhen Li.Ph.D
- …