Search CORE

1,157 research outputs found

Medical image synthesis using generative adversarial networks: towards photo-realistic image synthesis

Author: Attia Mohamed Hassan
Publication venue: Deakin University, Deputy Vice-Chancellor Research Group, Institute for Intelligent Systems Research and Innovation
Publication date: 01/09/2018
Field of study

This proposed work addresses the photo-realism for synthetic images. We introduced a modified generative adversarial network: StencilGAN. It is a perceptually-aware generative adversarial network that synthesizes images based on overlaid labelled masks. This technique can be a prominent solution for the scarcity of the resources in the healthcare sector

Deakin Research Online

Intuitive, Interactive Beard and Hair Synthesis with Generative Models

Author: Ceylan Duygu
Chen Weikai
Chen Zhili
Echevarria Jose
Li Hao
Olszewski Kyle
Xing Jun
Publication venue
Publication date: 13/04/2020
Field of study

We present an interactive approach to synthesizing realistic variations in facial hair in images, ranging from subtle edits to existing hair to the addition of complex and challenging hair in images of clean-shaven subjects. To circumvent the tedious and computationally expensive tasks of modeling, rendering and compositing the 3D geometry of the target hairstyle using the traditional graphics pipeline, we employ a neural network pipeline that synthesizes realistic and detailed images of facial hair directly in the target image in under one second. The synthesis is controlled by simple and sparse guide strokes from the user defining the general structural and color properties of the target hairstyle. We qualitatively and quantitatively evaluate our chosen method compared to several alternative approaches. We show compelling interactive editing results with a prototype user interface that allows novice users to progressively refine the generated image to match their desired hairstyle, and demonstrate that our approach also allows for flexible and high-fidelity scalp hair synthesis.Comment: To be presented in the 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020, Oral Presentation). Supplementary video can be seen at: https://www.youtube.com/watch?v=v4qOtBATrv

arXiv.org e-Print Archive

Crossref

Scipedia

Object segmentation from low depth of field images and video sequences

Author: McDonnell Ian (Researcher in engineering)
Publication venue
Publication date
Field of study

This thesis addresses the problem of autonomous object segmentation. To do so the proposed segementation method uses some prior information, namely that the image to be segmented will have a low depth of field and that the object of interest will be more in focus than the background. To differentiate the object from the background scene, a multiscale wavelet based assessment is proposed. The focus assessment is used to generate a focus intensity map, and a sparse fields level set implementation of active contours is used to segment the object of interest. The initial contour is generated using a grid based technique. The method is extended to segment low depth of field video sequences with each successive initialisation for the active contours generated from the binary dilation of the previous frame's segmentation. Experimental results show good segmentations can be achieved with a variety of different images, video sequences, and objects, with no user interaction or input. The method is applied to two different areas. In the first the segmentations are used to automatically generate trimaps for use with matting algorithms. In the second, the method is used as part of a shape from silhouettes 3D object reconstruction system, replacing the need for a constrained background when generating silhouettes. In addition, not using a thresholding to perform the silhouette segmentation allows for objects with dark components or areas to be segmented accurately. Some examples of 3D models generated using silhouettes are shown

Warwick Research Archives Portal Repository

Applied Visualization in the Neurosciences and the Enhancement of Visualization through Computer Graphics

Author: Eichelbaum Sebastian
Publication venue
Publication date: 27/11/2014
Field of study

The complexity and size of measured and simulated data in many fields of science is increasing constantly. The technical evolution allows for capturing smaller features and more complex structures in the data. To make this data accessible by the scientists, efficient and specialized visualization techniques are required. Maximum efficiency and value for the user can only be achieved by adapting visualization to the specific application area and the specific requirements of the scientific field. Part I: In the first part of my work, I address the visualization in the neurosciences. The neuroscience tries to understand the human brain; beginning at its smallest parts, up to its global infrastructure. To achieve this ambitious goal, the neuroscience uses a combination of three-dimensional data from a myriad of sources, like MRI, CT, or functional MRI. To handle this diversity of different data types and sources, the neuroscience need specialized and well evaluated visualization techniques. As a start, I will introduce an extensive software called \"OpenWalnut\". It forms the common base for developing and using visualization techniques with our neuroscientific collaborators. Using OpenWalnut, standard and novel visualization approaches are available to the neuroscientific researchers too. Afterwards, I am introducing a very specialized method to illustrate the causal relation of brain areas, which was, prior to that, only representable via abstract graph models. I will finalize the first part of my work with an evaluation of several standard visualization techniques in the context of simulated electrical fields in the brain. The goal of this evaluation was clarify the advantages and disadvantages of the used visualization techniques to the neuroscientific community. We exemplified these, using clinically relevant scenarios. Part II: Besides the data preprocessing, which plays a tremendous role in visualization, the final graphical representation of the data is essential to understand structure and features in the data. The graphical representation of data can be seen as the interface between the data and the human mind. The second part of my work is focused on the improvement of structural and spatial perception of visualization -- the improvement of the interface. Unfortunately, visual improvements using computer graphics methods of the computer game industry is often seen sceptically. In the second part, I will show that such methods can be applied to existing visualization techniques to improve spatiality and to emphasize structural details in the data. I will use a computer graphics paradigm called \"screen space rendering\". Its advantage, amongst others, is its seamless applicability to nearly every visualization technique. I will start with two methods that improve the perception of mesh-like structures on arbitrary surfaces. Those mesh structures represent second-order tensors and are generated by a method named \"TensorMesh\". Afterwards I show a novel approach to optimally shade line and point data renderings. With this technique it is possible for the first time to emphasize local details and global, spatial relations in dense line and point data.In vielen Bereichen der Wissenschaft nimmt die Größe und Komplexität von gemessenen und simulierten Daten zu. Die technische Entwicklung erlaubt das Erfassen immer kleinerer Strukturen und komplexerer Sachverhalte. Um solche Daten dem Menschen zugänglich zu machen, benötigt man effiziente und spezialisierte Visualisierungswerkzeuge. Nur die Anpassung der Visualisierung auf ein Anwendungsgebiet und dessen Anforderungen erlaubt maximale Effizienz und Nutzen für den Anwender. Teil I: Im ersten Teil meiner Arbeit befasse ich mich mit der Visualisierung im Bereich der Neurowissenschaften. Ihr Ziel ist es, das menschliche Gehirn zu begreifen; von seinen kleinsten Teilen bis hin zu seiner Gesamtstruktur. Um dieses ehrgeizige Ziel zu erreichen nutzt die Neurowissenschaft vor allem kombinierte, dreidimensionale Daten aus vielzähligen Quellen, wie MRT, CT oder funktionalem MRT. Um mit dieser Vielfalt umgehen zu können, benötigt man in der Neurowissenschaft vor allem spezialisierte und evaluierte Visualisierungsmethoden. Zunächst stelle ich ein umfangreiches Softwareprojekt namens \"OpenWalnut\" vor. Es bildet die gemeinsame Basis für die Entwicklung und Nutzung von Visualisierungstechniken mit unseren neurowissenschaftlichen Kollaborationspartnern. Auf dieser Basis sind klassische und neu entwickelte Visualisierungen auch für Neurowissenschaftler zugänglich. Anschließend stelle ich ein spezialisiertes Visualisierungsverfahren vor, welches es ermöglicht, den kausalen Zusammenhang zwischen Gehirnarealen zu illustrieren. Das war vorher nur durch abstrakte Graphenmodelle möglich. Den ersten Teil der Arbeit schließe ich mit einer Evaluation verschiedener Standardmethoden unter dem Blickwinkel simulierter elektrischer Felder im Gehirn ab. Das Ziel dieser Evaluation war es, der neurowissenschaftlichen Gemeinde die Vor- und Nachteile bestimmter Techniken zu verdeutlichen und anhand klinisch relevanter Fälle zu erläutern. Teil II: Neben der eigentlichen Datenvorverarbeitung, welche in der Visualisierung eine enorme Rolle spielt, ist die grafische Darstellung essenziell für das Verständnis der Strukturen und Bestandteile in den Daten. Die grafische Repräsentation von Daten bildet die Schnittstelle zum Gehirn des Menschen. Der zweite Teile meiner Arbeit befasst sich mit der Verbesserung der strukturellen und räumlichen Wahrnehmung in Visualisierungsverfahren -- mit der Verbesserung der Schnittstelle. Leider werden viele visuelle Verbesserungen durch Computergrafikmethoden der Spieleindustrie mit Argwohn beäugt. Im zweiten Teil meiner Arbeit werde ich zeigen, dass solche Methoden in der Visualisierung angewendet werden können um den räumlichen Eindruck zu verbessern und Strukturen in den Daten hervorzuheben. Dazu nutze ich ein in der Computergrafik bekanntes Paradigma: das \"Screen Space Rendering\". Dieses Paradigma hat den Vorteil, dass es auf nahezu jede existierende Visualiserungsmethode als Nachbearbeitunsgschritt angewendet werden kann. Zunächst führe ich zwei Methoden ein, die die Wahrnehmung von gitterartigen Strukturen auf beliebigen Oberflächen verbessern. Diese Gitter repräsentieren die Struktur von Tensoren zweiter Ordnung und wurden durch eine Methode namens \"TensorMesh\" erzeugt. Anschließend zeige ich eine neuartige Technik für die optimale Schattierung von Linien und Punktdaten. Mit dieser Technik ist es erstmals möglich sowohl lokale Details als auch globale räumliche Zusammenhänge in dichten Linien- und Punktdaten zu erfassen

Qucosa - Publikationsserver der Universität Leipzig

A Survey on Ear Biometrics

Author: Abaza Ayman
Harrison Mary Ann F.
Hebert Christina
Nixon Mark
Ross Arun
Publication venue
Publication date: 01/02/2013
Field of study

Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

Southampton (e-Prints Soton)

Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

Author: Zhang Ming Zhu
Publication venue: ODU Digital Commons
Publication date: 01/01/2008
Field of study

Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

Old Dominion University