98 research outputs found
Recommended from our members
Models of Visual Appearance for Analyzing and Editing Images and Videos
The visual appearance of an image is a complex function of factors such as scene geometry, material reflectances and textures, illumination, and the properties of the camera used to capture the image. Understanding how these factors interact to produce an image is a fundamental problem in computer vision and graphics. This dissertation examines two aspects of this problem: models of visual appearance that allow us to recover scene properties from images and videos, and tools that allow users to manipulate visual appearance in images and videos in intuitive ways. In particular, we look at these problems in three different applications. First, we propose techniques for compositing images that differ significantly in their appearance. Our framework transfers appearance between images by manipulating the different levels of a multi-scale decomposition of the image. This allows users to create realistic composites with minimal interaction in a number of different scenarios. We also discuss techniques for compositing and replacing facial performances in videos. Second, we look at the problem of creating high-quality still images from low-quality video clips. Traditional multi-image enhancement techniques accomplish this by inverting the camera’s imaging process. Our system incorporates feature weights into these image models to create results that have better resolution, noise, and blur characteristics, and summarize the activity in the video. Finally, we analyze variations in scene appearance caused by changes in lighting. We develop a model for outdoor scene appearance that allows us to recover radiometric and geometric infor- mation about the scene from images. We apply this model to a variety of visual tasks, including color-constancy, background subtraction, shadow detection, scene reconstruction, and camera geo-location. We also show that the appearance of a Lambertian scene can be modeled as a combi- nation of distinct three-dimensional illumination subspaces — a result that leads to novel bounds on scene appearance, and a robust uncalibrated photometric stereo method.Engineering and Applied Science
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Performance evaluation of quality metrics for video quality assessment
Projecte fianl de carrera fet en col.lboració amb Deutsche Telekom Laboratorie
Modeling and applications of the focus cue in conventional digital cameras
El enfoque en cámaras digitales juega un papel fundamental tanto en la calidad de la imagen como en la percepción del entorno. Esta tesis estudia el enfoque en cámaras digitales convencionales, tales como cámaras de móviles, fotográficas, webcams y similares. Una revisión rigurosa de los conceptos teóricos detras del enfoque en cámaras convencionales muestra que, a pasar de su utilidad, el modelo clásico del thin lens presenta muchas limitaciones para aplicación en diferentes problemas relacionados con el foco. En esta tesis, el focus profile es propuesto como una alternativa a conceptos clásicos como la profundidad de campo. Los nuevos conceptos introducidos en esta tesis son aplicados a diferentes problemas relacionados con el foco, tales como la adquisición eficiente de imágenes, estimación de profundidad, integración de elementos perceptuales y fusión de imágenes. Los resultados experimentales muestran la aplicación exitosa de los modelos propuestos.The focus of digital cameras plays a fundamental role in both the quality of the acquired images and the perception of the imaged scene. This thesis studies the focus cue in conventional cameras with focus control, such as cellphone cameras, photography cameras, webcams and the like. A deep review of the theoretical concepts behind focus in conventional cameras reveals that, despite its usefulness, the widely known thin lens model has several limitations for solving different focus-related problems in computer vision. In order to overcome these limitations, the focus profile model is introduced as an alternative to classic concepts, such as the near and far limits of the depth-of-field. The new concepts introduced in this dissertation are exploited for solving diverse focus-related problems, such as efficient image capture, depth estimation, visual cue integration and image fusion. The results obtained through an exhaustive experimental validation demonstrate the applicability of the proposed models
On the generation of high dynamic range images: theory and practice from a statistical perspective
This dissertation studies the problem of high dynamic range (HDR) image generation from a statistical perspective. A thorough analysis of the camera acquisition process leads to a simplified yet realistic statistical model describing raw pixel values. The analysis and methods then proposed are based on this model. First, the theoretical performance bound of the problem is computed for the static case, where the acquisition conditions are controlled. Furthermore, a new method is proposed that, unlike previous methods, improves the reconstructed HDR image by taking into account the information carried by saturated samples. From a more practical perspective, two methods are proposed to generate HDR images in the more realistic and complex case where both objects and camera may exhibit motion. The first one is a multi-image, patch-based method, that simultaneously estimates and denoises the HDR image. The other is a single image approach that makes use of a general restoration method to generate the HDR image. This general restoration method, applicable to a wide range of problems, constitutes the last contribution of this dissertation
Noise, artifact and the uncanny in large scale digital photographic practice.
This dissertation explores the question: why, when encountering the products of many new technologies delivering information via a new media, do I often experience a feeling of disquiet or estrangement? I use the example of laser-photographic printing to explore the issue through a program of practice-based research. The outcome of this line of enquiry includes an original contribution via three series of large-format digital photographic works: Presenting "The Amazing Kriels", Home At Last, and Pure.
In this thesis, which supports the main body of the research, that is, the practice-based research, I will briefly review the case for artefact as noise within photographic printing, articulate a significant difference between the artefact levels of traditional analogue and Lambda prints, present original dialogical evidence for estrangement in the latter, and identify it via readings of Sigmund Freud's "The Uncanny" and McLuhan's "The Gadget Lover", as a function of the uncanny. I will propose an original rewriting of McLuhan's ideas of "hot" and "cool" media, as well as the cycles of irritation/mediation repression within McLuhan's media theory as a direction for future research, and relate them to a shift from large-scale analogue photographic printing to Lambda printing
RGB-D Object Recognition for Deep Robotic Learning
Negli ultimi anni, il successo delle tecniche di Deep Learning in una grande varietà di problemi sia nel contesto della visione artificiale che in quello dell’elaborazione del linguaggio naturale ha contribuito all’applicazione di reti neurali artificiali profonde a sistemi robotici.
Grazie all’utilizzo di sensori RGB-D per l’acquisizione dell’informazione di profondità di una scena del mondo reale, i sistemi robotizzati stanno sempre più semplificando alcune delle sfide comuni nel campo della visione robotica. Nel contesto del riconoscimento oggetti RGB-D, un’attività fondamentale per diverse applicazioni robotiche, data una CNN come modello di apprendimento ed un dataset RGB-D, ci si chiede spesso quale sia la migliore strategia di preprocessamento della profondità al fine di ottenere una migliore accuratezza di classificazione. Un’altra domanda cruciale è se l’informazione di profondità incrementerà in maniera notevole o meno l’accuratezza del classificatore.Questa tesi è interessata a cercare di rispondere a queste domande chiave. In particolare, discutiamo e confrontiamo i risultati ottenuti dall’impiego di tre strategie di preprocessamento dell’informazione di profondità, dove ognuna di queste strategie conduce ad uno specifico scenario di training. Questi scenari vengono valutati per mezzo del dataset CORe50 RGB-D. Infine, questa tesi prova che, nel contesto del riconoscimento oggetti, l’utilizzo dell’informazione di profondità migliora significativamente l’accuratezza di classificazione. A tal fine, dalla nostra analisi si evince che la precisione e completezza dell’informazione di profondità ed eventualmente la sua strategia di segmentazione svolgono un ruolo fondamentale. Inoltre, mostriamo che effettuare un training from scratch di una CNN (rispetto
ad un fine-tuning) può permettere di apprezzare miglioramenti notevoli dell’accuratezza
Deep learning in food category recognition
Integrating artificial intelligence with food category recognition has been a field of interest for research for the
past few decades. It is potentially one of the next steps in revolutionizing human interaction with food. The
modern advent of big data and the development of data-oriented fields like deep learning have provided advancements
in food category recognition. With increasing computational power and ever-larger food datasets,
the approach’s potential has yet to be realized. This survey provides an overview of methods that can be applied
to various food category recognition tasks, including detecting type, ingredients, quality, and quantity. We
survey the core components for constructing a machine learning system for food category recognition, including
datasets, data augmentation, hand-crafted feature extraction, and machine learning algorithms. We place a
particular focus on the field of deep learning, including the utilization of convolutional neural networks, transfer
learning, and semi-supervised learning. We provide an overview of relevant studies to promote further developments
in food category recognition for research and industrial applicationsMRC (MC_PC_17171)Royal Society (RP202G0230)BHF (AA/18/3/34220)Hope Foundation for Cancer Research (RM60G0680)GCRF (P202PF11)Sino-UK Industrial
Fund (RP202G0289)LIAS (P202ED10Data Science
Enhancement Fund (P202RE237)Fight for Sight (24NN201);Sino-UK
Education Fund (OP202006)BBSRC (RM32G0178B8
- …