Search CORE

49 research outputs found

Data-Driven Image Restoration

Author: Anwar Saeed
Publication venue
Publication date: 01/01/2018
Field of study

Every day many images are taken by digital cameras, and people are demanding visually accurate and pleasing result. Noise and blur degrade images captured by modern cameras, and high-level vision tasks (such as segmentation, recognition, and tracking) require high-quality images. Therefore, image restoration specifically, image deblurring and image denoising is a critical preprocessing step. A fundamental problem in image deblurring is to recover reliably distinct spatial frequencies that have been suppressed by the blur kernel. Existing image deblurring techniques often rely on generic image priors that only help recover part of the frequency spectrum, such as the frequencies near the high-end. To this end, we pose the following specific questions: (i) Does class-specific information offer an advantage over existing generic priors for image quality restoration? (ii) If a class-specific prior exists, how should it be encoded into a deblurring framework to recover attenuated image frequencies? Throughout this work, we devise a class-specific prior based on the band-pass filter responses and incorporate it into a deblurring strategy. Specifically, we show that the subspace of band-pass filtered images and their intensity distributions serve as useful priors for recovering image frequencies. Next, we present a novel image denoising algorithm that uses external, category specific image database. In contrast to existing noisy image restoration algorithms, our method selects clean image “support patches” similar to the noisy patch from an external database. We employ a content adaptive distribution model for each patch where we derive the parameters of the distribution from the support patches. Our objective function composed of a Gaussian fidelity term that imposes category specific information, and a low-rank term that encourages the similarity between the noisy and the support patches in a robust manner. Finally, we propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, each neuron in the last convolution layer of each module can observe the full receptive field of the first layer

The Australian National University

Statistical Modelling of Craniofacial Shape

Author: Dai Hang
Publication venue: University of York
Publication date: 27/09/2018
Field of study

With prior knowledge and experience, people can easily observe rich shape and texture variation for a certain type of objects, such as human faces, cats or chairs, in both 2D and 3D images. This ability helps us recognise the same person, distinguish different kinds of creatures and sketch unseen samples of the same object class. The process of capturing this prior knowledge is mathematically interpreted as statistical modelling. The outcome is a morphable model, a vector space representation of objects, that captures the variation of shape and texture. This thesis presents research aimed at constructing 3DMMs of craniofacial shape and texture using new algorithms and processing pipelines to offer enhanced modelling abilities over existing techniques. In particular, we present several fully automatic modelling approaches and apply them to a large dataset of 3D images of the human head, the Headspace dataset, thus generating the first public shape-and- texture 3D Morphable Model (3DMM) of the full human head. We call this the Liverpool-York Head Model, reflecting the data collection and statistical modelling respectively. We also explore the craniofacial symmetry and asymmetry in template morphing and statistical modelling. We propose a Symmetry-aware Coherent Point Drift (SA-CPD) algorithm, which mitigates the tangential sliding problem seen in competing morphing algorithms. Based on the symmetry-constrained correspondence output of SA-CPD, we present a symmetry-factored statistical modelling method for craniofacial shape. Also, we propose an iterative process of refinement for a 3DMM of the human ear that employs data augmentation. Then we merge the proposed 3DMMs of the ear with the full head model. As craniofacial clinicians like to look at head profiles, we propose a new pipeline to build a 2D morphable model of the craniofacial sagittal profile and augment it with profile models from frontal and top-down views. Our models and data are made publicly available online for research purposes

White Rose E-theses Online

Proceedings of the 38th International Workshop on Statistical Modelling

Author
Publication venue: Durham University
Publication date: 15/07/2024
Field of study

Durham Research Online

Object Detection with Active Sample Harvesting

Author: Canévet Olivier
Publication venue: Lausanne, EPFL
Publication date: 07/02/2017
Field of study

The work presented in this dissertation lies in the domains of image classification, object detection, and machine learning. Whether it is training image classifiers or object detectors, the learning phase consists in finding an optimal boundary between populations of samples. In practice, all the samples are not equally important: some examples are trivially classified and do not bring much to the training, while others close to the boundary or misclassified are the ones that truly matter. Similarly, images where the samples originate from are not all rich in informative samples. However, most training procedures select samples and images uniformly or weight them equally. The common thread of this dissertation is how to efficiently find the informative samples/images for training. Although we never consider all the possible samples "in the world", our purpose is to select the samples in a smarter manner, without looking at all the available ones. The framework adopted in this work consists in organising the data (samples or images) in a tree to reflect the statistical regularities of the training samples, by putting "similar" samples in the same branch. Each leaf carries a sample and a weight related to the "importance" of the corresponding sample, and each internal node carries statistics about the weights below. The tree is used to select the next sample/image for training, by applying a sampling policy, and the "importance" weights are updated accordingly, to bias the sampling towards informative samples/images in future iterations. Our experiments show that, in the various applications, properly focusing on informative images or informative samples improves the learning phase by either reaching better performances faster or by reducing the training loss faster

Infoscience - École polytechnique fédérale de Lausanne

Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

Author: Ferreira N.
Oliveira M.
Publication venue: CFE and CMStatistics networks
Publication date: 01/01/2015
Field of study

The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL

Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

Author: International Workshop on Statistical Modelling (35º. 2020. Bilbao)
Irigoyen Garbizu Itziar
Lee Dae-Ji
Martínez-Minaya Joaquín
Rodríguez-Álvarez María Xosé
Publication venue: Servicio Editorial de la Universidad del País Vasco/Euskal Herriko Unibertsitatearen Argitalpen Zerbitzua
Publication date: 01/01/2020
Field of study

466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

Archivo Digital para la Docencia y la Investigación

Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

Author: International Workshop on Statistical Modelling (35º. 2020. Bilbao)
Irigoyen Garbizu Itziar
Lee Dae-Ji
Martínez-Minaya Joaquín
Rodríguez-Álvarez María Xosé
Publication venue: Servicio Editorial de la Universidad del País Vasco/Euskal Herriko Unibertsitatearen Argitalpen Zerbitzua
Publication date: 01/01/2020
Field of study

Archivo Digital para la Docencia y la Investigación

Multi-task near-field perception for autonomous driving using surround-view fisheye cameras

Author: Ravi Kumar Varun
Publication venue
Publication date: 01/01/2021
Field of study

Die Bildung der Augen führte zum Urknall der Evolution. Die Dynamik änderte sich von einem primitiven Organismus, der auf den Kontakt mit der Nahrung wartete, zu einem Organismus, der durch visuelle Sensoren gesucht wurde. Das menschliche Auge ist eine der raffiniertesten Entwicklungen der Evolution, aber es hat immer noch Mängel. Der Mensch hat über Millionen von Jahren einen biologischen Wahrnehmungsalgorithmus entwickelt, der in der Lage ist, Autos zu fahren, Maschinen zu bedienen, Flugzeuge zu steuern und Schiffe zu navigieren. Die Automatisierung dieser Fähigkeiten für Computer ist entscheidend für verschiedene Anwendungen, darunter selbstfahrende Autos, Augmented Realität und architektonische Vermessung. Die visuelle Nahfeldwahrnehmung im Kontext von selbstfahrenden Autos kann die Umgebung in einem Bereich von 0 - 10 Metern und 360° Abdeckung um das Fahrzeug herum wahrnehmen. Sie ist eine entscheidende Entscheidungskomponente bei der Entwicklung eines sichereren automatisierten Fahrens. Jüngste Fortschritte im Bereich Computer Vision und Deep Learning in Verbindung mit hochwertigen Sensoren wie Kameras und LiDARs haben ausgereifte Lösungen für die visuelle Wahrnehmung hervorgebracht. Bisher stand die Fernfeldwahrnehmung im Vordergrund. Ein weiteres wichtiges Problem ist die begrenzte Rechenleistung, die für die Entwicklung von Echtzeit-Anwendungen zur Verfügung steht. Aufgrund dieses Engpasses kommt es häufig zu einem Kompromiss zwischen Leistung und Laufzeiteffizienz. Wir konzentrieren uns auf die folgenden Themen, um diese anzugehen: 1) Entwicklung von Nahfeld-Wahrnehmungsalgorithmen mit hoher Leistung und geringer Rechenkomplexität für verschiedene visuelle Wahrnehmungsaufgaben wie geometrische und semantische Aufgaben unter Verwendung von faltbaren neuronalen Netzen. 2) Verwendung von Multi-Task-Learning zur Überwindung von Rechenengpässen durch die gemeinsame Nutzung von initialen Faltungsschichten zwischen den Aufgaben und die Entwicklung von Optimierungsstrategien, die die Aufgaben ausbalancieren.The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of 0 - 10 meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks

Digitale Bibliothek Thüringen

Volumetric Estimation of Cystic Macular Edema in OCT Scans

Author: Greenwood Luke
Publication venue
Publication date: 01/01/2019
Field of study

The analysis of retinal Spectral Domain Optical Coherence Tomography (SDOCT) images by trained medical professionals can be used to provide useful insights into various diseases. It is the most popular method of retinal imaging due to its non invasive nature and the useful information it provides for making an accurate diagnosis, however there is a clear lack of publicly available data available to researchers in the domain. In this report, a deep learning approach for automating the segmentation of cystic macular edema (fluid) in retinal OCT B-Scan images is presented that is consequently used for volumetric analysis of OCT scans. This solution is a fast and accurate semantic segmentation network which makes use of a shortened encoderdecoder UNet-like architecture with an integrated DenseASPP module and Attention Gate for producing an accurate and refined retinal fluid segmentation map. The network is evaluated against both publicly and privately available datasets; on the former the network achieved a Dice coefficient of 0.804, thus making it the current best performing approach on this dataset and on the very small and challenging private dataset, it achieved a score of 0.691. Due to the issue presented by a lack of publicly available data in this domain, a Graphical User Interface that aims to semi-automate the labelling process of OCT images was also created, thus greatly simplifying the process of dataset creation and potentially leading to an increase in labelled data production

University of Lincoln Institutional Repository

Semi-Automated Labelling of Cystoid Macular Edema in OCT Scans

Author: Greenwood Luke
Publication venue
Publication date: 01/12/2019
Field of study

The analysis of retinal Spectral Domain Optical Coherence Tomography (SDOCT) images by trained medical professionals can be used to provide useful insights into various diseases. It is the most popular method of retinal imaging due to its non invasive nature and the useful information it provides for making an accurate diagnosis, however there is a clear lack of publicly available data available to researchers in the domain. In this report, a deep learning approach for automating the segmentation of Cystoid Macular Edema (fluid) in retinal OCT B-Scan images is presented that is consequently used for volumetric analysis of OCT scans. This solution is a fast and accurate semantic segmentation network which makes use of a shortened encoderdecoder UNet-like architecture with an integrated DenseASPP module and Attention Gate for producing an accurate and refined retinal fluid segmentation map. The network is evaluated against both publicly and privately available datasets; on the former the network achieved a Dice coefficient of 0.804, thus making it the current best performing approach on this dataset and on the very small and challenging private dataset, it achieved a score of 0.691. Due to the issue presented by a lack of publicly available data in this domain, a Graphical User Interface that aims to semi-automate the labelling process of OCT images was also created, thus greatly simplifying the process of dataset creation and potentially leading to an increase in labelled data production

University of Lincoln Institutional Repository