13 research outputs found

    An evaluation of partial differential equations based digital inpainting algorithms

    Get PDF
    Partial Differential equations (PDEs) have been used to model various phenomena/tasks in different scientific and engineering endeavours. This thesis is devoted to modelling image inpainting by numerical implementations of certain PDEs. The main objectives of image inpainting include reconstructing damaged parts and filling-in regions in which data/colour information are missing. Different automatic and semi-automatic approaches to image inpainting have been developed including PDE-based, texture synthesis-based, exemplar-based, and hybrid approaches. Various challenges remain unresolved in reconstructing large size missing regions and/or missing areas with highly textured surroundings. Our main aim is to address such challenges by developing new advanced schemes with particular focus on using PDEs of different orders to preserve continuity of textural and geometric information in the surrounding of missing regions. We first investigated the problem of partial colour restoration in an image region whose greyscale channel is intact. A PDE-based solution is known that is modelled as minimising total variation of gradients in the different colour channels. We extend the applicability of this model to partial inpainting in other 3-channels colour spaces (such as RGB where information is missing in any of the two colours), simply by exploiting the known linear/affine relationships between different colouring models in the derivation of a modified PDE solution obtained by using the Euler-Lagrange minimisation of the corresponding gradient Total Variation (TV). We also developed two TV models on the relations between greyscale and colour channels using the Laplacian operator and the directional derivatives of gradients. The corresponding Euler-Lagrange minimisation yields two new PDEs of different orders for partial colourisation. We implemented these solutions in both spatial and frequency domains. We measure the success of these models by evaluating known image quality measures in inpainted regions for sufficiently large datasets and scenarios. The results reveal that our schemes compare well with existing algorithms, but inpainting large regions remains a challenge. Secondly, we investigate the Total Inpainting (TI) problem where all colour channels are missing in an image region. Reviewing and implementing existing PDE-based total inpainting methods reveal that high order PDEs, applied to each colour channel separately, perform well but are influenced by the size of the region and the quantity of texture surrounding it. Here we developed a TI scheme that benefits from our partial inpainting approach and apply two PDE methods to recover the missing regions in the image. First, we extract the (Y, Cb, Cr) of the image outside the missing region, apply the above PDE methods for reconstructing the missing regions in the luminance channel (Y), and then use the colourisation method to recover the missing (Cb, Cr) colours in the region. We shall demonstrate that compared to existing TI algorithms, our proposed method (using 2 PDE methods) performs well when tested on large datasets of natural and face images. Furthermore, this helps understanding of the impact of the texture in the surrounding areas on inpainting and opens new research directions. Thirdly, we investigate existing Exemplar-Based Inpainting (EBI) methods that do not use PDEs but simultaneously propagate the texture and structure into the missing region by finding similar patches within the rest of image and copying them into the boundary of the missing region. The order of patch propagation is determined by a priority function, and the similarity is determined by matching criteria. We shall exploit recently emerging Topological Data Analysis (TDA) tools to create innovative EBI schemes, referred to as TEBI. TDA studies shapes of data/objects to quantify image texture in terms of connectivity and closeness properties of certain data landmarks. Such quantifications help determine the appropriate size of patch propagation and will be used to modify the patch propagation priority function using the geometrical properties of curvature of isophotes, and to improve the matching criteria of patches by calculating the correlation coefficients from the spatial, gradient and Laplacian domains. The performance of this TEBI method will be tested by applying it to natural dataset images, resulting in improved inpainting when compared with other EBI methods. Fourthly, the recent hybrid-based inpainting techniques are reviewed and a number of highly performing innovative hybrid techniques that combine the use of high order PDE methods with the TEBI method for the simultaneous rebuilding of the missing texture and structure regions in an image are proposed. Such a hybrid scheme first decomposes the image into texture and structure components, and then the missing regions in these components are recovered by TEBI and PDE based methods respectively. The performance of our hybrid schemes will be compared with two existing hybrid algorithms. Fifthly, we turn our attention to inpainting large missing regions, and develop an innovative inpainting scheme that uses the concept of seam carving to reduce this problem to that of inpainting a smaller size missing region that can be dealt with efficiently using the inpainting schemes developed above. Seam carving resizes images based on content-awareness of the image for both reduction and expansion without affecting those image regions that have rich information. The missing region of the seam-carved version will be recovered by the TEBI method, original image size is restored by adding the removed seams and the missing parts of the added seams are then repaired using a high order PDE inpainting scheme. The benefits of this approach in dealing with large missing regions are demonstrated. The extensive performance testing of the developed inpainting methods shows that these methods significantly outperform existing inpainting methods for such a challenging task. However, the performance is still not acceptable in recovering large missing regions in high texture and structure images, and hence we shall identify remaining challenges to be investigated in the future. We shall also extend our work by investigating recently developed deep learning based image/video colourisation, with the aim of overcoming its limitations and shortcoming. Finally, we should also describe our on-going research into using TDA to detect recently growing serious “malicious” use of inpainting to create Fake images/videos

    Class-incremental lifelong object learning for domestic robots

    Get PDF
    Traditionally, robots have been confined to settings where they operate in isolation and in highly controlled and structured environments to execute well-defined non-varying tasks. As a result, they usually operate without the need to perceive their surroundings or to adapt to changing stimuli. However, as robots start to move towards human-centred environments and share the physical space with people, there is an urgent need to endow them with the flexibility to learn and adapt given the changing nature of the stimuli they receive and the evolving requirements of their users. Standard machine learning is not suitable for these types of applications because it operates under the assumption that data samples are independent and identically distributed, and requires access to all the data in advance. If any of these assumptions is broken, the model fails catastrophically, i.e., either it does not learn or it forgets all that was previously learned. Therefore, different strategies are required to address this problem. The focus of this thesis is on lifelong object learning, whereby a model is able to learn from data that becomes available over time. In particular we address the problem of classincremental learning with an emphasis on algorithms that can enable interactive learning with a user. In class-incremental learning, models learn from sequential data batches where each batch can contain samples coming from ideally a single class. The emphasis on interactive learning capabilities poses additional requirements in terms of the speed with which model updates are performed as well as how the interaction is handled. The work presented in this thesis can be divided into two main lines of work. First, we propose two versions of a lifelong learning algorithm composed of a feature extractor based on pre-trained residual networks, an array of growing self-organising networks and a classifier. Self-organising networks are able to adapt their structure based on the input data distribution, and learn representative prototypes of the data. These prototypes can then be used to train a classifier. The proposed approaches are evaluated on various benchmarks under several conditions and the results show that they outperform competing approaches in each case. Second, we propose a robot architecture to address lifelong object learning through interactions with a human partner using natural language. The architecture consists of an object segmentation, tracking and preprocessing pipeline, a dialogue system, and a learning module based on the algorithm developed in the first part of the thesis. Finally, the thesis also includes an exploration into the contributions that different preprocessing operations have on performance when learning from both RGB and Depth images.James Watt Scholarshi

    Smart vision in system-on-chip applications

    Get PDF
    In the last decade the ability to design and manufacture integrated circuits with higher transistor densities has led to the integration of complete systems on a single silicon die. These are commonly referred to as System-on-Chip (SoC). As SoCs processes can incorporate multiple technologies it is now feasible to produce single chip camera systems with embedded image processing, known as Imager-on-Chips (IoC). The development of IoCs is complicated due to the mixture of digital and analog components and the high cost of prototyping these designs using silicon processes. There are currently no re-usable prototyping platforms that specifically address the needs of IoC development. This thesis details a new prototyping platform specifically for use in the development of low-cost mass-market IoC applications. FPGA technology was utilised to implement a frame-based processing architecture suitable for supporting a range of real-time imaging and machine vision applications. To demonstrate the effectiveness of the prototyping platform, an example object counting and highlighting application was developed and functionally verified in real-time. A high-level IoC cost model was formulated to calculate the cost of manufacturing prototyped applications as a single IoC. This highlighted the requirement for careful analysis of optical issues, embedded imager array size and the silicon process used to ensure the desired IoC unit cost was achieved. A modified version of the FPGA architecture, which would result in improving the DSP performance, is also proposed

    The Influence of Intrinsic Perceptual Cues on Navigation and Route Selection in Virtual Environments

    Get PDF
    The principle aims of this thesis were to investigate the influence of intrinsic navigational cues in virtual environments and video games. Modern video games offer complex environments that may reflect real world spaces or represent landscapes from fantasy and fiction. The coherent design of these spaces can promote natural navigational flow without the requirement for extraneous guidance such as maps and arrows. The methods that designers use to create natural flow are complex and stratified utilising principles rooted in urban architectural design and navigational cues that are intrinsic to real-world wayfinding scenarios. The studies presented in this thesis analysed not only these commonly used architectural cues but also the potential for the reinforcing of these cues by the addition of lighting, visual and auditory cues. The primary focus of this thesis was a systematic and quantitatively rooted analysis of the impact lighting has on navigation and the levels at which variance in lighting makes a quantifiable difference to navigational choices within a virtual environment. The findings of this thesis offer clear guidance as to the influence that lighting has within virtual environments and specifies that thresholds at which the inclusion of guidance lighting begins to affect navigational choices and the levels that players become conscious of these cues. The thesis also analyses the temporal thresholds for the detection of changes in contrast, hue and texture within an environment. The relationship of other intrinsic cues such as the potential reinforcement or cue competition effects of both audio and other visual cues, for instance motion are quantitatively analysed. These data were reflected in the form of a series of heuristic design principles that augment those that underpin architectural and environmental design considerations by for instance suggesting levels of saliency for lighting cues or reinforcing existing cues via supporting audio guidance

    Using Deep Learning to Explore Ultra-Large Scale Astronomical Datasets

    Get PDF
    In every field that deep learning has infiltrated we have seen a reduction in the use of specialist knowledge, to be replaced with knowledge automatically derived from data. We have already seen this process play out in many ‘applied deep learning’ fields such as computer Go, protein folding, natural language processing, and computer vision. This thesis argues that astronomy is no different to these applied deep learning fields. To this end, this thesis’ introduction serves as a historical background on astronomy’s ‘three waves’ of increasingly automated connectionism: initial work on multilayerperceptrons within astronomy required manually selected emergent properties as input; the second wave coincided with the dissemination of convolutional neural networks and recurrent neural networks, models where the multilayer perceptron’s manually selected inputs are replaced with raw data ingestion; and in the current third wave we are seeing the removal of human supervision altogether with deep learning methods inferring labels and knowledge directly from the data. §2, §3, and §4 of this thesis explore these waves through application. In §2 I show that a convolutional/recurrent encoder/decoder network is capable of emulating a complicated semi-manual galaxy processing pipeline. I find that this ‘Pix2Prof’ neural network can satisfactorily carry out this task over 100x faster than the method it emulates. §3 and §4 explore the application of deep generative models to astronomical simulation. §3 uses a generative adversarial network to generate mock deep field surveys, and finds it capable of generating mock images that are statistically indistinguishable from the real thing. Likewise, §4 demonstrates that a Diffusion model is capable of generating galaxy images that are both qualitatively and quantitatively indistinguishable from the training set. The main benefit of these deep learning based simulations is that they do not rely on a possibly flawed (or incomplete) physical knowledge of their subjects and observation processes. Also, once trained, they are capable of rapidly generating a very large amount of mock data. §5 looks to the future and predicts that we will soon enter a fourth wave of astronomical connectionism. If astronomy follows in the footsteps of other applied deep learning fields we will see the removal of expertly crafted deep learning models, to be replaced with finetuned versions of an all-encompassing ‘foundation’ model. As part of this fourth wave I argue for a symbiosis between astronomy and connectionism. This symbiosis is predicated on astronomy’s relative data wealth, and contemporary deep learning’s enormous data appetite; many ultra-large datasets in machine learning are proprietary or of poor quality, and so astronomy as a whole could develop and provide a high quality multimodal public dataset. In turn, this dataset could be used to train an astronomical foundation model that can be used for state-of-the-art downstream tasks. Due to the foundation models’ hunger for data and compute, a single astronomical research group could not bring about such a model alone. Therefore, I conclude that astronomy as a whole has slim chance of keeping up with a research pace set by the Big Tech goliaths—that is, unless we follow the examples of EleutherAI and HuggingFace and pool our resources in a grassroots open source fashion

    Remote Sensing and Geosciences for Archaeology

    Get PDF
    This book collects more than 20 papers, written by renowned experts and scientists from across the globe, that showcase the state-of-the-art and forefront research in archaeological remote sensing and the use of geoscientific techniques to investigate archaeological records and cultural heritage. Very high resolution satellite images from optical and radar space-borne sensors, airborne multi-spectral images, ground penetrating radar, terrestrial laser scanning, 3D modelling, Geographyc Information Systems (GIS) are among the techniques used in the archaeological studies published in this book. The reader can learn how to use these instruments and sensors, also in combination, to investigate cultural landscapes, discover new sites, reconstruct paleo-landscapes, augment the knowledge of monuments, and assess the condition of heritage at risk. Case studies scattered across Europe, Asia and America are presented: from the World UNESCO World Heritage Site of Lines and Geoglyphs of Nasca and Palpa to heritage under threat in the Middle East and North Africa, from coastal heritage in the intertidal flats of the German North Sea to Early and Neolithic settlements in Thessaly. Beginners will learn robust research methodologies and take inspiration; mature scholars will for sure derive inputs for new research and applications

    Animated proportional Venn diagrams: a study into their description, construction and business application

    Get PDF
    Anecdotal observation of the way in which data visualisation techniques are utilised to present relationships in data to audiences informed the author's view that data visualisation had not evolved to utilise the capabilities of ubiquitous business computer equipment. In an information rich but attention poor business environment, a search for a new tool was undertaken to supplement those techniques available to help audiences understand statistical relationships in presentation data. This search resulted in the development of a practical software tool based on animated Venn diagrams (Dvenn) that attempted to exploit the inherent human ability to perceive quantities visually, a faculty described herein as visual numeracy. The exploitation of this faculty is considered here to be a valuable aid for group understanding of business presentation data. The development of the tool was an essential part of the research that was undertaken and the resulting software forms a significant portion of this practise based research. The aim of the software development was to develop a readily accessible tool that could be utilised in a non-specialist business environment to better facilitate an honest shared meaning of numerical data between a presenter and their audience. The development of the tool progressed through a number of iterations and the software that accompanies this work is an important component that needs to be viewed in conjunction with the text. The test of the final version was undertaken with undergraduate University students in an attempt to validate the efficacy of the data visualisation technique. The test of the Dvenn software was made against the mature yardstick of scatter-plots. Interestingly, the correlations presented by scatter-plot were not as readily identified as would have been assumed, however, the results for the Dvenn tests were not supportive of the technique for widespread adoption. Nevertheless, further research into the best method of harnessing visual numeracy would seem to be justified
    corecore