106 research outputs found

    Reducible Dictionaries for Single Image Super-Resolution based on Patch Matching and Mean Shifting

    Get PDF
    A single-image super-resolution (SR) method is proposed. The proposed method uses a generated dictionary from pairs of high resolution (HR) images and their corresponding low resolution (LR) representations. First, HR images and the corresponding LR ones are divided into patches of HR and LR, respectively, and then they are collected into separate dictionaries. Afterward, when performing SR, the distance between every patch of the input LR image and those of available LR patches in the LR dictionary is calculated. The minimum distance between the input LR patch and those in the LR dictionary is taken, and its counterpart from the HR dictionary is passed through an illumination enhancement process. By this technique, the noticeable change of illumination between neighbor patches in the super-resolved image is significantly reduced. The enhanced HR patch represents the HR patch of the super-resolved image. Finally, to remove the blocking effect caused by merging the patches, an average of the obtained HR image and the interpolated image obtained using bicubic interpolation is calculated. The quantitative and qualitative analyses show the superiority of the proposed technique over the conventional and state-of-art methods

    Vedel-objektiiv abil salvestatud kaugseire piltide analüüs kasutades super-resolutsiooni meetodeid

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneKäesolevas doktoritöös uuriti nii riist- kui ka tarkvaralisi lahendusi piltide töötlemiseks. Riist¬varalise poole pealt pakuti lahenduseks uudset vedelläätse, milles on dielekt¬rilisest elastomeerist kihilise täituriga membraan otse optilisel teljel. Doktoritöö käigus arendati välja kaks prototüüpi kahe erineva dielektrilisest elastomeerist ki¬hilise täituriga, mille aktiivne ala oli ühel juhul 40 ja teisel 20 mm. Läätse töö vas¬tas elastomeeri deformatsiooni mehaanikale ja suhtelistele muutustele fookuskau¬guses. Muutuste demonstreerimiseks meniskis ja läätse fookuskauguse mõõtmiseks kasutati laserkiirt. Katseandmetest selgub, et muutuste tekitamiseks on vajalik pinge vahemikus 50 kuni 750 volti. Tarkvaralise poole pealt pakuti uut satelliitpiltide parandamise süsteemi. Paku¬tud süsteem jagas mürase sisendpildi DT-CWT laineteisenduse abil mitmeteks sagedusalamribadeks. Pärast müra eemaldamist LA-BSF funktsiooni abil suu¬rendati pildi resolutsiooni DWT-ga ja kõrgsagedusliku alamriba piltide interpo¬leerimisega. Interpoleerimise faktor algsele pildile oli pool sellest, mida kasutati kõrgsagedusliku alamriba piltide interpoleerimisel ning superresolutsiooniga pilt rekonst¬rueeriti IDWT abil. Käesolevas doktoritöös pakuti tarkvaraliseks lahenduseks uudset sõnastiku baasil töötavat super-resolutsiooni (SR) meetodit, milles luuakse paarid suure resolutsiooniga (HR) ja madala resolut-siooniga (LR) piltidest. Kõigepealt jagati vastava sõnastiku loomiseks HR ja LR paarid omakorda osadeks. Esialgse HR kujutise saamiseks LR sisendpildist kombineeriti HR osi. HR osad valiti sõnastikust nii, et neile vastavad LR osad oleksid võimalikult lähedased sisendiks olevale LR pil¬dile. Iga valitud HR osa heledust korrigeeriti, et vähendada kõrvuti asuvate osade heleduse erine¬vusi superresolutsiooniga pildil. Plokkide efekti vähendamiseks ar¬vutati saadud SR pildi keskmine ning bikuupinterpolatsiooni pilt. Lisaks pakuti käesolevas doktoritöös välja kernelid, mille tulemusel on võimalik saadud SR pilte teravamaks muuta. Pakutud kernelite tõhususe tõestamiseks kasutati [83] ja [50] poolt pakutud resolutsiooni parandamise meetodeid. Superreso¬lutsiooniga pilt saadi iga kerneli tehtud HR pildi kombineerimise teel alpha blen¬dingu meetodit kasutades. Pakutud meetodeid ja kerneleid võrreldi erinevate tavaliste ja kaasaegsete meetoditega. Kvantita-tiivsetest katseandmetest ja saadud piltide kvaliteedi visuaal¬sest hindamisest selgus, et pakutud meetodid on tavaliste kaasaegsete meetoditega võrreldes paremad.In this thesis, a study of both hardware and software solutions for image enhance¬ment has been done. On the hardware side, a new liquid lens design with a DESA membrane located directly in the optical path has been demonstrated. Two pro¬totypes with two different DESA, which have a 40 and 20 mm active area in diameter, were developed. The lens performance was consistent with the mechan¬ics of elastomer deformation and relative focal length changes. A laser beam was used to show the change in the meniscus and to measure the focal length of the lens. The experimental results demonstrate that voltage in the range of 50 to 750 V is required to create change in the meniscus. On the software side, a new satellite image enhancement system was proposed. The proposed technique decomposed the noisy input image into various frequency subbands by using DT-CWT. After removing the noise by applying the LA-BSF technique, its resolution was enhanced by employing DWT and interpolating the high-frequency subband images. An original image was interpolated with half of the interpolation factor used for interpolating the high-frequency subband images, and the super-resolved image was reconstructed by using IDWT. A novel single-image SR method based on a generating dictionary from pairs of HR and their corresponding LR images was proposed. Firstly, HR and LR pairs were divided into patches in order to make HR and LR dictionaries respectively. The initial HR representation of an input LR image was calculated by combining the HR patches. These HR patches are chosen from the HR dictionary corre-sponding to the LR patches that have the closest distance to the patches of the in¬put LR image. Each selected HR patch was processed further by passing through an illumination enhancement processing order to reduce the noticeable change of illumination between neighbor patches in the super-resolved image. In order to reduce the blocking effect, the average of the obtained SR image and the bicubic interpolated image was calculated. The new kernels for sampling have also been proposed. The kernels can improve the SR by resulting in a sharper image. In order to demonstrate the effectiveness of the proposed kernels, the techniques from [83] and [50] for resolution enhance¬ment were adopted. The super-resolved image was achieved by combining the HR images produced by each of the proposed kernels using the alpha blending tech-nique. The proposed techniques and kernels are compared with various conventional and state-of-the-art techniques, and the quantitative test results and visual results on the final image quality show the superiority of the proposed techniques and ker¬nels over conventional and state-of-art technique

    Learning Invariant Representations of Images for Computational Pathology

    Get PDF

    Intelligent Biosignal Processing in Wearable and Implantable Sensors

    Get PDF
    This reprint provides a collection of papers illustrating the state-of-the-art of smart processing of data coming from wearable, implantable or portable sensors. Each paper presents the design, databases used, methodological background, obtained results, and their interpretation for biomedical applications. Revealing examples are brain–machine interfaces for medical rehabilitation, the evaluation of sympathetic nerve activity, a novel automated diagnostic tool based on ECG data to diagnose COVID-19, machine learning-based hypertension risk assessment by means of photoplethysmography and electrocardiography signals, Parkinsonian gait assessment using machine learning tools, thorough analysis of compressive sensing of ECG signals, development of a nanotechnology application for decoding vagus-nerve activity, detection of liver dysfunction using a wearable electronic nose system, prosthetic hand control using surface electromyography, epileptic seizure detection using a CNN, and premature ventricular contraction detection using deep metric learning. Thus, this reprint presents significant clinical applications as well as valuable new research issues, providing current illustrations of this new field of research by addressing the promises, challenges, and hurdles associated with the synergy of biosignal processing and AI through 16 different pertinent studies. Covering a wide range of research and application areas, this book is an excellent resource for researchers, physicians, academics, and PhD or master students working on (bio)signal and image processing, AI, biomaterials, biomechanics, and biotechnology with applications in medicine

    Visual saliency computation for image analysis

    Full text link
    Visual saliency computation is about detecting and understanding salient regions and elements in a visual scene. Algorithms for visual saliency computation can give clues to where people will look in images, what objects are visually prominent in a scene, etc. Such algorithms could be useful in a wide range of applications in computer vision and graphics. In this thesis, we study the following visual saliency computation problems. 1) Eye Fixation Prediction. Eye fixation prediction aims to predict where people look in a visual scene. For this problem, we propose a Boolean Map Saliency (BMS) model which leverages the global surroundedness cue using a Boolean map representation. We draw a theoretic connection between BMS and the Minimum Barrier Distance (MBD) transform to provide insight into our algorithm. Experiment results show that BMS compares favorably with state-of-the-art methods on seven benchmark datasets. 2) Salient Region Detection. Salient region detection entails computing a saliency map that highlights the regions of dominant objects in a scene. We propose a salient region detection method based on the Minimum Barrier Distance (MBD) transform. We present a fast approximate MBD transform algorithm with an error bound analysis. Powered by this fast MBD transform algorithm, our method can run at about 80 FPS and achieve state-of-the-art performance on four benchmark datasets. 3) Salient Object Detection. Salient object detection targets at localizing each salient object instance in an image. We propose a method using a Convolutional Neural Network (CNN) model for proposal generation and a novel subset optimization formulation for bounding box filtering. In experiments, our subset optimization formulation consistently outperforms heuristic bounding box filtering baselines, such as Non-maximum Suppression, and our method substantially outperforms previous methods on three challenging datasets. 4) Salient Object Subitizing. We propose a new visual saliency computation task, called Salient Object Subitizing, which is to predict the existence and the number of salient objects in an image using holistic cues. To this end, we present an image dataset of about 14K everyday images which are annotated using an online crowdsourcing marketplace. We show that an end-to-end trained CNN subitizing model can achieve promising performance without requiring any localization process. A method is proposed to further improve the training of the CNN subitizing model by leveraging synthetic images. 5) Top-down Saliency Detection. Unlike the aforementioned tasks, top-down saliency detection entails generating task-specific saliency maps. We propose a weakly supervised top-down saliency detection approach by modeling the top-down attention of a CNN image classifier. We propose Excitation Backprop and the concept of contrastive attention to generate highly discriminative top-down saliency maps. Our top-down saliency detection method achieves superior performance in weakly supervised localization tasks on challenging datasets. The usefulness of our method is further validated in the text-to-region association task, where our method provides state-of-the-art performance using only weakly labeled web images for training

    The Future of Information Sciences : INFuture2015 : e-Institutions – Openness, Accessibility, and Preservation

    Get PDF

    Composing interaction within sound and image in digital technologies

    Get PDF
    This thesis investigates the entwined relationship between the creative process of composition and the development of technological frameworks, specifically software development, as parallel practices in digital-interactive contexts. Drawing on the tenets of intermediality, notably the writing of Elleström, Nelson, Bay-Cheng, and Kattenbelt, this work aims to explore and analyse the resonances and possibilities for renegotiating our perceptions of temporality, authorship and the construction of experience. This interrogation of digital-intermedial composition consists of three practical research projects and a threechapter written thesis that addresses the theoretical and practical concerns of a creative process exploring the notion of ‘composing experience.’ The reflexive relationship between composition and digital technologies the focus of this research yet further theoretical concepts arise from the central inquiry later in the thesis. A key methodology in my research has been the finding the balance between writing, analysis and practical engagement with the work. This is a Practice-as-Research PhD and as such a complex interaction between theoretical and practical elements define my inquiry, something reflected in the writing of this thesis. Chapter One seeks to locate the core aspects and processes of my own work within the field of contemporary practice looking notably at the work of artists involved in digital interactive work and composing with sound and image. The chapter looks specifically at the validity of creating interactive works from single data stream input devices – such as gaming controllers and the notion of how these interfaces should be ‘mapped’ (Elleström) to effective points of interaction in the context of the audiences experience. Chapter Two charts the linear journey of my practical projects beginning with Comrade Coffee (Donovan 2010) and my exploration of interdisciplinarity. My second research project, Inter-activity (Donovan 2011), details the shift in my research focus from interdisciplinarity to intermedial process in constructing work in digital-interactive contexts. The basis of my final work, Digital Spaces (Donovan 2012), is set up, for its exploration in Chapter Three, through analysing the system’s early development and the exploration of different methodological approaches including gamification. Chapter Three is split into four sections and focuses on the conceptual development and analysis of my research primarily through Digital Spaces and the theoretical issues emerging from these contexts. The thesis concludes by exploring the validity and functionality of a meta-compositional process and the composition of experience as being methodological and ideological focuses for creative arts practice in digital-interactive contexts

    Deep invariant feature learning for remote sensing scene classification

    Get PDF
    Image classification, as the core task in the computer vision field, has proceeded at a break­neck pace. It largely attributes to the recent growth of deep learning techniques which have blown the conventional statistical methods on a plethora of benchmarks and even can outperform humans in specific image classification tasks. Despite deep learning exceeding alternative techniques, they have many apparent disadvantages that prevent them from being deployed for the general-purpose. Specifically, deep learning always requires a considerable amount of well-annotated data to circumvent the problems of over-fitting and the lacking of prior knowledge. However, manually labelled data is expensive to acquire and is impossible to incorporate the variations as much as the real world. Consequently, deep learning models usually fail when they confront with the underrepresented variations in the training data. This is the main reason why the deep learning model is barely satisfactory in the challeng­ing image recognition task that contains nuisance variations such as, Remote Sensing Scene Classification (RSSC). The classification of remote sensing scene image is a procedure of assigning the seman­tic meaning labels for the given satellite images that contain the complicated variations, such as texture and appearances. The algorithms for effectively understanding and recognising remote sensing scene images have the potential to be employed in a broad range of applications, such as urban planning, Land Use and Land Cover (LULC) determination, natural hazards detection, vegetation mapping, environmental monitoring. This inspires us to de­sign the frameworks that can automatically predict the precise label for satellite images. In our research project, we mine and define the challenges in RSSC community compared with general scene image recognition tasks. Specifically, we summarise the problems into the following perspectives. 1) Visual-semantic ambiguity: the discrepancy between visual features and semantic concepts; 2) Variations: the intra-class diversity and inter-class similarity; 3) Clutter background; 4) The small size of the training set; 5) Unsatisfactory classification accuracy in large-scale datasets. To address the aforementioned challenges, we explore a way to dynamically expand the capabilities of incorporating the prior knowledge by transforming the input data so that we can learn the globally invariant second-order features from the transformed data for improving the performance of RSSC tasks. First, we devise a recurrent transformer network (RTN) to progressively discover the discriminative regions of input images and learn the corresponding second-order features. The model is optimised using pairwise ranking loss to achieve localising discriminative parts and learning the corresponding features in a mutu­ally reinforced way. Second, we observed that existing remote sensing image datasets lack the provision of ontological structures. Therefore, a multi-granularity canonical appearance pooling (MG-CAP) model is proposed to automatically seek the implied hierarchical structures of datasets and produced covariance features contained the multi-grained information. Third, we explore a way to improve the discriminative power of the second-order features. To accomplish this target, we present a covariance feature embedding (CFE) model to im­prove the distinctive power of covariance pooling by using suitable matrix normalisation methods and a low-norm cosine similarity loss to accurately metric the distances of high­dimensional features. Finally, we improved the performance of RSSC while using fewer model parameters. An invariant deep compressible covariance pooling (IDCCP) model is presented to boost the classification accuracy for RSSC tasks. Meanwhile, we proofed the generalisability of our IDCCP model using group theory and manifold optimisation techniques. All of the proposed frameworks allow being optimised in an end-to-end manner and are well-supported by GPU acceleration. We conduct extensive experiments on the well-known remote sensing scene image datasets to demonstrate the great promotions of our proposed methods in comparison with state-of-the-art approaches
    corecore