6,048 research outputs found

    Unified Multi-Modal Image Synthesis for Missing Modality Imputation

    Full text link
    Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.Comment: 10 pages, 9 figure

    VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs

    Full text link
    We introduce VIVE3D, a novel approach that extends the capabilities of image-based 3D GANs to video editing and is able to represent the input video in an identity-preserving and temporally consistent way. We propose two new building blocks. First, we introduce a novel GAN inversion technique specifically tailored to 3D GANs by jointly embedding multiple frames and optimizing for the camera parameters. Second, besides traditional semantic face edits (e.g. for age and expression), we are the first to demonstrate edits that show novel views of the head enabled by the inherent properties of 3D GANs and our optical flow-guided compositing technique to combine the head with the background video. Our experiments demonstrate that VIVE3D generates high-fidelity face edits at consistent quality from a range of camera viewpoints which are composited with the original video in a temporally and spatially consistent manner.Comment: CVPR 2023. Project webpage and video available at http://afruehstueck.github.io/vive3

    Projected Multi-Agent Consensus Equilibrium (PMACE) for Distributed Reconstruction with Application to Ptychography

    Full text link
    Multi-Agent Consensus Equilibrium (MACE) formulates an inverse imaging problem as a balance among multiple update agents such as data-fitting terms and denoisers. However, each such agent operates on a separate copy of the full image, leading to redundant memory use and slow convergence when each agent affects only a small subset of the full image. In this paper, we extend MACE to Projected Multi-Agent Consensus Equilibrium (PMACE), in which each agent updates only a projected component of the full image, thus greatly reducing memory use for some applications.We describe PMACE in terms of an equilibrium problem and an equivalent fixed point problem and show that in most cases the PMACE equilibrium is not the solution of an optimization problem. To demonstrate the value of PMACE, we apply it to the problem of ptychography, in which a sample is reconstructed from the diffraction patterns resulting from coherent X-ray illumination at multiple overlapping spots. In our PMACE formulation, each spot corresponds to a separate data-fitting agent, with the final solution found as an equilibrium among all the agents. Our results demonstrate that the PMACE reconstruction algorithm generates more accurate reconstructions at a lower computational cost than existing ptychography algorithms when the spots are sparsely sampled

    SC-VAE: Sparse Coding-based Variational Autoencoder

    Full text link
    Learning rich data representations from unlabeled data is a key challenge towards applying deep learning algorithms in downstream supervised tasks. Several variants of variational autoencoders have been proposed to learn compact data representaitons by encoding high-dimensional data in a lower dimensional space. Two main classes of VAEs methods may be distinguished depending on the characteristics of the meta-priors that are enforced in the representation learning step. The first class of methods derives a continuous encoding by assuming a static prior distribution in the latent space. The second class of methods learns instead a discrete latent representation using vector quantization (VQ) along with a codebook. However, both classes of methods suffer from certain challenges, which may lead to suboptimal image reconstruction results. The first class of methods suffers from posterior collapse, whereas the second class of methods suffers from codebook collapse. To address these challenges, we introduce a new VAE variant, termed SC-VAE (sparse coding-based VAE), which integrates sparse coding within variational autoencoder framework. Instead of learning a continuous or discrete latent representation, the proposed method learns a sparse data representation that consists of a linear combination of a small number of learned atoms. The sparse coding problem is solved using a learnable version of the iterative shrinkage thresholding algorithm (ISTA). Experiments on two image datasets demonstrate that our model can achieve improved image reconstruction results compared to state-of-the-art methods. Moreover, the use of learned sparse code vectors allows us to perform downstream task like coarse image segmentation through clustering image patches.Comment: 15 pages, 11 figures, and 3 table

    Deep Learning for Scene Flow Estimation on Point Clouds: A Survey and Prospective Trends

    Get PDF
    Aiming at obtaining structural information and 3D motion of dynamic scenes, scene flow estimation has been an interest of research in computer vision and computer graphics for a long time. It is also a fundamental task for various applications such as autonomous driving. Compared to previous methods that utilize image representations, many recent researches build upon the power of deep analysis and focus on point clouds representation to conduct 3D flow estimation. This paper comprehensively reviews the pioneering literature in scene flow estimation based on point clouds. Meanwhile, it delves into detail in learning paradigms and presents insightful comparisons between the state-of-the-art methods using deep learning for scene flow estimation. Furthermore, this paper investigates various higher-level scene understanding tasks, including object tracking, motion segmentation, etc. and concludes with an overview of foreseeable research trends for scene flow estimation

    Statistical-dynamical analyses and modelling of multi-scale ocean variability

    Get PDF
    This thesis aims to provide a comprehensive analysis of multi-scale oceanic variabilities using various statistical and dynamical tools and explore the data-driven methods for correct statistical emulation of the oceans. We considered the classical, wind-driven, double-gyre ocean circulation model in quasi-geostrophic approximation and obtained its eddy-resolving solutions in terms of potential vorticity anomaly and geostrophic streamfunctions. The reference solutions possess two asymmetric gyres of opposite circulations and a strong meandering eastward jet separating them with rich eddy activities around it, such as the Gulf Stream in the North Atlantic and Kuroshio in the North Pacific. This thesis is divided into two parts. The first part discusses a novel scale-separation method based on the local spatial correlations, called correlation-based decomposition (CBD), and provides a comprehensive analysis of mesoscale eddy forcing. In particular, we analyse the instantaneous and time-lagged interactions between the diagnosed eddy forcing and the evolving large-scale PVA using the novel `product integral' characteristics. The product integral time series uncover robust causality between two drastically different yet interacting flow quantities, termed `eddy backscatter'. We also show data-driven augmentation of non-eddy-resolving ocean models by feeding them the eddy fields to restore the missing eddy-driven features, such as the merging western boundary currents, their eastward extension and low-frequency variabilities of gyres. In the second part, we present a systematic inter-comparison of Linear Regression (LR), stochastic and deep-learning methods to build low-cost reduced-order statistical emulators of the oceans. We obtain the forecasts on seasonal and centennial timescales and assess them for their skill, cost and complexity. We found that the multi-level linear stochastic model performs the best, followed by the ``hybrid stochastically-augmented deep learning models''. The superiority of these methods underscores the importance of incorporating core dynamics, memory effects and model errors for robust emulation of multi-scale dynamical systems, such as the oceans.Open Acces

    Image classification over unknown and anomalous domains

    Get PDF
    A longstanding goal in computer vision research is to develop methods that are simultaneously applicable to a broad range of prediction problems. In contrast to this, models often perform best when they are specialized to some task or data type. This thesis investigates the challenges of learning models that generalize well over multiple unknown or anomalous modes and domains in data, and presents new solutions for learning robustly in this setting. Initial investigations focus on normalization for distributions that contain multiple sources (e.g. images in different styles like cartoons or photos). Experiments demonstrate the extent to which existing modules, batch normalization in particular, struggle with such heterogeneous data, and a new solution is proposed that can better handle data from multiple visual modes, using differing sample statistics for each. While ideas to counter the overspecialization of models have been formulated in sub-disciplines of transfer learning, e.g. multi-domain and multi-task learning, these usually rely on the existence of meta information, such as task or domain labels. Relaxing this assumption gives rise to a new transfer learning setting, called latent domain learning in this thesis, in which training and inference are carried out over data from multiple visual domains, without domain-level annotations. Customized solutions are required for this, as the performance of standard models degrades: a new data augmentation technique that interpolates between latent domains in an unsupervised way is presented, alongside a dedicated module that sparsely accounts for hidden domains in data, without requiring domain labels to do so. In addition, the thesis studies the problem of classifying previously unseen or anomalous modes in data, a fundamental problem in one-class learning, and anomaly detection in particular. While recent ideas have been focused on developing self-supervised solutions for the one-class setting, in this thesis new methods based on transfer learning are formulated. Extensive experimental evidence demonstrates that a transfer-based perspective benefits new problems that have recently been proposed in anomaly detection literature, in particular challenging semantic detection tasks

    Cost-effective non-destructive testing of biomedical components fabricated using additive manufacturing

    Get PDF
    Biocompatible titanium-alloys can be used to fabricate patient-specific medical components using additive manufacturing (AM). These novel components have the potential to improve clinical outcomes in various medical scenarios. However, AM introduces stability and repeatability concerns, which are potential roadblocks for its widespread use in the medical sector. Micro-CT imaging for non-destructive testing (NDT) is an effective solution for post-manufacturing quality control of these components. Unfortunately, current micro-CT NDT scanners require expensive infrastructure and hardware, which translates into prohibitively expensive routine NDT. Furthermore, the limited dynamic-range of these scanners can cause severe image artifacts that may compromise the diagnostic value of the non-destructive test. Finally, the cone-beam geometry of these scanners makes them susceptible to the adverse effects of scattered radiation, which is another source of artifacts in micro-CT imaging. In this work, we describe the design, fabrication, and implementation of a dedicated, cost-effective micro-CT scanner for NDT of AM-fabricated biomedical components. Our scanner reduces the limitations of costly image-based NDT by optimizing the scanner\u27s geometry and the image acquisition hardware (i.e., X-ray source and detector). Additionally, we describe two novel techniques to reduce image artifacts caused by photon-starvation and scatter radiation in cone-beam micro-CT imaging. Our cost-effective scanner was designed to match the image requirements of medium-size titanium-alloy medical components. We optimized the image acquisition hardware by using an 80 kVp low-cost portable X-ray unit and developing a low-cost lens-coupled X-ray detector. Image artifacts caused by photon-starvation were reduced by implementing dual-exposure high-dynamic-range radiography. For scatter mitigation, we describe the design, manufacturing, and testing of a large-area, highly-focused, two-dimensional, anti-scatter grid. Our results demonstrate that cost-effective NDT using low-cost equipment is feasible for medium-sized, titanium-alloy, AM-fabricated medical components. Our proposed high-dynamic-range strategy improved by 37% the penetration capabilities of an 80 kVp micro-CT imaging system for a total x-ray path length of 19.8 mm. Finally, our novel anti-scatter grid provided a 65% improvement in CT number accuracy and a 48% improvement in low-contrast visualization. Our proposed cost-effective scanner and artifact reduction strategies have the potential to improve patient care by accelerating the widespread use of patient-specific, bio-compatible, AM-manufactured, medical components

    Modified Stacked Autoencoder Using Adaptive Morlet Wavelet for Intelligent Fault Diagnosis of Rotating Machinery

    Get PDF
    Intelligent fault diagnosis techniques play an important role in improving the abilities of automated monitoring, inference, and decision making for the repair and maintenance of machinery and processes. In this article, a modified stacked autoencoder (MSAE) that uses adaptive Morlet wavelet is proposed to automatically diagnose various fault types and severities of rotating machinery. First, the Morlet wavelet activation function is utilized to construct an MSAE to establish an accurate nonlinear mapping between the raw nonstationary vibration data and different fault states. Then, the nonnegative constraint is applied to enhance the cost function to improve sparsity performance and reconstruction quality. Finally, the fruit fly optimization algorithm is used to determine the adjustable parameters of the Morlet wavelet to flexibly match the characteristics of the analyzed data. The proposed method is used to analyze the raw vibration data collected from a sun gear unit and a roller bearing unit. Experimental results show that the proposed method is superior to other state-of-the-art methods

    From wallet to mobile: exploring how mobile payments create customer value in the service experience

    Get PDF
    This study explores how mobile proximity payments (MPP) (e.g., Apple Pay) create customer value in the service experience compared to traditional payment methods (e.g. cash and card). The main objectives were firstly to understand how customer value manifests as an outcome in the MPP service experience, and secondly to understand how the customer activities in the process of using MPP create customer value. To achieve these objectives a conceptual framework is built upon the Grönroos-Voima Value Model (Grönroos and Voima, 2013), and uses the Theory of Consumption Value (Sheth et al., 1991) to determine the customer value constructs for MPP, which is complimented with Script theory (Abelson, 1981) to determine the value creating activities the consumer does in the process of paying with MPP. The study uses a sequential exploratory mixed methods design, wherein the first qualitative stage uses two methods, self-observations (n=200) and semi-structured interviews (n=18). The subsequent second quantitative stage uses an online survey (n=441) and Structural Equation Modelling analysis to further examine the relationships and effect between the value creating activities and customer value constructs identified in stage one. The academic contributions include the development of a model of mobile payment services value creation in the service experience, introducing the concept of in-use barriers which occur after adoption and constrains the consumers existing use of MPP, and revealing the importance of the mobile in-hand momentary condition as an antecedent state. Additionally, the customer value perspective of this thesis demonstrates an alternative to the dominant Information Technology approaches to researching mobile payments and broadens the view of technology from purely an object a user interacts with to an object that is immersed in consumers’ daily life
    • …
    corecore