19 research outputs found

    Denoising Diffusion Probabilistic Models for Generation of Realistic Fully-Annotated Microscopy Image Data Sets

    Full text link
    Recent advances in computer vision have led to significant progress in the generation of realistic image data, with denoising diffusion probabilistic models proving to be a particularly effective method. In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets through an unsupervised and intuitive approach, using rough sketches of desired structures as the starting point. The proposed pipeline helps to reduce the reliance on manual annotations when training deep learning-based segmentation approaches and enables the segmentation of diverse datasets without the need for human annotations. This approach holds great promise in streamlining the data generation process and enabling a more efficient and scalable training of segmentation models, as we show in the example of different practical experiments involving various organisms and cell types.Comment: 9 pages, 2 figure

    Thermal Analysis of Building Roofs with Latent Heat Storage for Reduction in Energy Consumption and CO2 Emissions: An Experimental and Numerical Research

    No full text
    In green energy buildings, air conditioning charges can be lowered through careful planning of the building’s envelope. This article investigates several strategically designed phase change material (PCM) roof envelopes for savings on air conditioning prices, CO2 emission abatement, and payback timeframes in hot–arid and warm-temperate climates, taking into account unsteady heat transfer characteristics, cooling, and heating degree–hours. This is accomplished by using six different PCMs–RCC (reinforced cement concrete) roof envelope cases (RCC roof with PCM layer on the outer side, RCC roof with PCM layer on the center (middle), RCC roof with PCM layer on the inside, RCC roof with PCM layers placed on the outside and center, RCC roof with PCM layers placed on the center and inside, and RCC roof with PCM layers placed on the outer side and inside) with three PCMs (FS29 (form stable mixture), HS29 (hydrated salt), and OM29 (organic mixture)). PCM thermophysical characteristics are experimentally measured. The analytical results are experimentally validated. In hot–arid and warm-temperate regions, the layer of PCM installed on the outside of the RCC with HS29 saved the most on air conditioning expenses, at 6.29 and 6.61 $/m2, respectively. They also reported the greatest carbon mitigation of 300.55 kg of CO2/year and 281.58 kg of CO2/year with the faster payback periods. PCM roof envelopes are the most energy-efficient option for green buildings

    Quantitative results of synthetic data.

    No full text
    PSNR values presented as boxplots, calculated between real image data and synthetic image data generated from corresponding silver truth and segmentation masks. Those silver truth segmentations are generated with automated approaches and validated to be reliable for training purposes. Whiskers range from the 5th to the 95th quantile, median values are indicated as orange line while mean values are depicted as green triangle and boxes represent the interquartile range. The involved datasets are 3D Caenorhabditis elegans (CE) [19], 2D Mouse Stem Cells (GOWT1) [19], 2D HeLa Cells (HeLa) [19], 3D Nuclei and Membranes of Danio rerio (DRNuc,DRMem) [21], 2D+t mitotic progression in Mouse Stem Cells (HeLa+t) [24] and 2D overlapping cervical cancer cells (Cerv) [39]. Using corresponding sketches and the optimized settings of tstart = 400 and σ = 1, the backward process was used to replicate corresponding image samples and PSNR values were calculated to assess similarity between synthetic and real versions. The Danio rerio multi-channel data and the temporal HeLa data present special cases, which demonstrate limitations of the proposed approach. (PDF)</p

    Latent feature representation.

    No full text
    2D feature representations obtained with t-SNE from the latent representation of an autoencoder for real 3D Arabidopsis thaliana image data [18] and corresponding synthetic data generated from sketches of manual annotations. Additionally, feature representations obtained for raw sketches serve as a reference. Since this data set contains large-scale image data, each image stack is partitioned into patches to reduce computational demand. During application, a feature representation for each single patch is obtained and averaged to derive an overall feature description of the entire image stack. Results indicate that the diffusion model learns the average distribution of real image data, since latent representations of synthetic data is enclosed by representations obtained for real image data. Sketch representations form a more distinct cluster, further promoting the realism of synthetic image data. (PDF)</p

    Pipeline optimization.

    No full text
    (a) Noisy data created by the forward process from either real images or sketches needs to be sufficiently similar to allow for the generation of realistic image data in the backward process, assessed by histograms. For the backward process, peak signal-to-noise ratio (PSNR) and zero-normalized cross-correlation (ZNCC) are used as metrics, to assess the realism of image data generated from different starting points tstart and sketch blurring factors σ. (b) Overlays of generated image data (red) and annotation masks (green) show how structural correlation is diminishing with increasing tstart in regions of low contrast, while manual annotation inaccuracies even present in regions of high contrast do not appear in simulated data.</p

    Special cases of generated image data.

    No full text
    Examples of real image data and corresponding synthetic image data generated from corresponding annotations of overlapping cells [39] (top). Moreover, examples of varying background illumination in C. elegans [19, 20] was generated by adding indications within the sketches (bottom). This demonstrates the intuitive strength of the proposed approach, as more complex scenes of overlapping cells can be realistically generated and position-dependent texture characteristics can be straightforwardly imposed and controlled. (PDF)</p

    Pipeline overview.

    No full text
    (a) The whole pipeline involves training a diffusion model on real image data and applying it to obtained structures to generate fully-annotated image datasets, which are then used to train models that segment the real data. (b) During application of DDPMs, annotations are automatically turned into coarse sketches for a subsequent application of the forward process, to achieve a realistic generation of the corresponding image data.</p

    Segmentation accuracy on further datasets.

    No full text
    Solid lines show results obtained for segmentation models solely trained on synthetic data evaluated on the sparse manually annotated ground truth. Dotted lines show determined accuracies when considering the silver truth annotations as predictions (not provided for T. castaneum), and comparing results against the ground truth. Data splits, ground truth and silver truth were provided by the Cell Tracking Challenge [19]. Note that manual annotation are only provided for a small fraction of cells visible within the image data, and annotated cells often focus the most challenging regions, which are typically difficult to generate. (PDF)</p

    Application Examples.

    No full text
    (a) Real image samples and fully-synthetic image samples generated by the diffusion model using simulated structures. (b) The Cellpose segmentation approach [28] is trained on synthetic datasets and applied to real image data to generate results (red overlay) without requiring human-generated annotations. Intersection-over-Union (IoU) scores obtained for a publicly available generalist model trained on a large collection of manually annotated image data (blue) and the model trained on synthetic data (orange) are shown as violin plots with indications of median values (black bar). All datasets are publicly available from [10, 18, 19, 21, 24].</p
    corecore