242 research outputs found
Diffusion-based generation of Histopathological Whole Slide Images at a Gigapixel scale
We present a novel diffusion-based approach to generate synthetic
histopathological Whole Slide Images (WSIs) at an unprecedented gigapixel
scale. Synthetic WSIs have many potential applications: They can augment
training datasets to enhance the performance of many computational pathology
applications. They allow the creation of synthesized copies of datasets that
can be shared without violating privacy regulations. Or they can facilitate
learning representations of WSIs without requiring data annotations. Despite
this variety of applications, no existing deep-learning-based method generates
WSIs at their typically high resolutions. Mainly due to the high computational
complexity. Therefore, we propose a novel coarse-to-fine sampling scheme to
tackle image generation of high-resolution WSIs. In this scheme, we increase
the resolution of an initial low-resolution image to a high-resolution WSI.
Particularly, a diffusion model sequentially adds fine details to images and
increases their resolution. In our experiments, we train our method with WSIs
from the TCGA-BRCA dataset. Additionally to quantitative evaluations, we also
performed a user study with pathologists. The study results suggest that our
generated WSIs resemble the structure of real WSIs
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Advances in Image Processing, Analysis and Recognition Technology
For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
Deep Representation Learning with Limited Data for Biomedical Image Synthesis, Segmentation, and Detection
Biomedical imaging requires accurate expert annotation and interpretation that can aid medical staff and clinicians in automating differential diagnosis and solving underlying health conditions. With the advent of Deep learning, it has become a standard for reaching expert-level performance in non-invasive biomedical imaging tasks by training with large image datasets. However, with the need for large publicly available datasets, training a deep learning model to learn intrinsic representations becomes harder. Representation learning with limited data has introduced new learning techniques, such as Generative Adversarial Networks, Semi-supervised Learning, and Self-supervised Learning, that can be applied to various biomedical applications. For example, ophthalmologists use color funduscopy (CF) and fluorescein angiography (FA) to diagnose retinal degenerative diseases. However, fluorescein angiography requires injecting a dye, which can create adverse reactions in the patients. So, to alleviate this, a non-invasive technique needs to be developed that can translate fluorescein angiography from fundus images. Similarly, color funduscopy and optical coherence tomography (OCT) are also utilized to semantically segment the vasculature and fluid build-up in spatial and volumetric retinal imaging, which can help with the future prognosis of diseases. Although many automated techniques have been proposed for medical image segmentation, the main drawback is the model's precision in pixel-wise predictions. Another critical challenge in the biomedical imaging field is accurately segmenting and quantifying dynamic behaviors of calcium signals in cells. Calcium imaging is a widely utilized approach to studying subcellular calcium activity and cell function; however, large datasets have yielded a profound need for fast, accurate, and standardized analyses of calcium signals. For example, image sequences from calcium signals in colonic pacemaker cells ICC (Interstitial cells of Cajal) suffer from motion artifacts and high periodic and sensor noise, making it difficult to accurately segment and quantify calcium signal events. Moreover, it is time-consuming and tedious to annotate such a large volume of calcium image stacks or videos and extract their associated spatiotemporal maps. To address these problems, we propose various deep representation learning architectures that utilize limited labels and annotations to address the critical challenges in these biomedical applications. To this end, we detail our proposed semi-supervised, generative adversarial networks and transformer-based architectures for individual learning tasks such as retinal image-to-image translation, vessel and fluid segmentation from fundus and OCT images, breast micro-mass segmentation, and sub-cellular calcium events tracking from videos and spatiotemporal map quantification. We also illustrate two multi-modal multi-task learning frameworks with applications that can be extended to other domains of biomedical applications. The main idea is to incorporate each of these as individual modules to our proposed multi-modal frameworks to solve the existing challenges with 1) Fluorescein angiography synthesis, 2) Retinal vessel and fluid segmentation, 3) Breast micro-mass segmentation, and 4) Dynamic quantification of calcium imaging datasets
Widening the view angle of auto-multiscopic display, denoising low brightness light field data and 3D reconstruction with delicate details
This doctoral thesis will present the results of my work into widening the viewing angle
of the auto-multiscopic display, denoising light filed data the enhancement of captured
light filed data captured in low light circumstance, and the attempts on reconstructing
the subject surface with delicate details from microscopy image sets.
The automultiscopic displays carefully control the distribution of emitted light over
space, direction (angle) and time so that even a static image displayed can encode
parallax across viewing directions (light field). This allows simultaneous observation by
multiple viewers, each perceiving 3D from their own (correct) perspective. Currently,
the illusion can only be effectively maintained over a narrow range of viewing angles.
We propose and analyze a simple solution to widen the range of viewing angles for
automultiscopic displays that use parallax barriers. We insert a refractive medium, with
a high refractive index, between the display and parallax barriers. The inserted medium
warps the exitant lightfield in a way that increases the potential viewing angle. We
analyze the consequences of this warp and build a prototype with a 93% increase in
the effective viewing angle. Additionally, we developed an integral images synthesis
method that can address the refraction introduced by the inserted medium efficiently
without the use of ray tracing.
Capturing light field image with a short exposure time is preferable for eliminating
the motion blur but it also leads to low brightness in a low light environment, which
results in a low signal noise ratio. Most light field denoising methods apply regular 2D
image denoising method to the sub-aperture images of a 4D light field directly, but it
is not suitable for focused light field data whose sub-aperture image resolution is too
low to be applied regular denoising methods. Therefore, we propose a deep learning
denoising method based on micro lens images of focused light field to denoise the depth
map and the original micro lens image set simultaneously, and achieved high quality
total focused images from the low focused light field data.
In areas like digital museum, remote researching, 3D reconstruction with delicate
details of subjects is desired and technology like 3D reconstruction based on macro
photography has been used successfully for various purposes. We intend to push it
further by using microscope rather than macro lens, which is supposed to be able to
capture the microscopy level details of the subject. We design and implement a scanning
method which is able to capture microscopy image set from a curve surface based on
robotic arm, and the 3D reconstruction method suitable for the microscopy image set
Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness
Smart Cities: Inverse Design of 3D Urban Procedural Models with Traffic and Weather Simulation
Urbanization, the demographic transition from rural to urban, has changed how we envision and share the world. From just one-fourth of the population living in cities one hundred years ago, now more than half of the population does, and this ratio is expected to grow in the near future. Creating more sustainable, accessible, safe, and enjoyable cities has become an imperative
- …