792 research outputs found
Learning Equivariant Representations
State-of-the-art deep learning systems often require large amounts of data
and computation. For this reason, leveraging known or unknown structure of the
data is paramount. Convolutional neural networks (CNNs) are successful examples
of this principle, their defining characteristic being the shift-equivariance.
By sliding a filter over the input, when the input shifts, the response shifts
by the same amount, exploiting the structure of natural images where semantic
content is independent of absolute pixel positions. This property is essential
to the success of CNNs in audio, image and video recognition tasks. In this
thesis, we extend equivariance to other kinds of transformations, such as
rotation and scaling. We propose equivariant models for different
transformations defined by groups of symmetries. The main contributions are (i)
polar transformer networks, achieving equivariance to the group of similarities
on the plane, (ii) equivariant multi-view networks, achieving equivariance to
the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving
equivariance to the continuous 3D rotation group, (iv) cross-domain image
embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v)
spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving
equivariance to 3D rotations for spherical vector fields. Applications include
image classification, 3D shape classification and retrieval, panoramic image
classification and segmentation, shape alignment and pose estimation. What
these models have in common is that they leverage symmetries in the data to
reduce sample and model complexity and improve generalization performance. The
advantages are more significant on (but not limited to) challenging tasks where
data is limited or input perturbations such as arbitrary rotations are present
Graph Relation Distillation for Efficient Biomedical Instance Segmentation
Instance-aware embeddings predicted by deep neural networks have
revolutionized biomedical instance segmentation, but its resource requirements
are substantial. Knowledge distillation offers a solution by transferring
distilled knowledge from heavy teacher networks to lightweight yet
high-performance student networks. However, existing knowledge distillation
methods struggle to extract knowledge for distinguishing instances and overlook
global relation information. To address these challenges, we propose a graph
relation distillation approach for efficient biomedical instance segmentation,
which considers three essential types of knowledge: instance-level features,
instance relations, and pixel-level boundaries. We introduce two graph
distillation schemes deployed at both the intra-image level and the inter-image
level: instance graph distillation (IGD) and affinity graph distillation (AGD).
IGD constructs a graph representing instance features and relations,
transferring these two types of knowledge by enforcing instance graph
consistency. AGD constructs an affinity graph representing pixel relations to
capture structured knowledge of instance boundaries, transferring
boundary-related knowledge by ensuring pixel affinity consistency. Experimental
results on a number of biomedical datasets validate the effectiveness of our
approach, enabling student models with less than parameters and less
than inference time while achieving promising performance compared to
teacher models
Spatial-Temporal Deep Embedding for Vehicle Trajectory Reconstruction from High-Angle Video
Spatial-temporal Map (STMap)-based methods have shown great potential to
process high-angle videos for vehicle trajectory reconstruction, which can meet
the needs of various data-driven modeling and imitation learning applications.
In this paper, we developed Spatial-Temporal Deep Embedding (STDE) model that
imposes parity constraints at both pixel and instance levels to generate
instance-aware embeddings for vehicle stripe segmentation on STMap. At pixel
level, each pixel was encoded with its 8-neighbor pixels at different ranges,
and this encoding is subsequently used to guide a neural network to learn the
embedding mechanism. At the instance level, a discriminative loss function is
designed to pull pixels belonging to the same instance closer and separate the
mean value of different instances far apart in the embedding space. The output
of the spatial-temporal affinity is then optimized by the mutex-watershed
algorithm to obtain final clustering results. Based on segmentation metrics,
our model outperformed five other baselines that have been used for STMap
processing and shows robustness under the influence of shadows, static noises,
and overlapping. The designed model is applied to process all public NGSIM
US-101 videos to generate complete vehicle trajectories, indicating a good
scalability and adaptability. Last but not least, the strengths of the scanline
method with STDE and future directions were discussed. Code, STMap dataset and
video trajectory are made publicly available in the online repository. GitHub
Link: shorturl.at/jklT0
Learning Instance Segmentation from Sparse Supervision
Instance segmentation is an important task in many domains of automatic image processing, such as self-driving cars, robotics and microscopy data analysis. Recently, deep learning-based algorithms have brought image segmentation close to human performance. However, most existing models rely on dense groundtruth labels for training, which are expensive, time consuming and often require experienced annotators to perform the labeling. Besides the annotation burden, training complex high-capacity neural networks depends upon non-trivial expertise in the choice and tuning of hyperparameters, making the adoption of these models challenging for researchers in other fields.
The aim of this work is twofold. The first is to make the deep learning segmentation methods accessible to non-specialist. The second is to address the dense annotation problem by developing instance segmentation methods trainable with limited groundtruth data.
In the first part of this thesis, I bring state-of-the-art instance segmentation methods closer to non-experts by developing PlantSeg: a pipeline for volumetric segmentation of light microscopy images of biological tissues into cells. PlantSeg comes with a large repository of pre-trained models and delivers highly accurate results on a variety of samples and image modalities. We exemplify its usefulness to answer biological questions in several collaborative research projects.
In the second part, I tackle the dense annotation bottleneck by introducing SPOCO, an instance segmentation method, which can be trained from just a few annotated objects. It demonstrates strong segmentation performance on challenging natural and biological benchmark datasets at a very reduced manual annotation cost and delivers state-of-the-art results on the CVPPP benchmark.
In summary, my contributions enable training of instance segmentation models with limited amounts of labeled data and make these methods more accessible for non-experts, speeding up the process of quantitative data analysis
Recommended from our members
Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions.
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research
Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research
Computer Vision Approaches for Mapping Gene Expression onto Lineage Trees
This project concerns studying the early development of living organisms. This period is accompanied by dynamic morphogenetic events. There is an increase in the number of cells, changes in the shape of cells and specification of cell fate during this time. Typically, in order to capture the dynamic morphological changes, one can employ a form of microscopy imaging such as Selective Plane Illumination Microscopy (SPIM) which offers a single-cell resolution across time, and hence allows observing the positions, velocities and trajectories of most cells in a developing embryo. Unfortunately, the dynamic genetic activity which underlies these morphological changes and influences cellular fate decision, is captured only as static snapshots and often requires processing (sequencing or imaging) multiple distinct individuals. In order to set the stage for characterizing the factors which influence cellular fate, one must bring the data arising from the above-mentioned static snapshots of multiple individuals and the data arising from SPIM imaging of other distinct individual(s) which characterizes the changes in morphology, into the same frame of reference.
In this project, a computational pipeline is established, which achieves the aforementioned goal of mapping data from these various imaging modalities and specimens to a canonical frame of reference. This pipeline relies on the three core building blocks of Instance Segmentation, Tracking and Registration. In this dissertation work, I introduce EmbedSeg which is my solution to performing instance segmentation of 2D and 3D (volume) image data. Next, I introduce LineageTracer which is my solution to performing tracking of a time-lapse (2d+t, 3d+t) recording. Finally, I introduce PlatyMatch which is my solution to performing registration of volumes. Errors from the application of these building blocks accumulate which produces a noisy observation estimate of gene expression for the digitized cells in the canonical frame of reference. These noisy estimates are processed to infer the underlying hidden state by using a Hidden Markov Model (HMM) formulation. Lastly, for wider dissemination of these methods, one requires an effective visualization strategy. A few details about the employed approach are also discussed in the dissertation work.
The pipeline was designed keeping imaging volume data in mind, but can easily be extended to incorporate other data modalities, if available, such as single cell RNA Sequencing (scRNA-Seq) (more details are provided in the Discussion chapter). The methods elucidated in this dissertation would provide a fertile playground for several experiments and analyses in the future. Some of such potential experiments and current weaknesses of the computational pipeline are also discussed additionally in the Discussion Chapter
- …