19 research outputs found
Comments on 'Fast and scalable search of whole-slide images via self-supervised deep learning'
Chen et al. [Chen2022] recently published the article 'Fast and scalable
search of whole-slide images via self-supervised deep learning' in Nature
Biomedical Engineering. The authors call their method 'self-supervised image
search for histology', short SISH. We express our concerns that SISH is an
incremental modification of Yottixel, has used MinMax binarization but does not
cite the original works, and is based on a misnomer 'self-supervised image
search'. As well, we point to several other concerns regarding experiments and
comparisons performed by Chen et al
Learning Binary and Sparse Permutation-Invariant Representations for Fast and Memory Efficient Whole Slide Image Search
Learning suitable Whole slide images (WSIs) representations for efficient
retrieval systems is a non-trivial task. The WSI embeddings obtained from
current methods are in Euclidean space not ideal for efficient WSI retrieval.
Furthermore, most of the current methods require high GPU memory due to the
simultaneous processing of multiple sets of patches. To address these
challenges, we propose a novel framework for learning binary and sparse WSI
representations utilizing a deep generative modelling and the Fisher Vector. We
introduce new loss functions for learning sparse and binary
permutation-invariant WSI representations that employ instance-based training
achieving better memory efficiency. The learned WSI representations are
validated on The Cancer Genomic Atlas (TCGA) and Liver-Kidney-Stomach (LKS)
datasets. The proposed method outperforms Yottixel (a recent search engine for
histopathology images) both in terms of retrieval accuracy and speed. Further,
we achieve competitive performance against SOTA on the public benchmark LKS
dataset for WSI classification
Histopathology Slide Indexing and Search: Are We There Yet?
The search and retrieval of digital histopathology slides is an important
task that has yet to be solved. In this case study, we investigate the clinical
readiness of three state-of-the-art histopathology slide search engines,
Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a
qualitative assessment of each model's performance in providing retrieval
results that are reliable and useful to pathologists. We found that all three
image search engines fail to produce consistently reliable results and have
difficulties in capturing granular and subtle features of malignancy, limiting
their diagnostic accuracy. Based on our findings, we also propose a minimal set
of requirements to further advance the development of accurate and reliable
histopathology image search engines for successful clinical adoption
Ranking Loss and Sequestering Learning for Reducing Image Search Bias in Histopathology
Recently, deep learning has started to play an essential role in healthcare
applications, including image search in digital pathology. Despite the recent
progress in computer vision, significant issues remain for image searching in
histopathology archives. A well-known problem is AI bias and lack of
generalization. A more particular shortcoming of deep models is the ignorance
toward search functionality. The former affects every model, the latter only
search and matching. Due to the lack of ranking-based learning, researchers
must train models based on the classification error and then use the resultant
embedding for image search purposes. Moreover, deep models appear to be prone
to internal bias even if using a large image repository of various hospitals.
This paper proposes two novel ideas to improve image search performance. First,
we use a ranking loss function to guide feature extraction toward the
matching-oriented nature of the search. By forcing the model to learn the
ranking of matched outputs, the representation learning is customized toward
image search instead of learning a class label. Second, we introduce the
concept of sequestering learning to enhance the generalization of feature
extraction. By excluding the images of the input hospital from the matched
outputs, i.e., sequestering the input domain, the institutional bias is
reduced. The proposed ideas are implemented and validated through the largest
public dataset of whole slide images. The experiments demonstrate superior
results compare to the-state-of-art.Comment: Under Review for publicatio
Learning Discriminative Representations for Gigapixel Images
Digital images of tumor tissue are important diagnostic and prognostic tools for pathologists. Recent advancement in digital pathology has led to an abundance of digitized histopathology slides, called whole-slide images. Computational analysis of whole-slide images is a challenging task as they are generally gigapixel files, often one or more gigabytes in size. However, these computational methods provide a unique opportunity to improve the objectivity and accuracy of diagnostic interpretations in histopathology. Recently, deep learning has been successful in characterizing images for vision-based applications in multiple domains. But its applications are relatively less explored in the histopathology domain mostly due to the following two challenges. Firstly, there is difficulty in scaling deep learning methods for processing large gigapixel histopathology images. Secondly, there is a lack of diversified and labeled datasets due to privacy constraints as well as workflow and technical challenges in the healthcare sector. The main goal of this dissertation is to explore and develop deep models to learn discriminative representations of whole slide images while overcoming the existing challenges. A three-staged approach was considered in this research. In the first stage, a framework called Yottixel is proposed. It represents a whole-slide image as a set of multiple representative patches, called mosaic. The mosaic enables convenient processing and compact representation of an entire high-resolution whole-slide image. Yottixel allows faster retrieval of similar whole-slide images within large archives of digital histopathology images. Such retrieval technology enables pathologists to tap into the past diagnostic data on demand. Yottixel is validated on the largest public archive of whole-slide images (The Cancer Genomic Atlas), achieving promising results. Yottixel is an unsupervised method that limits its performance on specific tasks especially when the labeled (or partially labeled) dataset can be available. In the second stage, multi-instance learning (MIL) is used to enhance the cancer subtype prediction through weakly-supervised training. Three MIL methods have been proposed, each improving upon the previous one. The first one is based on memory-based models, the second uses attention-based models, and the third one uses graph neural networks. All three methods are incorporated in Yottixel to classify entire whole-slide images with no pixel-level annotations. Access to large-scale and diversified datasets is a primary driver of the advancement and adoption of machine learning technologies. However, healthcare has many restrictive rules around data sharing, limiting research and model development. In the final stage, a federated learning scheme called ProxyFL is developed that enables collaborative training of Yottixel among the multiple healthcare organizations without centralization of the sensitive medical data. The combined research in all the three stages of the Ph.D. has resulted in the development of a holistic and practical framework for learning discriminative and compact representations of whole-slide images in digital pathology
Multi-Magnification Search in Digital Pathology
This research study investigates the effect of magnification on content-based image search in digital pathology archives and proposes to use multi-magnification image representation. Image search in large archives of digital pathology slides provides researchers and medical professionals with an opportunity to match records of current and past patients and learn from evidently diagnosed and treated cases. When working with microscopes, pathologists switch between different magnification levels while examining tissue specimens to find and evaluate various morphological features. Inspired by the conventional pathology workflow, this thesis investigates several magnification levels in digital pathology and their combinations to minimize the gap between AI-enabled image search methods and clinical settings. This thesis suggests two approaches for combining magnification levels and compares their performance. The first approach obtains a single-vector deep feature representation for a WSI, whereas the second approach works with a multi-vector deep feature representation. The proposed content-based searching framework does not rely on any pixel-level annotation and potentially applies to millions of unlabelled (raw) WSIs. This thesis proposes using binary masks generated by U-Net as the primary step of patch preparation to locating tissue regions in a WSI. As a part of this thesis, a multi-magnification dataset of histopathology patches is created by applying the proposed patch preparation method on more than 8,000 WSIs of TCGA repository.
The performance of both MMS methods is evaluated by investigating the top three most similar WSIs to a query WSI found by the search. The search is considered successful if two out of three matched cases have the same malignancy subtype as the query WSI. Experimental search results across tumors of several anatomical sites at different magnification levels, i.e., 20×, 10×, and 5× magnifications and their combinations, are reported in this thesis.
The experiments verify that cell-level information at the highest magnification is essential for searching for diagnostic purposes. In contrast, low-magnification information may improve this assessment depending on the tumor type. Both proposed search methods generally performed more accurately at 20× magnification or the combination of the 20× magnification with 10×, 5×, or both. The multi-magnification searching approach achieved up to 11% increase in F1-score for searching among some tumor types, including the urinary tract and brain tumor subtypes compared to the single-magnification image search
KimiaNet: Training a Deep Network for Histopathology using High-Cellularity
With the recent progress in deep learning, one of the common approaches to represent images is extracting deep features. A primitive way to do this is by using off-the-shelf models. However, these features could be improved through fine-tuning or even training a network from scratch by domain-specific images. This desirable task is hindered by the lack of annotated or labeled images in the field of histopathology.
In this thesis, a new network, namely KimiaNet, is proposed that uses an existing dense topology but is tailored for generating informative and discriminative deep features from histopathology images for image representation. This model is trained based on the existing DenseNet-121 architecture but by using more than 240,000 image patches of 1000 ⨉ 1000 pixels acquired at 20⨉ magnification.
Considering the high cost of histopathology image annotation, which makes the idea impractical at a large scale, a high-cellularity mosaic approach is suggested which could be used as a weak or soft labeling method. Patches used for training the KimiaNet are extracted from 7,126 whole slide images of formalin-fixed paraffin-embedded (FFPE) biopsy samples, spanning 30 cancer sub-types and publicly available through The Cancer Genome Atlas (TCGA) repository.
The quality of features generated by KimiaNet are tested via two types of image search, (i) given a query slide, searching among all of the slides and finding the ones with the tissue type similar to the query’s and (ii) searching among slides within the query slide’s tumor type and finding slides with the same cancer sub-type as the query slide’s. Compared to the pre-trained DenseNet-121 and the fine-tuned versions, KimiaNet achieved predominantly the best results for both search modes.
In order to get an intuition of how effective training from scratch is on the expressiveness of the deep features, the deep features of randomly selected patches, from each cancer subtype, are extracted using both KimiaNet and pre-trained DenseNet-121 and visualized after reducing their dimensionality using t-distributed Stochastic Neighbor Embedding (tSNE). This visualization illustrates that for KimiaNet, the instances of each class can easily be distinguished from others while for pre-trained DenseNet the instances of almost all of the classes are mixed together. This comparison is another verification to show that how discriminative training with domain-specific images has made the features.
Also, four simpler networks, made up of repetitions of convolutional, batch-normalization and Rectified Linear Unit (ReLU) layers, (CBR networks) are implemented and compared against the KimiaNet to check if the network design could still be further simplified. The experiments demonstrated that KimiaNet features are by far better than CBR networks which validate the DenseNet-121 as a good candidate for KimiaNet’s architecture
End-to-End Whole Slide Image Classification and Search using Set Representations
Due to the sheer size of gigapixel whole slide images (WSIs), it is difficult to apply deep learning and computer vision algorithms for digital pathology. Traditional approaches rely on extracting meaningful patches from a WSI and obtaining a representation for each patch individually. This approach fails to incorporate inherent information between the set of extracted patches. In this thesis, we approach the problem of WSI representation by using Set Transformers, a neural network architecture capable of incorporating the element-wise interactions of sets to obtain one global representation. We show through extensive experiments the representation capabilities of our method by outperforming existing patch-based approaches on search and classification tasks