104 research outputs found
Recommended from our members
Human machine collaboration for foreground segmentation in images and videos
Foreground segmentation is defined as the problem of generating pixel level foreground masks for all the objects in a given image or video. Accurate foreground segmentations in images and videos have several potential applications such as improving search, training richer object detectors, image synthesis and re-targeting, scene and activity understanding, video summarization, and post-production video editing.
One effective way to solve this problem is human-machine collaboration. The main idea is to let humans guide the segmentation process through some partial supervision. As humans, we are extremely good at perception and can easily identify the foreground regions. Computers, on the other hand, lack this capability, but are extremely good at continuously processing large volumes of data at the lowest level of detail with great efficiency. Bringing these complementary strengths together can lead to systems which are accurate and cost-effective at the same time. However, in any such human-machine collaboration system, cost effectiveness and higher accuracy are competing goals. While more involvement from humans can certainly lead to higher accuracy, it also leads to increased cost both in terms of time and money. On the other hand, relying more on machines is cost-effective, but algorithms are still nowhere near human-level performance. Balancing this cost versus accuracy trade-off holds the key behind success for such a hybrid system.
In this thesis, I develop foreground segmentation algorithms which effectively and efficiently make use of human guidance for accurately segmenting foreground objects in images and videos. The algorithms developed in this thesis actively reason about the best modalities or interactions through which a user can provide guidance to the system for generating accurate segmentations. At the same time, these algorithms are also capable of prioritizing human guidance on instances where it is most needed. Finally, when structural similarity exists within data (e.g., adjacent frames in a video or similar images in a collection), the algorithms developed in this thesis are capable of propagating information from instances which have received human guidance to the ones which did not. Together, these characteristics result in a substantial savings in human annotation cost while generating high quality foreground segmentations in images and videos.
In this thesis, I consider three categories of segmentation problems all of which can greatly benefit from human-machine collaboration. First, I consider the problem of interactive image segmentation. In traditional interactive methods a human annotator provides a coarse spatial annotation (e.g., bounding box or freehand outlines) around the object of interest to obtain a segmentation. The mode of manual annotation used affects both its accuracy and ease-of-use. Whereas existing methods assume a fixed form of input no matter the image, in this thesis I propose a data-driven algorithm which learns whether an interactive segmentation method will succeed if initialized with a given annotation mode. This allows us to predict the modality that will be sufficiently strong to yield a high quality segmentation for a given image and results in large savings in annotation costs. I also propose a novel interactive segmentation algorithm called Click Carving which can accurately segment objects in images and videos using a very simple form of human interaction---point clicks. It outperforms several state-of-the-art methods and requires only a fraction of human effort in comparison.
Second, I consider the problem of segmenting images in a weakly supervised image collection. Here, we are given a collection of images all belonging to the same object category and the goal is to jointly segment the common object from all the images. For this, I develop a stagewise active approach to segmentation propagation: in each stage, the images that appear most valuable for human annotation are actively determined and labeled by human annotators, then the foreground estimates are revised in all unlabeled images accordingly. In order to identify images that, once annotated, will propagate well to other examples, I introduce an active selection procedure that operates on the joint segmentation graph over all images. It prioritizes human intervention for those images that are uncertain and influential in the graph, while also mutually diverse. Building on this, I also introduce the problem of measuring compatibility between image pairs for joint segmentation. I show that restricting the joint segmentation to only compatible image pairs results in an improved joint segmentation performance.
Finally, I propose a semi-supervised approach for segmentation propagation in video. Given human supervision in some frames of a video, this information can be propagated through time. The main challenge is that the foreground object may move quickly in the scene at the same time its appearance and shape evolves over time. To address this, I propose a higher order supervoxel label consistency potential which leverages bottom-up supervoxels to enforce long-range temporal consistency during propagation. I also introduce the notion of a generic pixel-level objectness in images and videos by training a deep neural network which uses appearance and motion to automatically assign a score to each pixel capturing its likelihood to be an "object" or "background". I show that the human guidance in the semi-supervised propagation algorithm can be further augmented with the generic pixel-objectness scores to obtain an even more accurate foreground segmentation in videos.
Throughout, I provide extensive evaluation on challenging datasets and also compare with many state-of-the-art methods and other baselines validating the strengths of proposed algorithms. The outcomes across several different experiments show that the proposed human-machine collaboration algorithms achieve accurate segmentation of foreground objects in images and videos while saving a large amount of human annotation effort.Computer Science
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
Analysis of Sub-Cortical Morphology in Benign Epilepsy with Centrotemporal Spikes
RÉSUMÉ
Au Canada, l’épilepsie affecte environ 5 à 8 enfants par 3222 âgés de 2 à 37 ans dans la population globale. Quinze à 47 % de ces enfants ont une épilepsie bénigne avec des pointes centrotemporelles (BECTS), ce qui fait de BECTS le syndrome épileptique focal de l’enfant bénin le plus fréquent. Initialement, BECTS était considéré comme bénin parmi les autres épilepsies car il était généralement rapporté que les capacités cognitives ont été préservées
ou ramenées à la normale pendant la rémission. Cependant, certaines études ont trouvé des déficits cognitifs et comportementaux, qui peuvent bien persister même après la rémission.
Compte tenu des différences neurocognitives chez les enfants atteints de BECTS et de témoins normaux, la question est de savoir si des variations morphométriques subtiles dans les structures cérébrales sont également présentes chez ces patients et si elles expliquent des
variations dans les performence cognitifs. En fait, malgré les preuves accumulées d’une étiologie
neurodéveloppementale dans le BECTS, peu est connu sur les altérations structurelles sous-jacentes. À cet égard, la proposition de méthodes avancées en neuroimagerie permettrait d’évaluer quantitativement les variations de la morphologie cérébrale associées à ce trouble neurologique. En outre, l’étude du développement morphologique du cerveau et sa relation avec la cognition peut aider à élucider la base neuroanatomique des déficits cognitifs. Le but
de cette thèse est donc de fournir un ensemble d’outils pour analyser les variations morphologiques sous-corticales subtiles provoquées par différentes maladies, telles que l’épilepsie bénigne avec des pointes centrotemporelles.
La méthodologie adoptée dans cette thèse a conduit à trois objectifs de recherche spécifiques.
La première étape vise à développer un nouveau cadre automatisé pour segmenter les structures sous-corticales sur les images à resonance magnètique (IRM). La deuxième étape vise à concevoir une nouvelle approche basée sur la correspondance spectrale pour capturer précisément la variabilité de forme chez les sujets épileptiques. La troisième étape conduit à une analyse de la relation entre les changements morphologiques du cerveau et les indices
cognitifs.
La première contribution vise plus spécifiquement la segmentation automatique des structures sous-corticales dans un processus de co-recalage et de co-segmentation multi-atlas. Contrairement aux approches standards de segmentation multi-atlas, la méthode proposée obtient la segmentation finale en utilisant un recalage en fonction de la population, tandis que les connaissances à prior basés sur les réseaux neuronaux par convolution (CNNs) sont
incorporées dans la formulation d’énergie en tant que représentation d’image discriminative.
Ainsi, cette méthode exploite des représentations apprises plus sophistiquées pour conduire le processus de co-recalage. De plus, étant donné un ensemble de volumes cibles, la méthode proposée calcule les probabilités de segmentation individuellement, puis segmente tous les
volumes simultanément. Par conséquent, le fardeau de fournir un sous-ensemble de vérité connue approprié pour effectuer la segmentation multi-atlas est évité. Des résultats prometteurs démontrent le potentiel de notre méthode sur deux ensembles de données, contenant des annotations de structures sous-corticales. L’importance des estimations fiables des annotations est également mise en évidence, ce qui motive l’utilisation de réseaux neuronaux
profonds pour remplacer les annotations de vérité connue en co-recalage avec une perte de performance minimale.
La deuxième contribution vise à saisir la variabilité de forme entre deux populations de surfaces en utilisant une analyse morphologique multijoints. La méthode proposée exploite la représentation spectrale pour établir des correspondances de surface, puisque l’appariement est plus simple dans le domaine spectral plutôt que dans l’espace euclidien conventionnel.
Le cadre proposé intègre la concordance spectrale à courbure moyenne dans un plateforme d’analyse de formes sous-corticales multijoints. L’analyse expérimentale sur des données cliniques a montré que les différences de groupe extraites étaient similaires avec les résultats
dans d’autres études cliniques, tandis que les sorties d’analyse de forme ont été créées d’une manière à réduire le temps de calcul.
Enfin, la troisième contribution établit l’association entre les altérations morphologiques souscorticales
chez les enfants atteints d’épilepsie bénigne et les indices cognitifs. Cette étude permet de détecter les changements du putamen et du noyau caudé chez les enfants atteints de BECTS gauche, droit ou bilatéral. De plus, l ’association des différences volumétriques structurelles
et des différences de forme avec la cognition a été étudiée. Les résultats confirment les altérations de la forme du putamen et du noyau caudé chez les enfants atteints de BECTS.
De plus, nos résultats suggèrent que la variation de la forme sous-corticale affecte les fonctions cognitives. Cette étude démontre que les altérations de la forme et leur relation avec la cognition dépendent du côté de la focalisation de l’épilepsie.
Ce projet nous a permis d’étudier si de nouvelles méthodes permettraient de traiter automatiquement les informations de neuro-imagerie chez les enfants atteints de BECTS et de
détecter des variations morphologiques subtiles dans leurs structures sous-corticales. De plus, les résultats obtenus dans le cadre de cette thèse nous ont permis de conclure qu’il existe une association entre les variations morphologiques et la cognition par rapport au côté de la
focalisation de la crise Ă©pileptique.----------ABSTRACT
In Canada, epilepsy affects approximately 5 to 8 children per 3222 aged from 2 to 37 years in the overall population. Fifteen to 47% of these children have benign epilepsy with centrotemporal spikes (BECTS), making BECTS the most common benign childhood focal epileptic syndrome. Initially, BECTS was considered as benign among other epilepsies since it was
generally reported that cognitive abilities were preserved or brought back to normal during remission. However, some studies have found cognitive and behavioral deficits, which
may well persist even after remission. Given neurocognitive differences among children with BECTS and normal controls, the question is whether subtle morphometric variations in brain structures are also present in these patients, and whether they explain variations in cognitive indices. In fact, despite the accumulating evidence of a neurodevelopmental etiology in BECTS, little is known about underlying structural alterations. In this respect, proposing advanced neuroimaging methods will allow for quantitative assessment of variations in brain morphology associated with this neurological disorder. In addition, studying the brain morphological development and its relationship with cognition may help elucidate the neuroanatomical basis of cognitive deficits. Therefore, the focus of this thesis is to provide a set of tools for analyzing the subtle sub-cortical morphological alterations in different diseases, such as benign epilepsy with centrotemporal spikes.
The methodology adopted in this thesis led to addressing three specific research objectives. The first step develops a new automated framework for segmenting subcortical structures on MR images. The second step designs a new approach based on spectral correspondence to precisely capture shape variability in epileptic individuals. The third step finds the association between brain morphological changes and cognitive indices.
The first contribution aims more specifically at automatic segmentation of sub-cortical structures in a groupwise multi-atlas coregistration and cosegmentation process. Contrary to the standard multi-atlas segmentation approaches, the proposed method obtains the final segmentation using a population-wise registration, while Convolutional Neural Network (CNN)- based priors are incorporated in the energy formulation as a discriminative image representation. Thus, this method exploits more sophisticated learned representations to drive the
coregistration process. Furthermore, given a set of target volumes the developed method computes the segmentation probabilities individually, and then segments all the volumes simultaneously. Therefore, the burden of providing an appropriate ground truth subset to perform multi-atlas segmentation is removed. Promising results demonstrate the potential of our method on two different datasets, containing annotations of sub-cortical structures. The
importance of reliable label estimations is also highlighted, motivating the use of deep neural nets to replace ground truth annotations in coregistration with minimal loss in performance.
The second contribution intends to capture shape variability between two population of surfaces
using groupwise morphological analysis. The proposed method exploits spectral representation for establishing surface correspondences, since matching is simpler in the spectral
domain rather than in the conventional Euclidean space. The designed framework integrates mean curvature-based spectral matching in to a groupwise subcortical shape analysis pipeline.
Experimental analysis on real clinical dataset showed that the extracted group differences were in parallel with the findings in other clinical studies, while the shape analysis outputs were created in a computational efficient manner.
Finally, the third contribution establishes the association between sub-cortical morphological alterations in children with benign epilepsy and cognitive indices. This study detects putamen and caudate changes in children with left, right, or bilateral BECTS to age and gender matched healthy individuals. In addition, the association of structural volumetric and shape differences with cognition is investigated. The findings confirm putamen and caudate shape
alterations in children with BECTS. Also, our results suggest that variation in sub-cortical shape affects cognitive functions. More importantly, this study demonstrates that shape alterations and their relation with cognition depend on the side of epilepsy focus.
This project enabled us to investigate whether new methods would allow to automatically process neuroimaging information from children afflicted with BECTS and detect subtle morphological variations in their sub-cortical structures. In addition, the results obtained in this thesis allowed us to conclude the existence of the association between morphological variations and cognition with respect to the side of seizure focus
OSC-CO2: coattention and cosegmentation framework for plant state change with multiple features
Cosegmentation and coattention are extensions of traditional segmentation methods aimed at detecting a common object (or objects) in a group of images. Current cosegmentation and coattention methods are ineffective for objects, such as plants, that change their morphological state while being captured in different modalities and views. The Object State Change using Coattention-Cosegmentation (OSC-CO2) is an end-to-end unsupervised deep-learning framework that enhances traditional segmentation techniques, processing, analyzing, selecting, and combining suitable segmentation results that may contain most of our target object’s pixels, and then displaying a final segmented image. The framework leverages coattention-based convolutional neural networks (CNNs) and cosegmentation-based dense Conditional Random Fields (CRFs) to address segmentation accuracy in high-dimensional plant imagery with evolving plant objects. The efficacy of OSC-CO2 is demonstrated using plant growth sequences imaged with infrared, visible, and fluorescence cameras in multiple views using a remote sensing, high-throughput phenotyping platform, and is evaluated using Jaccard index and precision measures. We also introduce CosegPP+, a dataset that is structured and can provide quantitative information on the efficacy of our framework. Results show that OSC-CO2 out performed state-of-the art segmentation and cosegmentation methods by improving segementation accuracy by 3% to 45%
- …