Search CORE

41 research outputs found

Dynamically balanced online random forests for interactive scribble-based segmentation

Author: Aertsen M
David AL
Deprest J
Doel T
Klusmann M
Ourselin S
Pratt R
Vercauteren T
Wang G
Zuluaga MA
Publication venue: MICCAI 2016
Publication date: 02/10/2016
Field of study

Interactive scribble-and-learning-based segmentation is attractive for its good performance and reduced number of user interaction. Scribbles for foreground and background are often imbalanced. With the arrival of new scribbles,the imbalance ratio may change largely. Failing to deal with imbalanced training data and a changing imbalance ratio may lead to a decreased sensitivity and accuracy for segmentation. We propose a generic Dynamically Balanced Online Random Forest (DyBa ORF) to deal with these problems,with a combination of a dynamically balanced online Bagging method and a tree growing and shrinking strategy to update the random forests. We validated DyBa ORF on UCI machine learning data sets and applied it to two different clinical applications: 2D segmentation of the placenta from fetal MRI and adult lungs from radiographic images. Experiments show it outperforms traditional ORF in dealing with imbalanced data with a changing imbalance ratio,while maintaining a comparable accuracy and a higher efficiency compared with its offline counterpart. Our results demonstrate that DyBa ORF is more suitable than existing ORF for learning-based interactive image segmentation

UCL Discovery

ECONet: Efficient Convolutional Online Likelihood Network for Scribble-based Interactive Segmentation

Author: Asad Muhammad
Fidon Lucas
Vercauteren Tom
Publication venue
Publication date: 29/03/2022
Field of study

Automatic segmentation of lung lesions associated with COVID-19 in CT images requires large amount of annotated volumes. Annotations mandate expert knowledge and are time-intensive to obtain through fully manual segmentation methods. Additionally, lung lesions have large inter-patient variations, with some pathologies having similar visual appearance as healthy lung tissues. This poses a challenge when applying existing semi-automatic interactive segmentation techniques for data labelling. To address these challenges, we propose an efficient convolutional neural networks (CNNs) that can be learned online while the annotator provides scribble-based interaction. To accelerate learning from only the samples labelled through user-interactions, a patch-based approach is used for training the network. Moreover, we use weighted cross-entropy loss to address the class imbalance that may result from user-interactions. During online inference, the learned network is applied to the whole input volume using a fully convolutional approach. We compare our proposed method with state-of-the-art using synthetic scribbles and show that it outperforms existing methods on the task of annotating lung lesions associated with COVID-19, achieving 16% higher Dice score while reducing execution time by 3

\times

and requiring 9000 lesser scribbles-based labelled voxels. Due to the online learning aspect, our approach adapts quickly to user input, resulting in high quality segmentation labels. Source code for ECONet is available at: https://github.com/masadcv/ECONet-MONAILabel.Comment: Accepted at MIDL 202

arXiv.org e-Print Archive

Minimally Interactive Segmentation with Application to Human Placenta in Fetal MR Images

Author: Wang G
Publication venue: UCL (University College London)
Publication date: 28/06/2018
Field of study

Placenta segmentation from fetal Magnetic Resonance (MR) images is important for fetal surgical planning. However, accurate segmentation results are difficult to achieve for automatic methods, due to sparse acquisition, inter-slice motion, and the widely varying position and shape of the placenta among pregnant women. Interactive methods have been widely used to get more accurate and robust results. A good interactive segmentation method should achieve high accuracy, minimize user interactions with low variability among users, and be computationally fast. Exploiting recent advances in machine learning, I explore a family of new interactive methods for placenta segmentation from fetal MR images. I investigate the combination of user interactions with learning from a single image or a large set of images. For learning from a single image, I propose novel Online Random Forests to efficiently leverage user interactions for the segmentation of 2D and 3D fetal MR images. I also investigate co-segmentation of multiple volumes of the same patient with 4D Graph Cuts. For learning from a large set of images, I first propose a deep learning-based framework that combines user interactions with Convolutional Neural Networks (CNN) based on geodesic distance transforms to achieve accurate segmentation and good interactivity. I then propose image-specific fine-tuning to make CNNs adaptive to different individual images and able to segment previously unseen objects. Experimental results show that the proposed algorithms outperform traditional interactive segmentation methods in terms of accuracy and interactivity. Therefore, they might be suitable for segmentation of the placenta in planning systems for fetal and maternal surgery, and for rapid characterization of the placenta by MR images. I also demonstrate that they can be applied to the segmentation of other organs from 2D and 3D images

UCL Discovery

Knee cartilage segmentation using multi purpose interactive approach

Author: Gan Hong Seng
Publication venue
Publication date: 01/01/2016
Field of study

Interactive model incorporates expert interpretation and automated segmentation. However, cartilage has complicated structure, indistinctive tissue contrast in magnetic resonance image of knee hardens image review and existing interactive methods are sensitive to various technical problems such as bi-label segmentation problem, shortcut problem and sensitive to image noise. Moreover, redundancy issue caused by non-cartilage labelling has never been tackled. Therefore, Bi-Bezier Curve Contrast Enhancement is developed to improve visual quality of magnetic resonance image by considering brightness preservation and contrast enhancement control. Then, Multipurpose Interactive Tool is developed to handle users’ interaction through Label Insertion Point approach. Approximate NonCartilage Labelling system is developed to generate computerized non-cartilage label, while preserves cartilage for expert labelling. Both computerized and interactive labels initialize Random Walks based segmentation model. To evaluate contrast enhancement techniques, Measure of Enhancement (EME), Absolute Mean Brightness Error (AMBE) and Feature Similarity Index (FSIM) are used. The results suggest that Bi-Bezier Curve Contrast Enhancement outperforms existing methods in terms of contrast enhancement control (EME = 41.44±1.06), brightness distortion (AMBE = 14.02±1.29) and image quality (FSIM = 0.92±0.02). Besides, implementation of Approximate Non-Cartilage Labelling model has demonstrated significant efficiency improvement in segmenting normal cartilage (61s±8s, P = 3.52 x 10-5) and diseased cartilage (56s±16s, P = 1.4 x 10-4). Finally, the proposed labelling model has high Dice values (Normal: 0.94±0.022, P = 1.03 x 10-9; Abnormal: 0.92±0.051, P = 4.94 x 10-6) and is found to be beneficial to interactive model (+0.12)

Universiti Teknologi Malaysia Institutional Repository

Generalizations of the Multicut Problem for Computer Vision

Author: Levinkov Evgeny
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

Graph decomposition has always been a very important concept in machine learning and computer vision. Many tasks like image and mesh segmentation, community detection in social networks, as well as object tracking and human pose estimation can be formulated as a graph decomposition problem. The multicut problem in particular is a popular model to optimize for a decomposition of a given graph. Its main advantage is that no prior knowledge about the number of components or their sizes is required. However, it has several limitations, which we address in this thesis: Firstly, the multicut problem allows to specify only cost or reward for putting two direct neighbours into distinct components. This limits the expressibility of the cost function. We introduce special edges into the graph that allow to define cost or reward for putting any two vertices into distinct components, while preserving the original set of feasible solutions. We show that this considerably improves the quality of image and mesh segmentations. Second, multicut is notorious to be NP-hard for general graphs, that limits its applications to small super-pixel graphs. We define and implement two primal feasible heuristics to solve the problem. They do not provide any guarantees on the runtime or quality of solutions, but in practice show good convergence behaviour. We perform an extensive comparison on multiple graphs of different sizes and properties. Third, we extend the multicut framework by introducing node labels, so that we can jointly optimize for graph decomposition and nodes classification by means of exactly the same optimization algorithm, thus eliminating the need to hand-tune optimizers for a particular task. To prove its universality we applied it to diverse computer vision tasks, including human pose estimation, multiple object tracking, and instance-aware semantic segmentation. We show that we can improve the results over the prior art using exactly the same data as in the original works. Finally, we use employ multicuts in two applications: 1) a client-server tool for interactive video segmentation: After the pre-processing of the video a user draws strokes on several frames and a time-coherent segmentation of the entire video is performed on-the-fly. 2) we formulate a method for simultaneous segmentation and tracking of living cells in microscopy data. This task is challenging as cells split and our algorithm accounts for this, creating parental hierarchies. We also present results on multiple model fitting. We find models in data heavily corrupted by noise by finding components defining these models using higher order multicuts. We introduce an interesting extension that allows our optimization to pick better hyperparameters for each discovered model. In summary, this thesis extends the multicut problem in different directions, proposes algorithms for optimization, and applies it to novel data and settings.Die Zerlegung von Graphen ist ein sehr wichtiges Konzept im maschinellen Lernen und maschinellen Sehen. Viele Aufgaben wie Bild- und Gittersegmentierung, Kommunitätserkennung in sozialen Netzwerken, sowie Objektverfolgung und Schätzung von menschlichen Posen können als Graphzerlegungsproblem formuliert werden. Der Mehrfachschnitt-Ansatz ist ein populäres Mittel um über die Zerlegungen eines gegebenen Graphen zu optimieren. Sein größter Vorteil ist, dass kein Vorwissen über die Anzahl an Komponenten und deren Größen benötigt wird. Dennoch hat er mehrere ernsthafte Limitierungen, welche wir in dieser Arbeit behandeln: Erstens erlaubt der klassische Mehrfachschnitt nur die Spezifikation von Kosten oder Belohnungen für die Trennung von zwei Nachbarn in verschiedene Komponenten. Dies schränkt die Ausdrucksfähigkeit der Kostenfunktion ein und führt zu suboptimalen Ergebnissen. Wir fügen dem Graphen spezielle Kanten hinzu, welche es erlauben, Kosten oder Belohnungen für die Trennung von beliebigen Paaren von Knoten in verschiedene Komponenten zu definieren, ohne die Menge an zulässigen Lösungen zu verändern. Wir zeigen, dass dies die Qualität von Bild- und Gittersegmentierungen deutlich verbessert. Zweitens ist das Mehrfachschnittproblem berüchtigt dafür NP-schwer für allgemeine Graphen zu sein, was die Anwendungen auf kleine superpixel-basierte Graphen einschränkt. Wir definieren und implementieren zwei primal-zulässige Heuristiken um das Problem zu lösen. Diese geben keine Garantien bezüglich der Laufzeit oder der Qualität der Lösungen, zeigen in der Praxis jedoch gutes Konvergenzverhalten. Wir führen einen ausführlichen Vergleich auf vielen Graphen verschiedener Größen und Eigenschaften durch. Drittens erweitern wir den Mehrfachschnitt-Ansatz um Knoten-Kennzeichnungen, sodass wir gemeinsam über Zerlegungen und Knoten-Klassifikationen mit dem gleichen Optimierungs-Algorithmus optimieren können. Dadurch wird der Bedarf der Feinabstimmung einzelner aufgabenspezifischer Löser aus dem Weg geräumt. Um die Allgemeingültigkeit dieses Ansatzes zu überprüfen, haben wir ihn auf verschiedenen Aufgaben des maschinellen Sehens, einschließlich menschliche Posenschätzung, Mehrobjektverfolgung und instanz-bewusste semantische Segmentierung, angewandt. Wir zeigen, dass wir Resultate von vorherigen Arbeiten mit exakt den gleichen Daten verbessern können. Abschließend benutzen wir Mehrfachschnitte in zwei Anwendungen: 1) Ein Nutzer-Server-Werkzeug für interaktive Video Segmentierung: Nach der Vorbearbeitung eines Videos zeichnet der Nutzer Striche auf mehrere Einzelbilder und eine zeit-kohärente Segmentierung des gesamten Videos wird in Echtzeit berechnet. 2) Wir formulieren eine Methode für simultane Segmentierung und Verfolgung von lebenden Zellen in Mikroskopie-Aufnahmen. Diese Aufgabe ist anspruchsvoll, da Zellen sich aufteilen und unser Algorithmus dies in der Erstellung von Eltern-Hierarchien mitberücksichtigen muss. Wir präsentieren außerdem Resultate zur Mehrmodellanpassung. Wir berechnen Modelle in stark verrauschten Daten indem wir mithilfe von Mehrfachschnitten höherer Ordnung Komponenten finden, die diesen Modellen entsprechen. Wir führen eine interessante Erweiterung ein, die es unserer Optimierung erlaubt, bessere Hyperparameter für jedes entdeckte Modell auszuwählen. Zusammenfassend erweitert diese Arbeit den Mehrfachschnitt-Ansatz in unterschiedlichen Richtungen, schlägt Algorithmen zur Inferenz in den resultierenden Modellen vor und wendet ihn auf neuartigen Daten und Umgebungen an

Universaar

Acronym

MPG.PuRe

Analysis and Synthesis of Interactive Video Sprites

Author: Ilisescu Corneliu
Publication venue: UCL (University College London)
Publication date: 28/03/2018
Field of study

In this thesis, we explore how video, an extremely compelling medium that is traditionally consumed passively, can be transformed into interactive experiences and what is preventing content creators from using it for this purpose. Film captures extremely rich and dynamic information but, due to the sheer amount of data and the drastic change in content appearance over time, it is problematic to work with. Content creators are willing to invest time and effort to design and capture video so why not for manipulating and interacting with it? We hypothesize that people can help and be helped by automatic video processing and synthesis algorithms when they are given the right tools. Computer games are a very popular interactive media where players engage with dynamic content in compelling and intuitive ways. The first contribution of this thesis is an in-depth exploration of the modes of interaction that enable game-like video experiences. Through active discussions with game developers, we identify both how to assist content creators and how their creation can be dynamically interacted with by players. We present concepts, explore algorithms and design tools that together enable interactive video experiences. Our findings concerning processing videos and interacting with filmed content come together in this thesis' second major contribution. We present a new medium of expression where video elements can be looped, merged and triggered interactively. Static-camera videos are converted into loopable sequences that can be controlled in real time in response to simple end-user requests. We present novel algorithms and interactive tools that enable our new medium of expression. Our human-in-the-loop system gives the user progressively more creative control over the video content as they invest more effort and artists help us evaluate it. Monocular, static-camera videos are a good fit for looping algorithms but they have been limited to two-dimensional applications as pixels are reshuffled in space and time on the image plane. The final contribution of this thesis breaks through this barrier by allowing users to interact with filmed objects in a three-dimensional manner. Our novel object tracking algorithm extends existing 2D bounding box trackers with 3D information, such as a well-fitting bounding volume, which in turn enables a new breed of interactive video experiences. The filmed content becomes a three-dimensional playground as users are free to move the virtual camera or the tracked objects and see them from novel viewpoints

UCL Discovery

Interactive computer vision through the Web

Author: Pizenberg Matthieu
Publication venue
Publication date: 28/02/2020
Field of study

Computer vision is the computational science aiming at reproducing and improving the ability of human vision to understand its environment. In this thesis, we focus on two fields of computer vision, namely image segmentation and visual odometry and we show the positive impact that interactive Web applications provide on each. The first part of this thesis focuses on image annotation and segmentation. We introduce the image annotation problem and challenges it brings for large, crowdsourced datasets. Many interactions have been explored in the literature to help segmentation algorithms. The most common consist in designating contours, bounding boxes around objects, or interior and exterior scribbles. When crowdsourcing, annotation tasks are delegated to a non-expert public, sometimes on cheaper devices such as tablets. In this context, we conducted a user study showing the advantages of the outlining interaction over scribbles and bounding boxes. Another challenge of crowdsourcing is the distribution medium. While evaluating an interaction in a small user study does not require complex setup, distributing an annotation campaign to thousands of potential users might differ. Thus we describe how the Elm programming language helped us build a reliable image annotation Web application. A highlights tour of its functionalities and architecture is provided, as well as a guide on how to deploy it to crowdsourcing services such as Amazon Mechanical Turk. The application is completely opensource and available online. In the second part of this thesis we present our open-source direct visual odometry library. In that endeavor, we provide an evaluation of other open-source RGB-D camera tracking algorithms and show that our approach performs as well as the currently available alternatives. The visual odometry problem relies on geometry tools and optimization techniques traditionally requiring much processing power to perform at realtime framerates. Since we aspire to run those algorithms directly in the browser, we review past and present technologies enabling high performance computations on the Web. In particular, we detail how to target a new standard called WebAssembly from the C++ and Rust programming languages. Our library has been started from scratch in the Rust programming language, which then allowed us to easily port it to WebAssembly. Thanks to this property, we are able to showcase a visual odometry Web application with multiple types of interactions available. A timeline enables one-dimensional navigation along the video sequence. Pairs of image points can be picked on two 2D thumbnails of the image sequence to realign cameras and correct drifts. Colors are also used to identify parts of the 3D point cloud, selectable to reinitialize camera positions. Combining those interactions enables improvements on the tracking and 3D point reconstruction results

Open Archive Toulouse Archive Ouverte

Segmentation mutuelle d'objets d'intérêt dans des séquences d'images stéréo multispectrales

Author: St-Charles Pierre-Luc
Publication venue
Publication date: 01/04/2018
Field of study

Les systèmes de vidéosurveillance automatisés actuellement déployés dans le monde sont encore bien loin de ceux qui sont représentés depuis des années dans les oeuvres de sciencefiction. Une des raisons derrière ce retard de développement est le manque d’outils de bas niveau permettant de traiter les données brutes captées sur le terrain. Le pré-traitement de ces données sert à réduire la quantité d’information qui transige vers des serveurs centralisés, qui eux effectuent l’interprétation complète du contenu visuel capté. L’identification d’objets d’intérêt dans les images brutes à partir de leur mouvement est un exemple de pré-traitement qui peut être réalisé. Toutefois, dans un contexte de vidéosurveillance, une méthode de pré-traitement ne peut généralement pas se fier à un modèle d’apparence ou de forme qui caractérise ces objets, car leur nature exacte n’est pas connue d’avance. Cela complique donc l’élaboration des méthodes de traitement de bas niveau. Dans cette thèse, nous présentons différentes méthodes permettant de détecter et de segmenter des objets d’intérêt à partir de séquences vidéo de manière complètement automatisée. Nous explorons d’abord les approches de segmentation vidéo monoculaire par soustraction d’arrière-plan. Ces approches se basent sur l’idée que l’arrière-plan d’une scène peut être modélisé au fil du temps, et que toute variation importante d’apparence non prédite par le modèle dévoile en fait la présence d’un objet en intrusion. Le principal défi devant être relevé par ce type de méthode est que leur modèle d’arrière-plan doit pouvoir s’adapter aux changements dynamiques des conditions d’observation de la scène. La méthode conçue doit aussi pouvoir rester sensible à l’apparition de nouveaux objets d’intérêt, malgré cette robustesse accrue aux comportements dynamiques prévisibles. Nous proposons deux méthodes introduisant différentes techniques de modélisation qui permettent de mieux caractériser l’apparence de l’arrière-plan sans que le modèle soit affecté par les changements d’illumination, et qui analysent la persistance locale de l’arrière-plan afin de mieux détecter les objets d’intérêt temporairement immobilisés. Nous introduisons aussi de nouveaux mécanismes de rétroaction servant à ajuster les hyperparamètres de nos méthodes en fonction du dynamisme observé de la scène et de la qualité des résultats produits.----------ABSTRACT: The automated video surveillance systems currently deployed around the world are still quite far in terms of capabilities from the ones that have inspired countless science fiction works over the past few years. One of the reasons behind this lag in development is the lack of lowlevel tools that allow raw image data to be processed directly in the field. This preprocessing is used to reduce the amount of information transferred to centralized servers that have to interpret the captured visual content for further use. The identification of objects of interest in raw images based on motion is an example of a reprocessing step that might be required by a large system. However, in a surveillance context, the preprocessing method can seldom rely on an appearance or shape model to recognize these objects since their exact nature cannot be known exactly in advance. This complicates the elaboration of low-level image processing methods. In this thesis, we present different methods that detect and segment objects of interest from video sequences in a fully unsupervised fashion. We first explore monocular video segmentation approaches based on background subtraction. These approaches are based on the idea that the background of an observed scene can be modeled over time, and that any drastic variation in appearance that is not predicted by the model actually reveals the presence of an intruding object. The main challenge that must be met by background subtraction methods is that their model should be able to adapt to dynamic changes in scene conditions. The designed methods must also remain sensitive to the emergence of new objects of interest despite this increased robustness to predictable dynamic scene behaviors. We propose two methods that introduce different modeling techniques to improve background appearance description in an illumination-invariant way, and that analyze local background persistence to improve the detection of temporarily stationary objects. We also introduce new feedback mechanisms used to adjust the hyperparameters of our methods based on the observed dynamics of the scene and the quality of the generated output

PolyPublie

Ecosystemic Evolution Feeded by Smart Systems

Author: Dino Giuli (Ed.)
Publication venue: MDPI - Multidisciplinary Digital Publishing Institute
Publication date: 01/01/2018
Field of study

Information Society is advancing along a route of ecosystemic evolution. ICT and Internet advancements, together with the progression of the systemic approach for enhancement and application of Smart Systems, are grounding such an evolution. The needed approach is therefore expected to evolve by increasingly fitting into the basic requirements of a significant general enhancement of human and social well-being, within all spheres of life (public, private, professional). This implies enhancing and exploiting the net-living virtual space, to make it a virtuous beneficial integration of the real-life space. Meanwhile, contextual evolution of smart cities is aiming at strongly empowering that ecosystemic approach by enhancing and diffusing net-living benefits over our own lived territory, while also incisively targeting a new stable socio-economic local development, according to social, ecological, and economic sustainability requirements. This territorial focus matches with a new glocal vision, which enables a more effective diffusion of benefits in terms of well-being, thus moderating the current global vision primarily fed by a global-scale market development view. Basic technological advancements have thus to be pursued at the system-level. They include system architecting for virtualization of functions, data integration and sharing, flexible basic service composition, and end-service personalization viability, for the operation and interoperation of smart systems, supporting effective net-living advancements in all application fields. Increasing and basically mandatory importance must also be increasingly reserved for human–technical and social–technical factors, as well as to the associated need of empowering the cross-disciplinary approach for related research and innovation. The prospected eco-systemic impact also implies a social pro-active participation, as well as coping with possible negative effects of net-living in terms of social exclusion and isolation, which require incisive actions for a conformal socio-cultural development. In this concern, speed, continuity, and expected long-term duration of innovation processes, pushed by basic technological advancements, make ecosystemic requirements stricter. This evolution requires also a new approach, targeting development of the needed basic and vocational education for net-living, which is to be considered as an engine for the development of the related ‘new living know-how’, as well as of the conformal ‘new making know-how’

Directory of Open Access Books (DOAB)