Search CORE

1,416 research outputs found

Methodology for extensive evaluation of semiautomatic and interactive segmentation algorithms using simulated Interaction models

Author: Haque S M Rafizul
Publication venue: 'University of Saskatchewan Library'
Publication date: 21/09/2016
Field of study

Performance of semiautomatic and interactive segmentation(SIS) algorithms are usually evaluated by employing a small number of human operators to segment the images. The human operators typically provide the approximate location of objects of interest and their boundaries in an interactive phase, which is followed by an automatic phase where the segmentation is performed under the constraints of the operator-provided guidance. The segmentation results produced from this small set of interactions do not represent the true capability and potential of the algorithm being evaluated. For example, due to inter-operator variability, human operators may make choices that may provide either overestimated or underestimated results. As well, their choices may not be realistic when compared to how the algorithm is used in the field, since interaction may be influenced by operator fatigue and lapses in judgement. Other drawbacks to using human operators to assess SIS algorithms, include: human error, the lack of available expert users, and the expense. A methodology for evaluating segmentation performance is proposed here which uses simulated Interaction models to programmatically generate large numbers of interactions to ensure the presence of interactions throughout the object region. These interactions are used to segment the objects of interest and the resulting segmentations are then analysed using statistical methods. The large number of interactions generated by simulated interaction models capture the variabilities existing in the set of user interactions by considering each and every pixel inside the entire region of the object as a potential location for an interaction to be placed with equal probability. Due to the practical limitation imposed by the enormous amount of computation for the enormous number of possible interactions, uniform sampling of interactions at regular intervals is used to generate the subset of all possible interactions which still can represent the diverse pattern of the entire set of interactions. Categorization of interactions into different groups, based on the position of the interaction inside the object region and texture properties of the image region where the interaction is located, provides the opportunity for fine-grained algorithm performance analysis based on these two criteria. Application of statistical hypothesis testing make the analysis more accurate, scientific and reliable in comparison to conventional evaluation of semiautomatic segmentation algorithms. The proposed methodology has been demonstrated by two case studies through implementation of seven different algorithms using three different types of interaction modes making a total of nine segmentation applications to assess the efficacy of the methodology. Application of this methodology has revealed in-depth, fine details about the performance of the segmentation algorithms which currently existing methods could not achieve due to the absence of a large, unbiased set of interactions. Practical application of the methodology for a number of algorithms and diverse interaction modes have shown its feasibility and generality for it to be established as an appropriate methodology. Development of this methodology to be used as a potential application for automatic evaluation of the performance of SIS algorithms looks very promising for users of image segmentation

eCommons@USASK

University of Saskatchewan Research Archive

Combining Shape and Learning for Medical Image Analysis

Author: Alv\ue9n Jennifer
Publication venue
Publication date: 01/01/2020
Field of study

Automatic methods with the ability to make accurate, fast and robust assessments of medical images are highly requested in medical research and clinical care. Excellent automatic algorithms are characterized by speed, allowing for scalability, and an accuracy comparable to an expert radiologist. They should produce morphologically and physiologically plausible results while generalizing well to unseen and rare anatomies. Still, there are few, if any, applications where today\u27s automatic methods succeed to meet these requirements.\ua0The focus of this thesis is two tasks essential for enabling automatic medical image assessment, medical image segmentation and medical image registration. Medical image registration, i.e. aligning two separate medical images, is used as an important sub-routine in many image analysis tools as well as in image fusion, disease progress tracking and population statistics. Medical image segmentation, i.e. delineating anatomically or physiologically meaningful boundaries, is used for both diagnostic and visualization purposes in a wide range of applications, e.g. in computer-aided diagnosis and surgery.The thesis comprises five papers addressing medical image registration and/or segmentation for a diverse set of applications and modalities, i.e. pericardium segmentation in cardiac CTA, brain region parcellation in MRI, multi-organ segmentation in CT, heart ventricle segmentation in cardiac ultrasound and tau PET registration. The five papers propose competitive registration and segmentation methods enabled by machine learning techniques, e.g. random decision forests and convolutional neural networks, as well as by shape modelling, e.g. multi-atlas segmentation and conditional random fields

Chalmers Research

Crowdsourcing in Computer Vision

Author: Fei-Fei Li
Grauman Kristen
Kovashka Adriana
Russakovsky Olga
Publication venue: 'Now Publishers'
Publication date: 01/01/2016
Field of study

Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

arXiv.org e-Print Archive

Crossref

Doctor of Philosophy

Author: Widanagamaachchi Wathsala
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

dissertationA broad range of applications capture dynamic data at an unprecedented scale. Independent of the application area, finding intuitive ways to understand the dynamic aspects of these increasingly large data sets remains an interesting and, to some extent, unsolved research problem. Generically, dynamic data sets can be described by some, often hierarchical, notion of feature of interest that exists at each moment in time, and those features evolve across time. Consequently, exploring the evolution of these features is considered to be one natural way of studying these data sets. Usually, this process entails the ability to: 1) define and extract features from each time step in the data set; 2) find their correspondences over time; and 3) analyze their evolution across time. However, due to the large data sizes, visualizing the evolution of features in a comprehensible manner and performing interactive changes are challenging. Furthermore, feature evolution details are often unmanageably large and complex, making it difficult to identify the temporal trends in the underlying data. Additionally, many existing approaches develop these components in a specialized and standalone manner, thus failing to address the general task of understanding feature evolution across time. This dissertation demonstrates that interactive exploration of feature evolution can be achieved in a non-domain-specific manner so that it can be applied across a wide variety of application domains. In particular, a novel generic visualization and analysis environment that couples a multiresolution unified spatiotemporal representation of features with progressive layout and visualization strategies for studying the feature evolution across time is introduced. This flexible framework enables on-the-fly changes to feature definitions, their correspondences, and other arbitrary attributes while providing an interactive view of the resulting feature evolution details. Furthermore, to reduce the visual complexity within the feature evolution details, several subselection-based and localized, per-feature parameter value-based strategies are also enabled. The utility and generality of this framework is demonstrated by using several large-scale dynamic data sets

The University of Utah: J. Willard Marriott Digital Library

Two and three dimensional segmentation of multimodal imagery

Author: Vantaram Sreenath Rao
Publication venue: RIT Scholar Works
Publication date: 01/10/2012
Field of study

The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

RIT Scholar Works

Adaptive video delivery using semantics

Author: Steiger Olivier
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

Infoscience - École polytechnique fédérale de Lausanne

Deep Learning for Free-Hand Sketch: A Survey

Author: Hospedales Timothy M.
Song Yi-Zhe
Wang Liang
Xiang Tao
Xu Peng
Yin Qiyue
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community.Comment: This paper is accepted by IEEE TPAM

arXiv.org e-Print Archive

Edinburgh Research Explorer

DR-NTU (Digital Repository of NTU)