190 research outputs found

    Content-driven superpixels and their applications

    No full text
    This thesis develops a new superpixel algorithm that displays excellent visual reconstruction of the original image. It achieves high stability across multiple random initialisations, achieved by producing superpixels directly corresponding to local image complexity. This is achieved by growing superpixels and dividing them on image variation. The existing analysis was not sufficient to take these properties into account so new measures of oversegmentation provide new insight into the optimum superpixel representation. As a consequence of the algorithm, it was discovered that CDS has properties that have eluded previous attempts, such as initialisation invariance and stability. The completely unsupervised nature of CDS makes them highly suitable for tasks such as application to a database containing images of unknown complexity. These new superpixel properties have allowed new applications for superpixel pre-processing to be produced. These are image segmentation; image compression; scene classification; and focus detection. In addition, a new method of objectively analysing regions of focus has been developed using Light-Field photography

    ANALYZING PULMONARY ABNORMALITY WITH SUPERPIXEL BASED GRAPH NEURAL NETWORKS IN CHEST X-RAY

    Get PDF
    In recent years, the utilization of graph-based deep learning has gained prominence, yet its potential in the realm of medical diagnosis remains relatively unexplored. Convolutional Neural Network (CNN) has achieved state-of-the-art performance in areas such as computer vision, particularly for grid-like data such as images. However, they require a huge dataset to achieve top level of performance and challenge arises when learning from the inherent irregular/unordered nature of physiological data. In this thesis, the research primarily focuses on abnormality screening: classification of Chest X-Ray (CXR) as Tuberculosis positive or negative, using Graph Neural Networks (GNN) that uses Region Adjacency Graphs (RAGs), and each superpixel serves as a dedicated graph node. For graph classification, provided that the different classes are distinct enough GNN often classify graphs using just the graph structures. This study delves into the inquiry of whether the incorporation of node features, such as coordinate points and pixel intensity, along with structured data representing graph can enhance the learning process. By integration of residual and concatenation structures, this methodology adeptly captures essential features and relationships among superpixels, thereby contributing to advancements in tuberculosis identification. We achieved the best performance: accuracy of 0.80 and AUC of 0.79, through the union of state-of-the-art neural network architectures and innovative graph-based representations. This work introduces a new perspective to medical image analysis

    Benchmark evaluation of object segmentation proposal

    Get PDF
    Abstract. In this research, we provide an in depth analysis and evaluation of four recent segmentation proposals algorithms on PASCAL VOC benchmark. The principal goal of this study is to investigate these object detection proposal methods in an un-biased evaluation framework. Despite having a widespread application, the strengths and weaknesses of different segmentation proposal methods with respect to each other are mostly not completely clear in the previous works. This thesis provides additional insights to the segmentation proposal methods. In order to evaluate the quality of proposals we plot the recall as a function of average number of regions per image. PASCAL VOC 2012 Object categories, where the methodologies show high performance and instances where these algorithms suffer low recall is also discussed in this work. Experimental evaluation reveals that, despite being different in the operational nature, generally all segmentation proposal methods share similar strengths and weaknesses. The analysis also show how one could select a proposal generation method based on object attributes. Finally we show that, improvement in recall can be obtained by merging the proposals of different algorithms together. Experimental evaluation shows that this merging approach outperforms individual algorithms both in terms of precision and recall

    Visual object category discovery in images and videos

    Get PDF
    textThe current trend in visual recognition research is to place a strict division between the supervised and unsupervised learning paradigms, which is problematic for two main reasons. On the one hand, supervised methods require training data for each and every category that the system learns; training data may not always be available and is expensive to obtain. On the other hand, unsupervised methods must determine the optimal visual cues and distance metrics that distinguish one category from another to group images into semantically meaningful categories; however, for unlabeled data, these are unknown a priori. I propose a visual category discovery framework that transcends the two paradigms and learns accurate models with few labeled exemplars. The main insight is to automatically focus on the prevalent objects in images and videos, and learn models from them for category grouping, segmentation, and summarization. To implement this idea, I first present a context-aware category discovery framework that discovers novel categories by leveraging context from previously learned categories. I devise a novel object-graph descriptor to model the interaction between a set of known categories and the unknown to-be-discovered categories, and group regions that have similar appearance and similar object-graphs. I then present a collective segmentation framework that simultaneously discovers the segmentations and groupings of objects by leveraging the shared patterns in the unlabeled image collection. It discovers an ensemble of representative instances for each unknown category, and builds top-down models from them to refine the segmentation of the remaining instances. Finally, building on these techniques, I show how to produce compact visual summaries for first-person egocentric videos that focus on the important people and objects. The system leverages novel egocentric and high-level saliency features to predict important regions in the video, and produces a concise visual summary that is driven by those regions. I compare against existing state-of-the-art methods for category discovery and segmentation on several challenging benchmark datasets. I demonstrate that we can discover visual concepts more accurately by focusing on the prevalent objects in images and videos, and show clear advantages of departing from the status quo division between the supervised and unsupervised learning paradigms. The main impact of my thesis is that it lays the groundwork for building large-scale visual discovery systems that can automatically discover visual concepts with minimal human supervision.Electrical and Computer Engineerin

    A comparative study of algorithms for automatic segmentation of dermoscopic images

    Get PDF
    Melanoma is the most common as well as the most dangerous type of skin cancer. Nevertheless, it can be effectively treated if detected early. Dermoscopy is one of the major non-invasive imaging techniques for the diagnosis of skin lesions. The computer-aided diagnosis based on the processing of dermoscopic images aims to reduce the subjectivity and time-consuming analysis related to traditional diagnosis. The first step of automatic diagnosis is image segmentation. In this project, the implementation and evaluation of several methods were proposed for the automatic segmentation of lesion regions in dermoscopic images, along with the corresponding implemented phases for image preprocessing and postprocessing. The developed algorithms include methods based on different state of the art techniques. The main groups of techniques which have been selected to be studied and implemented are thresholding-based methods, region-based methods, segmentation based on deformable models, as well as a new proposed approach based on the bag-of-words model. The implemented methods incorporate modifications for a better adaptation to features associated with dermoscopic images. Each implemented method was applied to a database constituted by 724 dermoscopic images. The output of the automatic segmentation procedure for each image was compared with the corresponding manual segmentation in order to evaluate the performance. The comparison between algorithms was carried out regarding the obtained evaluation metrics. The best results were achieved by the combination of region-based segmentation based on the multi-region adaptation of the k-means algorithm and the subIngeniería de Sistemas Audiovisuale

    Automated taxiing for unmanned aircraft systems

    Get PDF
    Over the last few years, the concept of civil Unmanned Aircraft System(s) (UAS) has been realised, with small UASs commonly used in industries such as law enforcement, agriculture and mapping. With increased development in other areas, such as logistics and advertisement, the size and range of civil UAS is likely to grow. Taken to the logical conclusion, it is likely that large scale UAS will be operating in civil airspace within the next decade. Although the airborne operations of civil UAS have already gathered much research attention, work is also required to determine how UAS will function when on the ground. Motivated by the assumption that large UAS will share ground facilities with manned aircraft, this thesis describes the preliminary development of an Automated Taxiing System(ATS) for UAS operating at civil aerodromes. To allow the ATS to function on the majority of UAS without the need for additional hardware, a visual sensing approach has been chosen, with the majority of work focusing on monocular image processing techniques. The purpose of the computer vision system is to provide direct sensor data which can be used to validate the vehicle s position, in addition to detecting potential collision risks. As aerospace regulations require the most robust and reliable algorithms for control, any methods which are not fully definable or explainable will not be suitable for real-world use. Therefore, non-deterministic methods and algorithms with hidden components (such as Artificial Neural Network (ANN)) have not been used. Instead, the visual sensing is achieved through a semantic segmentation, with separate segmentation and classification stages. Segmentation is performed using superpixels and reachability clustering to divide the image into single content clusters. Each cluster is then classified using multiple types of image data, probabilistically fused within a Bayesian network. The data set for testing has been provided by BAE Systems, allowing the system to be trained and tested on real-world aerodrome data. The system has demonstrated good performance on this limited dataset, accurately detecting both collision risks and terrain features for use in navigation
    corecore