43 research outputs found

    Deformable MR Prostate Segmentation via Deep Feature Learning and Sparse Patch Matching

    Get PDF
    Automatic and reliable segmentation of the prostate is an important but difficult task for various clinical applications such as prostate cancer radiotherapy. The main challenges for accurate MR prostate localization lie in two aspects: (1) inhomogeneous and inconsistent appearance around prostate boundary, and (2) the large shape variation across different patients. To tackle these two problems, we propose a new deformable MR prostate segmentation method by unifying deep feature learning with the sparse patch matching. First, instead of directly using handcrafted features, we propose to learn the latent feature representation from prostate MR images by the stacked sparse auto-encoder (SSAE). Since the deep learning algorithm learns the feature hierarchy from the data, the learned features are often more concise and effective than the handcrafted features in describing the underlying data. To improve the discriminability of learned features, we further refine the feature representation in a supervised fashion. Second, based on the learned features, a sparse patch matching method is proposed to infer a prostate likelihood map by transferring the prostate labels from multiple atlases to the new prostate MR image. Finally, a deformable segmentation is used to integrate a sparse shape model with the prostate likelihood map for achieving the final segmentation. The proposed method has been extensively evaluated on the dataset that contains 66 T2-wighted prostate MR images. Experimental results show that the deep-learned features are more effective than the handcrafted features in guiding MR prostate segmentation. Moreover, our method shows superior performance than other state-of-the-art segmentation methods

    BagStack Classification for Data Imbalance Problems with Application to Defect Detection and Labeling in Semiconductor Units

    Get PDF
    abstract: Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the ”no free lunch theorem”. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements. Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of defect detection and classification in semiconductor units is challenging due to different acceptable variations that the manufacturing process introduces. Other variations are also typically introduced when using optical inspection systems due to changes in lighting conditions and misalignment of the imaged units, which makes the defect detection process more challenging. In this thesis, a BagStack classification framework is proposed, which makes use of stacking and bagging concepts to handle both variance and bias errors. The classifier is designed to handle the data imbalance and overfitting problems by adaptively transforming the multi-class classification problem into multiple binary classification problems, applying a bagging approach to train a set of base learners for each specific problem, adaptively specifying the number of base learners assigned to each problem, adaptively specifying the number of samples to use from each class, applying a novel data-imbalance aware cross-validation technique to generate the meta-data while taking into account the data imbalance problem at the meta-data level and, finally, using a multi-response random forest regression classifier as a meta-classifier. The BagStack classifier makes use of multiple features to solve the defect classification problem. In order to detect defects, a locally adaptive statistical background modeling is proposed. The proposed BagStack classifier outperforms state-of-the-art image classification techniques on our dataset in terms of overall classification accuracy and average per-class classification accuracy. The proposed detection method achieves high performance on the considered dataset in terms of recall and precision.Dissertation/ThesisDoctoral Dissertation Computer Engineering 201

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Binding information in short-term memory: evidence from healthy individuals, Alzheimer's Disease and other clinical populations

    Get PDF
    Memory binding is a cognitive process that enables complex objects to be stored or retrieved coherently during perception, learning, or action. Binding functions are aimed at reducing the misattribution of the features of objects in crowded and changing sensory contexts, ensuring accurate representation in visual working memory. Binding is a relatively new concept in working memory research. However, as an integrative function it provides a rich context in which to investigate the mechanisms underlying memory deterioration. In this PhD project, a range of experimental temporary binding paradigms were used to investigate whether some of the memory impairments observed in patients with Alzheimer’s Disease could be accounted for by deficits in this memory function. A set of neuropsychological tasks were used to investigate binding operations across memory domains (i.e., verbal and nonverbal), sensory modalities (i.e., visual and auditory), types of information (e.g., objects and colours), and retrieval processes (i.e., recognition and recall) in healthy individuals, Alzheimer’s Disease patients and other clinical populations. The results suggest that the efficiency of short-term memory to store bound complex events depends on the nature of the information presented (e.g., type of information bound into objects) (Chapter 2). Short-term memory seems to be equipped with relatively separate mechanisms to store integrated objects and individual features (Chapter 4). It was also observed that the binding properties of short-term memory apply to healthy young and older people, and are functions which are preserved in the elderly (Chapter 3). In two additional experimental chapters (5 and 6) the preserved binding abilities of older people were compared with temporary binding in Alzheimer’s Disease. The latter group showed a very large impairment in binding that was distinct from their impairments in memory for individual features. These findings suggest that memory binding tasks could reliably separate the cognitive changes in normal ageing from those linked with Alzheimer’ Disease. Moreover, the results of Chapter 7 suggested that memory binding tasks may detect memory changes in people that will develop Alzheimer’ Disease (i.e., asymptomatic carriers of the gene defect E280A of the Preseniline-1 gene) almost 10 years before the average age of onset. These results are relevant to our understanding of short-term memory and to the memory models currently available. Finally, it is suggested that the constructs of memory binding may increase the sensitivity of current assessment procedures for people at risk of developing Alzheimer’s Disease

    A novel voting classifier for electric vehicles population at different locations using Al-Biruni earth radius optimization algorithm

    Get PDF
    The rising popularity of electric vehicles (EVs) can be attributed to their positive impact on the environment and their ability to lower operational expenses. Nevertheless, the task of determining the most suitable EV types for a specific site continues to pose difficulties, mostly due to the wide range of consumer preferences and the inherent limits of EVs. This study introduces a new voting classifier model that incorporates the Al-Biruni earth radius optimization algorithm, which is derived from the stochastic fractal search. The model aims to predict the optimal EV type for a given location by considering factors such as user preferences, availability of charging infrastructure, and distance to the destination. The proposed classification methodology entails the utilization of ensemble learning, which can be subdivided into two distinct stages: pre-classification and classification. During the initial stage of classification, the process of data preprocessing involves converting unprocessed data into a refined, systematic, and well-arranged format that is appropriate for subsequent analysis or modeling. During the classification phase, a majority vote ensemble learning method is utilized to categorize unlabeled data properly and efficiently. This method consists of three independent classifiers. The efficacy and efficiency of the suggested method are showcased through simulation experiments. The results indicate that the collaborative classification method performs very well and consistently in classifying EV populations. In comparison to similar classification approaches, the suggested method demonstrates improved performance in terms of assessment metrics such as accuracy, sensitivity, specificity, and F-score. The improvements observed in these metrics are 91.22%, 94.34%, 89.5%, and 88.5%, respectively. These results highlight the overall effectiveness of the proposed method. Hence, the suggested approach is seen more favorable for implementing the voting classifier in the context of the EV population across different geographical areas

    Vulnerable road users and connected autonomous vehicles interaction: a survey

    Get PDF
    There is a group of users within the vehicular traffic ecosystem known as Vulnerable Road Users (VRUs). VRUs include pedestrians, cyclists, motorcyclists, among others. On the other hand, connected autonomous vehicles (CAVs) are a set of technologies that combines, on the one hand, communication technologies to stay always ubiquitous connected, and on the other hand, automated technologies to assist or replace the human driver during the driving process. Autonomous vehicles are being visualized as a viable alternative to solve road accidents providing a general safe environment for all the users on the road specifically to the most vulnerable. One of the problems facing autonomous vehicles is to generate mechanisms that facilitate their integration not only within the mobility environment, but also into the road society in a safe and efficient way. In this paper, we analyze and discuss how this integration can take place, reviewing the work that has been developed in recent years in each of the stages of the vehicle-human interaction, analyzing the challenges of vulnerable users and proposing solutions that contribute to solving these challenges.This work was partially funded by the Ministry of Economy, Industry, and Competitiveness of Spain under Grant: Supervision of drone fleet and optimization of commercial operations flight plans, PID2020-116377RB-C21.Peer ReviewedPostprint (published version

    Target classification in multimodal video

    Get PDF
    The presented thesis focuses on enhancing scene segmentation and target recognition methodologies via the mobilisation of contextual information. The algorithms developed to achieve this goal utilise multi-modal sensor information collected across varying scenarios, from controlled indoor sequences to challenging rural locations. Sensors are chiefly colour band and long wave infrared (LWIR), enabling persistent surveillance capabilities across all environments. In the drive to develop effectual algorithms towards the outlined goals, key obstacles are identified and examined: the recovery of background scene structure from foreground object ’clutter’, employing contextual foreground knowledge to circumvent training a classifier when labeled data is not readily available, creating a labeled LWIR dataset to train a convolutional neural network (CNN) based object classifier and the viability of spatial context to address long range target classification when big data solutions are not enough. For an environment displaying frequent foreground clutter, such as a busy train station, we propose an algorithm exploiting foreground object presence to segment underlying scene structure that is not often visible. If such a location is outdoors and surveyed by an infra-red (IR) and visible band camera set-up, scene context and contextual knowledge transfer allows reasonable class predictions for thermal signatures within the scene to be determined. Furthermore, a labeled LWIR image corpus is created to train an infrared object classifier, using a CNN approach. The trained network demonstrates effective classification accuracy of 95% over 6 object classes. However, performance is not sustainable for IR targets acquired at long range due to low signal quality and classification accuracy drops. This is addressed by mobilising spatial context to affect network class scores, restoring robust classification capability
    corecore