43 research outputs found
Deformable MR Prostate Segmentation via Deep Feature Learning and Sparse Patch Matching
Automatic and reliable segmentation of the prostate is an important but difficult task for various clinical applications such as prostate cancer radiotherapy. The main challenges for accurate MR prostate localization lie in two aspects: (1) inhomogeneous and inconsistent appearance around prostate boundary, and (2) the large shape variation across different patients. To tackle these two problems, we propose a new deformable MR prostate segmentation method by unifying deep feature learning with the sparse patch matching. First, instead of directly using handcrafted features, we propose to learn the latent feature representation from prostate MR images by the stacked sparse auto-encoder (SSAE). Since the deep learning algorithm learns the feature hierarchy from the data, the learned features are often more concise and effective than the handcrafted features in describing the underlying data. To improve the discriminability of learned features, we further refine the feature representation in a supervised fashion. Second, based on the learned features, a sparse patch matching method is proposed to infer a prostate likelihood map by transferring the prostate labels from multiple atlases to the new prostate MR image. Finally, a deformable segmentation is used to integrate a sparse shape model with the prostate likelihood map for achieving the final segmentation. The proposed method has been extensively evaluated on the dataset that contains 66 T2-wighted prostate MR images. Experimental results show that the deep-learned features are more effective than the handcrafted features in guiding MR prostate segmentation. Moreover, our method shows superior performance than other state-of-the-art segmentation methods
BagStack Classification for Data Imbalance Problems with Application to Defect Detection and Labeling in Semiconductor Units
abstract: Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the ”no free lunch theorem”. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements.
Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of defect detection and classification in semiconductor units is challenging due to different acceptable variations that the manufacturing process introduces. Other variations are also typically introduced when using optical inspection systems due to changes in lighting conditions and misalignment of the imaged units, which makes the defect detection process more challenging.
In this thesis, a BagStack classification framework is proposed, which makes use of stacking and bagging concepts to handle both variance and bias errors. The classifier is designed to handle the data imbalance and overfitting problems by adaptively transforming the
multi-class classification problem into multiple binary classification problems, applying a bagging approach to train a set of base learners for each specific problem, adaptively specifying the number of base learners assigned to each problem, adaptively specifying the number of samples to use from each class, applying a novel data-imbalance aware cross-validation technique to generate the meta-data while taking into account the data imbalance problem at the meta-data level and, finally, using a multi-response random forest regression classifier as a meta-classifier. The BagStack classifier makes use of multiple features to solve the defect classification problem. In order to detect defects, a locally adaptive statistical background modeling is proposed. The proposed BagStack classifier outperforms state-of-the-art image classification techniques on our dataset in terms of overall classification accuracy and average per-class classification accuracy. The proposed detection method achieves high performance on the considered dataset in terms of recall and precision.Dissertation/ThesisDoctoral Dissertation Computer Engineering 201
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Binding information in short-term memory: evidence from healthy individuals, Alzheimer's Disease and other clinical populations
Memory binding is a cognitive process that enables complex objects to be
stored or retrieved coherently during perception, learning, or action. Binding
functions are aimed at reducing the misattribution of the features of objects
in crowded and changing sensory contexts, ensuring accurate representation
in visual working memory. Binding is a relatively new concept in working
memory research. However, as an integrative function it provides a rich
context in which to investigate the mechanisms underlying memory
deterioration. In this PhD project, a range of experimental temporary binding
paradigms were used to investigate whether some of the memory
impairments observed in patients with Alzheimer’s Disease could be
accounted for by deficits in this memory function. A set of
neuropsychological tasks were used to investigate binding operations across
memory domains (i.e., verbal and nonverbal), sensory modalities (i.e., visual
and auditory), types of information (e.g., objects and colours), and retrieval
processes (i.e., recognition and recall) in healthy individuals, Alzheimer’s
Disease patients and other clinical populations. The results suggest that the
efficiency of short-term memory to store bound complex events depends on
the nature of the information presented (e.g., type of information bound into
objects) (Chapter 2). Short-term memory seems to be equipped with
relatively separate mechanisms to store integrated objects and individual
features (Chapter 4). It was also observed that the binding properties of
short-term memory apply to healthy young and older people, and are
functions which are preserved in the elderly (Chapter 3). In two additional
experimental chapters (5 and 6) the preserved binding abilities of older
people were compared with temporary binding in Alzheimer’s Disease. The
latter group showed a very large impairment in binding that was distinct
from their impairments in memory for individual features. These findings suggest that memory binding tasks could reliably separate the cognitive
changes in normal ageing from those linked with Alzheimer’ Disease.
Moreover, the results of Chapter 7 suggested that memory binding tasks may
detect memory changes in people that will develop Alzheimer’ Disease (i.e.,
asymptomatic carriers of the gene defect E280A of the Preseniline-1 gene)
almost 10 years before the average age of onset. These results are relevant to
our understanding of short-term memory and to the memory models
currently available. Finally, it is suggested that the constructs of memory
binding may increase the sensitivity of current assessment procedures for
people at risk of developing Alzheimer’s Disease
A novel voting classifier for electric vehicles population at different locations using Al-Biruni earth radius optimization algorithm
The rising popularity of electric vehicles (EVs) can be attributed to their positive impact on the environment and their ability to lower operational expenses. Nevertheless, the task of determining the most suitable EV types for a specific site continues to pose difficulties, mostly due to the wide range of consumer preferences and the inherent limits of EVs. This study introduces a new voting classifier model that incorporates the Al-Biruni earth radius optimization algorithm, which is derived from the stochastic fractal search. The model aims to predict the optimal EV type for a given location by considering factors such as user preferences, availability of charging infrastructure, and distance to the destination. The proposed classification methodology entails the utilization of ensemble learning, which can be subdivided into two distinct stages: pre-classification and classification. During the initial stage of classification, the process of data preprocessing involves converting unprocessed data into a refined, systematic, and well-arranged format that is appropriate for subsequent analysis or modeling. During the classification phase, a majority vote ensemble learning method is utilized to categorize unlabeled data properly and efficiently. This method consists of three independent classifiers. The efficacy and efficiency of the suggested method are showcased through simulation experiments. The results indicate that the collaborative classification method performs very well and consistently in classifying EV populations. In comparison to similar classification approaches, the suggested method demonstrates improved performance in terms of assessment metrics such as accuracy, sensitivity, specificity, and F-score. The improvements observed in these metrics are 91.22%, 94.34%, 89.5%, and 88.5%, respectively. These results highlight the overall effectiveness of the proposed method. Hence, the suggested approach is seen more favorable for implementing the voting classifier in the context of the EV population across different geographical areas
Vulnerable road users and connected autonomous vehicles interaction: a survey
There is a group of users within the vehicular traffic ecosystem known as Vulnerable Road Users (VRUs). VRUs include pedestrians, cyclists, motorcyclists, among others. On the other hand, connected autonomous vehicles (CAVs) are a set of technologies that combines, on the one hand, communication technologies to stay always ubiquitous connected, and on the other hand, automated technologies to assist or replace the human driver during the driving process. Autonomous vehicles are being visualized as a viable alternative to solve road accidents providing a general safe environment for all the users on the road specifically to the most vulnerable. One of the problems facing autonomous vehicles is to generate mechanisms that facilitate their integration not only within the mobility environment, but also into the road society in a safe and efficient way. In this paper, we analyze and discuss how this integration can take place, reviewing the work that has been developed in recent years in each of the stages of the vehicle-human interaction, analyzing the challenges of vulnerable users and proposing solutions that contribute to solving these challenges.This work was partially funded by the Ministry of Economy, Industry, and Competitiveness
of Spain under Grant: Supervision of drone fleet and optimization of commercial operations flight
plans, PID2020-116377RB-C21.Peer ReviewedPostprint (published version
Target classification in multimodal video
The presented thesis focuses on enhancing scene segmentation and target recognition methodologies via the mobilisation of contextual information. The algorithms developed to achieve this goal utilise multi-modal sensor information collected across varying scenarios,
from controlled indoor sequences to challenging rural locations. Sensors are chiefly colour band and long wave infrared (LWIR), enabling persistent surveillance capabilities across all environments. In the drive to develop effectual algorithms towards the outlined goals, key obstacles are identified and examined: the recovery of background scene structure from foreground object ’clutter’, employing contextual foreground knowledge to circumvent training a classifier when labeled data is not readily available, creating a labeled LWIR dataset to train a convolutional neural network (CNN) based object classifier and the viability of spatial context to address long range target classification when big data solutions are not enough. For an environment displaying frequent foreground clutter, such as a busy train station, we propose an algorithm exploiting foreground object presence to segment underlying scene structure that is not often visible. If such a location is outdoors and surveyed by an infra-red (IR) and visible band camera set-up, scene context and contextual knowledge transfer allows reasonable class predictions for thermal signatures within the scene to be determined. Furthermore, a labeled LWIR image corpus is created to train an infrared object classifier, using a CNN approach. The trained network demonstrates effective classification accuracy of 95% over 6 object classes. However, performance is not sustainable for IR targets acquired at long range due to low signal quality and classification accuracy drops. This is addressed by mobilising spatial context to affect network class scores, restoring robust classification capability