150 research outputs found
Twin identification over viewpoint change: A deep convolutional neural network surpasses humans
Deep convolutional neural networks (DCNNs) have achieved human-level accuracy
in face identification (Phillips et al., 2018), though it is unclear how
accurately they discriminate highly-similar faces. Here, humans and a DCNN
performed a challenging face-identity matching task that included identical
twins. Participants (N=87) viewed pairs of face images of three types:
same-identity, general imposter pairs (different identities from similar
demographic groups), and twin imposter pairs (identical twin siblings). The
task was to determine whether the pairs showed the same person or different
people. Identity comparisons were tested in three viewpoint-disparity
conditions: frontal to frontal, frontal to 45-degree profile, and frontal to
90-degree profile. Accuracy for discriminating matched-identity pairs from
twin-imposters and general imposters was assessed in each viewpoint-disparity
condition. Humans were more accurate for general-imposter pairs than
twin-imposter pairs, and accuracy declined with increased viewpoint disparity
between the images in a pair. A DCNN trained for face identification (Ranjan et
al., 2018) was tested on the same image pairs presented to humans. Machine
performance mirrored the pattern of human accuracy, but with performance at or
above all humans in all but one condition. Human and machine similarity scores
were compared across all image-pair types. This item-level analysis showed that
human and machine similarity ratings correlated significantly in six of nine
image-pair types [range r=0.38 to r=0.63], suggesting general accord between
the perception of face similarity by humans and the DCNN. These findings also
contribute to our understanding of DCNN performance for discriminating
high-resemblance faces, demonstrate that the DCNN performs at a level at or
above humans, and suggest a degree of parity between the features used by
humans and the DCNN
Leveraging Expert Models for Training Deep Neural Networks in Scarce Data Domains: Application to Offline Handwritten Signature Verification
This paper introduces a novel approach to leverage the knowledge of existing
expert models for training new Convolutional Neural Networks, on domains where
task-specific data are limited or unavailable. The presented scheme is applied
in offline handwritten signature verification (OffSV) which, akin to other
biometric applications, suffers from inherent data limitations due to
regulatory restrictions. The proposed Student-Teacher (S-T) configuration
utilizes feature-based knowledge distillation (FKD), combining graph-based
similarity for local activations with global similarity measures to supervise
student's training, using only handwritten text data. Remarkably, the models
trained using this technique exhibit comparable, if not superior, performance
to the teacher model across three popular signature datasets. More importantly,
these results are attained without employing any signatures during the feature
extraction training process. This study demonstrates the efficacy of leveraging
existing expert models to overcome data scarcity challenges in OffSV and
potentially other related domains
Maximizing Model Generalization for Machine Condition Monitoring with Self-Supervised Learning and Federated Learning
Deep Learning (DL) can diagnose faults and assess machine health from raw
condition monitoring data without manually designed statistical features.
However, practical manufacturing applications remain extremely difficult for
existing DL methods. Machine data is often unlabeled and from very few health
conditions (e.g., only normal operating data). Furthermore, models often
encounter shifts in domain as process parameters change and new categories of
faults emerge. Traditional supervised learning may struggle to learn compact,
discriminative representations that generalize to these unseen target domains
since it depends on having plentiful classes to partition the feature space
with decision boundaries. Transfer Learning (TL) with domain adaptation
attempts to adapt these models to unlabeled target domains but assumes similar
underlying structure that may not be present if new faults emerge. This study
proposes focusing on maximizing the feature generality on the source domain and
applying TL via weight transfer to copy the model to the target domain.
Specifically, Self-Supervised Learning (SSL) with Barlow Twins may produce more
discriminative features for monitoring health condition than supervised
learning by focusing on semantic properties of the data. Furthermore, Federated
Learning (FL) for distributed training may also improve generalization by
efficiently expanding the effective size and diversity of training data by
sharing information across multiple client machines. Results show that Barlow
Twins outperforms supervised learning in an unlabeled target domain with
emerging motor faults when the source training data contains very few distinct
categories. Incorporating FL may also provide a slight advantage by diffusing
knowledge of health conditions between machines
Self-supervised Learning in Remote Sensing: A Review
In deep learning research, self-supervised learning (SSL) has received great
attention triggering interest within both the computer vision and remote
sensing communities. While there has been a big success in computer vision,
most of the potential of SSL in the domain of earth observation remains locked.
In this paper, we provide an introduction to, and a review of the concepts and
latest developments in SSL for computer vision in the context of remote
sensing. Further, we provide a preliminary benchmark of modern SSL algorithms
on popular remote sensing datasets, verifying the potential of SSL in remote
sensing and providing an extended study on data augmentations. Finally, we
identify a list of promising directions of future research in SSL for earth
observation (SSL4EO) to pave the way for fruitful interaction of both domains.Comment: Accepted by IEEE Geoscience and Remote Sensing Magazine. 32 pages, 22
content page
Human alignment of neural network representations
Today's computer vision models achieve human or near-human level performance
across a wide variety of vision tasks. However, their architectures, data, and
learning algorithms differ in numerous ways from those that give rise to human
vision. In this paper, we investigate the factors that affect the alignment
between the representations learned by neural networks and human mental
representations inferred from behavioral responses. We find that model scale
and architecture have essentially no effect on the alignment with human
behavioral responses, whereas the training dataset and objective function both
have a much larger impact. These findings are consistent across three datasets
of human similarity judgments collected using two different tasks. Linear
transformations of neural network representations learned from behavioral
responses from one dataset substantially improve alignment with human
similarity judgments on the other two datasets. In addition, we find that some
human concepts such as food and animals are well-represented by neural networks
whereas others such as royal or sports-related objects are not. Overall,
although models trained on larger, more diverse datasets achieve better
alignment with humans than models trained on ImageNet alone, our results
indicate that scaling alone is unlikely to be sufficient to train neural
networks with conceptual representations that match those used by humans.Comment: Accepted for publication at ICLR 202
A Comprehensive Overview of Computational Nuclei Segmentation Methods in Digital Pathology
In the cancer diagnosis pipeline, digital pathology plays an instrumental
role in the identification, staging, and grading of malignant areas on biopsy
tissue specimens. High resolution histology images are subject to high variance
in appearance, sourcing either from the acquisition devices or the H\&E
staining process. Nuclei segmentation is an important task, as it detects the
nuclei cells over background tissue and gives rise to the topology, size, and
count of nuclei which are determinant factors for cancer detection. Yet, it is
a fairly time consuming task for pathologists, with reportedly high
subjectivity. Computer Aided Diagnosis (CAD) tools empowered by modern
Artificial Intelligence (AI) models enable the automation of nuclei
segmentation. This can reduce the subjectivity in analysis and reading time.
This paper provides an extensive review, beginning from earlier works use
traditional image processing techniques and reaching up to modern approaches
following the Deep Learning (DL) paradigm. Our review also focuses on the weak
supervision aspect of the problem, motivated by the fact that annotated data is
scarce. At the end, the advantages of different models and types of supervision
are thoroughly discussed. Furthermore, we try to extrapolate and envision how
future research lines will potentially be, so as to minimize the need for
labeled data while maintaining high performance. Future methods should
emphasize efficient and explainable models with a transparent underlying
process so that physicians can trust their output.Comment: 47 pages, 27 figures, 9 table
- …