683 research outputs found
A Novel Self-Learning Framework for Bladder Cancer Grading Using Histopathological Images
Recently, bladder cancer has been significantly increased in terms of
incidence and mortality. Currently, two subtypes are known based on tumour
growth: non-muscle invasive (NMIBC) and muscle-invasive bladder cancer (MIBC).
In this work, we focus on the MIBC subtype because it is of the worst prognosis
and can spread to adjacent organs. We present a self-learning framework to
grade bladder cancer from histological images stained via immunohistochemical
techniques. Specifically, we propose a novel Deep Convolutional Embedded
Attention Clustering (DCEAC) which allows classifying histological patches into
different severity levels of the disease, according to the patterns established
in the literature. The proposed DCEAC model follows a two-step fully
unsupervised learning methodology to discern between non-tumour, mild and
infiltrative patterns from high-resolution samples of 512x512 pixels. Our
system outperforms previous clustering-based methods by including a
convolutional attention module, which allows refining the features of the
latent space before the classification stage. The proposed network exceeds
state-of-the-art approaches by 2-3% across different metrics, achieving a final
average accuracy of 0.9034 in a multi-class scenario. Furthermore, the reported
class activation maps evidence that our model is able to learn by itself the
same patterns that clinicians consider relevant, without incurring prior
annotation steps. This fact supposes a breakthrough in muscle-invasive bladder
cancer grading which bridges the gap with respect to train the model on
labelled data
Pluralistic Image Completion
Most image completion methods produce only one result for each masked input,
although there may be many reasonable possibilities. In this paper, we present
an approach for \textbf{pluralistic image completion} -- the task of generating
multiple and diverse plausible solutions for image completion. A major
challenge faced by learning-based approaches is that usually only one ground
truth training instance per label. As such, sampling from conditional VAEs
still leads to minimal diversity. To overcome this, we propose a novel and
probabilistically principled framework with two parallel paths. One is a
reconstructive path that utilizes the only one given ground truth to get prior
distribution of missing parts and rebuild the original image from this
distribution. The other is a generative path for which the conditional prior is
coupled to the distribution obtained in the reconstructive path. Both are
supported by GANs. We also introduce a new short+long term attention layer that
exploits distant relations among decoder and encoder features, improving
appearance consistency. When tested on datasets with buildings (Paris), faces
(CelebA-HQ), and natural images (ImageNet), our method not only generated
higher-quality completion results, but also with multiple and diverse plausible
outputs.Comment: 21 pages, 16 figure
Mobile sensor data anonymization
Data from motion sensors such as accelerometers and gyroscopes embedded in our devices can reveal secondary undesired, private information about our activities. This information can be used for malicious purposes such as user identification by application developers. To address this problem, we propose a data transformation mechanism that enables a device to share data for specific applications (e.g.~monitoring their daily activities) without revealing private user information (e.g.~ user identity). We formulate this anonymization process based on an information theoretic approach and propose a new multi-objective loss function for training convolutional auto-encoders~(CAEs) to provide a practical approximation to our anonymization problem. This effective loss function forces the transformed data to minimize the information about the user's identity, as well as the data distortion to preserve application-specific utility. Our training process regulates the encoder to disregard user-identifiable patterns and tunes the decoder to shape the final output independently of users in the training set. Then, a trained CAE can be deployed on a user's mobile device to anonymize sensor data before sharing with an app, even for users who are not included in the training dataset. The results, on a dataset of 24 users for activity recognition, show a promising trade-off on transformed data between utility and privacy, with an accuracy for activity recognition over 92%, while reducing the chance of identifying a user to less than 7%
ConsInstancy: learning instance representations for semi-supervised panoptic segmentation of concrete aggregate particles
We present a semi-supervised method for panoptic segmentation based on ConsInstancy regularisation, a novel strategy for semi-supervised learning. It leverages completely unlabelled data by enforcing consistency between predicted instance representations and semantic segmentations during training in order to improve the segmentation performance. To this end, we also propose new types of instance representations that can be predicted by one simple forward path through a fully convolutional network (FCN), delivering a convenient and simple-to-train framework for panoptic segmentation. More specifically, we propose the prediction of a three-dimensional instance orientation map as intermediate representation and two complementary distance transform maps as final representation, providing unique instance representations for a panoptic segmentation. We test our method on two challenging data sets of both, hardened and fresh concrete, the latter being proposed by the authors in this paper demonstrating the effectiveness of our approach, outperforming the results achieved by state-of-the-art methods for semi-supervised segmentation. In particular, we are able to show that by leveraging completely unlabelled data in our semi-supervised approach the achieved overall accuracy (OA) is increased by up to 5% compared to an entirely supervised training using only labelled data. Furthermore, we exceed the OA achieved by state-of-the-art semi-supervised methods by up to 1.5%
Circumpapillary OCT-Focused Hybrid Learning for Glaucoma Grading Using Tailored Prototypical Neural Networks
Glaucoma is one of the leading causes of blindness worldwide and Optical
Coherence Tomography (OCT) is the quintessential imaging technique for its
detection. Unlike most of the state-of-the-art studies focused on glaucoma
detection, in this paper, we propose, for the first time, a novel framework for
glaucoma grading using raw circumpapillary B-scans. In particular, we set out a
new OCT-based hybrid network which combines hand-driven and deep learning
algorithms. An OCT-specific descriptor is proposed to extract hand-crafted
features related to the retinal nerve fibre layer (RNFL). In parallel, an
innovative CNN is developed using skip-connections to include tailored residual
and attention modules to refine the automatic features of the latent space. The
proposed architecture is used as a backbone to conduct a novel few-shot
learning based on static and dynamic prototypical networks. The k-shot paradigm
is redefined giving rise to a supervised end-to-end system which provides
substantial improvements discriminating between healthy, early and advanced
glaucoma samples. The training and evaluation processes of the dynamic
prototypical network are addressed from two fused databases acquired via
Heidelberg Spectralis system. Validation and testing results reach a
categorical accuracy of 0.9459 and 0.8788 for glaucoma grading, respectively.
Besides, the high performance reported by the proposed model for glaucoma
detection deserves a special mention. The findings from the class activation
maps are directly in line with the clinicians' opinion since the heatmaps
pointed out the RNFL as the most relevant structure for glaucoma diagnosis
Deep-seeded Clustering for Unsupervised Valence-Arousal Emotion Recognition from Physiological Signals
Emotions play a significant role in the cognitive processes of the human
brain, such as decision making, learning and perception. The use of
physiological signals has shown to lead to more objective, reliable and
accurate emotion recognition combined with raising machine learning methods.
Supervised learning methods have dominated the attention of the research
community, but the challenge in collecting needed labels makes emotion
recognition difficult in large-scale semi- or uncontrolled experiments.
Unsupervised methods are increasingly being explored, however sub-optimal
signal feature selection and label identification challenges unsupervised
methods' accuracy and applicability. This article proposes an unsupervised deep
cluster framework for emotion recognition from physiological and psychological
data. Tests on the open benchmark data set WESAD show that deep k-means and
deep c-means distinguish the four quadrants of Russell's circumplex model of
affect with an overall accuracy of 87%. Seeding the clusters with the subject's
subjective assessments helps to circumvent the need for labels.Comment: 7 pages, 1 figure, 2 table
MTS-LOF: Medical Time-Series Representation Learning via Occlusion-Invariant Features
Medical time series data are indispensable in healthcare, providing critical
insights for disease diagnosis, treatment planning, and patient management. The
exponential growth in data complexity, driven by advanced sensor technologies,
has presented challenges related to data labeling. Self-supervised learning
(SSL) has emerged as a transformative approach to address these challenges,
eliminating the need for extensive human annotation. In this study, we
introduce a novel framework for Medical Time Series Representation Learning,
known as MTS-LOF. MTS-LOF leverages the strengths of contrastive learning and
Masked Autoencoder (MAE) methods, offering a unique approach to representation
learning for medical time series data. By combining these techniques, MTS-LOF
enhances the potential of healthcare applications by providing more
sophisticated, context-rich representations. Additionally, MTS-LOF employs a
multi-masking strategy to facilitate occlusion-invariant feature learning. This
approach allows the model to create multiple views of the data by masking
portions of it. By minimizing the discrepancy between the representations of
these masked patches and the fully visible patches, MTS-LOF learns to capture
rich contextual information within medical time series datasets. The results of
experiments conducted on diverse medical time series datasets demonstrate the
superiority of MTS-LOF over other methods. These findings hold promise for
significantly enhancing healthcare applications by improving representation
learning. Furthermore, our work delves into the integration of joint-embedding
SSL and MAE techniques, shedding light on the intricate interplay between
temporal and structural dependencies in healthcare data. This understanding is
crucial, as it allows us to grasp the complexities of healthcare data analysis
HTM approach to image classification, sound recognition and time series forecasting
Dissertação de mestrado em Biomedical EngineeringThe introduction of Machine Learning (ML) on the orbit of the resolution of problems
typically associated within the human behaviour has brought great expectations to
the future. In fact, the possible development of machines capable of learning, in a
similar way as of the humans, could bring grand perspectives to diverse areas like
healthcare, the banking sector, retail, and any other area in which we could avoid the
constant attention of a person dedicated to the solving of a problem; furthermore, there
are those problems that are still not at the hands of humans to solve - these are now
at the disposal of intelligent machines, bringing new possibilities to the humankind
development.
ML algorithms, specifically Deep Learning (DL) methods, lack a bigger acceptance by
part of the community, even though they are present in various systems in our daily
basis. This lack of confidence, mandatory to let systems make big, important decisions
with great impact in the everyday life is due to the difficulty on understanding the
learning mechanisms and previsions that result by the same - some algorithms represent
themselves as ”black boxes”, translating an input into an output, while not being totally
transparent to the outside. Another complication rises, when it is taken into account
that the same algorithms are trained to a specific task and in accordance to the training
cases found on their development, being more susceptible to error in a real environment
- one can argue that they do not constitute a true Artificial Intelligence (AI).
Following this line of thought, this dissertation aims at studying a new theory,
Hierarchical Temporal Memory (HTM), that can be placed in the area of Machine
Intelligence (MI), an area that studies the capacity of how the software systems can
learn, in an identical way to the learning of a human being. The HTM is still a fresh
theory, that lays on the present perception of the functioning of the human neocortex
and assumes itself as under constant development; at the moment, the theory dictates
that the neocortex zones are organized in an hierarchical structure, being a memory
system, capable of recognizing spatial and temporal patterns. In the course of this
project, an analysis was made to the functioning of the theory and its applicability
to the various tasks typically solved with ML algorithms, like image classification, sound recognition and time series forecasting. At the end of this dissertation, after the
evaluation of the different results obtained in various approaches, it was possible to
conclude that even though these results were positive, the theory still needs to mature,
not only in its theoretical basis but also in the development of libraries and frameworks
of software, to capture the attention of the AI community.A introdução de ML na órbita da resolução de problemas tipicamente dedicados ao foro humano trouxe grandes expectativas para o futuro. De facto, o possível desenvolvimento de máquinas capazes de aprender, de forma semelhante aos humanos, poderia trazer grandes perspetivas para diversas áreas como a saúde, o setor bancário, retalho, e qualquer outra área em que se poderia evitar o constante alerta de uma pessoa dedicada a um problema; para além disso, problemas sem resolução humana passavam a estar a mercê destas máquinas, levando a novas possibilidades no desenvolvimento da humanidade. Apesar de se encontrar em vários sistemas no nosso dia-a-dia, estes algoritmos de ML, especificamente de DL, carecem ainda de maior aceitação por parte da comunidade, devido a dificuldade de perceber as aprendizagens e previsões resultantes, feitas pelos mesmos - alguns algoritmos apresentam-se como ”caixas negras”, traduzindo um input num output, não sendo totalmente transparente para o exterior - é necessária confiança nos sistemas que possam tomar decisões importantes e com grandes impactos no quotidiano; por outro lado, os mesmos algoritmos encontram-se treinados para uma tarefa específica e de acordo com os casos encontrados no desenvolvimento do seu treino, sendo mais suscetíveis a erros em ambientes reais, podendo se discutir que não constituem, por isso, uma verdadeira Inteligência Artificial. Seguindo este segmento, a presente dissertação procura estudar uma nova teoria, HTM, inserida na área de MI, que pretende dar a capacidade aos sistemas de software de aprenderem de uma forma idêntica a do ser humano. Esta recente teoria, assenta na atual perceção do funcionamento do neocórtex, estando por isso em constante desenvolvimento; no momento, e assumida como uma teoria que dita a hierarquização estrutural das zonas do neocórtex, sendo um sistema de memória, reconhecedor de padrões espaciais e temporais. Ao longo deste projeto, foi feita uma análise ao funcionamento da teoria, e a sua aplicabilidade a várias tarefas tipicamente resolvidas com algoritmos de ML, como classificação de imagem, reconhecimento de som e previsão de series temporais. No final desta dissertação, após uma avaliação dos diferentes resultados obtidos em várias abordagens, foi possível concluir que apesar dos resultadospositivos, a teoria precisa ainda de maturar, não só a nível teórico como a nível prático, no desenvolvimento de bibliotecas e frameworks de software, de forma a capturar a atenção da comunidade de Inteligência Artificial
- …