683 research outputs found

    A Novel Self-Learning Framework for Bladder Cancer Grading Using Histopathological Images

    Full text link
    Recently, bladder cancer has been significantly increased in terms of incidence and mortality. Currently, two subtypes are known based on tumour growth: non-muscle invasive (NMIBC) and muscle-invasive bladder cancer (MIBC). In this work, we focus on the MIBC subtype because it is of the worst prognosis and can spread to adjacent organs. We present a self-learning framework to grade bladder cancer from histological images stained via immunohistochemical techniques. Specifically, we propose a novel Deep Convolutional Embedded Attention Clustering (DCEAC) which allows classifying histological patches into different severity levels of the disease, according to the patterns established in the literature. The proposed DCEAC model follows a two-step fully unsupervised learning methodology to discern between non-tumour, mild and infiltrative patterns from high-resolution samples of 512x512 pixels. Our system outperforms previous clustering-based methods by including a convolutional attention module, which allows refining the features of the latent space before the classification stage. The proposed network exceeds state-of-the-art approaches by 2-3% across different metrics, achieving a final average accuracy of 0.9034 in a multi-class scenario. Furthermore, the reported class activation maps evidence that our model is able to learn by itself the same patterns that clinicians consider relevant, without incurring prior annotation steps. This fact supposes a breakthrough in muscle-invasive bladder cancer grading which bridges the gap with respect to train the model on labelled data

    Pluralistic Image Completion

    Full text link
    Most image completion methods produce only one result for each masked input, although there may be many reasonable possibilities. In this paper, we present an approach for \textbf{pluralistic image completion} -- the task of generating multiple and diverse plausible solutions for image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label. As such, sampling from conditional VAEs still leads to minimal diversity. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one given ground truth to get prior distribution of missing parts and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by GANs. We also introduce a new short+long term attention layer that exploits distant relations among decoder and encoder features, improving appearance consistency. When tested on datasets with buildings (Paris), faces (CelebA-HQ), and natural images (ImageNet), our method not only generated higher-quality completion results, but also with multiple and diverse plausible outputs.Comment: 21 pages, 16 figure

    Mobile sensor data anonymization

    Get PDF
    Data from motion sensors such as accelerometers and gyroscopes embedded in our devices can reveal secondary undesired, private information about our activities. This information can be used for malicious purposes such as user identification by application developers. To address this problem, we propose a data transformation mechanism that enables a device to share data for specific applications (e.g.~monitoring their daily activities) without revealing private user information (e.g.~ user identity). We formulate this anonymization process based on an information theoretic approach and propose a new multi-objective loss function for training convolutional auto-encoders~(CAEs) to provide a practical approximation to our anonymization problem. This effective loss function forces the transformed data to minimize the information about the user's identity, as well as the data distortion to preserve application-specific utility. Our training process regulates the encoder to disregard user-identifiable patterns and tunes the decoder to shape the final output independently of users in the training set. Then, a trained CAE can be deployed on a user's mobile device to anonymize sensor data before sharing with an app, even for users who are not included in the training dataset. The results, on a dataset of 24 users for activity recognition, show a promising trade-off on transformed data between utility and privacy, with an accuracy for activity recognition over 92%, while reducing the chance of identifying a user to less than 7%

    ConsInstancy: learning instance representations for semi-supervised panoptic segmentation of concrete aggregate particles

    Get PDF
    We present a semi-supervised method for panoptic segmentation based on ConsInstancy regularisation, a novel strategy for semi-supervised learning. It leverages completely unlabelled data by enforcing consistency between predicted instance representations and semantic segmentations during training in order to improve the segmentation performance. To this end, we also propose new types of instance representations that can be predicted by one simple forward path through a fully convolutional network (FCN), delivering a convenient and simple-to-train framework for panoptic segmentation. More specifically, we propose the prediction of a three-dimensional instance orientation map as intermediate representation and two complementary distance transform maps as final representation, providing unique instance representations for a panoptic segmentation. We test our method on two challenging data sets of both, hardened and fresh concrete, the latter being proposed by the authors in this paper demonstrating the effectiveness of our approach, outperforming the results achieved by state-of-the-art methods for semi-supervised segmentation. In particular, we are able to show that by leveraging completely unlabelled data in our semi-supervised approach the achieved overall accuracy (OA) is increased by up to 5% compared to an entirely supervised training using only labelled data. Furthermore, we exceed the OA achieved by state-of-the-art semi-supervised methods by up to 1.5%

    Circumpapillary OCT-Focused Hybrid Learning for Glaucoma Grading Using Tailored Prototypical Neural Networks

    Full text link
    Glaucoma is one of the leading causes of blindness worldwide and Optical Coherence Tomography (OCT) is the quintessential imaging technique for its detection. Unlike most of the state-of-the-art studies focused on glaucoma detection, in this paper, we propose, for the first time, a novel framework for glaucoma grading using raw circumpapillary B-scans. In particular, we set out a new OCT-based hybrid network which combines hand-driven and deep learning algorithms. An OCT-specific descriptor is proposed to extract hand-crafted features related to the retinal nerve fibre layer (RNFL). In parallel, an innovative CNN is developed using skip-connections to include tailored residual and attention modules to refine the automatic features of the latent space. The proposed architecture is used as a backbone to conduct a novel few-shot learning based on static and dynamic prototypical networks. The k-shot paradigm is redefined giving rise to a supervised end-to-end system which provides substantial improvements discriminating between healthy, early and advanced glaucoma samples. The training and evaluation processes of the dynamic prototypical network are addressed from two fused databases acquired via Heidelberg Spectralis system. Validation and testing results reach a categorical accuracy of 0.9459 and 0.8788 for glaucoma grading, respectively. Besides, the high performance reported by the proposed model for glaucoma detection deserves a special mention. The findings from the class activation maps are directly in line with the clinicians' opinion since the heatmaps pointed out the RNFL as the most relevant structure for glaucoma diagnosis

    Deep-seeded Clustering for Unsupervised Valence-Arousal Emotion Recognition from Physiological Signals

    Full text link
    Emotions play a significant role in the cognitive processes of the human brain, such as decision making, learning and perception. The use of physiological signals has shown to lead to more objective, reliable and accurate emotion recognition combined with raising machine learning methods. Supervised learning methods have dominated the attention of the research community, but the challenge in collecting needed labels makes emotion recognition difficult in large-scale semi- or uncontrolled experiments. Unsupervised methods are increasingly being explored, however sub-optimal signal feature selection and label identification challenges unsupervised methods' accuracy and applicability. This article proposes an unsupervised deep cluster framework for emotion recognition from physiological and psychological data. Tests on the open benchmark data set WESAD show that deep k-means and deep c-means distinguish the four quadrants of Russell's circumplex model of affect with an overall accuracy of 87%. Seeding the clusters with the subject's subjective assessments helps to circumvent the need for labels.Comment: 7 pages, 1 figure, 2 table

    MTS-LOF: Medical Time-Series Representation Learning via Occlusion-Invariant Features

    Full text link
    Medical time series data are indispensable in healthcare, providing critical insights for disease diagnosis, treatment planning, and patient management. The exponential growth in data complexity, driven by advanced sensor technologies, has presented challenges related to data labeling. Self-supervised learning (SSL) has emerged as a transformative approach to address these challenges, eliminating the need for extensive human annotation. In this study, we introduce a novel framework for Medical Time Series Representation Learning, known as MTS-LOF. MTS-LOF leverages the strengths of contrastive learning and Masked Autoencoder (MAE) methods, offering a unique approach to representation learning for medical time series data. By combining these techniques, MTS-LOF enhances the potential of healthcare applications by providing more sophisticated, context-rich representations. Additionally, MTS-LOF employs a multi-masking strategy to facilitate occlusion-invariant feature learning. This approach allows the model to create multiple views of the data by masking portions of it. By minimizing the discrepancy between the representations of these masked patches and the fully visible patches, MTS-LOF learns to capture rich contextual information within medical time series datasets. The results of experiments conducted on diverse medical time series datasets demonstrate the superiority of MTS-LOF over other methods. These findings hold promise for significantly enhancing healthcare applications by improving representation learning. Furthermore, our work delves into the integration of joint-embedding SSL and MAE techniques, shedding light on the intricate interplay between temporal and structural dependencies in healthcare data. This understanding is crucial, as it allows us to grasp the complexities of healthcare data analysis

    HTM approach to image classification, sound recognition and time series forecasting

    Get PDF
    Dissertação de mestrado em Biomedical EngineeringThe introduction of Machine Learning (ML) on the orbit of the resolution of problems typically associated within the human behaviour has brought great expectations to the future. In fact, the possible development of machines capable of learning, in a similar way as of the humans, could bring grand perspectives to diverse areas like healthcare, the banking sector, retail, and any other area in which we could avoid the constant attention of a person dedicated to the solving of a problem; furthermore, there are those problems that are still not at the hands of humans to solve - these are now at the disposal of intelligent machines, bringing new possibilities to the humankind development. ML algorithms, specifically Deep Learning (DL) methods, lack a bigger acceptance by part of the community, even though they are present in various systems in our daily basis. This lack of confidence, mandatory to let systems make big, important decisions with great impact in the everyday life is due to the difficulty on understanding the learning mechanisms and previsions that result by the same - some algorithms represent themselves as ”black boxes”, translating an input into an output, while not being totally transparent to the outside. Another complication rises, when it is taken into account that the same algorithms are trained to a specific task and in accordance to the training cases found on their development, being more susceptible to error in a real environment - one can argue that they do not constitute a true Artificial Intelligence (AI). Following this line of thought, this dissertation aims at studying a new theory, Hierarchical Temporal Memory (HTM), that can be placed in the area of Machine Intelligence (MI), an area that studies the capacity of how the software systems can learn, in an identical way to the learning of a human being. The HTM is still a fresh theory, that lays on the present perception of the functioning of the human neocortex and assumes itself as under constant development; at the moment, the theory dictates that the neocortex zones are organized in an hierarchical structure, being a memory system, capable of recognizing spatial and temporal patterns. In the course of this project, an analysis was made to the functioning of the theory and its applicability to the various tasks typically solved with ML algorithms, like image classification, sound recognition and time series forecasting. At the end of this dissertation, after the evaluation of the different results obtained in various approaches, it was possible to conclude that even though these results were positive, the theory still needs to mature, not only in its theoretical basis but also in the development of libraries and frameworks of software, to capture the attention of the AI community.A introdução de ML na órbita da resolução de problemas tipicamente dedicados ao foro humano trouxe grandes expectativas para o futuro. De facto, o possível desenvolvimento de máquinas capazes de aprender, de forma semelhante aos humanos, poderia trazer grandes perspetivas para diversas áreas como a saúde, o setor bancário, retalho, e qualquer outra área em que se poderia evitar o constante alerta de uma pessoa dedicada a um problema; para além disso, problemas sem resolução humana passavam a estar a mercê destas máquinas, levando a novas possibilidades no desenvolvimento da humanidade. Apesar de se encontrar em vários sistemas no nosso dia-a-dia, estes algoritmos de ML, especificamente de DL, carecem ainda de maior aceitação por parte da comunidade, devido a dificuldade de perceber as aprendizagens e previsões resultantes, feitas pelos mesmos - alguns algoritmos apresentam-se como ”caixas negras”, traduzindo um input num output, não sendo totalmente transparente para o exterior - é necessária confiança nos sistemas que possam tomar decisões importantes e com grandes impactos no quotidiano; por outro lado, os mesmos algoritmos encontram-se treinados para uma tarefa específica e de acordo com os casos encontrados no desenvolvimento do seu treino, sendo mais suscetíveis a erros em ambientes reais, podendo se discutir que não constituem, por isso, uma verdadeira Inteligência Artificial. Seguindo este segmento, a presente dissertação procura estudar uma nova teoria, HTM, inserida na área de MI, que pretende dar a capacidade aos sistemas de software de aprenderem de uma forma idêntica a do ser humano. Esta recente teoria, assenta na atual perceção do funcionamento do neocórtex, estando por isso em constante desenvolvimento; no momento, e assumida como uma teoria que dita a hierarquização estrutural das zonas do neocórtex, sendo um sistema de memória, reconhecedor de padrões espaciais e temporais. Ao longo deste projeto, foi feita uma análise ao funcionamento da teoria, e a sua aplicabilidade a várias tarefas tipicamente resolvidas com algoritmos de ML, como classificação de imagem, reconhecimento de som e previsão de series temporais. No final desta dissertação, após uma avaliação dos diferentes resultados obtidos em várias abordagens, foi possível concluir que apesar dos resultadospositivos, a teoria precisa ainda de maturar, não só a nível teórico como a nível prático, no desenvolvimento de bibliotecas e frameworks de software, de forma a capturar a atenção da comunidade de Inteligência Artificial
    corecore