6,701 research outputs found
Representation Learning With Convolutional Neural Networks
Deep learning methods have achieved great success in the areas of Computer Vision and Natural Language Processing. Recently, the rapidly developing field of deep learning is concerned with questions surrounding how we can learn meaningful and effective representations of data. This is because the performance of machine learning approaches is heavily dependent on the choice and quality of data representation, and different kinds of representation entangle and hide the different explanatory factors of variation behind the data.
In this dissertation, we focus on representation learning with deep neural networks for different data formats including text, 3D polygon shapes, and brain fiber tracts.
First, we propose a topic-based word representation learning approach for text classification. The proposed approach takes global semantic relationship between words over the whole corpus into consideration and encodes the relationships into distributed vector representations with continuous Skip-gram model. The learned representations which capture a large number of precise syntactic and semantic word relationships are taken as input of Convolution Neural Networks for classification. Our experimental results show the effectiveness of the proposed method on indexing of biomedical articles, behavior code annotation of clinical text fragments, and classification of news groups.
Second, we present a 3D polygon shape representation learning framework for shape segmentation. We propose Directionally Convolutional Network (DCN) that extends convolution operations from images to the polygon mesh surface with rotation-invariant property. Based on the proposed DCN, we learn effective shape representations from raw geometric features and then classify each face of a given polygon into predefined semantic parts. Through extensive experiments, we demonstrate that our framework outperforms the current state-of-the-arts.
Third, we propose to learn effective and meaningful representations for brain fiber tracts using deep learning frameworks. We handle the highly unbalanced dataset by introducing asymmetrical loss function for easily classified samples and hard classified ones. The training loss avoids to be dominated by the easy samples and the training step is more efficient. In addition, we learn more effective and meaningful representations by introducing deeper network and metric learning approaches. Furthermore, we propose to improve the interpretability of our framework by inducing attention mechanism. Our experimental results show that our proposed framework outperforms current golden standard significantly on the real-world dataset
Resolving Structure in Human Brain Organization: Identifying Mesoscale Organization in Weighted Network Representations
Human brain anatomy and function display a combination of modular and
hierarchical organization, suggesting the importance of both cohesive
structures and variable resolutions in the facilitation of healthy cognitive
processes. However, tools to simultaneously probe these features of brain
architecture require further development. We propose and apply a set of methods
to extract cohesive structures in network representations of brain connectivity
using multi-resolution techniques. We employ a combination of soft
thresholding, windowed thresholding, and resolution in community detection,
that enable us to identify and isolate structures associated with different
weights. One such mesoscale structure is bipartivity, which quantifies the
extent to which the brain is divided into two partitions with high connectivity
between partitions and low connectivity within partitions. A second,
complementary mesoscale structure is modularity, which quantifies the extent to
which the brain is divided into multiple communities with strong connectivity
within each community and weak connectivity between communities. Our methods
lead to multi-resolution curves of these network diagnostics over a range of
spatial, geometric, and structural scales. For statistical comparison, we
contrast our results with those obtained for several benchmark null models. Our
work demonstrates that multi-resolution diagnostic curves capture complex
organizational profiles in weighted graphs. We apply these methods to the
identification of resolution-specific characteristics of healthy weighted graph
architecture and altered connectivity profiles in psychiatric disease.Comment: Comments welcom
A Vietnamese Handwritten Text Recognition Pipeline for Tetanus Medical Records
Machine learning techniques are successful for optical character recognition tasks, especially in recognizing handwriting. However, recognizing Vietnamese handwriting is challenging with the presence of extra six distinctive tonal symbols and vowels. Such a challenge is amplified given the handwriting of health workers in an emergency care setting, where staff is under constant pressure to record the well-being of patients. In this study, we aim to digitize the handwriting of Vietnamese health workers. We develop a complete handwritten text recognition pipeline that receives scanned documents, detects, and enhances the handwriting text areas of interest, transcribes the images into computer text, and finally auto-corrects invalid words and terms to achieve high accuracy. From experiments with medical documents written by 30 doctors and nurses from the Tetanus Emergency Care unit at the Hospital for Tropical Diseases, we obtain promising results of 2% and 12% for Character Error Rate and Word Error Rate, respectively
CUTS: A Fully Unsupervised Framework for Medical Image Segmentation
In this work we introduce CUTS (Contrastive and Unsupervised Training for
Segmentation) the first fully unsupervised deep learning framework for medical
image segmentation, facilitating the use of the vast majority of imaging data
that is not labeled or annotated. Segmenting medical images into regions of
interest is a critical task for facilitating both patient diagnoses and
quantitative research. A major limiting factor in this segmentation is the lack
of labeled data, as getting expert annotations for each new set of imaging data
or task can be expensive, labor intensive, and inconsistent across annotators:
thus, we utilize self-supervision based on pixel-centered patches from the
images themselves. Our unsupervised approach is based on a training objective
with both contrastive learning and autoencoding aspects. Previous contrastive
learning approaches for medical image segmentation have focused on image-level
contrastive training, rather than our intra-image patch-level approach or have
used this as a pre-training task where the network needed further supervised
training afterwards. By contrast, we build the first entirely unsupervised
framework that operates at the pixel-centered-patch level. Specifically, we add
novel augmentations, a patch reconstruction loss, and introduce a new pixel
clustering and identification framework. Our model achieves improved results on
several key medical imaging tasks, as verified by held-out expert annotations
on the task of segmenting geographic atrophy (GA) regions of images of the
retina
- …