13,770 research outputs found
SAGE: Sequential Attribute Generator for Analyzing Glioblastomas using Limited Dataset
While deep learning approaches have shown remarkable performance in many
imaging tasks, most of these methods rely on availability of large quantities
of data. Medical image data, however, is scarce and fragmented. Generative
Adversarial Networks (GANs) have recently been very effective in handling such
datasets by generating more data. If the datasets are very small, however, GANs
cannot learn the data distribution properly, resulting in less diverse or
low-quality results. One such limited dataset is that for the concurrent gain
of 19 and 20 chromosomes (19/20 co-gain), a mutation with positive prognostic
value in Glioblastomas (GBM). In this paper, we detect imaging biomarkers for
the mutation to streamline the extensive and invasive prognosis pipeline. Since
this mutation is relatively rare, i.e. small dataset, we propose a novel
generative framework - the Sequential Attribute GEnerator (SAGE), that
generates detailed tumor imaging features while learning from a limited
dataset. Experiments show that not only does SAGE generate high quality tumors
when compared to standard Deep Convolutional GAN (DC-GAN) and Wasserstein GAN
with Gradient Penalty (WGAN-GP), it also captures the imaging biomarkers
accurately
Multilingual Twitter Sentiment Classification: The Role of Human Annotators
What are the limits of automated Twitter sentiment classification? We analyze
a large set of manually labeled tweets in different languages, use them as
training data, and construct automated classification models. It turns out that
the quality of classification models depends much more on the quality and size
of training data than on the type of the model trained. Experimental results
indicate that there is no statistically significant difference between the
performance of the top classification models. We quantify the quality of
training data by applying various annotator agreement measures, and identify
the weakest points of different datasets. We show that the model performance
approaches the inter-annotator agreement when the size of the training set is
sufficiently large. However, it is crucial to regularly monitor the self- and
inter-annotator agreements since this improves the training datasets and
consequently the model performance. Finally, we show that there is strong
evidence that humans perceive the sentiment classes (negative, neutral, and
positive) as ordered
Improving acoustic vehicle classification by information fusion
We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicle’s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicle’s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac
An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols
We describe an effort to annotate a corpus of natural language instructions
consisting of 622 wet lab protocols to facilitate automatic or semi-automatic
conversion of protocols into a machine-readable format and benefit biological
research. Experimental results demonstrate the utility of our corpus for
developing machine learning approaches to shallow semantic parsing of
instructional texts. We make our annotated Wet Lab Protocol Corpus available to
the research community
Recommended from our members
A Large-Scale Study of Modern Code Review and Security in Open Source Projects.
- …