470 research outputs found
A Survey on Bayesian Deep Learning
A comprehensive artificial intelligence system needs to not only perceive the
environment with different `senses' (e.g., seeing and hearing) but also infer
the world's conditional (or even causal) relations and corresponding
uncertainty. The past decade has seen major advances in many perception tasks
such as visual object recognition and speech recognition using deep learning
models. For higher-level inference, however, probabilistic graphical models
with their Bayesian nature are still more powerful and flexible. In recent
years, Bayesian deep learning has emerged as a unified probabilistic framework
to tightly integrate deep learning and Bayesian models. In this general
framework, the perception of text or images using deep learning can boost the
performance of higher-level inference and in turn, the feedback from the
inference process is able to enhance the perception of text or images. This
survey provides a comprehensive introduction to Bayesian deep learning and
reviews its recent applications on recommender systems, topic models, control,
etc. Besides, we also discuss the relationship and differences between Bayesian
deep learning and other related topics such as Bayesian treatment of neural
networks.Comment: To appear in ACM Computing Surveys (CSUR) 202
Modular Machine Learning Methods for Computer-Aided Diagnosis of Breast Cancer
The purpose of this study was to improve breast cancer diagnosis by reducing the number of benign biopsies performed. To this end, we investigated modular and ensemble systems of machine learning methods for computer-aided diagnosis (CAD) of breast cancer. A modular system partitions the input space into smaller domains, each of which is handled by a local model. An ensemble system uses multiple models for the same cases and combines the models\u27 predictions.
Five supervised machine learning techniques (LDA, SVM, BP-ANN, CBR, CART) were trained to predict the biopsy outcome from mammographic findings (BIRADS™) and patient age based on a database of 2258 cases mixed from multiple institutions. The generalization of the models was tested on second set of 2177 cases. Clusters were identified in the database using a priori knowledge and unsupervised learning methods (agglomerative hierarchical clustering followed by K-Means, SOM, AutoClass). The performance of the global models over the clusters was examined and local models were trained for clusters.
While some local models were superior to some global models, we were unable to build a modular CAD system that was better than the global BP-ANN model. The ensemble systems based on simplistic combination schemes did not result in significant improvements and more complicated combination schemes were found to be unduly optimistic. One of the most striking results of this dissertation was that CAD systems trained on a mixture of lesion types performed much better on masses than on calcifications. Our study of the institutional effects suggests that models built on cases mixed between institutions may overcome some of the weaknesses of models built on cases from a single institution. It was suggestive that each of the unsupervised methods identified a cluster of younger women with well-circumscribed or obscured, oval-shaped masses that accounted for the majority of the BP-ANN’s recommendations for follow up. From the cluster analysis and the CART models, we determined a simple diagnostic rule that performed comparably to the global BP-ANN. Approximately 98% sensitivity could be maintained while providing approximately 26% specificity. This should be compared to the clinical status quo of 100% sensitivity and 0% specificity on this database of indeterminate cases already referred to biopsy
Feature Augmentation for Improved Topic Modeling of Youtube Lecture Videos using Latent Dirichlet Allocation
Application of Topic Models in text mining of educational data and more specifically, the text data obtained from lecture videos, is an area of research which is largely unexplored yet holds great potential. This work seeks to find empirical evidence for an improvement in Topic Modeling by pre- extracting bigram tokens and adding them as additional features in the Latent Dirichlet Allocation (LDA) algorithm, a widely-recognized topic modeling technique. The dataset considered for analysis is a collection of transcripts of video lectures on Machine Learning scraped from YouTube. Using the cosine similarity distance measure as a metric, the experiment showed a statistically significant improvement in topic model performance against the baseline topic model which did not use extra features, thus confirming the hypothesis. By introducing explainable features before modeling and using deep learning based text representation only at the post-modeling evaluation stage, the overall model interpretability is retained. This empowers educators and researchers alike to not only benefit from the LDA model in their own fields but also to play a substantial role in eorts to improve model performance. It also sets the direction for future work which could use the feature augmented topic model as the input to other more common text mining tasks like document categorization and information retrieval
A methodology for contextual recommendation using artificial neural networks
“A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of Philosophy”.Recommender systems are an advanced form of software applications, more specifically
decision-support systems, that efficiently assist the users in finding items of their interest.
Recommender systems have been applied to many domains from music to e-commerce,
movies to software services delivery and tourism to news by exploiting available information
to predict and provide recommendations to end user. The suggestions generated by recommender
systems tend to narrow down the list of items which a user may overlook due to the
huge variety of similar items or users’ lack of experience in the particular domain of interest.
While the performance of traditional recommender systems, which rely on relatively simpler
information such as content and users’ filters, is widely accepted, their predictive capability
perfomrs poorly when local context of the user and situated actions have significant role in the
final decision. Therefore, acceptance and incorporation of context of the user as a significant
feature and development of recommender systems utilising the premise becomes an active
area of research requiring further investigation of the underlying algorithms and methodology.
This thesis focuses on categorisation of contextual and non-contextual features within
the domain of context-aware recommender system and their respective evaluation. Further,
application of the Multilayer Perceptron Model (MLP) for generating predictions and ratings
from the contextual and non-contextual features for contextual recommendations is presented
with support from relevant literature and empirical evaluation. An evaluation of specifically
employing artificial neural networks (ANNs) in the proposed methodology is also presented.
The work emphasizes on both algorithms and methodology with three points of consideration:\ud
contextual features and ratings of particular items/movies are exploited in several representations
to improve the accuracy of recommendation process using artificial neural networks
(ANNs), context features are combined with user-features to further improve the accuracy of
a context-aware recommender system and lastly, a combination of the item/movie features
are investigated within the recommendation process. The proposed approach is evaluated on
the LDOS-CoMoDa dataset and the results are compared with state-of-the-art approaches
from relevant published literature
- …