34 research outputs found
Diagnostic Prediction Using Discomfort Drawings with IBTM
In this paper, we explore the possibility to apply machine learning to make
diagnostic predictions using discomfort drawings. A discomfort drawing is an
intuitive way for patients to express discomfort and pain related symptoms.
These drawings have proven to be an effective method to collect patient data
and make diagnostic decisions in real-life practice. A dataset from real-world
patient cases is collected for which medical experts provide diagnostic labels.
Next, we use a factorized multimodal topic model, Inter-Battery Topic Model
(IBTM), to train a system that can make diagnostic predictions given an unseen
discomfort drawing. The number of output diagnostic labels is determined by
using mean-shift clustering on the discomfort drawing. Experimental results
show reasonable predictions of diagnostic labels given an unseen discomfort
drawing. Additionally, we generate synthetic discomfort drawings with IBTM
given a diagnostic label, which results in typical cases of symptoms. The
positive result indicates a significant potential of machine learning to be
used for parts of the pain diagnostic process and to be a decision support
system for physicians and other health care personnel.Comment: Presented at 2016 Machine Learning and Healthcare Conference (MLHC
2016), Los Angeles, C
Empirical prior latent Dirichlet allocation model
In this study, empirical prior Dirichlet allocation (epLDA) model that uses latent semantic indexing framework to derive the priors required for topics computation from data is presented. The parameters of the priors so obtained are related to the parameters of the conventional LDA model using exponential function. The model was implemented and tested with benchmarked data and it achieves a prediction accuracy of 92.15%. It was observed that the epLDA model consistently outperforms the conventional LDA model on different datasets with an average percentage accuracy of 6.33%; this clearly demonstrates the advantage of using side information obtained from data for the computation of the mixture components.Keywords: latent Dirichlet allocation; semantic indexing; empirical prior; hidden structures; Prediction accurac
An analysis of the abstracts presented at the annual meetings of the Society for Neuroscience from 2001 to 2006
Annual meeting abstracts published by scientific societies often contain rich arrays of information that can be computationally mined and distilled to elucidate the state and dynamics of the subject field. We extracted and processed abstract data from the Society for Neuroscience (SFN) annual meeting abstracts during the period 2001-2006 in order to gain an objective view of contemporary neuroscience. An important first step in the process was the application of data cleaning and disambiguation methods to construct a unified database, since the data were too noisy to be of full utility in the raw form initially available. Using natural language processing, text mining, and other data analysis techniques, we then examined the demographics and structure of the scientific collaboration network, the dynamics of the field over time, major research trends, and the structure of the sources of research funding. Some interesting findings include a high geographical concentration of neuroscience research in the north eastern United States, a surprisingly large transient population (66% of the authors appear in only one out of the six studied years), the central role played by the study of neurodegenerative disorders in the neuroscience community, and an apparent growth of behavioral/systems neuroscience with a corresponding shrinkage of cellular/molecular neuroscience over the six year period. The results from this work will prove useful for scientists, policy makers, and funding agencies seeking to gain a complete and unbiased picture of the community structure and body of knowledge encapsulated by a specific scientific domain