9 research outputs found
On Knowledge Discovery Experimented with Otoneurological Data
Diagnosis of otoneurological diseases can be challenging due to similar kind of and
overlapping symptoms that can also vary over time. Thus, systems to support and
aid diagnosis of vertiginous patients are considered beneficial. This study continues
refinement of an otoneurological decision support system ONE and its knowledge
base. The aim of the study is to improve the classification accuracy of nine
otoneurological diseases in real world situations by applying machine learning
methods to knowledge discovery in the otoneurological domain.
The phases of the dissertation is divided into three parts: fitness value formation
for attribute values, attribute weighting and classification task redefinition. The first
phase concentrates on the knowledge update of the ONE with the domain experts
and on the knowledge discovery method that forms the fitness values for the values
of the attributes. The knowledge base of the ONE needed update due to changes
made to data collection questionnaire. The effect of machine learnt fitness values on
classification are examined and classification results are compared to the knowledge
set by the experts and their combinations. Classification performance of nearest
pattern method of the ONE is compared to k-nearest neighbour method (k-NN)
and NaĂŻve Bayes (NB). The second phase concentrates on the attribute weighting.
Scatter method and instance-based learning algorithms IB4 and IB1w are applied in
the attribute weighting. These machine learnt attribute weights in addition to the
weights defined by the domain experts and equal weighting are tested with the
classification method of the ONE and attribute weighted k-NN with One-vs-All
classifiers (wk-NN OVA). Genetic algorithm (GA) approach is examined in the
attribute weighting. The machine learnt weight sets are utilized as a starting point
with the GA. Populations (the weight sets) are evaluated with the classification
method of the ONE, the wk-NN OVA and attribute weighted k-NN using
neighbourâs class-based attribute weighting (cwk-NN). In the third phase, the effect
of the classification task redefinition is examined. The multi-class classification task
is separated into several binary classification tasks. The binary classification is studied
without attribute weighting with the k-NN and support vector machines (SVM)
Machine Learning Techniques for Differential Diagnosis of Vertigo and Dizziness: A Review.
Vertigo is a sensation of movement that results from disorders of the inner ear balance organs and their central connections, with aetiologies that are often benign and sometimes serious. An individual who develops vertigo can be effectively treated only after a correct diagnosis of the underlying vestibular disorder is reached. Recent advances in artificial intelligence promise novel strategies for the diagnosis and treatment of patients with this common symptom. Human analysts may experience difficulties manually extracting patterns from large clinical datasets. Machine learning techniques can be used to visualize, understand, and classify clinical data to create a computerized, faster, and more accurate evaluation of vertiginous disorders. Practitioners can also use them as a teaching tool to gain knowledge and valuable insights from medical data. This paper provides a review of the literatures from 1999 to 2021 using various feature extraction and machine learning techniques to diagnose vertigo disorders. This paper aims to provide a better understanding of the work done thus far and to provide future directions for research into the use of machine learning in vertigo diagnosis
Development and validation of a classification algorithm to diagnose and differentiate spontaneous episodic vertigo syndromes: results from the DizzyReg patient registry
BACKGROUND Spontaneous episodic vertigo syndromes, namely vestibular migraine (VM) and MeniĂšre's disease (MD), are difficult to differentiate, even for an experienced clinician. In the presence of complex diagnostic information, automated systems can support human decision making. Recent developments in machine learning might facilitate bedside diagnosis of VM and MD.
METHODS Data of this study originate from the prospective patient registry of the German Centre for Vertigo and Balance Disorders, a specialized tertiary treatment center at the University Hospital Munich. The classification task was to differentiate cases of VM, MD from other vestibular disease entities. Deep Neural Networks (DNN) and Boosted Decision Trees (BDT) were used for classification.
RESULTS A total of 1357 patients were included (mean age 52.9, SD 15.9, 54.7% female), 9.9% with MD and 15.6% with VM. DNN models yielded an accuracy of 98.4 ± 0.5%, a precision of 96.3 ± 3.9%, and a sensitivity of 85.4 ± 3.9% for VM, and an accuracy of 98.0 ± 1.0%, a precision of 90.4 ± 6.2% and a sensitivity of 89.9 ± 4.6% for MD. BDT yielded an accuracy of 84.5 ± 0.5%, precision of 51.8 ± 6.1%, sensitivity of 16.9 ± 1.7% for VM, and an accuracy of 93.3 ± 0.7%, precision 76.0 ± 6.7%, sensitivity 41.7 ± 2.9% for MD.
CONCLUSION The correct diagnosis of spontaneous episodic vestibular syndromes is challenging in clinical practice. Modern machine learning methods might be the basis for developing systems that assist practitioners and clinicians in their daily treatment decisions
Entwicklung und Testung eines datenbasierten Diagnosealgorithmus fĂŒr vestibulĂ€re Erkrankungen im Bereich der hausarztzentrierten Versorgung
Mit einer LebenszeitprĂ€valenz von etwa 30% und steigender Inzidenz mit dem Alter ist Schwindel eines der hĂ€ufigsten Leitsymptome und stellt fĂŒr Patient:innen eine schwere BeeintrĂ€chtigung des tĂ€glichen Lebens dar [1]. Die oft unklare Symptomlage und fehlende Erfahrung im hausĂ€rztlichen Bereich fĂŒhrt hĂ€ufig zu falscher Einordnung, somit zu erfolglosen Behandlungsversuchen [2]. Die Diagnose wird dadurch erschwert, dass sich hĂ€ufig Symptome verschiedener Schwindelerkrankungen ĂŒberlagern [4]. Somit ist es von Interesse, Ărzt:innen Möglichkeiten an die Hand zu geben, dass sie Schwindelerkrankungen in einem ersten Schritt zuverlĂ€ssiger einordnen können. Die ersten AnsĂ€tze fĂŒr die Klassifizierung von Schwindelerkrankungen mit den anfangs verfĂŒgbaren einfachen Modellen waren jedoch nicht vielversprechend genug, um weiter verfolgt zu werden [9].
Die Fragestellung dieser Arbeit war es daher, zu untersuchen, ob diese Verfahren in ihren weiterentwickelten Formen anhand symptomorientierter Charakteristika der Patient:innen dazu in der Lage sind, vestibulĂ€re Erkrankungen zu differenzieren. Es war zudem wichtig, welche vestibulĂ€ren Symptome besonders relevant fĂŒr die VorhersagegĂŒte sein könnten.
In beiden Studien wurden Daten aus der DizzyReg Patientendatenbank des Deutschen Schwindel- und Gleichgewichtszentrum verwendet [14]. Aus der Vielzahl der Machine Learning Modelle [13, 18] fiel die Wahl fĂŒr die Differenzierung von Morbus MeniĂšre und vestibulĂ€rer MigrĂ€ne, aufgrund ihrer Leistung in anderen medizinischen Anwendungen [9], auf die Modelle Deep Neural Networks [13] und der Boosted Decision Trees [29]. FĂŒr die Klassifizierung mit möglichst groĂer Transparenz zur Entscheidung und die Ermittlung der Relevanz der Variablen wĂ€hlten wir Classification and Regression Trees und Random Forests aus. CART bieten den Vorteil, dass sie eine visuelle Darstellung bereitstellen, die die menschliche Entscheidungsfindung nachbildet [25].
Gemittelt ĂŒber alle fĂŒnf imputierten DatensĂ€tze ergab sich mit einem DNN fĂŒr MM ein F-Measure von 55,5% (Accuracy 91,4%). Zum Vergleich erreichten wir mit DNN fĂŒr VM ein F-Measure von 36,8% (Accuracy 81,8%). Boosted Decision Trees, trainiert auf VM, ergaben lediglich ein F-Measure von 27,6% und eine Accuracy von 84,5%. Aus den Experimenten mit den CART wurden acht Variablen als relevant ermittelt. Als ĂŒbergeordnete Genauigkeit bei der Klassifizierung aller sieben Diagnosen ergab eine Genauigkeit von 42,2%.
DNN lieferten sehr gute Klassifizierungsergebnisse, aber aufgrund der inhĂ€renten Struktur des Modells nicht die geforderte Transparenz. Die bei der gleichzeitigen Klassifizierung der hĂ€ufigsten Diagnosen verwendeten Algorithmen basieren auf einem transparenten Ansatz. Sie können fĂŒr eine initiale Triage der Patient:innen verwendet werden, aber es muss sich eine klinische Untersuchung der vestibulĂ€ren und okularen Funktionen anschlieĂen, um die Genauigkeit der Diagnose zu verbessern. FĂŒr eine abschlieĂende Beurteilung ist noch weitere Forschung not-wendig.With a lifetime prevalence of about 30% and an increasing incidence with age, dizziness is one of the most common symptoms and represents a severe impairment of daily life for patients [1]. The often-unclear symptoms and lack of experience with general practitioners often lead to incorrect classification and thus to unsuccessful attempts at treatment [2]. The diagnosis is complicated by the fact that symptoms of different vertigo diseases often overlap [4]. Therefore, it is of interest to provide general practitioners with possibilities to classify dizziness more reliably in a first step. However, the initial approaches to classify vertigo disorders with the sim-ple models available at the beginning were not promising enough to be pursued further [9].
The question of this work was therefore to investigate whether these methods in their more developed forms can differentiate vestibular disorders based on symptom-oriented patient characteristics. It is also important to determine which vestibular symptoms might be particular-ly relevant to predictive accuracy.
In both studies, data from the DizzyReg patient database of the German Schwindel- und Gleichgewichtszentrum were used [14]. From the multitude of machine learning models [13, 18], the choice for differentiating Meniere's disease and vestibular migraine fell on the Deep Neural Networks and Boosted Decision Trees [29] models, due to their performance in other medical applications [9]. We selected Classification and Regression Trees for classification with the greatest possible transparency and for determining the relevance of variables. They offer the advantage of providing a visual representation that mimics human decision making [25].
Averaged over all five imputed datasets, a DNN for MeniĂšreâs Disease yielded an F-measure of 55.5% (accuracy 91.4%). In comparison, with DNN for Vestibular Migraine we achieved an F-Measure of 36.8% (Accuracy 81.8%). Boosted decision trees trained on Vestibular Migraine yielded only an F-Measure of 27.6% and an accuracy of 84.5%. Eight variables were determined to be relevant from the experiments with the CART. The overall accuracy when classifying all seven diagnosis yielded a result of 42.2%.
DNN achieved very good classification results but did not provide the required transparency due to the inherent structure of the model. The algorithms used in the classification of the most common diagnoses are based on a transparent approach. They can be used for initial triage of patients but must be followed by clinical examination of vestibular and ocular functions to im-prove the accuracy of the diagnosis. More research is needed to make a final assessment
Computational models and approaches for lung cancer diagnosis
The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results
Development and benchmarking a novel scatter search algorithm for learning probabilistic graphical models in healthcare
Healthcare data of small sizes are widespread, and the challenge of building accurate inference
models is difficult. Many machine learning algorithms exist, but many are black boxes. Explainable models in healthcare are essential, so healthcare practitioners can understand the developed model and incorporate domain knowledge into the model. Probabilistic graphical models offer a visual way to represent relationships between data. Here we develop a new scatter search algorithm to learn Bayesian networks. This machine learning approach is applied to three case studies to understand the effectiveness in comparison with traditional machine learning techniques.
First, a new scatter search approach is presented to construct the structure of a Bayesian
network. Statistical tests are used to build small Directed acyclic graphs combined in an
iterative process to build up multiple larger graphs. Probability distributions are fitted as the
graphs are built up. These graphs are then scored based on classification performance. Once no new solutions can be found, the algorithm finishes.
The first study looks at the effectiveness of the scatter search constructed Bayesian network against other machine learning algorithms in the same class. These algorithms are
benchmarked against standard datasets from the UCI Machine Learning Repository, which has many published studies.
The second study assesses the effectiveness of the scatter search Bayesian network for
classifying ovarian cancer patients. Multiple other machine learning algorithms were applied
alongside the Bayesian network. All data from this study were collected by clinicians from
the Aneurin Bevan University Health Board. The study concluded that machine-learning
techniques could be applied to classify patients based on early indicators.
The third and final study looked into applying machine learning techniques to no-show breast cancer follow-up patients. Once again, the scatter search Bayesian network was used alongside other machine learning approaches. Socio-demographic and socio-economic factors involving low to middle-income families were used in this study with feature selection techniques to improve machine learning performance. It was found machine learning, when
used with feature selection, could classify no-show patients with reasonable accuracy
Stratification of patient subgroups using high-dimensional and time-series observations
Precision medicine and patient stratification are expanding as a result of
innovations in high-throughput technologies applied to clinical medicine.
Stratification can explain differences in disease trajectories and outcomes in
heterogeneous cohorts. Thus, approaches employed for patient treatment can
be tailored by taking into account individual variabilities and specificities.
This thesis focuses on clustering approaches and how they can be applied to
both single time points and time-series high-dimensional data for the
identification of disease subtypes defined by distinct mechanisms, also called
endotypes, in complex and/or heterogeneous diseases. Multiple carefully
selected clustering strategies were compared to highlight which would produce
the most relevant stratification in terms of mathematical robustness and
biological meaning, both of which quantified using standardised methods.
More specifically, this strategy was applied to time-series multi-omics data
from a cohort of patients with acute pancreatitis, an inflammatory disease of
the pancreas. Using this high-dimensional multi-omics data as well as routine
lab and clinical measurements, the cohort was stratified into four subgroups.
Findings from the analysis of acute pancreatitis data showed that two of the
four subgroups could be detected in another syndrome, acute respiratory
distress syndrome, suggesting that inflammatory signatures are comparable
between diseases.
With the aim of applying these principles to other diseases and using
preliminary results from other studies suggesting that relevant subgroups
might be highlighted, data from inflammatory bowel disease and Parkinson's
disease cohorts was analysed. Results from our analyses confirmed that
disease knowledge could be gained using this approach. Work from this thesis provides novel approaches for the application and
evaluation of stratification methods. Furthermore, results may constitute a
basis for the development of tailored treatment approaches for acute
pancreatitis, acute respiratory distress syndrome, inflammatory bowel disease
and Parkinsonâs disease. Also, the observation of commonalities between
distinct inflammatory diseases will broaden the perspectives when analysing
disease data and more specifically, in biomarker discovery and drug
development processes