1,035 research outputs found

    Bioinformatics Applications Based On Machine Learning

    Get PDF
    The great advances in information technology (IT) have implications for many sectors, such as bioinformatics, and has considerably increased their possibilities. This book presents a collection of 11 original research papers, all of them related to the application of IT-related techniques within the bioinformatics sector: from new applications created from the adaptation and application of existing techniques to the creation of new methodologies to solve existing problems

    Explainable clinical decision support system: opening black-box meta-learner algorithm expert's based

    Get PDF
    Mathematical optimization methods are the basic mathematical tools of all artificial intelligence theory. In the field of machine learning and deep learning the examples with which algorithms learn (training data) are used by sophisticated cost functions which can have solutions in closed form or through approximations. The interpretability of the models used and the relative transparency, opposed to the opacity of the black-boxes, is related to how the algorithm learns and this occurs through the optimization and minimization of the errors that the machine makes in the learning process. In particular in the present work is introduced a new method for the determination of the weights in an ensemble model, supervised and unsupervised, based on the well known Analytic Hierarchy Process method (AHP). This method is based on the concept that behind the choice of different and possible algorithms to be used in a machine learning problem, there is an expert who controls the decisionmaking process. The expert assigns a complexity score to each algorithm (based on the concept of complexity-interpretability trade-off) through which the weight with which each model contributes to the training and prediction phase is determined. In addition, different methods are presented to evaluate the performance of these algorithms and explain how each feature in the model contributes to the prediction of the outputs. The interpretability techniques used in machine learning are also combined with the method introduced based on AHP in the context of clinical decision support systems in order to make the algorithms (black-box) and the results interpretable and explainable, so that clinical-decision-makers can take controlled decisions together with the concept of "right to explanation" introduced by the legislator, because the decision-makers have a civil and legal responsibility of their choices in the clinical field based on systems that make use of artificial intelligence. No less, the central point is the interaction between the expert who controls the algorithm construction process and the domain expert, in this case the clinical one. Three applications on real data are implemented with the methods known in the literature and with those proposed in this work: one application concerns cervical cancer, another the problem related to diabetes and the last one focuses on a specific pathology developed by HIV-infected individuals. All applications are supported by plots, tables and explanations of the results, implemented through Python libraries. The main case study of this thesis regarding HIV-infected individuals concerns an unsupervised ensemble-type problem, in which a series of clustering algorithms are used on a set of features and which in turn produce an output used again as a set of meta-features to provide a set of labels for each given cluster. The meta-features and labels obtained by choosing the best algorithm are used to train a Logistic regression meta-learner, which in turn is used through some explainability methods to provide the value of the contribution that each algorithm has had in the training phase. The use of Logistic regression as a meta-learner classifier is motivated by the fact that it provides appreciable results and also because of the easy explainability of the estimated coefficients

    Statistical methods to evaluate disease outcome diagnostic accuracy of multiple biomarkers with application to HIV and TB research.

    Get PDF
    Doctor of Philosophy in Statistics. University of KwaZulu-Natal, Pietermaritzburg 2015.One challenge in clinical medicine is that of the correct diagnosis of disease. Medical researchers invest considerable time and effort to improving accurate disease diagnosis and following from this diagnostic tests are important components in modern medical practice. The receiver oper- ating characteristic (ROC) is a statistical tool commonly used for describing the discriminatory accuracy and performance of a diagnostic test. A popular summary index of discriminatory accuracy is the area under ROC curve (AUC). In the medical research data, scientists are simultaneously evaluating hundreds of biomarkers. A critical challenge is the combination of biomarkers into models that give insight into disease. In infectious disease, biomarkers are often evaluated as well as in the micro organism or virus causing infection, adding more complexity to the analysis. In addition to providing an improved understanding of factors associated with infection and disease development, combinations of relevant markers are important to the diagnosis and treatment of disease. Taken together, this extends the role of, the statistical analyst and presents many novel and major challenges. This thesis discusses some of the various strategies and issues in using statistical data analysis to address the diagnosis problem, of selecting and combining multiple markers to estimate the predictive accuracy of test results. We also consider different methodologies to address missing data and to improve the predictive accuracy in the presence of incomplete data. The thesis is divided into five parts. The first part is an introduction to the theory behind the methods that we used in this work. The second part places emphasis on the so called classic ROC analysis, which is applied to cross sectional data. The main aim of this chapter is to address the problem of how to select and combine multiple markers and evaluate the appropriateness of certain techniques used in estimating the area under the ROC curve (AUC). Logistic regression models offer a simple method for combining markers. We applied resampling methods to adjust for over-fitting associated with model selection. We simulated several multivariate models to evaluate the performance of the resampling approaches in this setting. We applied these methods to data collected from a study of tuberculosis immune reconstitution in ammatory syndrome (TB-IRIS) in Cape Town, South Africa. Baseline levels of five biomarkers were evaluated and we used this dataset to evaluate whether a combination of these biomarkers could accurately discriminate between TB-IRIS and non TB-IRIS patients, by applying AUC analysis and resampling methods. The third part is concerned with a time dependent ROC analysis with event-time outcome and comparative analysis of the techniques applied to incomplete covariates. Three different methods are assessed and investigated, namely mean imputation, nearest neighbor hot deck imputation and multivariate imputation by chain equations (MICE). These methods were used together with bootstrap and cross-validation to estimate the time dependent AUC using a non-parametric approach and a Cox model. We simulated several models to evaluate the performance of the resampling approaches and imputation methods. We applied the above methods to a real data set. The fourth part is concerned with applying more advanced variable selection methods to predict the survival of patients using time dependent ROC analysis. The least absolute shrinkage and selection operator (LASSO) Cox model is applied to estimate the bootstrap cross-validated, 632 and 632+ bootstrap AUCs for TBM/HIV data set from KwaZulu-Natal in South Africa. We also suggest the use of ridge-Cox regression to estimate the AUC and two level bootstrapping to estimate the variances for AUC, in addition to evaluating these suggested methods. The last part of the research is an application study using genetic HIV data from rural KwaZulu-Natal to evaluate the sequence of ambiguities as a biomarker to predict recent infection in HIV patients

    Emergency Department Management: Data Analytics for Improving Productivity and Patient Experience

    Get PDF
    The onset of big data, typically defined by its volume, velocity, and variety, is transforming the healthcare industry. This research utilizes data corresponding to over 23 million emergency department (ED) visits between January 2010 and December 2017 which were treated by physicians and advanced practice providers from a large national emergency physician group. This group has provided ED services to health systems for several years, and each essay aims to address operational challenges faced by this group’s management team. The first essay focuses on physician performance. We question how to evaluate performance across multiple sites and work to understand the relationships between patient flow, patient complexity, and patient experience. Specifically, an evaluation system to assess physician performance across multiple facilities is proposed, the relationship between productivity and patient experience scores is explored, and the drivers of patient flow and complexity are simultaneously identified. The second essay explores the relationship between physician performance and malpractice claims as we investigate whether physicians’ practice patterns change after they are named in a malpractice lawsuit. Overall, the results of this analysis indicate that the likelihood of being named in a malpractice claim is largely a function of how long a physician has practiced. Furthermore, physician practice patterns remain consistent after a physician is sued, but patient experience scores increase among sued physicians after the lawsuit is filed. Such insights are beneficial for management as they address the issue of medical malpractice claims. The final essay takes a closer look at the relationship between advanced practice providers (APPs) and physicians. Can EDs better utilize APPs to reduce waiting times and improve patient flow? A systematic data-driven approach which incorporates descriptive, predictive, and prescriptive analyses is employed to provide recommendations for ED provider staffing practices

    Transformation of graphical models to support knowledge transfer

    Get PDF
    Menschliche Experten verfügen über die Fähigkeit, ihr Entscheidungsverhalten flexibel auf die jeweilige Situation abzustimmen. Diese Fähigkeit zahlt sich insbesondere dann aus, wenn Entscheidungen unter beschränkten Ressourcen wie Zeitrestriktionen getroffen werden müssen. In solchen Situationen ist es besonders vorteilhaft, die Repräsentation des zugrunde liegenden Wissens anpassen und Entscheidungsmodelle auf unterschiedlichen Abstraktionsebenen verwenden zu können. Weiterhin zeichnen sich menschliche Experten durch die Fähigkeit aus, neben unsicheren Informationen auch unscharfe Wahrnehmungen in die Entscheidungsfindung einzubeziehen. Klassische entscheidungstheoretische Modelle basieren auf dem Konzept der Rationalität, wobei in jeder Situation die nutzenmaximale Entscheidung einer Entscheidungsfunktion zugeordnet wird. Neuere graphbasierte Modelle wie Bayes\u27sche Netze oder Entscheidungsnetze machen entscheidungstheoretische Methoden unter dem Aspekt der Modellbildung interessant. Als Hauptnachteil lässt sich die Komplexität nennen, wobei Inferenz in Entscheidungsnetzen NP-hart ist. Zielsetzung dieser Dissertation ist die Transformation entscheidungstheoretischer Modelle in Fuzzy-Regelbasen als Zielsprache. Fuzzy-Regelbasen lassen sich effizient auswerten, eignen sich zur Approximation nichtlinearer funktionaler Beziehungen und garantieren die Interpretierbarkeit des resultierenden Handlungsmodells. Die Übersetzung eines Entscheidungsmodells in eine Fuzzy-Regelbasis wird durch einen neuen Transformationsprozess unterstützt. Ein Agent kann zunächst ein Bayes\u27sches Netz durch Anwendung eines in dieser Arbeit neu vorgestellten parametrisierten Strukturlernalgorithmus generieren lassen. Anschließend lässt sich durch Anwendung von Präferenzlernverfahren und durch Präzisierung der Wahrscheinlichkeitsinformation ein entscheidungstheoretisches Modell erstellen. Ein Transformationsalgorithmus kompiliert daraus eine Regelbasis, wobei ein Approximationsmaß den erwarteten Nutzenverlust als Gütekriterium berechnet. Anhand eines Beispiels zur Zustandsüberwachung einer Rotationsspindel wird die Praxistauglichkeit des Konzeptes gezeigt.Human experts are able to flexible adjust their decision behaviour with regard to the respective situation. This capability pays in situations under limited resources like time restrictions. It is particularly advantageous to adapt the underlying knowledge representation and to make use of decision models at different levels of abstraction. Furthermore human experts have the ability to include uncertain information and vague perceptions in decision making. Classical decision-theoretic models are based directly on the concept of rationality, whereby the decision behaviour prescribed by the principle of maximum expected utility. For each observation some optimal decision function prescribes an action that maximizes expected utility. Modern graph-based methods like Bayesian networks or influence diagrams make use of modelling. One disadvantage of decision-theoretic methods concerns the issue of complexity. Finding an optimal decision might become very expensive. Inference in decision networks is known to be NP-hard. This dissertation aimed at combining the advantages of decision-theoretic models with rule-based systems by transforming a decision-theoretic model into a fuzzy rule-based system. Fuzzy rule bases are an efficient implementation from a computational point of view, they can approximate non-linear functional dependencies and they are also intelligible. There was a need for establishing a new transformation process to generate rule-based representations from decision models, which provide an efficient implementation architecture and represent knowledge in an explicit, intelligible way. At first, an agent can apply the new parameterized structure learning algorithm to identify the structure of the Bayesian network. The use of learning approaches to determine preferences and the specification of probability information subsequently enables to model decision and utility nodes and to generate a consolidated decision-theoretic model. Hence, a transformation process compiled a rule base by measuring the utility loss as approximation measure. The transformation process concept has been successfully applied to the problem of representing condition monitoring results for a rotation spindle

    A Novel Ontology and Machine Learning Driven Hybrid Clinical Decision Support Framework for Cardiovascular Preventative Care

    Get PDF
    Clinical risk assessment of chronic illnesses is a challenging and complex task which requires the utilisation of standardised clinical practice guidelines and documentation procedures in order to ensure consistent and efficient patient care. Conventional cardiovascular decision support systems have significant limitations, which include the inflexibility to deal with complex clinical processes, hard-wired rigid architectures based on branching logic and the inability to deal with legacy patient data without significant software engineering work. In light of these challenges, we are proposing a novel ontology and machine learning-driven hybrid clinical decision support framework for cardiovascular preventative care. An ontology-inspired approach provides a foundation for information collection, knowledge acquisition and decision support capabilities and aims to develop context sensitive decision support solutions based on ontology engineering principles. The proposed framework incorporates an ontology-driven clinical risk assessment and recommendation system (ODCRARS) and a Machine Learning Driven Prognostic System (MLDPS), integrated as a complete system to provide a cardiovascular preventative care solution. The proposed clinical decision support framework has been developed under the close supervision of clinical domain experts from both UK and US hospitals and is capable of handling multiple cardiovascular diseases. The proposed framework comprises of two novel key components: (1) ODCRARS (2) MLDPS. The ODCRARS is developed under the close supervision of consultant cardiologists Professor Calum MacRae from Harvard Medical School and Professor Stephen Leslie from Raigmore Hospital in Inverness, UK. The ODCRARS comprises of various components, which include: (a) Ontology-driven intelligent context-aware information collection for conducting patient interviews which are driven through a novel clinical questionnaire ontology. (b) A patient semantic profile, is generated using patient medical records which are collated during patient interviews (conducted through an ontology-driven context aware adaptive information collection component). The semantic transformation of patients’ medical data is carried out through a novel patient semantic profile ontology in order to give patient data an intrinsic meaning and alleviate interoperability issues with third party healthcare systems. (c) Ontology driven clinical decision support comprises of a recommendation ontology and a NICE/Expert driven clinical rules engine. The recommendation ontology is developed using clinical rules provided by the consultant cardiologist from the US hospital. The recommendation ontology utilises the patient semantic profile for lab tests and medication recommendation. A clinical rules engine is developed to implement a cardiac risk assessment mechanism for various cardiovascular conditions. The clinical rules engine is also utilised to control the patient flow within the integrated cardiovascular preventative care solution. The machine learning-driven prognostic system is developed in an iterative manner using state of the art feature selection and machine learning techniques. A prognostic model development process is exploited for the development of MLDPS based on clinical case studies in the cardiovascular domain. An additional clinical case study in the breast cancer domain is also carried out for the development and validation purposes. The prognostic model development process is general enough to handle a variety of healthcare datasets which will enable researchers to develop cost effective and evidence based clinical decision support systems. The proposed clinical decision support framework also provides a learning mechanism based on machine learning techniques. Learning mechanism is provided through exchange of patient data amongst the MLDPS and the ODCRARS. The machine learning-driven prognostic system is validated using Raigmore Hospital's RACPC, heart disease and breast cancer clinical case studies
    • …
    corecore