28 research outputs found
Program
11th Annual Research and Engagement Day Program with descriptions and schedule of events
eXplainable Artificial Intelligence (XAI) in aging clock models
eXplainable Artificial Intelligence (XAI) is a rapidly progressing field of
machine learning, aiming to unravel the predictions of complex models. XAI is
especially required in sensitive applications, e.g. in health care, when
diagnosis, recommendations and treatment choices might rely on the decisions
made by artificial intelligence systems. AI approaches have become widely used
in aging research as well, in particular, in developing biological clock models
and identifying biomarkers of aging and age-related diseases. However, the
potential of XAI here awaits to be fully appreciated. We discuss the
application of XAI for developing the "aging clocks" and present a
comprehensive analysis of the literature categorized by the focus on particular
physiological systems
A dynamic risk score for early prediction of cardiogenic shock using machine learning
Myocardial infarction and heart failure are major cardiovascular diseases
that affect millions of people in the US. The morbidity and mortality are
highest among patients who develop cardiogenic shock. Early recognition of
cardiogenic shock is critical. Prompt implementation of treatment measures can
prevent the deleterious spiral of ischemia, low blood pressure, and reduced
cardiac output due to cardiogenic shock. However, early identification of
cardiogenic shock has been challenging due to human providers' inability to
process the enormous amount of data in the cardiac intensive care unit (ICU)
and lack of an effective risk stratification tool. We developed a deep
learning-based risk stratification tool, called CShock, for patients admitted
into the cardiac ICU with acute decompensated heart failure and/or myocardial
infarction to predict onset of cardiogenic shock. To develop and validate
CShock, we annotated cardiac ICU datasets with physician adjudicated outcomes.
CShock achieved an area under the receiver operator characteristic curve
(AUROC) of 0.820, which substantially outperformed CardShock (AUROC 0.519), a
well-established risk score for cardiogenic shock prognosis. CShock was
externally validated in an independent patient cohort and achieved an AUROC of
0.800, demonstrating its generalizability in other cardiac ICUs
Gradient boosting in automatic machine learning: feature selection and hyperparameter optimization
Das Ziel des automatischen maschinellen Lernens (AutoML) ist es, alle Aspekte der Modellwahl in prädiktiver Modellierung zu automatisieren. Diese Arbeit beschäftigt sich mit Gradienten Boosting im Kontext von AutoML mit einem Fokus auf Gradient Tree Boosting und komponentenweisem Boosting. Beide Techniken haben eine gemeinsame Methodik, aber ihre Zielsetzung ist unterschiedlich. Während Gradient Tree Boosting im maschinellen Lernen als leistungsfähiger Vorhersagealgorithmus weit verbreitet ist, wurde komponentenweises Boosting im Rahmen der Modellierung hochdimensionaler Daten entwickelt. Erweiterungen des komponentenweisen Boostings auf multidimensionale Vorhersagefunktionen werden in dieser Arbeit ebenfalls untersucht. Die Herausforderung der Hyperparameteroptimierung wird mit Fokus auf Bayesianische Optimierung und effiziente Stopping-Strategien diskutiert. Ein groß angelegter Benchmark über Hyperparameter verschiedener Lernalgorithmen, zeigt den kritischen Einfluss von Hyperparameter Konfigurationen auf die Qualität der Modelle. Diese Daten können als Grundlage für neue AutoML- und Meta-Lernansätze verwendet werden. Darüber hinaus werden fortgeschrittene Strategien zur Variablenselektion zusammengefasst und eine neue Methode auf Basis von permutierten Variablen vorgestellt. Schließlich wird ein AutoML-Ansatz vorgeschlagen, der auf den Ergebnissen und Best Practices für die Variablenselektion und Hyperparameteroptimierung basiert. Ziel ist es AutoML zu vereinfachen und zu stabilisieren sowie eine hohe Vorhersagegenauigkeit zu gewährleisten. Dieser Ansatz wird mit AutoML-Methoden, die wesentlich komplexere Suchräume und Ensembling Techniken besitzen, verglichen.
Vier Softwarepakete für die statistische Programmiersprache R sind Teil dieser Arbeit, die neu entwickelt oder erweitert wurden: mlrMBO: Ein generisches Paket für die Bayesianische Optimierung; autoxgboost: Ein AutoML System, das sich vollständig auf Gradient Tree Boosting fokusiert; compboost: Ein modulares, in C++ geschriebenes Framework für komponentenweises Boosting; gamboostLSS: Ein Framework für komponentenweises Boosting additiver Modelle für Location, Scale und Shape.The goal of automatic machine learning (AutoML) is to automate all aspects of model selection in (supervised) predictive modeling. This thesis deals with gradient boosting techniques in the context of AutoML with a focus on gradient tree boosting and component-wise gradient boosting. Both techniques have a common methodology, but their goal is quite different. While gradient tree boosting is widely used in machine learning as a powerful prediction algorithm, component-wise gradient boosting strength is in feature selection and modeling of high-dimensional data. Extensions of component-wise gradient boosting to multidimensional prediction functions are considered as well. Focusing on Bayesian optimization and efficient early stopping strategies the challenge of hyperparameter optimization for these algorithms is discussed. Difficulty in the optimization of these algorithms is shown by a large scale random search on hyperparameters for machine learning algorithms, that can build the foundation of new AutoML and metalearning approaches. Furthermore, advanced feature selection strategies are summarized and a new method based on shadow features is introduced. Finally, an AutoML approach based on the results and best practices for feature selection and hyperparameter optimization is proposed, with the goal of simplifying and stabilizing AutoML while maintaining high prediction accuracy. This is compared to AutoML approaches using much more complex search spaces and ensembling techniques.
Four software packages for the statistical programming language R have been newly developed or extended as a part of this thesis: mlrMBO: A general framework for Bayesian optimization; autoxgboost: An automatic machine learning framework that heavily utilizes gradient tree boosting; compboost: A modular framework for component-wise boosting written in C++; gamboostLSS: A framework for component-wise boosting for generalized additive models for location scale and shape
If interpretability is the answer, what is the question?
Due to the ability to model even complex dependencies, machine learning (ML) can be used to tackle a broad range of (high-stakes) prediction problems. The complexity of the resulting models comes at the cost of transparency, meaning that it is difficult to understand the model by inspecting its parameters.
This opacity is considered problematic since it hampers the transfer of knowledge from the model, undermines the agency of individuals affected by algorithmic decisions, and makes it more challenging to expose non-robust or unethical behaviour.
To tackle the opacity of ML models, the field of interpretable machine learning (IML) has emerged. The field is motivated by the idea that if we could understand the model's behaviour -- either by making the model itself interpretable or by inspecting post-hoc explanations -- we could also expose unethical and non-robust behaviour, learn about the data generating process, and restore the agency of affected individuals. IML is not only a highly active area of research, but the developed techniques are also widely applied in both industry and the sciences.
Despite the popularity of IML, the field faces fundamental criticism, questioning whether IML actually helps in tackling the aforementioned problems of ML and even whether it should be a field of research in the first place:
First and foremost, IML is criticised for lacking a clear goal and, thus, a clear definition of what it means for a model to be interpretable. On a similar note, the meaning of existing methods is often unclear, and thus they may be misunderstood or even misused to hide unethical behaviour. Moreover, estimating conditional-sampling-based techniques poses a significant computational challenge.
With the contributions included in this thesis, we tackle these three challenges for IML.
We join a range of work by arguing that the field struggles to define and evaluate "interpretability" because incoherent interpretation goals are conflated. However, the different goals can be disentangled such that coherent requirements can inform the derivation of the respective target estimands. We demonstrate this with the examples of two interpretation contexts: recourse and scientific inference.
To tackle the misinterpretation of IML methods, we suggest deriving formal interpretation rules that link explanations to aspects of the model and data. In our work, we specifically focus on interpreting feature importance. Furthermore, we collect interpretation pitfalls and communicate them to a broader audience.
To efficiently estimate conditional-sampling-based interpretation techniques, we propose two methods that leverage the dependence structure in the data to simplify the estimation problems for Conditional Feature Importance (CFI) and SAGE.
A causal perspective proved to be vital in tackling the challenges: First, since IML problems such as algorithmic recourse are inherently causal; Second, since causality helps to disentangle the different aspects of model and data and, therefore, to distinguish the insights that different methods provide; And third, algorithms developed for causal structure learning can be leveraged for the efficient estimation of conditional-sampling based IML methods.Aufgrund der Fähigkeit, selbst komplexe Abhängigkeiten zu modellieren, kann maschinelles Lernen (ML) zur Lösung eines breiten Spektrums von anspruchsvollen Vorhersageproblemen eingesetzt werden.
Die Komplexität der resultierenden Modelle geht auf Kosten der Interpretierbarkeit, d. h. es ist schwierig, das Modell durch die Untersuchung seiner Parameter zu verstehen.
Diese Undurchsichtigkeit wird als problematisch angesehen, da sie den Wissenstransfer aus dem Modell behindert, sie die Handlungsfähigkeit von Personen, die von algorithmischen Entscheidungen betroffen sind, untergräbt und sie es schwieriger macht, nicht robustes oder unethisches Verhalten aufzudecken.
Um die Undurchsichtigkeit von ML-Modellen anzugehen, hat sich das Feld des interpretierbaren maschinellen Lernens (IML) entwickelt.
Dieses Feld ist von der Idee motiviert, dass wir, wenn wir das Verhalten des Modells verstehen könnten - entweder indem wir das Modell selbst interpretierbar machen oder anhand von post-hoc Erklärungen - auch unethisches und nicht robustes Verhalten aufdecken, über den datengenerierenden Prozess lernen und die Handlungsfähigkeit betroffener Personen wiederherstellen könnten.
IML ist nicht nur ein sehr aktiver Forschungsbereich, sondern die entwickelten Techniken werden auch weitgehend in der Industrie und den Wissenschaften angewendet.
Trotz der Popularität von IML ist das Feld mit fundamentaler Kritik konfrontiert, die in Frage stellt, ob IML tatsächlich dabei hilft, die oben genannten Probleme von ML anzugehen, und ob es überhaupt ein Forschungsgebiet sein sollte:
In erster Linie wird an IML kritisiert, dass es an einem klaren Ziel und damit an einer klaren Definition dessen fehlt, was es für ein Modell bedeutet, interpretierbar zu sein. Weiterhin ist die Bedeutung bestehender Methoden oft unklar, so dass sie missverstanden oder sogar missbraucht werden können, um unethisches Verhalten zu verbergen. Letztlich stellt die Schätzung von auf bedingten Stichproben basierenden Verfahren eine erhebliche rechnerische Herausforderung dar.
In dieser Arbeit befassen wir uns mit diesen drei grundlegenden Herausforderungen von IML.
Wir schließen uns der Argumentation an, dass es schwierig ist, "Interpretierbarkeit" zu definieren und zu bewerten, weil inkohärente Interpretationsziele miteinander vermengt werden. Die verschiedenen Ziele lassen sich jedoch entflechten, sodass kohärente Anforderungen die Ableitung der jeweiligen Zielgrößen informieren. Wir demonstrieren dies am Beispiel von zwei Interpretationskontexten: algorithmischer Regress
und wissenschaftliche Inferenz.
Um der Fehlinterpretation von IML-Methoden zu begegnen, schlagen wir vor, formale Interpretationsregeln abzuleiten, die Erklärungen mit Aspekten des Modells und der Daten verknüpfen. In unserer Arbeit konzentrieren wir uns speziell auf die Interpretation von sogenannten Feature Importance Methoden. Darüber hinaus tragen wir wichtige Interpretationsfallen zusammen und kommunizieren sie an ein breiteres Publikum.
Zur effizienten Schätzung auf bedingten Stichproben basierender Interpretationstechniken schlagen wir zwei Methoden vor, die die Abhängigkeitsstruktur in den Daten nutzen, um die Schätzprobleme für Conditional Feature Importance (CFI) und SAGE zu vereinfachen.
Eine kausale Perspektive erwies sich als entscheidend für die Bewältigung der Herausforderungen: Erstens, weil IML-Probleme wie der algorithmische Regress inhärent kausal sind; zweitens, weil Kausalität hilft, die verschiedenen Aspekte von Modell und Daten zu entflechten und somit die Erkenntnisse, die verschiedene Methoden liefern, zu unterscheiden; und drittens können wir Algorithmen, die für das Lernen kausaler Struktur entwickelt wurden, für die effiziente Schätzung von auf bindingten Verteilungen basierenden IML-Methoden verwenden
Values of blockchain for risk-averse high-tech manufacturers under government’s carbon target environmental taxation policies
AbstractToday, high-tech industries such as consumer electronics commonly face government rules on carbon emissions. Among the rules, carbon emission tax as well as extended producer responsibility (EPR) tax are two important measures. Using blockchain, the policy makers can better determine the carbon target environmental taxation (CTET) policy with accurate information. In this paper, based on the mean-variance framework, we study the values of blockchain for risk-averse high-tech manufacturers who are under the government’s CTET policy. To be specific, the government first determines the optimal CTET policy. The high-tech manufacturer then reacts and determines its optimal production quantity. We analytically prove that the CTET policy simply relies on the setting of the optimal EPR tax. Then, in the absence of blockchain, we consider the case in which the government does not know the manufacturer’s degree of risk aversion for sure and then derive the expected value of using blockchain for the high-tech manufacturers. We study when it is wise for the high-tech manufacturer and the government to implement blockchain. To check for robustness, we consider in two extended models respectively the situations in which blockchain incurs non-trivial costs as well as having an alternative risk measure. We analytically show that most of the qualitative findings remain valid.</jats:p
Open Source Business Models and Synthetic Biology
The software industry has successfully utilized open source business models namely with software such as Android and Linux. Open source business models allow individuals to collaborate and share information without fear that the shared information will be commercially misused. Given the similarities between software source code and genetic sequences, innovators in the field of synthetic biology feel that open source business models can help further innovation for synthetic biology in a similar manner. However, when determining whether to join an open source project, practitioners must first identify if such a project will be beneficial to their goals. This Comment discuss benefits and risks associated with open source business models as applied to synthetic biology, as well as possible solutions to some of the risks identified. This Comment concludes with possible suggestions to solve some of the issues associated with open source business models with the goal to further current open source initiatives
Open Source Business Models and Synthetic Biology
The software industry has successfully utilized open source business models namely with software such as Android and Linux. Open source business models allow individuals to collaborate and share information without fear that the shared information will be commercially misused. Given the similarities between software source code and genetic sequences, innovators in the field of synthetic biology feel that open source business models can help further innovation for synthetic biology in a similar manner. However, when determining whether to join an open source project, practitioners must first identify if such a project will be beneficial to their goals. This Comment discuss benefits and risks associated with open source business models as applied to synthetic biology, as well as possible solutions to some of the risks identified. This Comment concludes with possible suggestions to solve some of the issues associated with open source business models with the goal to further current open source initiatives
Recommended from our members
End-to-End Machine Learning Frameworks for Medicine: Data Imputation, Model Interpretation and Synthetic Data Generation
Tremendous successes in machine learning have been achieved in a variety of applications such as image classification and language translation via supervised learning frameworks. Recently, with the rapid increase of electronic health records (EHR), machine learning researchers got immense opportunities to adopt the successful supervised learning frameworks to diverse clinical applications. To properly employ machine learning frameworks for medicine, we need to handle the special properties of the EHR and clinical applications: (1) extensive missing data, (2) model interpretation, (3) privacy of the data. This dissertation addresses those specialties to construct end-to-end machine learning frameworks for clinical decision support. We focus on the following three problems: (1) how to deal with incomplete data (data imputation), (2) how to explain the decisions of the trained model (model interpretation), (3) how to generate synthetic data for better sharing private clinical data (synthetic data generation). To appropriately handle those problems, we propose novel machine learning algorithms for both static and longitudinal settings. For data imputation, we propose modified Generative Adversarial Networks and Recurrent Neural Networks to accurately impute the missing values and return the complete data for applying state-of-the-art supervised learning models. For model interpretation, we utilize the actor-critic framework to estimate feature importance of the trained model's decision in an instance level. We expand this algorithm to active sensing framework that recommends which observations should we measure and when. For synthetic data generation, we extend well-known Generative Adversarial Network frameworks from static setting to longitudinal setting, and propose a novel differentially private synthetic data generation framework.To demonstrate the utilities of the proposed models, we evaluate those models on various real-world medical datasets including cohorts in the intensive care units, wards, and primary care hospitals. We show that the proposed algorithms consistently outperform state-of-the-art for handling missing data, understanding the trained model, and generating private synthetic data that are critical for building end-to-end machine learning frameworks for medicine