557 research outputs found
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact peopleâs lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative modelâs latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Measuring the impact of COVID-19 on hospital care pathways
Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospitalâs new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted
Named Entity Resolution in Personal Knowledge Graphs
Entity Resolution (ER) is the problem of determining when two entities refer
to the same underlying entity. The problem has been studied for over 50 years,
and most recently, has taken on new importance in an era of large,
heterogeneous 'knowledge graphs' published on the Web and used widely in
domains as wide ranging as social media, e-commerce and search. This chapter
will discuss the specific problem of named ER in the context of personal
knowledge graphs (PKGs). We begin with a formal definition of the problem, and
the components necessary for doing high-quality and efficient ER. We also
discuss some challenges that are expected to arise for Web-scale data. Next, we
provide a brief literature review, with a special focus on how existing
techniques can potentially apply to PKGs. We conclude the chapter by covering
some applications, as well as promising directions for future research.Comment: To appear as a book chapter by the same name in an upcoming (Oct.
2023) book `Personal Knowledge Graphs (PKGs): Methodology, tools and
applications' edited by Tiwari et a
Undergraduate and Graduate Course Descriptions, 2023 Spring
Wright State University undergraduate and graduate course descriptions from Spring 2023
Instance-Based Lossless Summarization of Knowledge Graph With Optimized Triples and Corrections (IBA-OTC)
Knowledge graph (KG) summarization facilitates efficient information retrieval for exploring complex structural data. For fast information retrieval, it requires processing on redundant data. However, it necessitates the completion of information in a summary graph. It also saves computational time during data retrieval, storage space, in-memory visualization, and preserving structure after summarization. State-of-the-art approaches summarize a given KG by preserving its structure at the cost of information loss. Additionally, the approaches not preserving the underlying structure, compromise the summarization ratio by focusing only on the compression of specific regions. In this way, these approaches either miss preserving the original facts or the wrong prediction of inferred information. To solve these problems, we present a novel framework for generating a lossless summary by preserving the structure through super signatures and their corresponding corrections. The proposed approach summarizes only the naturally overlapped instances while maintaining its information and preserving the underlying Resource Description Framework RDF graph. The resultant summary is composed of triples with positive, negative, and star corrections that are optimized by the smart calling of two novel functions namely merge and disperse . To evaluate the effectiveness of our proposed approach, we perform experiments on nine publicly available real-world knowledge graphs and obtain a better summarization ratio than state-of-the-art approaches by a margin of 10% to 30% with achieving its completeness, correctness, and compactness. In this way, the retrieval of common events and groups by queries is accelerated in the resultant graph
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information â provided implicitly or explicitly â is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
Towards a mesoscale rheology model for aqueous particulate suspensions
Particulate suspensions are ubiquitous and diverse; pharmaceutical formulations, biological fluids, magma and foodstuffs are just few of numerous examples. In many cases, the flow behaviour (rheology) of the suspension is critical to its function. A key rheological property is viscosity; a measure of a substanceâs resistance to flow. This work aims to understand molecular-level mechanisms responsible for determining flow behaviour in moderately dense suspensions; 35% particles by volume (i.e., volume fraction 0.35). The industrial application of interest to this thesis is catalysis; namely, the âwashcoatâ, a key component in the performance of catalytic converters. A typical washcoat formulation is an aqueous suspension, comprising a high surface-area support powder, an active catalyst material, together with organic additives and certain salts used to optimise properties of the washcoat; including its flow behaviour. Of these components, this work investigates âsalt-specific effectsâ; i.e. the influence of differing salt-types. Investigation is conducted at molecular and macroscopic resolution via simulations and experiments, respectively. The research approach probes the constituents of a suspension: the aqueous phase, the particle-aqueous phase interface, and particle interactions. Molecular dynamics simulations are employed as the foundation of this analysis, with experiments - rheology, nuclear magnetic resonance and dynamic light scattering - utilised alongside. A final set of rheology experiments is conducted on particulate suspensions of 35% volume fraction, in pure water and the aqueous salt solutions of interest. At all stages of analysis, results suggest that macroscopic behaviours are a cumulative manifestation of phenomena at molecular resolution. However, such phenomena are varied; the challenge lies in identifying which mechanisms are relevant to the behaviour of interest, how they work together, and how they manifest cumulatively. Towards a mesoscale rheology model for aqueous particulate suspensions, results are discussed in terms of input for such a model, which would predict rheology as a function of particle loading, ionic strength and possibly other factors, in future work
Leveraging literals for knowledge graph embeddings
Wissensgraphen (Knowledge Graphs, KGs) reprĂ€sentieren strukturierte Fakten, die sich aus EntitĂ€ten und den zwischen diesen bestehenden Relationen zusammensetzen. Um die Effizienz von KG-Anwendungen zu maximieren, ist es von Vorteil, KGs in einen niedrigdimensionalen Vektorraum zu transformieren. KGs folgen dem Paradigma einer offenen Welt (Open World Assumption, OWA), d. h. fehlende Information wird als potenziell möglich angesehen, wodurch ihre Verwendung in realen Anwendungsszenarien oft eingeschrĂ€nkt wird. Link-Vorhersage (Link Prediction, LP) zur VervollstĂ€ndigung von KGs kommt daher eine hohe Bedeutung zu. LP kann in zwei unterschiedlichen Modi durchgefĂŒhrt werden, transduktiv und induktiv, wobei die erste Möglichkeit voraussetzt, dass alle EntitĂ€ten der Testdaten in den Trainingsdaten vorhanden sind, wĂ€hrend die zweite Möglichkeit auch zuvor nicht bekannte EntitĂ€ten in den Testdaten zulĂ€sst. Die vorliegende Arbeit untersucht die Verwendung von Literalen in der transduktiven und induktiven LP, da KGs zahlreiche numerische und textuelle Literale enthalten, die eine wesentliche Semantik aufweisen. Zur Evaluierung dieser LP Methoden werden spezielle Benchmark-DatensĂ€tze eingefĂŒhrt.
Insbesondere wird eine neuartige KG Embedding (KGE) Methode, RAILD, vorgeschlagen, die Textliterale zusammen mit kontextuellen Graphinformationen fĂŒr die LP nutzt. Das Ziel von RAILD ist es, die bestehende ForschungslĂŒcke beim Lernen von Embeddings fĂŒr beim Training ungesehene Relationen zu schlieĂen. DafĂŒr wird eine Architektur vorgeschlagen, die Sprachmodelle (Language Models, LMs) mit Netzwerkembeddings kombiniert. Hierzu erfolgt ein Feintuning von leistungsstarken vortrainierten LMs wie BERT zum Zweck der LP, wobei textuelle Beschreibungen von EntitĂ€ten und Relationen genutzt werden. DarĂŒber hinaus wird ein neuer Algorithmus, WeiDNeR, eingefĂŒhrt, um ein Relationsnetzwerk zu generieren, das zum Erlernen graphbasierter Embeddings von Relationen unter Verwendung eines Netzwerkembeddingsmodells dient. Die VektorreprĂ€sentationen dieser Relationen werden fĂŒr die LP kombiniert. Zudem wird ein weiteres neuartiges Embeddingmodell, LitKGE, vorgestellt, das numerische Literale fĂŒr die transduktive LP verwendet. Es zielt darauf ab, numerische Merkmale fĂŒr EntitĂ€ten durch Graphtraversierung zu erzeugen. HierfĂŒr wird ein weiterer Algorithmus, WeiDNeR_Extended, eingefĂŒhrt, der ein Netzwerk aus Objekt- und Datentypproperties erzeugt. Aus den aus diesem Netzwerk extrahierten Propertypfaden werden dann numerische Merkmale von EntitĂ€ten generiert.
Des Weiteren wird der Einsatz eines mehrsprachigen LM zur Kodierung von EntitĂ€tenbeschreibungen in verschiedenen natĂŒrlichen Sprachen zum Zweck der LP untersucht. FĂŒr die Evaluierung der KGE-Modelle wurden die Benchmark-DatensĂ€tze LiterallyWikidata und Wikidata68K erstellt. Die vielversprechenden Ergebnisse, die mit den vorgestellten Modellen erzielt wurden, eröffnen interessante Fragestellungen fĂŒr die zukĂŒnftige Forschung auf dem Gebiet der KGEs und ihrer Folgeanwendungen
- âŠ