4,385 research outputs found
A knowledge graph-supported information fusion approach for multi-faceted conceptual modelling
It has become progressively more evident that a single data source is unable to comprehensively capture the
variability of a multi-faceted concept, such as product design, driving behaviour or human trust, which has
diverse semantic orientations. Therefore, multi-faceted conceptual modelling is often conducted based on multi-sourced data covering indispensable aspects, and information fusion is frequently applied to cope with the high
dimensionality and data heterogeneity. The consideration of intra-facets relationships is also indispensable. In
this context, a knowledge graph (KG), which can aggregate the relationships of multiple aspects by semantic
associations, was exploited to facilitate the multi-faceted conceptual modelling based on heterogeneous and
semantic-rich data. Firstly, rules of fault mechanism are extracted from the existing domain knowledge repository, and node attributes are extracted from multi-sourced data. Through abstraction and tokenisation of
existing knowledge repository and concept-centric data, rules of fault mechanism were symbolised and integrated with the node attributes, which served as the entities for the concept-centric knowledge graph (CKG).
Subsequently, the transformation of process data to a stack of temporal graphs was conducted under the CKG
backbone. Lastly, the graph convolutional network (GCN) model was applied to extract temporal and attribute
correlation features from the graphs, and a temporal convolution network (TCN) was built for conceptual
modelling using these features. The effectiveness of the proposed approach and the close synergy between the
KG-supported approach and multi-faceted conceptual modelling is demonstrated and substantiated in a case
study using real-world data
Creating granular climate zones for future-proof building design in the UK
This is the final version. Available on open access from Elsevier via the DOI in this recordData availability:
Datasets related to this article can be found at https://catalogue.ceda.ac.uk, hosted at the CEDA archive.Climate zones play an important role in promoting climate responsive building design and implementing climate-specific prescriptions in national building standards and regulations. The existing studies on climate zoning are subject to several limitations, i.e. the incapability of distinguishing microclimates and the lack of consideration of climate change. In this research, we propose a two-tiered ensemble clustering method for the identification of granular climate zones using the projections of future climate. The first tier identifies primary climate zones using a combination of climatic features and geographical locations, whereas the second tier identifies distinct local variations within each primary climate zone using the temperature related features. The proposed ensemble clustering model is applied to the UK to create a mapping of granular climate zones for future proofing building design. The method identified 14 distinct primary zones and distinguished microclimates at a range of scales from large urban areas, such as the Greater London Area, to national parks, such as Dartmoor and the Pennines. The identified mapping resolves two major obstacles in the creation and usage of weather data for building performance assessment in the UK, i.e. the lack of guidance for selecting weather files, and the absence of scientific rationale for representing the UK climate using the current 14 locations.Innovate U
Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
Recent years have seen a tremendous growth in Artificial Intelligence (AI)-based methodological development in a broad range of domains. In this rapidly evolving field, large number of methods are being reported using machine learning (ML) and Deep Learning (DL) models. Majority of these models are inherently complex and lacks explanations of the decision making process causing these models to be termed as 'Black-Box'. One of the major bottlenecks to adopt such models in mission-critical application domains, such as banking, e-commerce, healthcare, and public services and safety, is the difficulty in interpreting them. Due to the rapid proleferation of these AI models, explaining their learning and decision making process are getting harder which require transparency and easy predictability. Aiming to collate the current state-of-the-art in interpreting the black-box models, this study provides a comprehensive analysis of the explainable AI (XAI) models. To reduce false negative and false positive outcomes of these back-box models, finding flaws in them is still difficult and inefficient. In this paper, the development of XAI is reviewed meticulously through careful selection and analysis of the current state-of-the-art of XAI research. It also provides a comprehensive and in-depth evaluation of the XAI frameworks and their efficacy to serve as a starting point of XAI for applied and theoretical researchers. Towards the end, it highlights emerging and critical issues pertaining to XAI research to showcase major, model-specific trends for better explanation, enhanced transparency, and improved prediction accuracy
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Generalized network-based dimensionality analysis
Network analysis opens new horizons for data analysis methods, as the results of ever-developing network science can be integrated into classical data analysis techniques. This paper presents the generalized version of network-based dimensionality reduction and analysis (NDA). The main contributions of this paper are as follows: (1) The proposed generalized dimensionality reduction and analysis (GNDA) method already handles low-dimensional high-sample-size (LDHSS) and high-dimensional and low-sample-size (HDLSS) at the same time. In addition, compared with existing methods, we show that only the proposed GNDA method adequately estimates the number of latent variables (LVs). (2) The proposed GNDA already considers any symmetric and nonsymmetric similarity functions between indicators (i.e., variables or observations) to specify LVs. (3) The proposed prefiltering and resolution parameters provide the hierarchical version of GNDA to check the robustness of LVs. The proposed GNDA method is compared with traditional dimensionality reduction methods on various simulated and real-world datasets
Robustness, Heterogeneity and Structure Capturing for Graph Representation Learning and its Application
Graph neural networks (GNNs) are potent methods for graph representation learn- ing (GRL), which extract knowledge from complicated (graph) structured data in various real-world scenarios. However, GRL still faces many challenges. Firstly GNN-based node classification may deteriorate substantially by overlooking the pos- sibility of noisy data in graph structures, as models wrongly process the relation among nodes in the input graphs as the ground truth. Secondly, nodes and edges have different types in the real-world and it is essential to capture this heterogeneity in graph representation learning. Next, relations among nodes are not restricted to pairwise relations and it is necessary to capture the complex relations accordingly. Finally, the absence of structural encodings, such as positional information, deterio- rates the performance of GNNs. This thesis proposes novel methods to address the aforementioned problems:
1. Bayesian Graph Attention Network (BGAT): Developed for situations with scarce data, this method addresses the influence of spurious edges. Incor- porating Bayesian principles into the graph attention mechanism enhances robustness, leading to competitive performance against benchmarks (Chapter 3).
2. Neighbour Contrastive Heterogeneous Graph Attention Network (NC-HGAT): By enhancing a cutting-edge self-supervised heterogeneous graph neural net- work model (HGAT) with neighbour contrastive learning, this method ad- dresses heterogeneity and uncertainty simultaneously. Extra attention to edge relations in heterogeneous graphs also aids in subsequent classification tasks (Chapter 4).
3. A novel ensemble learning framework is introduced for predicting stock price movements. It adeptly captures both group-level and pairwise relations, lead- ing to notable advancements over the existing state-of-the-art. The integration of hypergraph and graph models, coupled with the utilisation of auxiliary data via GNNs before recurrent neural network (RNN), provides a deeper under- standing of long-term dependencies between similar entities in multivariate time series analysis (Chapter 5).
4. A novel framework for graph structure learning is introduced, segmenting graphs into distinct patches. By harnessing the capabilities of transformers and integrating other position encoding techniques, this approach robustly capture intricate structural information within a graph. This results in a more comprehensive understanding of its underlying patterns (Chapter 6)
A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus
IntroductionSeasonal influenza A H3N2 viruses are constantly changing, reducing the effectiveness of existing vaccines. As a result, the World Health Organization (WHO) needs to frequently update the vaccine strains to match the antigenicity of emerged H3N2 variants. Traditional assessments of antigenicity rely on serological methods, which are both labor-intensive and time-consuming. Although numerous computational models aim to simplify antigenicity determination, they either lack a robust quantitative linkage between antigenicity and viral sequences or focus restrictively on selected features.MethodsHere, we propose a novel computational method to predict antigenic distances using multiple features, including not only viral sequence attributes but also integrating four distinct categories of features that significantly affect viral antigenicity in sequences.ResultsThis method exhibits low error in virus antigenicity prediction and achieves superior accuracy in discerning antigenic drift. Utilizing this method, we investigated the evolution process of the H3N2 influenza viruses and identified a total of 21 major antigenic clusters from 1968 to 2022.DiscussionInterestingly, our predicted antigenic map aligns closely with the antigenic map generated with serological data. Thus, our method is a promising tool for detecting antigenic variants and guiding the selection of vaccine candidates
Modified fuzzy rough set technique with stacked autoencoder model for magnetic resonance imaging based breast cancer detection
Breast cancer is the common cancer in women, where early detection reduces the mortality rate. The magnetic resonance imaging (MRI) images are efficient in analyzing breast cancer, but it is hard to identify the abnormalities. The manual breast cancer detection in MRI images is inefficient; therefore, a deep learning-based system is implemented in this manuscript. Initially, the visual quality improvement is done using region growing and adaptive histogram equalization (AHE), and then, the breast lesion is segmented by Otsu thresholding with morphological transform. Next, the features are extracted from the segmented lesion, and a modified fuzzy rough set technique is proposed to reduce the dimensions of the extracted features that decreases the system complexity and computational time. The active features are fed to the stacked autoencoder for classifying the benign and malignant classes. The results demonstrated that the proposed model attained 99% and 99.22% of classification accuracy on the benchmark datasets, which are higher related to the comparative classifiers: decision tree, naïve Bayes, random forest and k-nearest neighbor (KNN). The obtained results state that the proposed model superiorly screens and detects the breast lesions that assists clinicians in effective therapeutic intervention and timely treatment
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
- …