868 research outputs found

    Uncertainty and Interpretability Studies in Soft Computing with an Application to Complex Manufacturing Systems

    Get PDF
    In systems modelling and control theory, the benefits of applying neural networks have been extensively studied. Particularly in manufacturing processes, such as the prediction of mechanical properties of heat treated steels. However, modern industrial processes usually involve large amounts of data and a range of non-linear effects and interactions that might hinder their model interpretation. For example, in steel manufacturing the understanding of complex mechanisms that lead to the mechanical properties which are generated by the heat treatment process is vital. This knowledge is not available via numerical models, therefore an experienced metallurgist estimates the model parameters to obtain the required properties. This human knowledge and perception sometimes can be imprecise leading to a kind of cognitive uncertainty such as vagueness and ambiguity when making decisions. In system classification, this may be translated into a system deficiency - for example, small input changes in system attributes may result in a sudden and inappropriate change for class assignation. In order to address this issue, practitioners and researches have developed systems that are functional equivalent to fuzzy systems and neural networks. Such systems provide a morphology that mimics the human ability of reasoning via the qualitative aspects of fuzzy information rather by its quantitative analysis. Furthermore, these models are able to learn from data sets and to describe the associated interactions and non-linearities in the data. However, in a like-manner to neural networks, a neural fuzzy system may suffer from a lost of interpretability and transparency when making decisions. This is mainly due to the application of adaptive approaches for its parameter identification. Since the RBF-NN can be treated as a fuzzy inference engine, this thesis presents several methodologies that quantify different types of uncertainty and its influence on the model interpretability and transparency of the RBF-NN during its parameter identification. Particularly, three kind of uncertainty sources in relation to the RBF-NN are studied, namely: entropy, fuzziness and ambiguity. First, a methodology based on Granular Computing (GrC), neutrosophic sets and the RBF-NN is presented. The objective of this methodology is to quantify the hesitation produced during the granular compression at the low level of interpretability of the RBF-NN via the use of neutrosophic sets. This study also aims to enhance the disitnguishability and hence the transparency of the initial fuzzy partition. The effectiveness of the proposed methodology is tested against a real case study for the prediction of the properties of heat-treated steels. Secondly, a new Interval Type-2 Radial Basis Function Neural Network (IT2-RBF-NN) is introduced as a new modelling framework. The IT2-RBF-NN takes advantage of the functional equivalence between FLSs of type-1 and the RBF-NN so as to construct an Interval Type-2 Fuzzy Logic System (IT2-FLS) that is able to deal with linguistic uncertainty and perceptions in the RBF-NN rule base. This gave raise to different combinations when optimising the IT2-RBF-NN parameters. Finally, a twofold study for uncertainty assessment at the high-level of interpretability of the RBF-NN is provided. On the one hand, the first study proposes a new methodology to quantify the a) fuzziness and the b) ambiguity at each RU, and during the formation of the rule base via the use of neutrosophic sets theory. The aim of this methodology is to calculate the associated fuzziness of each rule and then the ambiguity related to each normalised consequence of the fuzzy rules that result from the overlapping and to the choice with one-to-many decisions respectively. On the other hand, a second study proposes a new methodology to quantify the entropy and the fuzziness that come out from the redundancy phenomenon during the parameter identification. To conclude this work, the experimental results obtained through the application of the proposed methodologies for modelling two well-known benchmark data sets and for the prediction of mechanical properties of heat-treated steels conducted to publication of three articles in two peer-reviewed journals and one international conference

    Taxonomy, Semantic Data Schema, and Schema Alignment for Open Data in Urban Building Energy Modeling

    Full text link
    Urban Building Energy Modeling (UBEM) is a critical tool to provide quantitative analysis on building decarbonization, sustainability, building-to-grid integration, and renewable energy applications on city, regional, and national scales. Researchers usually use open data as inputs to build and calibrate UBEM. However, open data are from thousands of sources covering various perspectives of weather, building characteristics, etc. Besides, a lack of semantic features of open data further increases the engineering effort to process information to be directly used for UBEM as inputs. In this paper, we first reviewed open data types used for UBEM and developed a taxonomy to categorize open data. Based on that, we further developed a semantic data schema for each open data category to maintain data consistency and improve model automation for UBEM. In a case study, we use three popular open data to show how they can be automatically processed based on the proposed schematic data structure using large language models. The accurate results generated by large language models indicate the machine-readability and human-interpretability of the developed semantic data schema

    An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

    Get PDF
    AbstractNowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability

    A Generalized Look at Federated Learning: Survey and Perspectives

    Full text link
    Federated learning (FL) refers to a distributed machine learning framework involving learning from several decentralized edge clients without sharing local dataset. This distributed strategy prevents data leakage and enables on-device training as it updates the global model based on the local model updates. Despite offering several advantages, including data privacy and scalability, FL poses challenges such as statistical and system heterogeneity of data in federated networks, communication bottlenecks, privacy and security issues. This survey contains a systematic summarization of previous work, studies, and experiments on FL and presents a list of possibilities for FL across a range of applications and use cases. Other than that, various challenges of implementing FL and promising directions revolving around the corresponding challenges are provided.Comment: 9 pages, 2 figure

    Language modelling for clinical natural language understanding and generation

    Get PDF
    One of the long-standing objectives of Artificial Intelligence (AI) is to design and develop algorithms for social good including tackling public health challenges. In the era of digitisation, with an unprecedented amount of healthcare data being captured in digital form, the analysis of the healthcare data at scale can lead to better research of diseases, better monitoring patient conditions and more importantly improving patient outcomes. However, many AI-based analytic algorithms rely solely on structured healthcare data such as bedside measurements and test results which only account for 20% of all healthcare data, whereas the remaining 80% of healthcare data is unstructured including textual data such as clinical notes and discharge summaries which is still underexplored. Conventional Natural Language Processing (NLP) algorithms that are designed for clinical applications rely on the shallow matching, templates and non-contextualised word embeddings which lead to limited understanding of contextual semantics. Though recent advances in NLP algorithms have demonstrated promising performance on a variety of NLP tasks in the general domain with contextualised language models, most of these generic NLP algorithms struggle at specific clinical NLP tasks which require biomedical knowledge and reasoning. Besides, there is limited research to study generative NLP algorithms to generate clinical reports and summaries automatically by considering salient clinical information. This thesis aims to design and develop novel NLP algorithms especially clinical-driven contextualised language models to understand textual healthcare data and generate clinical narratives which can potentially support clinicians, medical scientists and patients. The first contribution of this thesis focuses on capturing phenotypic information of patients from clinical notes which is important to profile patient situation and improve patient outcomes. The thesis proposes a novel self-supervised language model, named Phenotypic Intelligence Extraction (PIE), to annotate phenotypes from clinical notes with the detection of contextual synonyms and the enhancement to reason with numerical values. The second contribution is to demonstrate the utility and benefits of using phenotypic features of patients in clinical use cases by predicting patient outcomes in Intensive Care Units (ICU) and identifying patients at risk of specific diseases with better accuracy and model interpretability. The third contribution is to propose generative models to generate clinical narratives to automate and accelerate the process of report writing and summarisation by clinicians. This thesis first proposes a novel summarisation language model named PEGASUS which surpasses or is on par with the state-of-the-art performance on 12 downstream datasets including biomedical literature from PubMed. PEGASUS is further extended to generate medical scientific documents from input tabular data.Open Acces

    Query Understanding in the Age of Large Language Models

    Full text link
    Querying, conversing, and controlling search and information-seeking interfaces using natural language are fast becoming ubiquitous with the rise and adoption of large-language models (LLM). In this position paper, we describe a generic framework for interactive query-rewriting using LLMs. Our proposal aims to unfold new opportunities for improved and transparent intent understanding while building high-performance retrieval systems using LLMs. A key aspect of our framework is the ability of the rewriter to fully specify the machine intent by the search engine in natural language that can be further refined, controlled, and edited before the final retrieval phase. The ability to present, interact, and reason over the underlying machine intent in natural language has profound implications on transparency, ranking performance, and a departure from the traditional way in which supervised signals were collected for understanding intents. We detail the concept, backed by initial experiments, along with open questions for this interactive query understanding framework.Comment: Accepted to GENIR(SIGIR'23

    Quantum Long Short-Term Memory (QLSTM) vs Classical LSTM in Time Series Forecasting: A Comparative Study in Solar Power Forecasting

    Full text link
    Accurately forecasting solar power generation is crucial in the global progression towards sustainable energy systems. In this study, we conduct a meticulous comparison between Quantum Long Short-Term Memory (QLSTM) and classical Long Short-Term Memory (LSTM) models for solar power production forecasting. Our controlled experiments reveal promising advantages of QLSTMs, including accelerated training convergence and substantially reduced test loss within the initial epoch compared to classical LSTMs. These empirical findings demonstrate QLSTM's potential to swiftly assimilate complex time series relationships, enabled by quantum phenomena like superposition. However, realizing QLSTM's full capabilities necessitates further research into model validation across diverse conditions, systematic hyperparameter optimization, hardware noise resilience, and applications to correlated renewable forecasting problems. With continued progress, quantum machine learning can offer a paradigm shift in renewable energy time series prediction. This pioneering work provides initial evidence substantiating quantum advantages over classical LSTM, while acknowledging present limitations. Through rigorous benchmarking grounded in real-world data, our study elucidates a promising trajectory for quantum learning in renewable forecasting. Additional research and development can further actualize this potential to achieve unprecedented accuracy and reliability in predicting solar power generation worldwide.Comment: 17 pages, 8 figure

    A Systematic Review for Transformer-based Long-term Series Forecasting

    Full text link
    The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field

    Deep Learning Systems with Linguistic Interpretability in Manufacturing Image Classifications

    Get PDF
    Machine learning has received considerable attention in recent decades in data-driven modelling systems and methods. Machine Learning focuses on applied maths and computing algorithms for creating ‘computational machines’ that can learn to imitate system behaviours automatically. Unlike traditional system modelling methods (physics-based, numerical etc.), machine learning does not require a dynamic process model but sufficient data, including input and output data of a specific system. It thus could get high prediction accuracy but lack interpretability. A method to add transparency to deep CNNs is adding a fuzzy logic radius basis function to specific CNN structures named RBF-CNN. With the deletion of the defuzzy layer, a more generalised form was introduced, namely ND-RBF-CNN. Both RBF-CNN and ND-RBF-CNN were benchmarked for linguistic fuzzy rules' prediction accuracy and interpretability. Both structures demonstrated good interpretability at a small cost of prediction accuracy. To improve the prediction accuracy, a general RBF layer initialisation methodology was explored
    • …
    corecore