40 research outputs found

    Domain Knowledge Guided Testing and Training of Neural Networks

    Get PDF
    The extensive impact of Deep Neural Networks (DNNs) on various industrial applications and research areas within the last decade can not be overstated. However, they are also subject to notable limitations, namely their vulnerability to various forms of security attacks and their need for excessive data - especially for particular types of DNNs such as generative adversarial networks (GANs). Tackling the former challenge, researchers have proposed several testing, analysis, and verification (TAV) methods for DNNs. However, current state-of-the-art DNN TAV methods are either not scalable to industrial-sized DNNs or are not expressible (i.e. can not test DNNs for a rich set of properties). On the other hand, making GANs more data-efficient is an open area of research, and can potentially lead to improvements in training time and costs. In this work, I address these issues by leveraging domain knowledge - task-specific knowledge provided as an additional source of information - in order to better test and train DNNs. In particular, I present Constrained Gradient Descent (CGD), a novel algorithm (and a resultant tool called CGDTest) that leverages domain knowledge (in the form of logical constraints) to create a DNN TAV method that is both scalable and expressible. Further, I introduce a novel gradient descent method (and a resultant GAN referred to as xAI-GAN) that leverages domain knowledge (provided in the form of neuron importance) to train GANs to be more data-efficient. Through empirical evaluation, I show that both tools improve over current state-of-the-art methods in their respective applications. This thesis highlights the potential of leveraging domain knowledge to mitigate DNN weaknesses and paves the way for further research in this area

    From Shallow to Deep Interactions Between Knowledge Representation, Reasoning and Machine Learning (Kay R. Amel group)

    No full text
    53 pages ; Kay R. Amel is the pen name of the working group “Apprentissage et Raisonnement” of the GDR (“Groupement De Recherche”)named “Aspects Formels et Algorithmiques de l’Intelligence Artificielle”, CNRS, France (https://www.gdria.fr/presentation/)This paper proposes a tentative and original survey of meeting points between Knowledge Representation and Reasoning (KRR) and Machine Learning (ML), two areas which have been developing quite separately in the last three decades. Some common concerns are identified and discussed such as the types of used representation, the roles of knowledge and data, the lack or the excess of information, or the need for explanations and causal understanding. Then some methodologies combining reasoning and learning are reviewed (such as inductive logic programming, neuro-symbolic reasoning, formal concept analysis, rule-based representations and ML, uncertainty in ML, or case-based reasoning and analogical reasoning), before discussing examples of synergies between KRR and ML (including topics such as belief functions on regression, EM algorithm versus revision, the semantic description of vector representations, the combination of deep learning with high level inference, knowledge graph completion, declarative frameworks for data mining, or preferences and recommendation). This paper is the first step of a work in progress aiming at a better mutual understanding of research in KRR and ML, and how they could cooperate

    Authenticity Online: Using Webnography to Address Phenomenological Concerns

    Get PDF
    In this paper, I will aim to describe a webnography-based approach to exploring issues of the authenticity of being in online spaces. Early studies held the prevailing view that online communities were exotic places and fundamentally different to the norms of everyday communication, but the issue of authenticity still demands inquiry, and using Heidegger's categories of angst and resoluteness as moods of authentic existence, it will be argued that the extent of authenticity in being online can be assessed using ethnography

    English machine reading comprehension: new approaches to answering multiple-choice questions

    Get PDF
    Reading comprehension is often tested by measuring a person or system’s ability to answer questions about a given text. Machine reading comprehension datasets have proliferated in recent years, particularly for the English language. The aim of this thesis is to investigate and improve data-driven approaches to automatic reading comprehension. Firstly, I provide a full classification of question and answer types for the reading comprehension task. I also present a systematic overview of English reading comprehension datasets (over 50 datasets). I observe that the majority of questions were created using crowdsourcing and the most popular data source is Wikipedia. There is also a lack of why, when, and where questions. Additionally, I address the question “What makes a dataset difficult?” and highlight the difference between datasets created for people and datasets created for machine reading comprehension. Secondly, focusing on multiple-choice question answering, I propose a computationally light method for answer selection based on string similarities and logistic regression. At the time (December 2017), the proposed approach showed the best performance on two datasets (MovieQA and MCQA: IJCNLP 2017 Shared Task 5 Multi-choice Question Answering in Examinations) outperforming some CNN-based methods. Thirdly, I investigate methods for Boolean Reading Comprehension tasks including the use of Knowledge Graph (KG) information for answering questions. I provide an error analysis of a transformer model’s performance on the BoolQ dataset. This reveals several important issues such as unstable model behaviour and some issues with the dataset itself. Experiments with incorporating knowledge graph information into a baseline transformer model do not show a clear improvement due to a combination of the model’s ability to capture new information, inaccuracies in the knowledge graph, and imprecision in entity linking. Finally, I develop a Boolean Reading Comprehension dataset based on spontaneously user-generated questions and reviews which is extremely close to a real-life question-answering scenario. I provide a classification of question difficulty and establish a transformer-based baseline for the new proposed dataset

    Flipping All Courses on a Semester:Students' Reactions and Recommendations

    Get PDF
    corecore