7,327 research outputs found
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Towards A Practical High-Assurance Systems Programming Language
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
Fairness Testing: A Comprehensive Survey and Analysis of Trends
Unfair behaviors of Machine Learning (ML) software have garnered increasing
attention and concern among software engineers. To tackle this issue, extensive
research has been dedicated to conducting fairness testing of ML software, and
this paper offers a comprehensive survey of existing studies in this field. We
collect 100 papers and organize them based on the testing workflow (i.e., how
to test) and testing components (i.e., what to test). Furthermore, we analyze
the research focus, trends, and promising directions in the realm of fairness
testing. We also identify widely-adopted datasets and open-source tools for
fairness testing
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
We target the problem of automatically synthesizing proofs of semantic
equivalence between two programs made of sequences of statements. We represent
programs using abstract syntax trees (AST), where a given set of
semantics-preserving rewrite rules can be applied on a specific AST pattern to
generate a transformed and semantically equivalent program. In our system, two
programs are equivalent if there exists a sequence of application of these
rewrite rules that leads to rewriting one program into the other. We propose a
neural network architecture based on a transformer model to generate proofs of
equivalence between program pairs. The system outputs a sequence of rewrites,
and the validity of the sequence is simply checked by verifying it can be
applied. If no valid sequence is produced by the neural network, the system
reports the programs as non-equivalent, ensuring by design no programs may be
incorrectly reported as equivalent. Our system is fully implemented for a given
grammar which can represent straight-line programs with function calls and
multiple types. To efficiently train the system to generate such sequences, we
develop an original incremental training technique, named self-supervised
sample selection. We extensively study the effectiveness of this novel training
approach on proofs of increasing complexity and length. Our system, S4Eq,
achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent
programsComment: 30 pages including appendi
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
The Disputation: The Enduring Representations in William Holman Hunt's âThe Finding of the Saviour in the Temple,â 1860
This interdisciplinary thesis problematizes the Jewish presence in the painting The Finding of the Saviour in the Temple (1860) by William Holman Hunt. This âJewish presenceâ refers to characters within the painting, Jews who posed for the picture and the paintingâs portrayal of Judaism. The thesis takes a phenomenological and hermeneutical approach to The Finding providing careful description and interpretation of what appears in the painting. It situates the painting within a newly configured genre of disputation paintings depicting the Temple scene from the Gospel of Luke (2:47 â 52). It asks two questions. Why does The Finding look the way it does? And how did Holman Hunt know how to create the picture? Under the rubric of the first question, it explores and challenges customary accounts of the painting, explicitly challenging the over reliance upon F.G. Stephensâs pamphlet. Additionally, it examines Pre-Raphaelite and Victorian religious contexts and bringing hitherto unacknowledged artistic contexts to the fore. The second question examines less apparent influences through an analysis of the originary Lukan narrative in conjunction with the under-examined genre of Temple âdisputationâ paintings, and a legacy of scholarly and religious disputation. This demonstrates a discourse of disputation informing The Finding over and above the biblical narrative. In showing that this discourse strongly correlates with the paintingâs objectifying and spectacular properties, this thesis provides a new way to understand The Findingâs orientalism which is further revealed in its typological critical reworking of two Christian medieval and renaissance paintings. As a demonstration of the discourse, the thesis includes an examination of Jewish artists who addressed the theme of disputation overtly or obliquely thereby engaging with and challenging the assumptions upon which the disputation rests
Recommended from our members
An Agile Musicology: Improvisation in Corporate Management and Lean Startups
The last decade of the twentieth century saw a proliferation of publications that use jazz as a metaphor for corporate management, arguing that in the contemporary knowledge economy, jazz is superior to the symphonic model that governed mid-century factory floors. As the literature on the jazz metaphor, and organizational improvisation more broadly, continued to develop into the twenty-first century, another managerial methodology became widely adopted by entrepreneurs: agile. While agile is yet to be fully theorized as an improvisatory practice, agile shares several core tenets with the models promoted by organizational improvisation scholars, including the use of small teams, an emphasis on feedback, and an openness to change. In this dissertation, I argue that agile methods, and the adjacent lean methodology, are inherently improvisatory and that understanding them as improvisatory offers opportunities not only for their deployment within growing businesses, but also for adoption at-scale in large corporations.
I draw on an array of disciplinary perspectives, including management science, organizational studies, musicology, and critical improvisation studies, as well as a wide range of sources, from peer-reviewed journal publications to trade manuals. Each chapter builds upon the former: a substantial and critical review of the jazz metaphor literature is followed by a dissection of its main themes under a musicological lens; after securing the foundations of organizational improvisation, the next chapter reveals the improvisatory nature of agile and lean startup practices and links them to concepts discussed within the jazz metaphor literature. Drawing on insights from large-scale improvisatory musical practices, the final chapter reveals how improvisation, as a set of practices shared between corporate management and agile methodologies, provides avenues for agile to be scaled up as startups grow or for its widespread adoption within established companies
Beyond invisibility: The position and role of the literary translator in the digital paratextual space
This thesis presents a new theoretical framework through which to analyse the visibility of literary translators in the digital materials that present translations to readers, referred to throughout as paratextual spaces. Central to this model is the argument that paratextual âvisibilityâ must be understood as including both the way translators and their labour are presented to readers, defined here as their position, and also their role in the establishment of that position. Going beyond Lawrence Venutiâs concept of invisibility as an inevitably negative position to be fought against, this thesis instead establishes paratextual visibility as a complex negotiation between the agency of individual translators, the needs of a publishing house and the interests of readers.
The value of this approach is demonstrated through a case study examining the visibility of translator Jamie Bulloch in the digital spaces surrounding his English-language translations of two novels by German author Timur Vermes: Look Whoâs Back and The Hungry and the Fat. This analysis finds that even though Bulloch played an early role in creating the publisherâs paratextual materials, publisher MacLehose Press prioritised making the novelsâ German origins and the foreignness of the texts visible over Bullochâs status as the translator, or his translatorship. Bullochâs limited visibility in the publisher-created materials was then reproduced in digital paratexts created by readers and third parties such as retailer Amazon, despite his attempts to interact with readers and perform his translatorship in digital spaces such as Twitter. Rather than challenging Bullochâs limited visibility, then, digital spaces served to amplify it. This thesis therefore finds that the translatorâs active participation in the promotion of their work does not always equate to increased visibility, thus demonstrating the need to go beyond Venutiâs invisibility and towards understanding the multifaceted roles played by translators in presenting literary texts to new audiences
From wallet to mobile: exploring how mobile payments create customer value in the service experience
This study explores how mobile proximity payments (MPP) (e.g., Apple Pay) create customer value in the service experience compared to traditional payment methods (e.g. cash and card). The main objectives were firstly to understand how customer value manifests as an outcome in the MPP service experience, and secondly to understand how the customer activities in the process of using MPP create customer value. To achieve these objectives a conceptual framework is built upon the Grönroos-Voima Value Model (Grönroos and Voima, 2013), and uses the Theory of Consumption Value (Sheth et al., 1991) to determine the customer value constructs for MPP, which is complimented with Script theory (Abelson, 1981) to determine the value creating activities the consumer does in the process of paying with MPP.
The study uses a sequential exploratory mixed methods design, wherein the first qualitative stage uses two methods, self-observations (n=200) and semi-structured interviews (n=18). The subsequent second quantitative stage uses an online survey (n=441) and Structural Equation Modelling analysis to further examine the relationships and effect between the value creating activities and customer value constructs identified in stage one. The academic contributions include the development of a model of mobile payment services value creation in the service experience, introducing the concept of in-use barriers which occur after adoption and constrains the consumers existing use of MPP, and revealing the importance of the mobile in-hand momentary condition as an antecedent state. Additionally, the customer value perspective of this thesis demonstrates an alternative to the dominant Information Technology approaches to researching mobile payments and broadens the view of technology from purely an object a user interacts with to an object that is immersed in consumersâ daily life
- âŠ