6,582 research outputs found
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
The State of the Art in Deep Learning Applications, Challenges, and Future Prospects::A Comprehensive Review of Flood Forecasting and Management
Floods are a devastating natural calamity that may seriously harm both infrastructure and people. Accurate flood forecasts and control are essential to lessen these effects and safeguard populations. By utilizing its capacity to handle massive amounts of data and provide accurate forecasts, deep learning has emerged as a potent tool for improving flood prediction and control. The current state of deep learning applications in flood forecasting and management is thoroughly reviewed in this work. The review discusses a variety of subjects, such as the data sources utilized, the deep learning models used, and the assessment measures adopted to judge their efficacy. It assesses current approaches critically and points out their advantages and disadvantages. The article also examines challenges with data accessibility, the interpretability of deep learning models, and ethical considerations in flood prediction. The report also describes potential directions for deep-learning research to enhance flood predictions and control. Incorporating uncertainty estimates into forecasts, integrating many data sources, developing hybrid models that mix deep learning with other methodologies, and enhancing the interpretability of deep learning models are a few of these. These research goals can help deep learning models become more precise and effective, which will result in better flood control plans and forecasts. Overall, this review is a useful resource for academics and professionals working on the topic of flood forecasting and management. By reviewing the current state of the art, emphasizing difficulties, and outlining potential areas for future study, it lays a solid basis. Communities may better prepare for and lessen the destructive effects of floods by implementing cutting-edge deep learning algorithms, thereby protecting people and infrastructure
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in usersâ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018â6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
An exploration of adherence and persistence in overactive bladder and other long-term conditions
Background and aims
Overactive bladder is a common, bothersome, and chronic condition associated with
symptoms of urinary urgency, incontinence, increased daytime micturition frequency and
nocturia. Despite exerting a significant burden on quality of life, adherence, and persistence
behaviours with OAB are particularly poor in comparison with other long-term conditions.
The aims of the present work were to explore themes relating to medicine-taking
behaviours in OAB and other long-term conditions and to suggest ways to improve them.
Methods
A systematic literature review was undertaken to understand the current landscape of
qualitative work exploring adherence and persistence with OAB patients. A qualitative study
involving 1:1 semi-structured interviews was conducted with OAB patients to explore the
context and drivers for adherence and persistence behaviours using thematic analysis. A
comparative analysis was then undertaken with qualitative papers exploring medicinetaking behaviours in a chronic bowel condition, type II diabetes, and multimorbidity to
explore the themes identified in the OAB study for convergence and divergence in other
conditions and to contextualise the learnings from the former study.
Results
The systematic literature review revealed a gap in the literature of qualitative exploration of
adherence and persistence behaviours in OAB patients. The OAB study found a range of
drivers for non-adherent behaviours including a perceived lack of treatment efficacy, side
effects, unclear instructions, and drug and condition hierarchies, as well as the rich context
within which these themes sit. The comparative analysis study supported the findings of the
OAB study demonstrating evidence of key themes transcending across conditions, including a perceived lack of treatment efficacy and side effects, as well as nuances associated with
the OAB experience.
Conclusions
The present work has identified key drivers for non-adherent behaviours in OAB patients
and sets out a number of recommendations categorised within the World Health
Organisationâs 5 dimensions of adherence. These include addressing the poor understanding
and illness perception of OAB by patients and others, by improving the provision and
availability of information, as well as the work of patient support groups; scrutiny on the
support within primary care to OAB patients before and after diagnosis; and the
encouragement of realistic expectations of the condition and treatment with mindful use of
prescriberâs language at the point of prescribing. The present work has further highlighted
the utility of conceptual models of adherence such as COM-B and the NCF in understanding
medicine-taking behaviours in the context of OAB
Inclusive Intelligent Learning Management System Framework - Application of Data Science in Inclusive Education
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceBeing a disabled student the author faced higher education with a handicap which as experience
studying during COVID 19 confinement periods matched the findings in recent research about the
importance of digital accessibility through more e-learning intensive academic experiences. Narrative
and systematic literature reviews enabled providing context in World Health Organizationâs
International Classification of Functioning, Disability and Health, legal and standards framework and
information technology and communication state-of-the art. Assessing Portuguese higher education
institutionsâ web sites alerted to the fact that only outlying institutions implemented near perfect,
accessibility-wise, websites.
Therefore a gap was identified in how accessible the Portuguese higher education websites are, the
needs of all students, including those with disabilities, and even the accessibility minimum legal
requirements for digital products and the services provided by public or publicly funded organizations.
Having identified a problem in society and exploring the scientific base of knowledge for context and
state of the art was a first stage in the Design Science Research methodology, to which followed
development and validation cycles of an Inclusive Intelligent Learning Management System
Framework. The framework blends various Data Science study fields contributions with accessibility
guidelines compliant interface design and content upload accessibility compliance assessment.
Validation was provided by a focus group whose inputs were considered for the version presented in
this dissertation. Not being the purpose of the research to deliver a complete implementation of the
framework and lacking consistent data to put all the modules interacting with each other, the most
relevant modules were tested with open data as proof of concept.
The rigor cycle of DSR started with the inclusion of the previous thesis on Atlântica University Institute
Scientific Repository and is to be completed with the publication of this thesis and the already started
PhDâs findings in relevant journals and conferences
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemesâparticularly, Multi-Instance Learning and classical Machine Learning formulationsâto model student behaviour. Besides, Explainable Artificial Intelligence is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2,500 submissions from roughly 90 different students from a programming-related course in a Computer Science degree. The results obtained validate the proposal: the model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioural pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.This work has been partially funded by the âPrograma Redes-I3CE de investigacion en docencia universitaria del Instituto de Ciencias de la Educacion (REDES-I3CE-2020-5069)â of the University of Alicante. The third author is supported by grant APOSTD/2020/256 from âPrograma I+D+I de la Generalitat Valencianaâ
Syntactic change during the anglicisation of Scots: insights from the Parsed Corpus of Scottish Correspondence
Variation and change in syntax is particularly challenging to measure quantitatively, as such
investigation requires syntactically annotated (parsed) corpora; a parsed digital corpus allows for
retrieval of all instances of a construction or particular word order in a fraction of the time it would
take to retrieve the same information by hand. Compared to English, research on syntactic change
in the history of Scots has been limited, in part due to the lack of such a resource. In order to meet
these demands, this thesis presents the new Parsed Corpus of Scottish Correspondence (PCSC),
consisting of 270,000 words of parsed data from the Helsinki Corpus of Scottish Correspondence
1540-1750 (Meurman-Solin and VARIENG 2017), and demonstrates the process in turning strings
of words into searchable clause tokens by using a combination of automated and manual methods.
The PCSC provides data from the 16th to 18th century, a previous blind spot within Scots syntax
research despite being a highly interesting time period to investigate; these centuries saw a shift
in the relationship between Scots and English, as English started to exert influence over Scots as
a more socio-politically prestigious variety â consequently, salient Scots features were increasingly
replaced by English ones in writing. Thus, the 16th-18th century marks a period of great change in
Scots, as it went from being a more distinct variety on a standardisation trajectory, to the mixed
variety we encounter in Scotland today.
Using the new parsed data from the PCSC, I present results from three case studies on syntactic
change in 16th to 18th century Scots, thus beginning to fill the gaps in our knowledge of this period.
The findings of the case studies reveal the transformative nature of Scots syntax in the 16th to 18th
century, as the language undergoes dramatic changes in its subject-verb agreement system through
the decline of the Northern Subject Rule and the rise of do-support, and further rearrangement in
the verbal paradigm through the rise of verbal -ing in both participial and gerundive function. On
assessing whether these changes can be attributed to influence from English, or whether they are
simply parallel developments in closely related language varieties, it is found that the nature of
contact between Scots and English in the 16th-18th century, and the timing in which the changes
take place, speaks in favour of these changes being contact-induced. However, further fine-grained
investigation into the functions and distribution of the features involved, in Scots compared to
English, will be needed before more firm conclusions can be drawn regarding the origin of the
changes
Recommended from our members
Toward Annotation Efficiency in Biased Learning Settings for Natural Language Processing
The goal of this thesis is to improve the feasibility of building applied NLP systems for more diverse and niche real-world use-cases of extracting structured information from text. A core factor in determining this feasibility is the cost of manually annotating enough unbiased labeled data to achieve a desired level of system accuracy, and our goal is to reduce this cost. We focus on reducing this cost by making contributions in two directions: (1) easing the annotation burden by leveraging high-level expert knowledge in addition to labeled examples, thus making approaches more annotation-efficient; and (2) mitigating known biases in cheaper, imperfectly labeled real-world datasets so that we may use them to our advantage. A central theme of this thesis is that high-level expert knowledge about the data and task can allow for biased labeling processes that focus experts on only manually labeling aspects of the data that cannot be easily labeled through cheaper means. This combination allows for more accurate models with less human effort. We conduct our research on this general topic through three diverse problems with immediate applications to real-world settings.
First, we study an applied problem in biased text classification. We encounter a rare-event text classification system that has been deployed for several years. We are tasked with improving this system's performance using only the severely biased incidental feedback provided by the experts over years of system use. We develop a method that combines importance weighting and an unlabeled data imputation scheme that exploits the selection-bias of the feedback to train an unbiased classifier without requiring additional labeled data. We experimentally demonstrate that this method considerably improves the system performance.
Second, we tackle an applied problem in named entity recognition (NER) concerning learning tagging models from data that have very low recall for annotated entities. To solve this issue we propose a novel loss, the Expected Entity Ratio (EER), that uses an uncertain estimate of the proportion of entities in the data to counteract the false-negative bias in the data, encouraging the model to have the correct ratio of entities in expectation. We justify the principles of our approach by providing theory that shows it recovers the true tagging distribution under mild conditions. Additionally we provide extensive empirical results that show it to be practically useful. Empirically, we find that it meets or exceeds performance of state-of-the-art baselines across a variety of languages, annotation scenarios, and amounts of labeled data. We also show that, when combined with our approach, a novel sparse annotation scheme can outperform exhaustive annotation for modest annotation budgets.
Third, we study the challenging problem of syntactic parsing in low-resource languages. We approach the problem from a cross-lingual perspective, building on a state-of-the-art transfer-learning approach that underperforms on ``distant'' languages that have little to no representation in the training corpus. Motivated by the field of syntactic typology, we introduce a general method called Expected Statistic Regularization (ESR) to regularize the parser on distant languages according to their expected typological syntax statistics. We also contribute general approaches for estimating the loss supervision parameters from the task formalism or small amounts of labeled data. We present seven broad classes of descriptive statistic families and provide extensive experimental evidence showing that using these statistics for regularization is complementary to deep learning approaches in low-resource transfer settings.
In conclusion, this thesis contributes approaches for reducing the annotation cost of building applied NLP systems through the use of high-level expert knowledge to impart additional learning signal on models and cope with cheaper biased data. We publish implementations of our methods and results, so that they may facilitate future research and applications. It is our hope that the frameworks proposed in this thesis will help to democratize access to NLP for producing structured information from text in wider-reaching applications by making them faster and cheaper to build
- âŚ