38 research outputs found
Unlocking the Pragmatics of Emoji: Evaluation of the Integration of Pragmatic Markers for Sarcasm Detection
Emojis have become an integral element of online communications, serving as a powerful, under-utilised resource for enhancing pragmatic understanding in NLP. Previous works have highlighted their potential for improvement of more complex tasks such as the identification of figurative literary devices including sarcasm due to their role in conveying tone within text. However present state-of-the-art does not include the consideration of emoji or adequately address sarcastic markers such as sentiment incongruence. This work aims to integrate these concepts to generate more robust solutions for sarcasm detection leveraging enhanced pragmatic features from both emoji and text tokens. This was achieved by establishing methodologies for sentiment feature extraction from emojis and a depth statistical evaluation of the features which characterise sarcastic text on Twitter. Current convention for generation of training data which implements weak-labelling using hashtags or keywords was evaluated against a human-annotated baseline; postulated validity concerns were verified where statistical evaluation found the content features deviated significantly from the baseline, highlighting potential validity concerns for many prominent works on the topic to date. Organic labelled sarcastic tweets containing emojis were crowd sourced by means of a survey to ensure valid outcomes for the sarcasm detection model. Given an established importance of both semantic and sentiment information, a novel sentiment-aware attention mechanism was constructed to enhance pattern recognition, balancing core features of sarcastic text: sentiment incongruence and context. This work establishes a framework for emoji feature extraction; a key roadblock cited in literature for their use in NLP tasks. The proposed sarcasm detection pipeline successfully facilitates the task using a GRU neural network with sentiment-aware attention, at an accuracy of 73% and promising indications regarding model robustness as part of a framework which is easily scalable for the inclusion of any future emojis released. Both enhanced sentiment information to supplement context in addition to consideration of the emoji were found to improve outcomes for the task
Implicit indefinite objects at the syntax-semantics-pragmatics interface: a probabilistic model of acceptability judgments
Optionally transitive verbs, whose Patient participant is semantically obligatory but
syntactically optional (e.g., to eat, to drink, to write), deviate from the transitive prototype
defined by Hopper and Thompson (1980). Following Fillmore (1986), unexpressed objects
may be either indefinite (referring to prototypical Patients of a verb, whose actual entity
is unknown or irrelevant) or definite (with a referent available in the immediate intra- or
extra-linguistic context). This thesis centered on indefinite null objects, which the literature
argues to be a gradient, non-categorical phenomenon possible with virtually any transitive
verb (in different degrees depending on the verb semantics), favored or hindered by several
semantic, aspectual, pragmatic, and discourse factors. In particular, the probabilistic
model of the grammaticality of indefinite null objects hereby discussed takes into account
a continuous factor (semantic selectivity, as a proxy to object recoverability) and four
binary factors (telicity, perfectivity, iterativity, and manner specification).
This work was inspired by Medina (2007), who modeled the effect of three predictors
(semantic selectivity, telicity, and perfectivity) on the grammaticality of indefinite null
objects (as gauged via Likert-scale acceptability judgments elicited from native speakers
of English) within the framework of Stochastic Optimality Theory. In her variant of the
framework, the constraints get floating rankings based on the input verb’s semantic
selectivity, which she modeled via the Selectional Preference Strength measure by Resnik
(1993, 1996). I expanded Medina’s model by modeling implicit indefinite objects in two
languages (English and Italian), by using three different measures of semantic selectivity
(Resnik’s SPS; Behavioral PISA, inspired by Medina’s Object Similarity measure; and
Computational PISA, a novel similarity-based measure by Cappelli and Lenci (2020)
based on distributional semantics), and by adding iterativity and manner specification as
new predictors in the model.
Both the English and the Italian five-predictor models based on Behavioral PISA explain
almost half of the variance in the data, improving on the Medina-like three-predictor
models based on Resnik’s SPS. Moreover, they have a comparable range of predicted
object-dropping probabilities (30-100% in English, 30-90% in Italian), and the predictors
perform consistently with theoretical literature on object drop. Indeed, in both models,
atelic imperfective iterative manner-specified inputs are the most likely to drop their
object (between 80% and 90%), while telic perfective non-iterative manner-unspecified
inputs are the least likely (between 30% and 40%). The constraint re-ranking probabilities
are always directly proportional to semantic selectivity, with the exception of Telic End
in Italian. Both models show a main effect of telicity, but the second most relevant factor
in the model is perfectivity in English and manner specification in Italian
Quarantine interceptions & transparency in horticultural supply chains: causes and outcomes in Uganda, a qualitative case study
The horticultural industry in sub-Saharan Africa (SSA) has witnessed unprecedented growth in recent years, fuelled by increased demand for temperate fruits and vegetables in the European market. In Uganda, the introduction of Non-Traditional Agricultural Exports (NTAEs) in the post-civil war era (mid 1980s onwards) as an export diversification strategy was met with limited success attributed to agronomical, logistical, and institutional challenges that resulted in a relatively small and fragile horticultural industry, serving a limited market specialised in the ethnic/exotic food trade. However, in recent years, increased demand from the diaspora has created new opportunities for Uganda´s ethnic/exotic horticultural exports in a buoyant industry that has increased fourfold over the last two decades. Meanwhile, this renewed opportunity is threatened by EU/UK legislation targeting the introduction and spread of organisms considered harmful to the environment. The threat is manifested in the interception and destruction of consignments found to be infested by (regulated) organisms (notably the false coddling moth (FCM)). While being common to all SSA countries infested by the FCM, interceptions have been particularly high for Uganda over the last seven years, a period coinciding with the boom in its horticultural industry.
Based on an instrumental case study design consisting of semi-structured interviews, document reviews, and participant observations, this research investigates the cause of interceptions in the Ugandan Horticultural Export Supply Chain, (fresh fruits & vegetables) and their relationship to the concept of transparency, which is increasingly core to agri-food chains. In line with the Global Value Chain (GVC) approach (Gereffi, 1999; 2005), it examines the response and outcomes resulting from attempts to comply with international public standards governing agricultural supply chains.
Findings indicate that a combination of environmental (e.g., regulatory), people (e.g., literacy levels of Outgrowers), process (e.g., bureaucracy) and technological (e.g., lack of IT infrastructure) factors working together as inhibitors of transparency are to account for the rising wave of interceptions.
Uganda´s response to interceptions, described in this study as the regulated integration (backwards) of supply chain relationships through the mandatory registration of producers is yielding results. This is in terms of enhanced capability development and supply chain transparency in a process described by the GVC literature as process upgrading. In so doing, the research contributes to the literature on supply chain transparency while suggesting a renewed focus of GVC research on the role of public standards (as opposed to private governance) in the upgrading and integration of developing countries in the world economy. The research is limited by the lack of a quantitative approach to validating findings that are essentially qualitative in nature. Future research involves the validation of transparency inhibitor matrix for the prioritisation of improvement initiatives in a quantitative study as well as an investigation of opportunities for improving Uganda´s phytosanitary certification process with distributed ledger technology
Optimal Experimental Design Applied to Models of Microbial Gene Regulation
Microbial gene expression is a comparatively well understood process, but regulatory interactions between genes can give rise to complicated behaviours. Regulatory networks can exhibit strong context dependence, time-varying interactions and multiple equilibrium. The qualitative diagrammatic models often used in biology are not well suited to reasoning about such intricate dynamics. Fortunately, mathematics offers a natural language to model gene regulation because it can quantify the various system inter-dependencies with much greater clarity and precision. This added clarity makes models of microbial gene regulation a valuable tool for studying both natural and synthetic gene regulatory systems.
However models are only as good as the knowledge and assumptions they are built on. Specifically, all models depend on unknown parameters -- constant that quantify specific rates and interaction strengths within the regulatory system. In systems biology parameters are generally fit, rather than measured directly, because their values are contextually dependent on state of the microbial host. This fitting requires collecting observations of the modeled system. Exactly what is measured, how many times and under what experimental conditions defines an experimental design. The experimental design is intimately linked to the accuracy of any resulting parameter estimates for a model, but determining what experimental design will be useful for fitting can be difficult. Optimal experimental design (OED) provides a set of statistical techniques that can be used make design choices that improve parameter estimation accuracy.
In this thesis I examine the use of OED methods applied to models of microbial gene regulation. I have specifically focused on optimal design methods that combine asymptotic parametric accuracy objectives, based on the Fisher information matrix, with relaxed formulations of the design optimization problem. I have applied these OED methods to three biological case studies. (1) I have used these methods to implement a multiple-shooting optimal control algorithm for optimal design of dynamic experiments. This algorithm was applied to a novel model of transcriptional regulation that accounts for the microbial host's physiological context. Optimal experiments were derived for estimating sequence-specific regulatory parameters and host-specific physiological parameters. (2) I have used OED methods to formulate an optimal sample scheduling algorithm for dynamic induction experiments. This algorithm was applied to a model of an optogenetic induction system -- an important tool for dynamic gene expression studies. The value of sampling schedules within dynamic experiments was examined by comparing optimal and naive schedules. (3) I derived an optimal experimental procedure for fitting a steady-state model of single-cell observations from a bistable regulatory motif. This system included a stochastic model of gene expression and the OED methods made use of the linear noise approximation to derive a tractable design algorithm. In addition to these case-studies, I also introduce the NLOED software package. The package can perform optimal design and a number of other fitting and diagnostic procedures on both static and dynamic multi-input multi-output models. The package makes use automatic differentiation for efficient computation, offers a flexible modeling interface, and will make OED more accessible to the wider biological community. Overall, the main contributions of this thesis include: developing novel OED methods for a variety of gene regulatory scenarios, studying optimal experimental design properties for these scenarios, and implementing open-source numerical software for a variety of OED problems in systems biology
Recommended from our members
The Intuition of Knowing: Its Biological Function and Natural Triggering-Conditions
Over the last hundred years, competing and incompatible positions in relation to basic problems of knowledge and the use of the verb ‘to know’ have multiplied; and the prospect of a consensus solution emerging with respect to any of the problems has not seemed particularly good. We have a Gordian knot. Even so, I suggest that we also have a way to cut it. This will involve identifying why the cognitive mechanism that produces our intuitions of knowing evolved and was maintained (by natural selection), i.e., identifying the ‘teleonomic function’ of that cognitive mechanism. Also, it will involve predicting, on the basis of this teleonomic function, the triggering-conditions of these natural knowledge intuitions. In this thesis, I develop a general theory of the origin, function and triggering-conditions of knowledge intuitions that will allow us to cut that knot. That theory follows basic biological theory (including that which pertains to natural altruism) and also signal detection theory.
My theory identifies a number of different circumstances under which the triggering-conditions of knowledge intuitions are different. Strikingly, these different circumstances (and their associated triggering-conditions) map onto the different competing and incompatible epistemological positions to which I referred. This suggests that these positions are all correct within the boundaries of one of the circumstances that my theory identifies; and that the Gordian knot is largely the result of epistemologists claiming universal applicability of a theory that in fact only applies under particular circumstances. We cut the knot by specifying the different circumstances under which each of the different epistemological positions will hold, and the reason we should expect it to hold in just these circumstances, in light of the teleonomic function of knowledge intuitions
Unravelling the insulin signalling pathway using mechanistic modelling
Type two diabetes affects 5% of the world's population and is increasing in prevalence. A key precursor to this disease is insulin resistance, which is characterised by a loss of responsiveness to insulin in liver, muscle and adipose tissue. This thesis focuses on understanding insulin signalling using the 3T3-L1 adipocyte cell model. Computational modelling was used to generate quantitative predictions in the signalling pathways of the adipocyte, many of which are mediated by enzymatic reactions. This study began by comparing existing enzyme kinetic models and evaluating their applicability to insulin signalling in particular. From this understanding, we developed an improved enzyme kinetic model, the differential quasi-steady state model (dQSSA), that avoids the reactant stationary assumption used in the Michaelis Menten model. The dQSSA was found to more accurately model the behaviours of enzymes in large in silico systems, and in various coenzyme inhibited and non-inhibited reactions in vitro. To apply the dQSSA, the SigMat software package was developed in the MATLAB environment to construct mathematical models from qualitative descriptions of networks. After the robustness of the package was verified, it was used to construct a basic model of the insulin signalling pathway. This model was trained against experimental temporal data at 1 nM and 100 nM doses of insulin. It revealed that the simple description of Akt activation, which displays an overshoot behaviour, was insufficient to describe the kinetics of substrate phosphorylation, which does not display the overshoot behaviour. The model was expanded to include Akt translocation and the individual phosphorylation at the 308 and 473 residues. This model resolved the discrepancy and predicts that Akt substrates are only accessible to Akt localised in the cytosol and that PIP3 sequestration of cytosolic Akt acts as a negative feedback
Sintaksička i informacijsko-strukturalna obeležja glagolske fraze u staroengleskom
This thesis deals with the alternation in the AB position of the finite and the non-finite verb in Old English, specifically, with the alternation finite verb-final vs. finite verb-non-final embedded clauses, and the alternation object–verb (OV) vs. verb–object (VO) alternation in the non-finite verb phrase. The central proposal is that information-structural factors underlie most of the Old English word order patterns, including these alternations. What influences the surface position of the finite verb in embedded clauses is the discourse status of the proposition. Verb-final clauses are pragmatically presupposed, while non-final verb position signals pragmatic assertion. The OV/VO alterantion does not reflect competing structures/grammars, but rather focus marking strategies on the VP material, reflected in VO orders. We therefore propose a multi-layered model of information-structure, according to which, topic/background-focus structures are represented at three different levels, whereby the following types of focus are distinguished: sentence focus, predicate focus and ‘new information’ focus. We also present a mechanism of their interaction and syntactic encoding in Old English. Two important insights emerge from this analysis. First, Old English is a discourse configurational language. Second, at least some discourse configurational languages do not syntactically mark each individual information-structural interpretation of sentence elements. It rather seems that the syntax reflexts IS marking of a larger constituent, leaving it to the context for specific resolutions.Ova disertacija bavi se problemom alternacije u IZ poziciji finitnog i nefinitnog glagola u staroengleskom, preciznije, razlikom između zavisnih rečenica u kojima je finitni glagol u poslednjoj poziciji u klauzi, i onih u kojima se finitni glagol nalazi u višoj poziciji, kao i alternacijom u položaju nefinitnog leksičkog glagola u odnosu na objekat (objekat-glagol, naspram glagol-objekat). Osnovna hipoteza u radu jeste da su glavni redosledi reči u staroengleskom, uključujući i navedene alternacije, rezultat uticaja informacijsko-strukturalnih faktora. Položaj finitnog glagola u zavisnim rečenicama određen je diskursnim statusom propozicije. Rečenice s glagolom na poslednjem položaju u klauzi su pragmatski presuponirane, dok su one s glagolom u višoj pozicji asertivne. Što se tiče alternacije objekat-glagol/glagol-objekat, ona ne odražava sistem dvostruke gramatike, već način obeležavanja fokusa unutar glagolske fraze. Redosled glagol-objekat je markiran, u smislu da se fokus nalazi Ova disertacija bavi se problemom alternacije u poziciji finitnog i nefinitnog glagola u staroengleskom, preciznije, razlikom između zavisnih rečenica u kojima je finitni glagol u poslednjoj poziciji u klauzi, i onih u kojima se finitni glagol nalazi u višoj poziciji, kao i alternacijom u položaju nefinitnog leksičkog glagola u odnosu na objekat (objekat-glagol, naspram glagol-objekat). Osnovna hipoteza u radu jeste da su glavni redosledi reči u staroengleskom, uključujući i navedene alternacije, rezultat uticaja informacijsko-strukturalnih faktora. Položaj finitnog glagola u zavisnim rečenicama određen je diskursnim statusom propozicije. Rečenice s glagolom na poslednjem položaju u klauzi su pragmatski presuponirane, dok su one s glagolom u višoj pozicji asertivne. Što se tiče alternacije objekat-glagol/glagol-objekat, ona ne odražava sistem dvostruke gramatike, već način obeležavanja fokusa unutar glagolske fraze. Redosled glagol-objekat je markiran, u smislu da se fokus nalazi