Search CORE

45 research outputs found

Multimodal communicative competence in second language contexts

Author: Royce TD
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2013
Field of study

Towards Machine-Assisted Meta Studies of Astrophysical Data From the Scientific Literature

Author: Crossland Thomas David
Publication venue: UCL (University College London)
Publication date: 28/02/2023
Field of study

We develop a new model for automatic extraction of reported measurements from the astrophysical literature, utilising modern Natural Language Processing techniques. We begin with a rules-based model for keyword-search-based extraction, and then proceed to develop artificial neural network models for full entity and relation extraction from free text. This process also requires the creation of hand-annotated datasets selected from the available astrophysical literature for training and validation purposes. We use a set of cosmological parameters to examine the model's ability to identify information relating to a specific parameter and to illustrate its capabilities, using the Hubble constant as a primary case study due to the well-document history of that parameter. Our results correctly highlight the current tension present in measurements of the Hubble constant and recover the 3.5σ discrepancy – demonstrating that the models are useful for meta-studies of astrophysical measurements from a large number of publications. From the other cosmological parameter results we can clearly observe the historical trends in the reported values of these quantities over the past two decades, and see the impacts of landmark publications on our understanding of cosmology. The outputs of these models, when applied to the article abstracts present in the arXiv repository, constitute a database of over 231,000 astrophysical numerical measurements, relating to over 61,000 different symbolic parameter representations – here a measurement refers to the combination of a numerical value and an identifier (i.e. a name or symbol) to give it physical meaning. We present an online interface (Numerical Atlas) to allow users to query and explore this database, based on parameter names and symbolic representations, and download the resulting datasets for their own research uses

UCL Discovery

Proceedings of the workshop on language technology for normalisation of less-resourced languages (SaLTMiL 8 - AfLaT 2012)

Author: De Pauw Guy
de Schryver Gilles-Maurice
Forcada Mike L
Sarasola Kepa
Tyers Francis M
Wagacha Peter W
Publication venue: European Language Resources Association
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

Theoretical and empirical arguments for the reassessment of the notion of paradigm

Author
Publication venue
Publication date: 01/01/2022
Field of study

The volume discusses the breadth of applications for an extended notion of paradigm. Paradigms in this sense are not only tools of morphological description but constitute the inherent structure of grammar. Grammatical paradigms are structural sets forming holistic, semiotic structures with an informational value of their own. We argue that as such, paradigms are a part of speaker knowledge and provide necessary structuring for grammaticalization processes. The papers discuss theoretical as well as conceptual questions and explore different domains of grammatical phenomena, ranging from grammaticalization, morphology, and cognitive semantics to modality, aiming to illustrate what the concept of grammatical paradigms can and cannot (yet) explain

Institutional Repository of the Freie Universität Berlin

MULTI-MODAL TASK INSTRUCTIONS TO ROBOTS BY NAIVE USERS

Author: WOLF JOERG CHRISTIAN
Publication venue: 'University of Plymouth'
Publication date: 01/01/2008
Field of study

This thesis presents a theoretical framework for the design of user-programmable robots. The objective of the work is to investigate multi-modal unconstrained natural instructions given to robots in order to design a learning robot. A corpus-centred approach is used to design an agent that can reason, learn and interact with a human in a natural unconstrained way. The corpus-centred design approach is formalised and developed in detail. It requires the developer to record a human during interaction and analyse the recordings to find instruction primitives. These are then implemented into a robot. The focus of this work has been on how to combine speech and gesture using rules extracted from the analysis of a corpus. A multi-modal integration algorithm is presented, that can use timing and semantics to group, match and unify gesture and language. The algorithm always achieves correct pairings on a corpus and initiates questions to the user in ambiguous cases or missing information. The domain of card games has been investigated, because of its variety of games which are rich in rules and contain sequences. A further focus of the work is on the translation of rule-based instructions. Most multi-modal interfaces to date have only considered sequential instructions. The combination of frame-based reasoning, a knowledge base organised as an ontology and a problem solver engine is used to store these rules. The understanding of rule instructions, which contain conditional and imaginary situations require an agent with complex reasoning capabilities. A test system of the agent implementation is also described. Tests to confirm the implementation by playing back the corpus are presented. Furthermore, deployment test results with the implemented agent and human subjects are presented and discussed. The tests showed that the rate of errors that are due to the sentences not being defined in the grammar does not decrease by an acceptable rate when new grammar is introduced. This was particularly the case for complex verbal rule instructions which have a large variety of being expressed

CiteSeerX

Plymouth Electronic Archive and Research Library

OpenGrey Repository

Paradigms regained

Author
Publication venue: Language Science Press
Publication date: 22/04/2022
Field of study

Directory of Open Access Books (DOAB)

Towards the Repayment of Self-Admitted Technical Debt

Author: Sierra Giancarlo
Publication venue
Publication date: 29/01/2019
Field of study

Technical Debt is a metaphor used to express sub-optimal source code implementations that are introduced for short-term benefits that often must be paid back later, at an increased cost. In recent years, various empirical studies have focused on investigating source code comments that indicate Technical Debt, often referred to as Self-Admitted Technical Debt (SATD). In this thesis, we survey research work on SATD, analyzing characteristics of current approaches and techniques for SATD, dividing literature in three categories: detection, comprehension, and repayment. To set the stage for novel and improved work on SATD, we compile tools, resources, and data sets made publicly available. We also identify areas that are missing investigation, open challenges, and discuss potential future research avenues. From the literature survey, we conclude that most findings and contributions have focused on techniques to identify, classify, and comprehend SATD. Few studies focused on the repayment or management of SATD, which is an essential goal of studying technical debt for software maintenance. Therefore, we perform an empirical study towards SATD repayment. We conducted a preliminary online survey with developers to understand the elements they consider to prioritize SATD. With the acquired knowledge from the survey responses and previous literature work, we select metrics to estimate SATD repayment effort. We examine SATD instances found in software systems to see how it has been repaid and investigate the possibility of using historical data at the time of SATD introduction as indicators for SATD that should be addressed. We find two SATD repayment effort metrics that can be consistently modeled in our studied projects and surface the best early indicators for important SATD

Concordia University Research Repository

Paradigms regained

Author
Publication venue
Publication date
Field of study

OAPEN Library

Collateral adjectives in English and related Issues

Author: Koshiishi Tetsuya
Publication venue: The University of Edinburgh
Publication date: 01/01/2009
Field of study

Edinburgh Research Archive

Statistical modelling of lexical and syntactic complexity of postgraduate academic writing: a genre and corpus-based study of EFL,ESL, and English L1 M.A. dissertations

Author: Nasseri Maryam
Publication venue
Publication date: 21/07/2021
Field of study

This research is an interdisciplinary study that adopts the principles of corpus linguistics and the methods of quantitative linguistics and statistical modelling to analyse the rhetorical sections of MA dissertations written by EFL, ESL, and English L1 postgraduate students. A discipline-specific corpus was analysed for 22 lexical and 11 syntactic complexity measures using three natural language processing tools [LCA-AW, TAALED, Coh Metrix] to find differences of academic texts by English L1 vs. L2 and to investigate the relationship between these linguistic indices. Structural factor analyses as well as the two statistical modelling methods of linear mixed-effects modelling and the supervised machine learning predictive classification modelling were then employed to verify the existing classification of the complexity indices, to explore their further dimensions, to investigate the effects of English language background and rhetorical sections on the production of lexically and syntactically complex texts, and finally to predict models that can best classify the group membership and the membership to the rhetorical sections based on the values of these measures. This investigation resulted in more than 20 specific findings with important implications for academic writing assessment of English L1 vs. L2, for academic writing research on rhetorical sections of English academic texts, for academic writing instruction especially materials development and syllabus designs in the EFL contexts, and academic immersion programmes, for the measure-testing and selection processes, and for methodological aspects of statistical modelling in corpus-based academic studies

University of Birmingham Research Archive, E-theses Repository