Search CORE

21,687 research outputs found

Statistical pattern recognition for automatic writer identification and verification

Author: Bulacu Marius Lucian
Publication venue: s.n.
Publication date: 01/01/2007
Field of study

Dialogue as Data in Learning Analytics for Productive Educational Dialogue

Author: Knight Simon
Littleton Karen
Publication venue: 'Society for Learning Analytics Research'
Publication date: 01/01/2015
Field of study

This paper provides a novel, conceptually driven stance on the state of the contemporary analytic challenges faced in the treatment of dialogue as a form of data across on- and offline sites of learning. In prior research, preliminary steps have been taken to detect occurrences of such dialogue using automated analysis techniques. Such advances have the potential to foster effective dialogue using learning analytic techniques that scaffold, give feedback on, and provide pedagogic contexts promoting such dialogue. However, the translation of much prior learning science research to online contexts is complex, requiring the operationalization of constructs theorized in different contexts (often face-to-face), and based on different datasets and structures (often spoken dialogue). In this paper, we explore what could constitute the effective analysis of productive online dialogues, arguing that it requires consideration of three key facets of the dialogue: features indicative of productive dialogue; the unit of segmentation; and the interplay of features and segmentation with the temporal underpinning of learning contexts. The paper thus foregrounds key considerations regarding the analysis of dialogue data in emerging learning analytics environments, both for learning-science and for computationally oriented researchers

OPUS - University of Technology Sydney

Writer adaptation for offline text recognition: An exploration of neural network-based methods

Author: Dhali Maruf A.
Schomaker Lambert
van der Werff Tobias
Publication venue
Publication date: 11/07/2023
Field of study

Handwriting recognition has seen significant success with the use of deep learning. However, a persistent shortcoming of neural networks is that they are not well-equipped to deal with shifting data distributions. In the field of handwritten text recognition (HTR), this shows itself in poor recognition accuracy for writers that are not similar to those seen during training. An ideal HTR model should be adaptive to new writing styles in order to handle the vast amount of possible writing styles. In this paper, we explore how HTR models can be made writer adaptive by using only a handful of examples from a new writer (e.g., 16 examples) for adaptation. Two HTR architectures are used as base models, using a ResNet backbone along with either an LSTM or Transformer sequence decoder. Using these base models, two methods are considered to make them writer adaptive: 1) model-agnostic meta-learning (MAML), an algorithm commonly used for tasks such as few-shot classification, and 2) writer codes, an idea originating from automatic speech recognition. Results show that an HTR-specific version of MAML known as MetaHTR improves performance compared to the baseline with a 1.4 to 2.0 improvement in word error rate (WER). The improvement due to writer adaptation is between 0.2 and 0.7 WER, where a deeper model seems to lend itself better to adaptation using MetaHTR than a shallower model. However, applying MetaHTR to larger HTR models or sentence-level HTR may become prohibitive due to its high computational and memory requirements. Lastly, writer codes based on learned features or Hinge statistical features did not lead to improved recognition performance.Comment: 21 pages including appendices, 6 figures, 10 table

arXiv.org e-Print Archive

Cognitive Grammar in Contemporary Fiction

Author: Harrison Chloe
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/05/2017
Field of study

This book proposes an extension of Cognitive Grammar (Langacker 1987, 1991, 2008) towards a cognitive discourse grammar, through the unique environment that literary stylistic application offers. Drawing upon contemporary research in cognitive stylistics (Text World Theory, deixis and mind-modelling, amongst others), the volume scales up central Cognitive Grammar concepts (such as construal, grounding, the reference point model and action chains) in order to explore the attenuation of experience – and how it is simulated – in literary reading. In particular, it considers a range of contemporary texts by Neil Gaiman, Jennifer Egan, Jonathan Safran Foer, Ian McEwan and Paul Auster. This application builds upon previous work that adopts Cognitive Grammar for literary analysis and provides the first extended account of Cognitive Grammar in contemporary fiction

Copyright in Utilitarian Objects: Beneath Metaphysics

Author: Lynch Michael J.
Publication venue: eCommons
Publication date: 01/04/1991
Field of study

Bayesian hierarchical modeling for the forensic evaluation of handwritten documents

Author: Crawford Amy
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2020
Field of study

The analysis of handwritten evidence has been used widely in courts in the United States since the 1930s (Osborn, 1946). Traditional evaluations are conducted by trained forensic examiners. More recently, there has been a movement toward objective and probability-based evaluation of evidence, and a variety of governing bodies have made explicit calls for research to support the scientific underpinnings of the field (National Research Council, 2009; President\u27s Council of Advisors on Science and Technology (US), 2016; National Institutes of Standards and Technology). This body of work makes contributions to help satisfy those needs for the evaluation of handwritten documents. We develop a framework to evaluate a questioned writing sample against a finite set of genuine writing samples from known sources. Our approach is fully automated, reducing the opportunity for cognitive biases to enter the analysis pipeline through regular examiner intervention. Our methods are able to handle all writing styles together, and result in estimated probabilities of writership based on parametric modeling. We contribute open-source datasets, code, and algorithms. A document is prepared for the evaluation processed by first being scanned and stored as an image file. The image is processed and the text within is decomposed into a sequence of disjoint graphical structures. The graphs serve as the smallest unit of writing we will consider, and features extracted from them are used as data for modeling. Chapter 2 describes the image processing steps and introduces a distance measure for the graphs. The distance measure is used in a K-means clustering algorithm (Forgy, 1965; Lloyd, 1982; Gan and Ng, 2017), which results in a clustering template with 40 exemplar structures. The primary feature we extract from each graph is a cluster assignment. We do so by comparing each graph to the template and making assignments based on the exemplar to which each graph is most similar in structure. The cluster assignment feature is used for a writer identification exercise using a Bayesian hierarchical model on a small set of 27 writers. In Chapter 3 we incorporate new data sources and a larger number of writers in the clustering algorithm to produce an updated template. A mixture component is added to the hierarchical model and we explore the relationship between a writer\u27s estimated mixing parameter and their writing style. In Chapter 4 we expand the hierarchical model to include other graph-based features, in addition to cluster assignments. We incorporate an angular feature with support on the polar coordinate system into the hierarchical modeling framework using a circular probability density function. The new model is applied and tested in three applications

Early aspects: aspect-oriented requirements engineering and architecture design

Author: Araújo João
Clements Paul
Moreira Ana
Tekinerdogan Bedir
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2004
Field of study

This paper reports on the third Early Aspects: Aspect-Oriented Requirements Engineering and Architecture Design Workshop, which has been held in Lancaster, UK, on March 21, 2004. The workshop included a presentation session and working sessions in which the particular topics on early aspects were discussed. The primary goal of the workshop was to focus on challenges to defining methodical software development processes for aspects from early on in the software life cycle and explore the potential of proposed methods and techniques to scale up to industrial applications

CiteSeerX

University of Twente Research Information

Recommended from our members

Duality in Diversity: Cultural Heterogeneity, Language, and Firm Performance

Author: Corritore Matthew
Goldberg Amir
Srivastava Sameer B
Publication venue: eScholarship, University of California
Publication date: 29/03/2018
Field of study

eScholarship - University of California

SYNTHNOTES: TOWARDS SYNTHETIC CLINICAL TEXT GENERATION

Author: Brown Kris Allen
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2019
Field of study

SynthNotes is a statistical natural language generation tool for the creation of realistic medical text notes for use by researchers in clinical language processing. Currently, advancements in medical analytics research face barriers due to patient privacy concerns which limits the numbers of researchers who have access to valuable data. Furthermore, privacy protections restrict the computing environments where data can be processed. This often adds prohibitive costs to researchers. The generation method described here provides domain-independent statistical methods for learning to generate text by extracting and ranking templates from a training corpus. The primary contribution in this work is automating the process of template selection and generation of text through classic machine learning methods. SynthNotes removes the need for human domain experts to construct templates, which can be time intensive and expensive. Furthermore, by using machine learning methods, this approach leads to greater realism and variability in the generated notes than could be achieved through classical language generation methods

Metamorphoses of a Genre: British and Italian Sound Docudrama in Context and Contemporary Production

Author: Macchiavelli Sabina
Publication venue
Publication date: 01/08/2019
Field of study

University of South Wales Research Explorer