35,283 research outputs found
Hierarchical Character-Word Models for Language Identification
Social media messages' brevity and unconventional spelling pose a challenge
to language identification. We introduce a hierarchical model that learns
character and contextualized word-level representations for language
identification. Our method performs well against strong base- lines, and can
also reveal code-switching
Troping the Enemy: Metaphor, Culture, and the Big Data Black Boxes of National Security
This article considers how cultural understanding is being brought into the work of the Intelligence Advanced Research Projects Activity (IARPA), through an analysis of its Metaphor program. It examines the type of social science underwriting this program, unpacks implications of the agency’s conception of metaphor for understanding so-called cultures of interest, and compares IARPA’s to competing accounts of how metaphor works to create cultural meaning. The article highlights some risks posed by key deficits in the Intelligence Community\u27s (IC) approach to culture, which relies on the cognitive linguistic theories of George Lakoff and colleagues. It also explores the problem of the opacity of these risks for analysts, even as such predictive cultural analytics are becoming a part of intelligence forecasting. This article examines the problem of information secrecy in two ways, by unpacking the opacity of “black box,” algorithm-based social science of culture for end users with little appreciation of their potential biases, and by evaluating the IC\u27s nontransparent approach to foreign cultures, as it underwrites national security assessments
The added value of implementing the Planet Game scenario with Collage and Gridcole
This paper discusses the suitability and the added value of Collage and Gridcole when contrasted with other solutions participating in the ICALT 2006 workshop titled “Comparing educational modelling languages on a case study.” In this workshop each proposed solution was challenged to implement a Computer-Supported Collaborative Learning situation (CSCL) posed by the workshop’s organizers. Collage is a pattern-based authoring tool for the creation of CSCL scripts compliant with IMS Learning Design (IMS LD). These IMS LD scripts can be enacted by the Gridcole tailorable CSCL system. The analysis presented in the paper is organized as a case study which considers the data recorded in the workshop discussion as well the information reported in the workshop contributions. The results of this analysis show how Collage and Gridcole succeed in implementing the scenario and also point out some significant advantages in terms of design reusability and generality, user-friendliness, and enactment flexibility
The Palaeographical Method under the Light of a Digital Approach
This paper has the twofold aim of reflecting upon a humanities computing approach to palaeography, and of making such reflections - together with its related experimental results - fruitful at the implementation level. Firstly, the paper explores the methodological issues related to the use of a digital tool to support the palaeographical analysis of medieval handwriting. It claims that humanities computing methods can assist in making explicit those processes of the palaeographical research that encompass detailed analyses, in particular of the handwriting and, more generally, of other idiosyncratic features of written cultural artefacts. Thus, palaeographical tools are to be contextualised and used within a broader methodological framework where their role is to mediate the vision, the comparison, the representation, the analysis and the interpretation of these objects. Secondly, the paper attempts to evaluate the experimentations carried out with a specific software and, in so doing, to test a humanities computing approach to palaeography at a practical level, so as to direct future implementations. Some of these implementations have already been carried out by the current developers of the application in question with whom the author collaborates closely, while others are still in progress and in need of future iterative refinements
Presenting GECO : an eyetracking corpus of monolingual and bilingual sentence reading
This paper introduces GECO, the Ghent Eye-tracking Corpus, a monolingual and bilingual corpus of eye-tracking data of participants reading a complete novel. English monolinguals and Dutch-English bilinguals read an entire novel, which was presented in paragraphs on the screen. The bilinguals read half of the novel in their first language, and the other half in their second language. In this paper we describe the distributions and descriptive statistics of the most important reading time measures for the two groups of participants. This large eye-tracking corpus is perfectly suited for both exploratory purposes as well as more directed hypothesis testing, and it can guide the formulation of ideas and theories about naturalistic reading processes in a meaningful context. Most importantly, this corpus has the potential to evaluate the generalizability of monolingual and bilingual language theories and models to reading of long texts and narratives
- …