54 research outputs found
Men Are from Mars, Women Are from Venus: Evaluation and Modelling of Verbal Associations
We present a quantitative analysis of human word association pairs and study
the types of relations presented in the associations. We put our main focus on
the correlation between response types and respondent characteristics such as
occupation and gender by contrasting syntagmatic and paradigmatic associations.
Finally, we propose a personalised distributed word association model and show
the importance of incorporating demographic factors into the models commonly
used in natural language processing.Comment: AIST 2017 camera-read
Intellectual Disability, Literacy, and Assistive Technology in the Community College Setting
This pilot project utilized a mixed-methodology approach to investigating assistive technology (AT) with students enrolled in a college program for students with intellectual disabilities. Its purpose was the provision of AT-based software, training, and support to examine changes related to academic learning and independence. Twelve college student participants were assessed using pre- and post-intervention assessment tools. Six student participants and five front-line Learning Facilitators also participated in focus groups. Focus group data were analyzed inductively resulting in seven emergent themes; assessment data were analyzed statistically and showed an increased trend in scores over time
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Two-part Negation in Yang Zhuang
The negation system of Yang Zhuang includes two standard negators and an aspectual negator, all of which occur before the verb; the negator meiz nearly always co-occurs with a clause-final particle nauq, which can also stand as a single-word negative response to a question. Although it is tempting to analyze nauq with a meaning beyond simply negation, this is difficult to do synchronically. Comparison with neighboring Tai languages suggests that this construction represents one stage in Jespersen's Cycle, whereby a negator is augmented with a second element, after which the second element becomes associated with negation; this element subsequently replaces the historical negator. A Jespersen's Cycle analysis also explains the occurrence of nauq as a preverbal negator in some neighboring Zhuang languages
Stylistics versus Statistics: A corpus linguistic approach to combining techniques in forensic authorship analysis using Enron emails
This thesis empirically investigates how a corpus linguistic approach can address the main theoretical and methodological challenges facing the field of forensic authorship analysis. Linguists approach the problem of questioned authorship from the theoretical position that each person has their own distinctive idiolect (Coulthard 2004: 431). However, the notion of idiolect has come under scrutiny in forensic linguistics over recent years for being too abstract to be of practical use (Grant 2010; Turell 2010). At the same time, two competing methodologies have developed in authorship analysis. On the one hand, there are qualitative stylistic approaches, and on the other there are statistical âstylometricâ techniques. This study uses a corpus of over 60,000 emails and 2.5 million words written by 176 employees of the former American company Enron to tackle these issues in the contexts of both authorship attribution (identifying authors using linguistic evidence) and author profiling (predicting authorsâ social characteristics using linguistic evidence).
Analyses reveal that even in shared communicative contexts, and when using very common lexical items, individual Enron employees produce distinctive collocation patterns and lexical co-selections. In turn, these idiolectal elements of linguistic output can be captured and quantified by word n-grams (strings of n words). An attribution experiment is performed using word n-grams to identify the authors of anonymised email samples. Results of the experiment are encouraging, and it is argued that the approach developed here offers a means by which stylistic and statistical techniques can complement each other. Finally, quantitative and qualitative analyses are combined in the sociolinguistic profiling of Enron employees by gender and occupation. Current author profiling research is exclusively statistical in nature. However, the findings here demonstrate that when statistical results are augmented by qualitative evidence, the complex relationship between language use and author identity can be more accurately observed
Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014
- âŠ