Search CORE

2,599 research outputs found

Unsupervised Learning of Style-sensitive Word Vectors

Author: Akama Reina
Inui Kentaro
Kobayashi Sosuke
Watanabe Kento
Yokoi Sho
Publication venue
Publication date: 01/01/2018
Field of study

This paper presents the first study aimed at capturing stylistic similarity between words in an unsupervised manner. We propose extending the continuous bag of words (CBOW) model (Mikolov et al., 2013) to learn style-sensitive word vectors using a wider context window under the assumption that the style of all the words in an utterance is consistent. In addition, we introduce a novel task to predict lexical stylistic similarity and to create a benchmark dataset for this task. Our experiment with this dataset supports our assumption and demonstrates that the proposed extensions contribute to the acquisition of style-sensitive word embeddings.Comment: 7 pages, Accepted at The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018

arXiv.org e-Print Archive

Crossref

Stylistic variation over 200 years of court proceedings according to gender and social class

Author: Degaetano-Ortlieb Stefania
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2018
Field of study

We present an approach to detect stylistic variation across social variables (here: gender and social class), considering also diachronic change in language use. For detection of stylistic variation, we use relative entropy, measuring the difference between probability distributions at different linguistic levels (here: lexis and grammar). In addition, by relative entropy, we can determine which linguistic units are related to stylistic variation.This research is funded by the German Research Foundation (Deutsche Forschungsgemeinschaft) under grants SFB1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de) and the start-up grant for research projects from Saarland University

Crossref

Universaar

Acronym

Computational Sociolinguistics: A Survey

Author: de Jong Franciska
Doğruöz A. Seza
Nguyen Dong
Rosé Carolyn P.
Publication venue
Publication date: 01/01/2016
Field of study

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

EUR Research Repository

University of Twente Research Information

Profiling Dutch Authors on Twitter:Discovering Political Preference and Income Level

Author: Melein Léon
Plank Barbara
van Dalen Reinard
Publication venue
Publication date: 01/01/2017
Field of study

ARTS repository - University of Groningen

Profiling Dutch Authors on Twitter:Discovering Political Preference and Income Level

Author: Melein Léon
Plank Barbara
van Dalen Reinard
Publication venue
Publication date: 01/01/2017
Field of study

Dissertations of the University of Groningen

U Ok Hun?:The digital commodification of white woman style

Author: Ilbury Christian
Publication venue: 'Wiley'
Publication date: 02/04/2022
Field of study

Edinburgh Research Explorer

Predicting Authorship and Author Traits from Keystroke Dynamics

Author: Plank Barbara
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Crossref

The IT University of Copenhagen's Repository

Listening between the Lines: Learning Personal Attributes from Conversations

Author: Mirza Paramita
Tigunova Anna
Weikum Gerhard
Yates Andrew
Publication venue
Publication date: 01/01/2019
Field of study

Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Comment: published in WWW'1

arXiv.org e-Print Archive

MPG.PuRe