4,213 research outputs found

    When do Words Matter? Understanding the Impact of Lexical Choice on Audience Perception using Individual Treatment Effect Estimation

    Full text link
    Studies across many disciplines have shown that lexical choice can affect audience perception. For example, how users describe themselves in a social media profile can affect their perceived socio-economic status. However, we lack general methods for estimating the causal effect of lexical choice on the perception of a specific sentence. While randomized controlled trials may provide good estimates, they do not scale to the potentially millions of comparisons necessary to consider all lexical choices. Instead, in this paper, we first offer two classes of methods to estimate the effect on perception of changing one word to another in a given sentence. The first class of algorithms builds upon quasi-experimental designs to estimate individual treatment effects from observational data. The second class treats treatment effect estimation as a classification problem. We conduct experiments with three data sources (Yelp, Twitter, and Airbnb), finding that the algorithmic estimates align well with those produced by randomized-control trials. Additionally, we find that it is possible to transfer treatment effect classifiers across domains and still maintain high accuracy.Comment: AAAI_201

    Learning Topic-Sensitive Word Representations

    Get PDF
    Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.Comment: 5 pages, 1 figure, Accepted at ACL 201

    Deconstructing comprehensibility: identifying the linguistic influences on listeners' L2 comprehensibility ratings

    Get PDF
    Comprehensibility, a major concept in second language (L2) pronunciation research that denotes listenersā€™ perceptions of how easily they understand L2 speech, is central to interlocutorsā€™ communicative success in real-world contexts. Although comprehensibility has been modeled in several L2 oral proficiency scalesā€”for example, the Test of English as a Foreign Language (TOEFL) or the International English Language Testing System (IELTS)ā€”shortcomings of existing scales (e.g., vague descriptors) reflect limited empirical evidence as to which linguistic aspects influence listenersā€™ judgments of L2 comprehensibility at different ability levels. To address this gap, a mixed-methods approach was used in the present study to gain a deeper understanding of the linguistic aspects underlying listenersā€™ L2 comprehensibility ratings. First, speech samples of 40 native French learners of English were analyzed using 19 quantitative speech measures, including segmental, suprasegmental, fluency, lexical, grammatical, and discourse-level variables. These measures were then correlated with 60 native English listenersā€™ scalar judgments of the speakersā€™ comprehensibility. Next, three English as a second language (ESL) teachers provided introspective reports on the linguistic aspects of speech that they attended to when judging L2 comprehensibility. Following data triangulation, five speech measures were identified that clearly distinguished between L2 learners at different comprehensibility levels. Lexical richness and fluency measures differentiated between low-level learners; grammatical and discourse-level measures differentiated between high-level learners; and word stress errors discriminated between learners of all levels

    Investigation into Human Preference between Common and Unambiguous Lexical Substitutions

    Get PDF
    Publisher PD
    • ā€¦
    corecore