479,526 research outputs found

    A discriminative approach to grounded spoken language understanding in interactive robotics

    Get PDF
    Spoken Language Understanding in Interactive Robotics provides computational models of human-machine communication based on the vocal input. However, robots operate in specific environments and the correct interpretation of the spoken sentences depends on the physical, cognitive and linguistic aspects triggered by the operational environment. Grounded language processing should exploit both the physical constraints of the context as well as knowledge assumptions of the robot. These include the subjective perception of the environment that explicitly affects linguistic reasoning. In this work, a standard linguistic pipeline for semantic parsing is extended toward a form of perceptually informed natural language processing that combines discriminative learning and distributional semantics. Empirical results achieve up to a 40% of relative error reduction

    Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural Features

    Full text link
    The increasing ubiquity of language technology necessitates a shift towards considering cultural diversity in the machine learning realm, particularly for subjective tasks that rely heavily on cultural nuances, such as Offensive Language Detection (OLD). Current understanding underscores that these tasks are substantially influenced by cultural values, however, a notable gap exists in determining if cultural features can accurately predict the success of cross-cultural transfer learning for such subjective tasks. Addressing this, our study delves into the intersection of cultural features and transfer learning effectiveness. The findings reveal that cultural value surveys indeed possess a predictive power for cross-cultural transfer learning success in OLD tasks and that it can be further improved using offensive word distance. Based on these results, we advocate for the integration of cultural information into datasets. Additionally, we recommend leveraging data sources rich in cultural information, such as surveys, to enhance cultural adaptability. Our research signifies a step forward in the quest for more inclusive, culturally sensitive language technologies.Comment: Findings of EMNLP 202

    Second-language acquisition and motivation: A literature review

    Get PDF
    This literature review traces the development of motivation in second-language acquisition, a field that has evolved from basic associations between affective factors and second-language performance to nuanced approaches of how motivation is shaped by a learner’s subjective cognition. With this review, we see that motivation’s role has always been central to language learning, and the development of our understanding of this role has mirrored the development of our understanding of second-language acquisition’s psychological and cognitive aspects. Such understanding contributes to many areas of second-language pedagogy, developmental psychology, and applied linguistics, all of which are relevant to our practical research goals of maximizing student effectiveness in second-language learning

    Building a Sentiment Corpus of Tweets in Brazilian Portuguese

    Full text link
    The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic characteristics motivates research on novel methods and approaches for classification. Hence, there is a high demand for datasets on different domains and different languages. This paper introduces TweetSentBR, a sentiment corpora for Brazilian Portuguese manually annotated with 15.000 sentences on TV show domain. The sentences were labeled in three classes (positive, neutral and negative) by seven annotators, following literature guidelines for ensuring reliability on the annotation. We also ran baseline experiments on polarity classification using three machine learning methods, reaching 80.99% on F-Measure and 82.06% on accuracy in binary classification, and 59.85% F-Measure and 64.62% on accuracy on three point classification.Comment: Accepted for publication in 11th International Conference on Language Resources and Evaluation (LREC 2018

    Voice technology and BBN

    Get PDF
    The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described

    Enhancing Sentiment Analysis Results through Outlier Detection Optimization

    Full text link
    When dealing with text data containing subjective labels like speaker emotions, inaccuracies or discrepancies among labelers are not uncommon. Such discrepancies can significantly affect the performance of machine learning algorithms. This study investigates the potential of identifying and addressing outliers in text data with subjective labels, aiming to enhance classification outcomes. We utilized the Deep SVDD algorithm, a one-class classification method, to detect outliers in nine text-based emotion and sentiment analysis datasets. By employing both a small-sized language model (DistilBERT base model with 66 million parameters) and non-deep learning machine learning algorithms (decision tree, KNN, Logistic Regression, and LDA) as the classifier, our findings suggest that the removal of outliers can lead to enhanced results in most cases. Additionally, as outliers in such datasets are not necessarily unlearnable, we experienced utilizing a large language model -- DeBERTa v3 large with 131 million parameters, which can capture very complex patterns in data. We continued to observe performance enhancements across multiple datasets.Comment: 11 pages, 5 figure

    The Relationship Between the Amount of Time Spent Writing with Computers and the Quality of Written Work.

    Get PDF
    The decade of the 80\u27s witnessed the introduction of a new method for teaching language arts called Whole Language . A whole language approach provides language instruction as the simultaneous, integrated teaching of reading, writing, speaking, and listening in a context that is both meaningful and purposeful for the learner. A new paradigm emerged, demonstrating that knowledge is internal and subjective, learning is constructing meaning, and teaching is a dynamic combination of coaching and facilitating (Hiebert, 1989, p. 62). The whole-language movement appears to embody this new paradigm in its most advanced development

    Affect experience in natural language collected with smartphones

    Get PDF
    Recent technological advancements in computerized text and speech analysis as well as machine learning methods have sparked a growing body of research investigating the algorithmic recognition of affect from the ubiquitous digital traces of natural language data and corresponding affect-linked language variations. Also, commercial interest to leverage these new data using AI for affect inferences is on the rise. However, due to the challenges associated with collecting data on subjective affect experience and corresponding language samples, previous research studies and commercial products have mostly relied on data sets from labelled text or enacted speech and, thereby, are focused on affect expression. This work leverages new smartphone-based data collection methods to collect self-reports on in-situ subjective affect experience and corresponding language samples in the wild to investigate between-person differences and within-person fluctuations in affect experience. The present dissertation aims to achieve three goals: (1) to investigate if between-person differences and within-person fluctuations in subjective affect experience are associated with and predictable from cues in spoken and written natural language, (2) to identify specific language characteristics, such as the use of specific word categories or voice parameters, that are associated with and predictive of affect experience, and (3) to analyze the influence of the context of language production on the associations and predictions of affect experience from natural language. This work is comprised of two empirical studies that analyze self-reports on subjective affect experience and natural language data collected with smartphones. Study 1 investigates predictions of between-person differences and within-person fluctuations in subjective momentary affect experience in more than 23000 speech samples from over 1000 participants in two data sets from Germany and the United States. In contrast to voice acoustics, which contain limited predictive information for affective arousal, state-of-the-art word embeddings yield significant above-chance predictions for affective arousal and valence. Moreover, interpretable machine learning methods are used to identify those voice features (i.e., loudness and spectral features) that are most predictive of affect experience. Finally, the work suggests that affect predictions from voice cues from semi-structured free speech are superior to those from read-out predefined sentences and that the emotional sentiment of the spoken content has no effect on affect predictions from voice cues. Study 2 analyzes patterns in written language data logged through smartphones' keyboards to investigate how between-person differences and within-person fluctuations in affect experience manifest in and are predictable from logged text data across different time frames and communication contexts. From a data set of more than 10 million typed words, features regarding typing dynamics, word use based on word dictionaries, and emoji and emoticon use are computed. From the data, distinct affect-linked language variations across communication contexts (private messaging versus public posting) and time frames (trait, weekly, daily, momentary) are identified (e.g., the use 1st person singular). Predictions of affect experience from machine learning algorithms, however, are not significantly better than chance. Results of this study highlight the challenges of using occurrence-counts, such as word dictionaries, for the assessment of subjective affect experience. By leveraging novel smartphone-based experience sampling and on-device language data collection in everyday life, the present dissertation shows how characteristics of spoken and written language are associated with and predictive of subjective affect experience. Thereby, this work highlights the utility of smartphones for investigating subjective affect experience in natural language in the wild, overcoming the caveats of prior research methods. Prediction results, however, challenge the optimistic prediction performances reported in prior works on the recognition of affect expression experience. Using statistical methods from the areas of description, prediction, and explanation, the present dissertation also reveals specific affect-linked language characteristics. Finally, results underline the relevance of the context of language production on language characteristics and corresponding affect predictions. The promising applications and potential future directions of this technology come with multiple challenges with regard to the conceptualization of affect, interdisciplinarity, ethics, and data privacy and security. If these challenges can be overcome, natural language analysis based on data collected with smartphones represents a promising tool to monitor affective well-being and to advance the affective sciences
    • …
    corecore