1,170 research outputs found

    Negative vaccine voices in Swedish social media

    Get PDF
    Vaccinations are one of the most significant interventions to public health, but vaccine hesitancy creates concerns for a portion of the population in many countries, including Sweden. Since discussions on vaccine hesitancy are often taken on social networking sites, data from Swedish social media are used to study and quantify the sentiment among the discussants on the vaccination-or-not topic during phases of the COVID-19 pandemic. Out of all the posts analyzed a majority showed a stronger negative sentiment, prevailing throughout the whole of the examined period, with some spikes or jumps due to the occurrence of certain vaccine-related events distinguishable in the results. Sentiment analysis can be a valuable tool to track public opinions regarding the use, efficacy, safety, and importance of vaccination

    Marginal contrast in loanword phonology:Production and perception

    Get PDF
    Though Dutch is usually described as lacking a voicing contrast at the velar place of articulation, due to intense language contact and heavy lexical borrowing, a contrast between /k/ and /g/ has recently been emerging. We explored the status of this contrast in Dutch speakers in both production and perception. We asked participants to produce loanwords containing a /g/ in the source language (e.g., goal) and found a range of productions, including a great many unadapted [g] tokens. We also tested the same speakers on their perception of the emerging [k] ~ [g] contrast and found that our participants were able to discriminate the emerging contrast well. We additionally explored the possibility that those speakers who use the new contrast more in production are also better at perceiving it, but we did not observe strong evidence of such a link. Overall, our results indicate that the adoption of the new sound is well advanced in the population we tested, but is still modulated by individual-level factors. We hold that contrasts emerging through borrowing, like other phonological contrasts, are subject to perceptual and functional constraints, and that these and other ‘marginal contrasts’ must be considered as full-fledged parts of phonology

    Hypocoristics: a derivational problem

    Get PDF
    This study is an investigatory research on the two major schools of linguistics, formal and functional. The study looks at earlier versions of Generative Theory as the representative of formal linguistics and contrasts it to Skousen’s computational model which is taken as the representative of functional linguistics. The way each of the theories are described and evaluated are by considering how each of them can be used in analysing hypocoristic data. A description of hypocoristics for 165 names collected from Kuwaiti Arabic speakers were the base for the analysis. The data was given a general description at first to show how they can be accounted for in the two theories. The first approach that was used was a rule-based approach used previously with Jordanian Arabic Hypocoristics which use Semitic root and Pattern Morphology. The second rule-based approach was also a rule-based approach the employed phonological processes to account for the derivation. The two were considered part of formal theories of analysis. The functional analysis which uses a computational model that employs phonological features defined over statistically driven frequencies was used to model the data. An evaluation of the model with low success rates lead to the change of the model and present an alternative hybrid model that utilises both rules and analogy. The model was inspired by a rule-based theory which was not fleshed out and analogy was used to flesh it out and place it with a usage-based theory of language. Finally, the thesis ended with an open evaluative stand requiring further research on computational models from a computational perspective rather than a linguistics view

    New approach for Arabic named entity recognition on social media based on feature selection using genetic algorithm

    Get PDF
    Many features can be extracted from the massive volume of data in different types that are available nowadays on social media. The growing demand for multimedia applications was an essential factor in this regard, particularly in the case of text data. Often, using the full feature set for each of these activities can be time-consuming and can also negatively impact performance. It is challenging to find a subset of features that are useful for a given task due to a large number of features. In this paper, we employed a feature selection approach using the genetic algorithm to identify the optimized feature set. Afterward, the best combination of the optimal feature set is used to identify and classify the Arabic named entities (NEs) based on support vector. Experimental results show that our system reaches a state-of-the-art performance of the Arab NER on social media and significantly outperforms the previous systems

    Word Knowledge and Word Usage

    Get PDF
    Word storage and processing define a multi-factorial domain of scientific inquiry whose thorough investigation goes well beyond the boundaries of traditional disciplinary taxonomies, to require synergic integration of a wide range of methods, techniques and empirical and experimental findings. The present book intends to approach a few central issues concerning the organization, structure and functioning of the Mental Lexicon, by asking domain experts to look at common, central topics from complementary standpoints, and discuss the advantages of developing converging perspectives. The book will explore the connections between computational and algorithmic models of the mental lexicon, word frequency distributions and information theoretical measures of word families, statistical correlations across psycho-linguistic and cognitive evidence, principles of machine learning and integrative brain models of word storage and processing. Main goal of the book will be to map out the landscape of future research in this area, to foster the development of interdisciplinary curricula and help single-domain specialists understand and address issues and questions as they are raised in other disciplines

    Unsupervised learning of Arabic non-concatenative morphology

    Get PDF
    Unsupervised approaches to learning the morphology of a language play an important role in computer processing of language from a practical and theoretical perspective, due their minimal reliance on manually produced linguistic resources and human annotation. Such approaches have been widely researched for the problem of concatenative affixation, but less attention has been paid to the intercalated (non-concatenative) morphology exhibited by Arabic and other Semitic languages. The aim of this research is to learn the root and pattern morphology of Arabic, with accuracy comparable to manually built morphological analysis systems. The approach is kept free from human supervision or manual parameter settings, assuming only that roots and patterns intertwine to form a word. Promising results were obtained by applying a technique adapted from previous work in concatenative morphology learning, which uses machine learning to determine relatedness between words. The output, with probabilistic relatedness values between words, was then used to rank all possible roots and patterns to form a lexicon. Analysis using trilateral roots resulted in correct root identification accuracy of approximately 86% for inflected words. Although the machine learning-based approach is effective, it is conceptually complex. So an alternative, simpler and computationally efficient approach was then devised to obtain morpheme scores based on comparative counts of roots and patterns. In this approach, root and pattern scores are defined in terms of each other in a mutually recursive relationship, converging to an optimized morpheme ranking. This technique gives slightly better accuracy while being conceptually simpler and more efficient. The approach, after further enhancements, was evaluated on a version of the Quranic Arabic Corpus, attaining a final accuracy of approximately 93%. A comparative evaluation shows this to be superior to two existing, well used manually built Arabic stemmers, thus demonstrating the practical feasibility of unsupervised learning of non-concatenative morphology

    One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis

    Get PDF
    When learning a new skill, you take advantage of your preexisting skills and knowledge. For instance, if you are a skilled violinist, you will likely have an easier time learning to play cello. Similarly, when learning a new language you take advantage of the languages you already speak. For instance, if your native language is Norwegian and you decide to learn Dutch, the lexical overlap between these two languages will likely benefit your rate of language acquisition. This thesis deals with the intersection of learning multiple tasks and learning multiple languages in the context of Natural Language Processing (NLP), which can be defined as the study of computational processing of human language. Although these two types of learning may seem different on the surface, we will see that they share many similarities. The traditional approach in NLP is to consider a single task for a single language at a time. However, recent advances allow for broadening this approach, by considering data for multiple tasks and languages simultaneously. This is an important approach to explore further as the key to improving the reliability of NLP, especially for low-resource languages, is to take advantage of all relevant data whenever possible. In doing so, the hope is that in the long term, low-resource languages can benefit from the advances made in NLP which are currently to a large extent reserved for high-resource languages. This, in turn, may then have positive consequences for, e.g., language preservation, as speakers of minority languages will have a lower degree of pressure to using high-resource languages. In the short term, answering the specific research questions posed should be of use to NLP researchers working towards the same goal.Comment: PhD thesis, University of Groninge

    Exceptionality and derived-environment effects: A comparison of Korean and Turkish

    Get PDF
    corecore