13 research outputs found

    Sentiment Analysis Using Averaged Weighted Word Vector Features

    Full text link
    People use the world wide web heavily to share their experience with entities such as products, services, or travel destinations. Texts that provide online feedback in the form of reviews and comments are essential to make consumer decisions. These comments create a valuable source that may be used to measure satisfaction related to products or services. Sentiment analysis is the task of identifying opinions expressed in such text fragments. In this work, we develop two methods that combine different types of word vectors to learn and estimate polarity of reviews. We develop average review vectors from word vectors and add weights to this review vectors using word frequencies in positive and negative sensitivity-tagged reviews. We applied the methods to several datasets from different domains that are used as standard benchmarks for sentiment analysis. We ensemble the techniques with each other and existing methods, and we make a comparison with the approaches in the literature. The results show that the performances of our approaches outperform the state-of-the-art success rates

    Whole Genome Sequencing of Turkish Genomes Reveals Functional Private Alleles and Impact of Genetic Interactions with Europe, Asia and Africa

    Get PDF
    Background Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×). Results We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency. Conclusion This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey

    Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

    No full text
    We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions

    Edition 1.1 of the PARSEME Shared Task on automatic identification of verbal multiword expressions

    No full text
    This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year's shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20~languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17~participating systems, their methods and obtained results are also presented and analysed

    The Dawn of the Human-Machine Era: A forecast of new and emerging language technologies.

    No full text
    The 'human-machine era' is coming soon: a time when technology is integrated with our senses, not confined to mobile devices. The hardware will move from our hands into our eyes and ears. Intelligent eyewear and earwear will be able to translate another person's words, and make it look and sound like they were talking to you in your language. Technology will mediate what we see, hear and say, in real time. In addition, we will be having increasingly complex conversations with smart devices. This is not science fiction or marketing hype. These devices are currently in prototype, set for widespread consumer adoption in the coming years. All this will disrupt and transform our use and understanding of language use. Are we ready?A new EU 'COST Action' (https://cost.eu) research network 'Language in the Human-Machine Era' (LITHME), with members from 52 countries, explores how such technological advances are likely to change our everyday communication, and ultimately language itself. As a first major collaborative effort, LITHME has published an open access report 'The Dawn of the Human-Machine Era: A Forecast of New and Emerging Language Technologies': https://doi.org/10.17011/jyx/reports/20210518/1.Accessible to a wide audience, the report brings together insights from specialists in the fields of language technology and linguistic research.The forecast report was authored by 52 researchers, and edited by LITHME's Chair Dave Sayers (University of JyvĂ€skylĂ€, Finland), Vice-Chair Sviatlana Höhn (University of Luxembourg), and the Chair of LITHME's Computational Linguistics working group Rui Sousa Silva (University of Porto, Portugal). It describes the current state and probable futures of various language technologies – for written, spoken, haptic and signed modalities of language.The publication is intended to be both authoritative and accessible, aimed at language and technology professionals but also policymakers and the wider public. It describes how a range of new technologies will soon transform the way we use language, while discussing the software powering these advances behind the scenes, as well as consumer devices like Augmented Reality eyepieces and immersive Virtual Reality spaces. The report also shines a light on critical issues such as inequality of access to technologies, privacy and security, and new forms of deception and crime.It is a result of unique collaboration, as LITHME brings together people from different directions in language research who would not otherwise meet or collaborate. LITHME has eight thematic working groups; and members from each working group have contributed to the report

    The Dawn of the Human-Machine Era: A forecast of new and emerging language technologies

    No full text
    New language technologies are coming, thanks to the huge and competing private investment fuelling rapid progress; we can either understand and foresee their effects, or be taken by surprise and spend our time trying to catch up. This report scketches out some transformative new technologies that are likely to fundamentally change our use of language. Some of these may feel unrealistically futuristic or far-fetched, but a central purpose of this report - and the wider LITHME network - is to illustrate that these are mostly just the logical development and maturation of technologies currently in prototype. But will everyone benefit from all these shiny new gadgets? Throughout this report we emphasise a range of groups who will be disadvantaged and issues of inequality. Important issues of security and privacy will accompany new language technologies. A further caution is to re-emphasise the current limitations of AI. Looking ahead, we see many intriguing opportunities and new capabilities, but a range of other uncertainties and inequalities. New devices will enable new ways to talk, to translate, to remember, and to learn. But advances in technology will reproduce existing inequalities among those who cannot afford these devices, among the world’s smaller languages, and especially for sign language. Debates over privacy and security will flare and crackle with every new immersive gadget. We will move together into this curious new world with a mix of excitement and apprehension - reacting, debating, sharing and disagreeing as we always do. Plug in, as the human-machine era dawns
    corecore