132 research outputs found

    Semelhanças e diferenças de duplicidade em inglês e vietnamita

    Get PDF
    Duplicity is a phenomenon that occurs when two or more words have the same tone or grammatical structure. The same language is used in many different languages, including in English and Vietnamese. In this article, the author focuses on researching the similarities and differences of duplicity in English and Vietnamese. Like other languages, in addition to the univesal rules, in English there are many "abnormal" phenomena that make users feel "absurd", "contradictory", including the phenomenon of coincidence. language. To show the validity as well as the value of these "unusual" structures is a difficult but extremely interesting problem, requiring research and clarification. In this article, the author will focus on clarifying the following issues: research on theoretical basis; Point out the similarities and differences of the phenomenon of duplicity in English and Vietnamese; Application of duplicity in learning and teaching English and Vietnamese.A duplicidade é um fenômeno que ocorre quando duas ou mais palavras têm o mesmo tom ou a mesma estrutura gramatical. A mesma linguagem é usada em muitos idiomas diferentes, inclusive em inglês e vietnamita. Neste artigo, o autor se concentra na pesquisa das semelhanças e diferenças da duplicidade em inglês e vietnamita. Como em outros idiomas, além das regras universais, no inglês há muitos fenômenos "anormais" que fazem os usuários se sentirem "absurdos", "contraditórios", inclusive o fenômeno da coincidência. Mostrar a validade e o valor dessas estruturas "incomuns" é um problema difícil, mas extremamente interessante, que exige pesquisa e esclarecimento. Neste artigo, o autor se concentrará em esclarecer as seguintes questões: pesquisa sobre a base teórica; apontar as semelhanças e diferenças do fenômeno da duplicidade em inglês e vietnamita; aplicação da duplicidade no aprendizado e no ensino de inglês e vietnamita


    Get PDF
    This paper focuses on Scratch language of programming and traces its math and linguistic features. From a complex consideration about Scratch language programming in linguistic paradigm, focusing on structural, semantic and syntactic features and logic of its narration, this research attempts to clarify specifics of the language and correlate it with the English language features. Global integration of ideas and sciences underline the crucial importance of programming and language conglomerate. Human-computer interfaces, software systems, and development of various programming languages depend on well-balanced structure, shape, logic, and appearance of the actual code. Dynamic characteristics of the Scratch programming environment sustain the creation of interactive and media-rich projects. Ad expansion of Scratch for mediation of animated stories, music videos, science projects, tutorials, and other contents necessitates multifaceted analysis of this programming environment and evokes the interest of researching Scratch from the math and linguistic perspective as one possible projection on various aspects of the considered programming language

    Brazilian funk as the herald of a new social order: a semiotic analysis of the internet music video “Beijinho no ombro” and its reception in social media

    Get PDF
    The present article aims to analyze the music video “Beijinho no ombro” – a major Brazilian social media phenomenon that reached more than 9 million Youtube views in 3 months in 2014 –, discussing both the processes by which homologies between categories of expression and content are established – the so called Hjelmslev’s “commutations” – and suspended – the Danish linguist’s concept of “syncretism” (Hjelmslev, 2003) – in the audiovisual text, and the effects of meaning created thereby. The analytical treatment assimilates also some of Éric Landowski’s contributions to the discussions about the intersubjective interactions regimes (Landowski, 1997, 2006) and their impact on the study of the socalled states of soul deeply developed by Greimas and Fontanille in their Sémiotique des passions (Greimas & Fontanille, 1993). The object analysis intends moreover to illustrate a methodological approach proposed by the author and that may be applied to various corpora regarding the audiovisual repertory. Such an approach, a natural extension of Greimas’ treatment of the plane of content and Floch’s developments into the plane of expression (Floch, 1984, 1993), offers as a contribution the proposition of a methodology that, departing from the figures of expression and their homologations and semi-symbolic relations with categories of content, will then detect their projections in each one of the three levels of the generative path. Thus, not only the role of the means of manifestation in the process of generation of effects of meaning can be better evaluated, but also the possibilities of a generative approach that includes the textual structures – rather than the explicit exclusion that appears in the Dictionary of semiotics (Greimas & Courtés, 1991:208) – can be further discussed.O presente artigo tem por objetivo analisar o clipe de “Beijinho no ombro” — um grande sucesso brasileiro nas redes sociais que alcançou, em três meses, mais de nove milhões de visualizações no YouTube em 2014 — discutindo tanto os processos pelos quais estabelecem-se homologias entre categorias do expressão e do conteúdo — as assim chamadas comutações de Hjelmslev — e suspensões — o conceito de “sincretismo” do linguista dinamarquês (Hjelmslev, 2003) — no texto audiovisual, e os efeitos de sentido criados dessa forma. O tratamento analítico assimila algumas das contribuições de Eric Landowski para as discussões sobre os regimes de interação intersubjetivos (Landowski, 1997, 2006) e seu impacto no estudo dos assim chamados estados de alma, desenvolvidos com profundidade por Greimas e Fontanille em seu Semiótica das paixões (Greimas; Fontanille, 1993). O objeto de análise visa, além dissop, ilustrar a abordagem metodológica proposta pelo autor e que pode ser aplicada a vários corpora considerando o repertório audiovisual. Tal abordagem, uma extensão natural do tratamento greimasiano do plano do conteúdo e dos desenvolvimentos de Floch para o plano da expressão, oferece como contribuição a proposta de uma metodologia que, partindo das figuras de expressão e de suas homologações e relações semissimbólicas com as categorias do conteúdo, detectará então as projeções em cada um dos três níveis do percurso gerativo. Assim, não apenas o papel dos meios de manifestação no processo de geração de efeitos de sentido pode ser melhor avaliado, mas também as possibilidades de uma abordagem gerativa que inclui as estruturas textuais — mais do que a exclusão explícita que aparece no Dicionário de Semiótica (Greimas; Courtés, 1991: 208) — podem ser discutidas

    This! Identifying new sentiment slang through orthographic pleonasm online: Yasss slay gorg queen ilysm

    Get PDF
    This is an accepted manuscript of an article published by IEEE in IEEE Intelligent Systems on 13 Sept 2021, available online: https://ieeexplore.ieee.org/document/9536263 The accepted version of the publication may differ from the final published version.Identifying neologisms is important for natural language processing of social web text when informal language is standard and youth slang is common. For example, failing to identify neologisms can reduce the accuracy of lexical sentiment analysis if opinions are frequently expressed in words that are too new to be in the sentiment dictionary. This article proposes a method based on orthographic pleonasm to identify emotion-related neologisms in the social web: finding words with the most different letter repetition spelling variations. For this method, non-dictionary words are extracted from a large social web corpus, spelling standardisation is applied, and then words are ranked in decreasing order of spelling variation frequency. Words with the most spelling variations are then KWIC-analysed for semantic context. Applied to a collection of comments on YouTube influencers, this method found neologisms like slay and early as positive terms, mixed with traditional sentiment words, exclamations, and nouns. Although orthographic pleonasm was originally used to express the speaker’s rhythm and one of voice, it is also used for initialisms in a way that is difficult to vocalise. The method is therefore a practical method to identify new sentiment slang, including both normal words and initialisms

    Brazilian funk as the herald of a new social order: a semiotic analysis of the internet music video “Beijinho no ombro” and its reception in social media

    Get PDF
    The present article aims to analyze the music video “Beijinho no ombro” – a major Brazilian social media phenomenon that reached more than 9 million Youtube views in 3 months in 2014 –, discussing both the processes by which homologies between categories of expression and content are established – the so called Hjelmslev’s “commutations” – and suspended – the Danish linguist’s concept of “syncretism” (Hjelmslev, 2003) – in the audiovisual text, and the effects of meaning created thereby. The analytical treatment assimilates also some of Éric Landowski’s contributions to the discussions about the intersubjective interactions regimes (Landowski, 1997, 2006) and their impact on the study of the socalled states of soul deeply developed by Greimas and Fontanille in their Sémiotique des passions (Greimas & Fontanille, 1993). The object analysis intends moreover to illustrate a methodological approach proposed by the author and that may be applied to various corpora regarding the audiovisual repertory. Such an approach, a natural extension of Greimas’ treatment of the plane of content and Floch’s developments into the plane of expression (Floch, 1984, 1993), offers as a contribution the proposition of a methodology that, departing from the figures of expression and their homologations and semi-symbolic relations with categories of content, will then detect their projections in each one of the three levels of the generative path. Thus, not only the role of the means of manifestation in the process of generation of effects of meaning can be better evaluated, but also the possibilities of a generative approach that includes the textual structures – rather than the explicit exclusion that appears in the Dictionary of semiotics (Greimas & Courtés, 1991:208) – can be further discussed.O presente artigo tem por objetivo analisar o clipe de “Beijinho no ombro” — um grande sucesso brasileiro nas redes sociais que alcançou, em três meses, mais de nove milhões de visualizações no YouTube em 2014 — discutindo tanto os processos pelos quais estabelecem-se homologias entre categorias do expressão e do conteúdo — as assim chamadas comutações de Hjelmslev — e suspensões — o conceito de “sincretismo” do linguista dinamarquês (Hjelmslev, 2003) — no texto audiovisual, e os efeitos de sentido criados dessa forma. O tratamento analítico assimila algumas das contribuições de Eric Landowski para as discussões sobre os regimes de interação intersubjetivos (Landowski, 1997, 2006) e seu impacto no estudo dos assim chamados estados de alma, desenvolvidos com profundidade por Greimas e Fontanille em seu Semiótica das paixões (Greimas; Fontanille, 1993). O objeto de análise visa, além dissop, ilustrar a abordagem metodológica proposta pelo autor e que pode ser aplicada a vários corpora considerando o repertório audiovisual. Tal abordagem, uma extensão natural do tratamento greimasiano do plano do conteúdo e dos desenvolvimentos de Floch para o plano da expressão, oferece como contribuição a proposta de uma metodologia que, partindo das figuras de expressão e de suas homologações e relações semissimbólicas com as categorias do conteúdo, detectará então as projeções em cada um dos três níveis do percurso gerativo. Assim, não apenas o papel dos meios de manifestação no processo de geração de efeitos de sentido pode ser melhor avaliado, mas também as possibilidades de uma abordagem gerativa que inclui as estruturas textuais — mais do que a exclusão explícita que aparece no Dicionário de Semiótica (Greimas; Courtés, 1991: 208) — podem ser discutidas

    Методы, алгоритмы и программное обеспечение обнаружения сленга в онлайн-сообществах социальных медиа

    Get PDF
    На сегодняшний день сленг является неотъемлемой частью языка интернет-пользователей в социальных сетях. Значения сленговых слов часто определяют ключевую семантику в предложениях и могут играть решающую роль в приложениях обработки естественного языка (NLP). Таким образом, учет сленга позволяет повысить эффективность семантического анализа текстов. Целью данной работы является разработка методов и алгоритмов для автоматизированного обнаружения характерного сленга онлайн-сообществ русскоязычных социальных медиа. Классификация текста на текст содержащий сленг и не содержащий сленг проводилась с помощью нейронных сетей.Today, slang is a significant part of the language of Internet users in social networks. Slang word often define key semantics in sentences and can play a main role in natural language processing (NLP) applications. Slang word meanings often define key semantics in sentences and can play a critical role in natural language processing (NLP) applications. Thus, slang identification can help improve the efficiency of semantic analysis of texts. The purpose of this work is to develop methods and algorithms for automated slang word identification for Russian-speaking social media. Text classification was performed using neural networks

    Exploring subdomain variation in biomedical language.

    Get PDF
    BACKGROUND: Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the biomedical domain, i.e., the extent to which different subject areas of biomedicine are characterised by different linguistic behaviour. While variation at a coarser domain level such as between newswire and biomedical text is well-studied and known to affect the portability of NLP systems, we are the first to conduct an extensive investigation into more fine-grained levels of variation. RESULTS: Using the large OpenPMC text corpus, which spans the many subdomains of biomedicine, we investigate variation across a number of lexical, syntactic, semantic and discourse-related dimensions. These dimensions are chosen for their relevance to the performance of NLP systems. We use clustering techniques to analyse commonalities and distinctions among the subdomains. CONCLUSIONS: We find that while patterns of inter-subdomain variation differ somewhat from one feature set to another, robust clusters can be identified that correspond to intuitive distinctions such as that between clinical and laboratory subjects. In particular, subdomains relating to genetics and molecular biology, which are the most common sources of material for training and evaluating biomedical NLP tools, are not representative of all biomedical subdomains. We conclude that an awareness of subdomain variation is important when considering the practical use of language processing applications by biomedical researchers

    Astrophysics Titles in Scientific American Magazine (1990-2014): Linguistic and Discourse Practices

    Get PDF
    We analyze Astrophysics titles published in Scientific American Magazine in the period 1990-2014 and compare them with Astrophysics titles of specialized journals. Our main results show that titles published in Scientific American are short, clear, direct and with low lexical density and little terminology. They mainly consist in simple and nominal constructions with few adjectives and compound groups. The predominance of nominal compounds and the high number of verbal titles and definite articles imply that popularized science titles mainly deal with global and well-established concepts. Pragmatic and rhetorical strategies are common in Astrophysics Scientific American titles in order to appeal to multiple audiences and invite them to use their cultural background knowledge to grasp at the actual meaning. Although pragmatic and rhetorical mechanisms overlap in some titles, rhetorical devices seem to prevail over pragmatic ones. All in all, however, both types of devices reveal a growing trend over time

    Attitudes towards English usage in the late modern period: the case of phrasal verbs

    Get PDF
    Phrasal verbs are an intrinsic part of Late Modern English, and are found in both informal and colloquial language (check out, listen up) and more formal styles (a thesis might set out some problems and then sum up the main points). They are highly productive: 'up' can be added to almost any verb to signify goal or end-point (read up, finish up, eat up, meet up, fatten up); and once a phrasal verb has been coined, a conversion often follows (for example, the verb 'phone in' was first recorded in 1946, and the noun 'phone-in' in 1967; 'dumb down' was coined in 1933, and we read of 'dumbed-down' material in 1982). Perhaps because of their pervasiveness, phrasal verbs are frequently criticized (although occasionally praised) in Late Modern English texts about language. The purpose of this thesis is to examine such attitudes in three strands. Firstly, over one hundred language texts (grammars, dictionaries, and usage manuals, among others, from 1750 to 1970) were examined to discover how phrasal verbs were recognized and classified in Late Modern English. Secondly, these materials were analyzed in order to find out how attitudes towards phrasal verbs in English developed in relation to broader attitudes towards language in the Late Modern period. Thirdly, phrasal verb usage in A Representative Corpus of Historical English Registers, a corpus of British and American English from 1650 to 1990, was analyzed to determine how such attitudes affect usage. It will be shown that attitudes towards phrasal verbs reflect various strands of language ideology, including opinions about Latinate as opposed to native vocabulary; ideals relating to etymology, polysemy, and redundancy; reactions to neologisms; and attitudes towards language variety. Furthermore, it will be suggested that in the case of certain redundant combinations such as 'return back' and 'raise up', proscriptions of phrasal verbs did have an effect on their usage in the Late Modern period

    Automated Extraction of Protein Mutation Impacts from the Biomedical Literature

    Get PDF
    Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually reading through the rich and fast growing repository of biomedical literature is expensive and time-consuming. A number of manually curated databases, such as BRENDA (http://www.brenda-enzymes.org), try to index and provide this information; yet the provided data seems to be incomplete. Thus, there is a growing need for automated approaches to extract this information. In this work, we present a system to automatically extract and summarize impact information from protein mutations. Our system extraction module is split into subtasks: organism analysis, mutation detection, protein property extraction and impact analysis. Organisms, as sources of proteins, are required to be extracted to help disambiguation of genes and proteins. Thus, our system extracts and grounds organisms to NCBI. We detect mutation series to correctly ground our detected impacts. Our system also extracts the affected protein properties as well as the magnitude of the effects. The output of our system is populated to an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on both external and internal corpora and databases. The results show the reliability of the approaches. Our Organism extraction system achieves a precision and recall of 95% and 94% and a grounding accuracy of 97.5% on the OT corpus. On the manually annotated corpus of Linneaus-100, the results show a precision and recall of 99% and 97% and grounding with an accuracy of 97.4%. In the impact detection task, our system achieves a precision and recall of 70.4%-71.8% and 71.2%-71.3% on a manually annotated documents. Our system grounds the detected impacts with an accuracy of 70.1%-71.7% on the manually annotated documents and a precision and recall of 57%-57.5% and 82.5%-84.2% against the BRENDA data