Search CORE

65 research outputs found

The Strengths and Pitfalls of Large-Scale Text Mining for Literary Studies

Author: Hengchen Simon
Tahmasebi Nina
Publication venue
Publication date: 01/01/2019
Field of study

This paper is an overview of the opportunities and challenges of using large-scale text mining to answer research questions that stem from the humanities in general and literature specifically. In this paper, we will discuss a data-intensive research methodology and how different views of digital text affect answers to research questions. We will discuss results derived from text mining, how these results can be evaluated, and their relation to hypotheses and research questions. Finally, we will discuss some pitfalls of computational literary analysis and give some pointers as to how these can be avoided.Peer reviewe

Publikationer från Uppsala Universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Helsingin yliopiston digitaalinen arkisto

Computational modeling of semantic change

Author: Dubossarsky Haim
Tahmasebi Nina
Publication venue
Publication date: 13/04/2023
Field of study

In this chapter we provide an overview of computational modeling for semantic change using large and semi-large textual corpora. We aim to provide a key for the interpretation of relevant methods and evaluation techniques, and also provide insights into important aspects of the computational study of semantic change. We discuss the pros and cons of different classes of models with respect to the properties of the data from which one wishes to model semantic change, and which avenues are available to evaluate the results.Comment: This chapter is submitted to Routledge Handbook of Historical Linguistics, 2nd Editio

arXiv.org e-Print Archive

The Finer They Get: Combining Fine-Tuned Models For Better Semantic Change Detection

Author: Dubossarsky Haim
Tahmasebi Nina
Zhou Wei
Publication venue: University of Tartu Library
Publication date: 01/05/2023
Field of study

DSpace at Tartu University Library

Models and algorithms for automatic detection of language evolution : towards finding and interpreting of content in long-term archives

Author: Tahmasebi Nina N.
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2013
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover

Time-Out : Temporal Referencing for Robust Modeling of Lexical Semantic Change

Author: Dubossarsky Haim
Hengchen Simon
Schlechtweg Dominik
Tahmasebi Nina
Publication venue: ACL
Publication date: 01/01/2019
Field of study

Code produced for this paper is available at: https://github.com/Garrafao/TemporalReferencingState-of-the-art models of lexical semantic change detection suffer from noise stemming from vector space alignment. We have empirically tested the Temporal Referencing method for lexical semantic change and show that, by avoiding alignment, it is less affected by this noise. We show that, trained on a diachronic corpus, the skip-gram with negative sampling architecture with temporal referencing outperforms alignment models on a synthetic task as well as a manual testset. We introduce a principled way to simulate lexical semantic change and systematically control for possible biases.Peer reviewe

arXiv.org e-Print Archive

Crossref

Helsingin yliopiston digitaalinen arkisto

The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change

Author: Kuhn Jonas
Linke Lukas Theuer
Sander Pauline
Schlechtweg Dominik
Sköldberg Emma
Tahmasebi Nina
Virk Shafqat Mumtaz
Walde Sabine Schulte im
Zhang Tuo
Publication venue
Publication date: 05/02/2024
Field of study

We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface. The tool supports standardized human annotation as well as computational annotation, building on recent advances with Word-in-Context models. Annotator judgments are clustered with automatic graph clustering techniques and visualized for analysis. This allows to measure word senses with simple and intuitive micro-task judgments between use pairs, requiring minimal preparation efforts. The tool offers additional functionalities to compare the agreement between annotators to guarantee the inter-subjectivity of the obtained judgments and to calculate summary statistics giving insights into sense frequency distributions, semantic variation or changes of senses over time.Comment: EACL Demo, 7 page

arXiv.org e-Print Archive

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Semantic change — how the meanings of words change over time — has preoccupied scholars since well before modern linguistics emerged in the late 19th and early 20th century, ushering in a new methodological turn in the study of language change. Compared to changes in sound and grammar, semantic change is the least  understood. Ever since, the study of semantic change has progressed steadily, accumulating a vast store of knowledge for over a century, encompassing many languages and language families. Historical linguists also early on realized the potential of computers as research tools, with papers at the very first international conferences in computational linguistics in the 1960s. Such computational studies still tended to be small-scale, method-oriented, and qualitative. However, recent years have witnessed a sea-change in this regard. Big-data empirical quantitative investigations are now coming to the forefront, enabled by enormous advances in storage capability and processing power. Diachronic corpora have grown beyond imagination, defying exploration by traditional manual qualitative methods, and language technology has become increasingly data-driven and semantics-oriented. These developments present a golden opportunity for the empirical study of semantic change over both long and short time spans. A major challenge presently is to integrate the hard-earned  knowledge and expertise of traditional historical linguistics with  cutting-edge methodology explored primarily in computational linguistics. The idea for the present volume came out of a concrete response to this challenge.  The 1st International Workshop on Computational Approaches to Historical Language Change (LChange'19), at ACL 2019, brought together scholars from both fields. This volume offers a survey of this exciting new direction in the study of semantic change, a discussion of the many remaining challenges that we face in pursuing it, and considerably updated and extended versions of a selection of the contributions to the LChange'19 workshop, addressing both more theoretical problems —  e.g., discovery of "laws of semantic change" — and practical applications, such as information retrieval in longitudinal text archives

Language Science Press

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Language Science Press

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Language Science Press