Search CORE

360 research outputs found

A Comparison of Feature-Based and Neural Scansion of Poetry

Author: Agirrezabal Manex
Alegria Iñaki
Hulden Mans
Publication venue
Publication date: 01/01/2017
Field of study

Automatic analysis of poetic rhythm is a challenging task that involves linguistics, literature, and computer science. When the language to be analyzed is known, rule-based systems or data-driven methods can be used. In this paper, we analyze poetic rhythm in English and Spanish. We show that the representations of data learned from character-based neural models are more informative than the ones from hand-crafted features, and that a Bi-LSTM+CRF-model produces state-of-the art accuracy on scansion of poetry in two languages. Results also show that the information about whole word structure, and not just independent syllables, is highly informative for performing scansion.Comment: RANLP 201

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

ALBERTI, a Multilingual Domain Specific Language Model for Poetry Analysis

Author: de la Rosa Javier
González-Blanco Elena
Pozo Álvaro Pérez
Ros Salvador
Publication venue
Publication date: 03/07/2023
Field of study

The computational analysis of poetry is limited by the scarcity of tools to automatically analyze and scan poems. In a multilingual settings, the problem is exacerbated as scansion and rhyme systems only exist for individual languages, making comparative studies very challenging and time consuming. In this work, we present \textsc{Alberti}, the first multilingual pre-trained large language model for poetry. Through domain-specific pre-training (DSP), we further trained multilingual BERT on a corpus of over 12 million verses from 12 languages. We evaluated its performance on two structural poetry tasks: Spanish stanza type classification, and metrical pattern prediction for Spanish, English and German. In both cases, \textsc{Alberti} outperforms multilingual BERT and other transformers-based models of similar sizes, and even achieves state-of-the-art results for German when compared to rule-based systems, demonstrating the feasibility and effectiveness of DSP in the poetry domain.Comment: Accepted for publication at SEPLN 2023: 39th International Conference of the Spanish Society for Natural Language Processin

arXiv.org e-Print Archive

Automatic Recognition of Arabic Poetry Meter from Speech Signal using Long Short-term Memory and Support Vector Machine

Author: Al-Talabani Abdulbasit K.
Publication venue: 'Koya University'
Publication date: 14/04/2020
Field of study

The recognition of the poetry meter in spoken lines is a natural language processing application that aims to identify a stressed and unstressed syllabic pattern in a line of a poem. Stateof-the-art studies include few works on the automatic recognition of Arud meters, all of which are text-based models, and none is voice based. Poetry meter recognition is not easy for an ordinary reader, it is very difficult for the listener and it is usually performed manually by experts. This paper proposes a model to detect the poetry meter from a single spoken line (“Bayt”) of an Arabic poem. Data of 230 samples collected from 10 poems of Arabic poetry, including three meters read by two speakers, are used in this work. The work adopts the extraction of linear prediction cepstrum coefficient and Mel frequency cepstral coefficient (MFCC) features, as a time series input to the proposed long short-term memory (LSTM) classifier, in addition to a global feature set that is computed using some statistics of the features across all of the frames to feed the support vector machine (SVM) classifier. The results show that the SVM model achieves the highest accuracy in the speakerdependent approach. It improves results by 3%, as compared to the state-of-the-art studies, whereas for the speaker-independent approach, the MFCC feature using LSTM exceeds the other proposed models

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY

Versification and Authorship Attribution

Author: Plecháč Petr
Šeļa Artjoms
Publication venue: 'Charles University in Prague, Karolinum Press'
Publication date: 01/01/2021
Field of study

The technique known as contemporary stylometry uses different methods, including machine learning, to discover a poem’s author based on features like the frequencies of words and character n-grams. However, there is one potential textual fingerprint stylometry tends to ignore: versification, or the very making of language into verse. Using poetic texts in three different languages (Czech, German, and Spanish), Petr Plecháč asks whether versification features like rhythm patterns and types of rhyme can help determine authorship. He then tests its findings on two unsolved literary mysteries. In the first, Plecháč distinguishes the parts of the Elizabethan verse play The Two Noble Kinsmen written by William Shakespeare from those written by his coauthor, John Fletcher. In the second, he seeks to solve a case of suspected forgery: how authentic was a group of poems first published as the work of the nineteenth-century Russian author Gavriil Stepanovich Batenkov? This book of poetic investigation should appeal to literary sleuths the world over.illustrato

CU Digital Repository

Directory of Open Access Books (DOAB)

Computational Stylistics in Poetry, Prose, and Drama

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/01/2023
Field of study

The contributions in this edited volume approach poetry, narrative, and drama from the perspective of Computational Stylistics. They exemplify methods of computational textual analysis and explore the possibility of computational generation of literary texts. The volume presents a range of computational and Natural Language Processing applications to literary studies, such as motif detection, network analysis, machine learning, and deep learning

Directory of Open Access Books (DOAB)

Computational Stylistics in Poetry, Prose, and Drama

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

OAPEN Library

Towards a Philological Metric through a Topological Data Analysis Approach

Author: González Díaz Rocío
Gutiérrez Naranjo Miguel Ángel
Paluzo Hidalgo Eduardo
Publication venue: 'SAGE Publications'
Publication date: 01/01/2019
Field of study

The canon of the baroque Spanish literature has been thoroughly studied with philological techniques. The major representatives of the poetry of this epoch are Francisco de Quevedo and Luis de Góngora y Argote. They are commonly classified by the literary experts in two different streams: Quevedo belongs to the Conceptismo and Góngora to the Culteranismo. Besides, traditionally, even if Quevedo is considered the most representative of the Conceptismo, Lope de Vega is also considered to be, at least, closely related to this literary trend. In this paper, we use Topological Data Analysis techniques to provide a first approach to a metric distance between the literary style of these poets. As a consequence, we reach results that are under the literary experts’ criteria, locating the literary style of Lope de Vega, closer to the one of Quevedo than to the one of Góngora

idUS. Depósito de Investigación Universidad de Sevilla