360 research outputs found
A Comparison of Feature-Based and Neural Scansion of Poetry
Automatic analysis of poetic rhythm is a challenging task that involves
linguistics, literature, and computer science. When the language to be analyzed
is known, rule-based systems or data-driven methods can be used. In this paper,
we analyze poetic rhythm in English and Spanish. We show that the
representations of data learned from character-based neural models are more
informative than the ones from hand-crafted features, and that a
Bi-LSTM+CRF-model produces state-of-the art accuracy on scansion of poetry in
two languages. Results also show that the information about whole word
structure, and not just independent syllables, is highly informative for
performing scansion.Comment: RANLP 201
ALBERTI, a Multilingual Domain Specific Language Model for Poetry Analysis
The computational analysis of poetry is limited by the scarcity of tools to
automatically analyze and scan poems. In a multilingual settings, the problem
is exacerbated as scansion and rhyme systems only exist for individual
languages, making comparative studies very challenging and time consuming. In
this work, we present \textsc{Alberti}, the first multilingual pre-trained
large language model for poetry. Through domain-specific pre-training (DSP), we
further trained multilingual BERT on a corpus of over 12 million verses from 12
languages. We evaluated its performance on two structural poetry tasks: Spanish
stanza type classification, and metrical pattern prediction for Spanish,
English and German. In both cases, \textsc{Alberti} outperforms multilingual
BERT and other transformers-based models of similar sizes, and even achieves
state-of-the-art results for German when compared to rule-based systems,
demonstrating the feasibility and effectiveness of DSP in the poetry domain.Comment: Accepted for publication at SEPLN 2023: 39th International Conference
of the Spanish Society for Natural Language Processin
Automatic Recognition of Arabic Poetry Meter from Speech Signal using Long Short-term Memory and Support Vector Machine
The recognition of the poetry meter in spoken lines is a natural language processing application that aims to identify a stressed and unstressed syllabic pattern in a line of a poem. Stateof-the-art studies include few works on the automatic recognition of Arud meters, all of which are text-based models, and none is voice based. Poetry meter recognition is not easy for an ordinary reader, it is very difficult for the listener and it is usually performed manually by experts. This paper proposes a model to detect the poetry meter from a single spoken line (“Bayt”) of an Arabic poem. Data of 230 samples collected from 10 poems of Arabic poetry, including three meters read by two speakers, are used in this work. The work adopts the extraction of linear prediction cepstrum coefficient and Mel frequency cepstral coefficient (MFCC) features, as a time series input to the proposed long short-term memory (LSTM) classifier, in addition to a global feature set that is computed using some statistics of the features across all of the frames to feed the support vector machine (SVM) classifier. The results show that the SVM model achieves the highest accuracy in the speakerdependent approach. It improves results by 3%, as compared to the state-of-the-art studies, whereas for the speaker-independent approach, the MFCC feature using LSTM exceeds the other proposed models
Versification and Authorship Attribution
The technique known as contemporary stylometry uses different methods, including machine learning, to discover a poem’s author based on features like the frequencies of words and character n-grams. However, there is one potential textual fingerprint stylometry tends to ignore: versification, or the very making of language into verse. Using poetic texts in three different languages (Czech, German, and Spanish), Petr Plecháč asks whether versification features like rhythm patterns and types of rhyme can help determine authorship. He then tests its findings on two unsolved literary mysteries. In the first, Plecháč distinguishes the parts of the Elizabethan verse play The Two Noble Kinsmen written by William Shakespeare from those written by his coauthor, John Fletcher. In the second, he seeks to solve a case of suspected forgery: how authentic was a group of poems first published as the work of the nineteenth-century Russian author Gavriil Stepanovich Batenkov? This book of poetic investigation should appeal to literary sleuths the world over.illustrato
Computational Stylistics in Poetry, Prose, and Drama
The contributions in this edited volume approach poetry, narrative, and drama from the perspective of Computational Stylistics. They exemplify methods of computational textual analysis and explore the possibility of computational generation of literary texts. The volume presents a range of computational and Natural Language Processing applications to literary studies, such as motif detection, network analysis, machine learning, and deep learning
Computational Stylistics in Poetry, Prose, and Drama
The contributions in this edited volume approach poetry, narrative, and drama from the perspective of Computational Stylistics. They exemplify methods of computational textual analysis and explore the possibility of computational generation of literary texts. The volume presents a range of computational and Natural Language Processing applications to literary studies, such as motif detection, network analysis, machine learning, and deep learning
Towards a Philological Metric through a Topological Data Analysis Approach
The canon of the baroque Spanish literature has been thoroughly studied with philological techniques.
The major representatives of the poetry of this epoch are Francisco de Quevedo and Luis de Góngora
y Argote. They are commonly classified by the literary experts in two different streams: Quevedo
belongs to the Conceptismo and Góngora to the Culteranismo. Besides, traditionally, even if Quevedo
is considered the most representative of the Conceptismo, Lope de Vega is also considered to be, at
least, closely related to this literary trend. In this paper, we use Topological Data Analysis techniques
to provide a first approach to a metric distance between the literary style of these poets. As a
consequence, we reach results that are under the literary experts’ criteria, locating the literary style of
Lope de Vega, closer to the one of Quevedo than to the one of Góngora
- …