34 research outputs found

    Morphological complexity of the word

    Get PDF
    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugÀnglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.Although there is much linguistic work concerning morphological complexity of words no study tries to scale their physical form. In this paper firstly we present a scaling of word's morphological complexity considering reduplication, compounding, derivation, inflection and suppletivism. Secondly we show some quantitative analyses of morphological complexity with respect to arc, motif, and distances of words in a Slovak poem as an example

    Drawing Elena Ferrante's Profile. Workshop Proceedings, Padova, 7 September 2017

    Get PDF
    Elena Ferrante is an internationally acclaimed Italian novelist whose real identity has been kept secret by E/O publishing house for more than 25 years. Owing to her popularity, major Italian and foreign newspapers have long tried to discover her real identity. However, only a few attempts have been made to foster a scientific debate on her work. In 2016, Arjuna Tuzzi and Michele Cortelazzo led an Italian research team that conducted a preliminary study and collected a well-founded, large corpus of Italian novels comprising 150 works published in the last 30 years by 40 different authors. Moreover, they shared their data with a select group of international experts on authorship attribution, profiling, and analysis of textual data: Maciej Eder and Jan Rybicki (Poland), Patrick Juola (United States), Vittorio Loreto and his research team, Margherita Lalli and Francesca Tria (Italy), George Mikros (Greece), Pierre Ratinaud (France), and Jacques Savoy (Switzerland). The chapters of this volume report the results of this endeavour that were first presented during the international workshop Drawing Elena Ferrante's Profile in Padua on 7 September 2017 as part of the 3rd IQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The fascinating research findings suggest that Elena Ferrante\u2019s work definitely deserves \u201cmany hands\u201d as well as an extensive effort to understand her distinct writing style and the reasons for her worldwide success

    SHOE:The extraction of hierarchical structure for machine learning of natural language

    Get PDF

    Corrections of Zipf's and Heaps' Laws Derived from Hapax Rate Models

    Full text link
    The article introduces corrections to Zipf's and Heaps' laws based on systematic models of the hapax rate. The derivation rests on two assumptions: The first one is the standard urn model which predicts that marginal frequency distributions for shorter texts look as if word tokens were sampled blindly from a given longer text. The second assumption posits that the rate of hapaxes is a simple function of the text size. Four such functions are discussed: the constant model, the Davis model, the linear model, and the logistic model. It is shown that the logistic model yields the best fit.Comment: 42 pages, 7 figures, 3 table

    As good as it gets? Unrepresented litigant and courtroom dynamics: a case study

    Get PDF
    This paper examines the pragmatic competence of unrepresented litigants in court. In doing so, it engages larger themes including laymen understanding of the law and the major challenge for judges in an adversarial legal system of balancing offers of assistance with maintaining judicial neutrality. The research reported in the paper involved 72 hours of observation during a 14-day trial in a Hong Kong appellate court. The litigant in the case had represented herself previously in at least three lawsuits over a period of ten years. She had initiated each action, and two cases had gone to appeal. As well as having extensive litigation experience in this way, the litigant was also a highly educated professional, capable of speaking fluently in the professional language of the proceedings; seemingly she had also devoted a lot of time to researching and preparing her cases. These characteristics mark her out as among the most prepared of unrepresented litigants to deal with obstacles presented by legal procedures and courtroom requirements. The study contributes to the field in two main respects: 1) Perspective. Previous studies of unrepresented litigants have tended to take a top-down approach. They look at litigant behaviour from the perspective of a judge or lawyer. This study uses discourse data obtained from courtroom observation in an attempt to understand the trial from the litigant’s perspective. 2) Access to justice. The stereotypical unrepresented litigant has low income and literacy, and makes obvious mistakes in court. The litigant in this case study represents the other end of the litigant-in-person spectrum. Her courtroom struggles expose obstacles that all unrepresented litigants face, which are not easily overcome even after repeated experience of the legal system. The data show persistent misconceptions regarding the law, and reveal tensions between layman understanding of justice and the institutional delivery of legal outcomes by the courts. In recent years there has been a rapid rise in the number of unrepresented litigants in Hong Kong, as in many other jurisdictions. Better understanding of courtroom dynamics created by the behaviour of such litigants may help prevent interruptions in courtroom proceedings and improve overall public access to justice.postprin

    Quantifying Interpreting Types: Language Sequence Mirrors Cognitive Load Minimization in Interpreting Tasks

    Get PDF
    Most interpreting theories claim that different interpreting types should involve varied processing mechanisms and procedures. However, few studies have examined their underlying differences. Even though some previous results based on quantitative approaches show that different interpreting types yield outputs of varying lexical and syntactic features, the grammatical parsing approach is limited. Language sequences that form without relying on parsing or processing with a specific linguistic approach or grammar excel other quantitative approaches at revealing the sequential behavior of language production. As a non-grammatically-bound unit of language sequences, frequency motif can visualize the local distribution of content and function words, and can also statistically classify languages and identify text types. Thus, the current research investigates the distribution, length and position-dependent properties of frequency motifs across different interpreting outputs in pursuit of the sequential generation behaviors. It is found that the distribution, the length and certain position-dependent properties of the specific language sequences differ significantly across simultaneous interpreting and consecutive interpreting output. The features of frequency motifs manifest that both interpreting output is produced in the manner that abides by the least effort principle. The current research suggests that interpreting types can be differentiated through this type of language sequential unit and offers evidence for how the different task features mediate the sequential organization of interpreting output under different demand to achieve cognitive load minimization

    Verse diversification: Frequencies and variations of verse types in Vana kannel and Kalevipoeg

    Get PDF
    The present study concentrates on specific linguistic aspects in traditional Estonian poetic texts. Focusing on the verse structure of the traditional folk song of Vana kannel and the individually edited and authored epic poem Kalevipoeg, different aspects of the length of verse lines, of the words included in these verses, and of the relation between verse and word length shall be analyzed, aiming to study verse variability in detail. Given there are specific rules of verse and word length organization, as well as of regular relations between them, sequences of words with different length, resulting in different verse types, are focused. Theoretical and empirical evidence is provided that, in addition to existing regularities, verse variability, too, follows specific rules which can be modelled in terms of a diversificational process

    Max-Planck-Institute for Psycholinguistics: Annual Report 2003

    Get PDF
    corecore