442 research outputs found

    Software as theory: a case study in the domain of text analysis

    Get PDF
    This article proposes a reflection on a specific way of envisioning and valorising the scholarly contribution of scientific software, namely by making explicit the model of data analysis that underlies it. It seeks to illustrate this way of studying a software construct by applying it to a particular text analysis program. Fundamental aspects of this program's design (input and output, data structures, process model, and user interface) are reviewed and discussed from the point of view of their implications in terms of theoretical commitments to a specific conception of text and text analysis. The conclusions of this case study notably emphasise the central role of user modelling in the assessment of scientific software's epistemological contribution as well as the necessity of extending the proposed approach to a broader range of software applications

    Markov associativities

    Get PDF

    On the robust measurement of inflectional diversity

    Get PDF
    Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity

    Learning phonological categories

    Get PDF
    This paper describes in detail several explicit computational methods for approaching such questions in phonology as the vowel/consonant distinction, the nature of vowel harmony systems, and syllable structure, appealing solely to distributional information. Beginning with the vowel/consonant distinction, we consider a method for its discovery by the Russian linguist Sukhotin, and compare it to two newer methods of more general interest, both computational and theoretical, today. The rst is based on spectral decomposition of matrices, allowing for dimensionality reduction in a nely controlled way, and the second is based on nding parameters for maximum likelihood in a hidden Markov model. While all three methods work for discovering the fairly robust vowel/consonant distinction, we extend the newer ones to the discovery of vowel harmony, and in the case of the probabilistic model, to the discovery of some aspects of syllable structure, and o er an evaluation of the results.

    Segmentation and Clustering of Textual Sequences: a Typological Approach

    Get PDF
    The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning

    A comparison of age-standardised event rates for acute and chronic coronary heart disease in metropolitan and regional/remote Victoria: a retrospective cohort study

    Get PDF
    Abstract Background Acute and chronic coronary heart disease (CHD) pose different burdens on health-care services and require different prevention and treatment strategies. Trends in acute and chronic CHD event rates can guide service implementation. This study evaluated changes in acute and chronic CHD event rates in metropolitan and regional/remote Victoria. Methods Victorian hospital admitted episodes with a principal diagnosis of acute CHD or chronic CHD were identified from 2005 to 2012. Acute and chronic CHD age-standardised event rates were calculated in metropolitan and regional/remote Victoria. Poisson log-link linear regression was used to estimate annual change in acute and chronic CHD event rates. Results Acute CHD age-standardised event rates decreased annually by 2.9 % (95 % CI, −4.3 to −1.4 %) in metropolitan Victoria and 1.7 % (95 % CI, −3.2 to −0.1 %) in regional/remote Victoria. In comparison, chronic CHD age-standardised event rates increased annually by 4.8 % (95 % CI, +3.0 to +6.5 %) in metropolitan Victoria and 3.1 % (95 % CI, +1.3 to +4.9 %) in regional/remote Victoria. On average, age-standardised event rates for regional/remote Victoria were 30.3 % (95 % CI, 23.5 to 37.2 %) higher for acute CHD and 55.3 % (95 % CI, 47.1 to 63.5 %) higher for chronic CHD compared to metropolitan Victoria from 2005 to 2012. Conclusion Annual decreases in acute CHD age-standardised event rates might reflect improvements in primary prevention, while annual increases in chronic CHD age-standardised event rates suggest a need to improve secondary prevention strategies. Consistently higher acute and chronic CHD age-standardised event rates were evident in regional/remote Victoria compared to metropolitan Victoria from 2005 to 2012
    corecore