4 research outputs found

    Developing an Image-Based Classifier for Detecting Poetic Content in Historic Newspaper Collections

    Get PDF
    Developing an Image-Based Classifier for Detecting Poetic Content in Historic Newspaper Collections details and analyzes the first stage of work of the Image Analysis for Archival Discovery project team. Our team is is investigating the use of image analysis to identify poetic content in historic newspapers. The project seeks both to augment the study of literary history by drawing attention to the magnitude of poetry published in newspapers and by making the poetry more readily available for study, as well as to advance work on the use of digital images in facilitating discovery in digital libraries and other digitized collections. We have recently completed the process of training our classifier for identifying poetic content, and as we prepare to move in to the deployment stage, we are making available our methods for classification and testing in order to promote further research and discussion. The precision and recall values achieved during the training (90.58%; 79.4%) and testing (74.92%; 61.84%) stages are encouraging. In addition to discussing why such an approach is needed and relevant and situating our project alongside related work, this paper analyzes preliminary results, which support the feasibility and viability of our approach to detecting poetic content in historic newspaper collections

    Positive data clustering using finite inverted dirichlet mixture models

    Get PDF
    In this thesis we present an unsupervised algorithm for learning finite mixture models from multivariate positive data. Indeed, this kind of data appears naturally in many applications, yet it has not been adequately addressed in the past. This mixture model is based on the inverted Dirichlet distribution, which offers a good representation and modeling of positive non gaussian data. The proposed approach for estimating the parameters of an inverted Dirichlet mixture is based on the maximum likelihood (ML) using Newton Raphson method. We also develop an approach, based on the Minimum Message Length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture. Experimental results are presented using artificial histograms and real data sets. The challenging problem of software modules classification is investigated within the proposed statistical framework, also

    ALOS-2/PALSAR-2 Calibration, Validation, Science and Applications

    Get PDF
    Twelve edited original papers on the latest and state-of-art results of topics ranging from calibration, validation, and science to a wide range of applications using ALOS-2/PALSAR-2. We hope you will find them useful for your future research
    corecore