70,548 research outputs found

    Sparse Radial Sampling LBP for Writer Identification

    Full text link
    In this paper we present the use of Sparse Radial Sampling Local Binary Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture classification. By adapting and extending the standard LBP operator to the particularities of text we get a generic text-as-texture classification scheme and apply it to writer identification. In experiments on CVL and ICDAR 2013 datasets, the proposed feature-set demonstrates State-Of-the-Art (SOA) performance. Among the SOA, the proposed method is the only one that is based on dense extraction of a single local feature descriptor. This makes it fast and applicable at the earliest stages in a DIA pipeline without the need for segmentation, binarization, or extraction of multiple features.Comment: Submitted to the 13th International Conference on Document Analysis and Recognition (ICDAR 2015

    A Comparative Study of Two State-of-the-Art Feature Selection Algorithms for Texture-Based Pixel-Labeling Task of Ancient Documents

    Get PDF
    International audienceRecently, texture features have been widely used for historical document image analysis. However, few studies have focused exclusively on feature selection algorithms for historical document image analysis. Indeed, an important need has emerged to use a feature selection algorithm in data mining and machine learning tasks, since it helps to reduce the data dimensionality and to increase the algorithm performance such as a pixel classification algorithm. Therefore, in this paper we propose a comparative study of two conventional feature selection algorithms, genetic algorithm and ReliefF algorithm, using a classical pixel-labeling scheme based on analyzing and selecting texture features. The two assessed feature selection algorithms in this study have been applied on a training set of the HBR dataset in order to deduce the most selected texture features of each analyzed texture-based feature set. The evaluated feature sets in this study consist of numerous state-of-the-art texture features (Tamura, local binary patterns, gray-level run-length matrix, auto-correlation function, gray-level co-occurrence matrix, Gabor filters, Three-level Haar wavelet transform, three-level wavelet transform using 3-tap Daubechies filter and three-level wavelet transform using 4-tap Daubechies filter). In our experiments, a public corpus of historical document images provided in the context of the historical book recognition contest (HBR2013 dataset: PRImA, Salford, UK) has been used. Qualitative and numerical experiments are given in this study in order to provide a set of comprehensive guidelines on the strengths and the weaknesses of each assessed feature selection algorithm according to the used texture feature set

    Quantitative Perspectives on Fifty Years of the Journal of the History of Biology

    Get PDF
    Journal of the History of Biology provides a fifty-year long record for examining the evolution of the history of biology as a scholarly discipline. In this paper, we present a new dataset and preliminary quantitative analysis of the thematic content of JHB from the perspectives of geography, organisms, and thematic fields. The geographic diversity of authors whose work appears in JHB has increased steadily since 1968, but the geographic coverage of the content of JHB articles remains strongly lopsided toward the United States, United Kingdom, and western Europe and has diversified much less dramatically over time. The taxonomic diversity of organisms discussed in JHB increased steadily between 1968 and the late 1990s but declined in later years, mirroring broader patterns of diversification previously reported in the biomedical research literature. Finally, we used a combination of topic modeling and nonlinear dimensionality reduction techniques to develop a model of multi-article fields within JHB. We found evidence for directional changes in the representation of fields on multiple scales. The diversity of JHB with regard to the representation of thematic fields has increased overall, with most of that diversification occurring in recent years. Drawing on the dataset generated in the course of this analysis, as well as web services in the emerging digital history and philosophy of science ecosystem, we have developed an interactive web platform for exploring the content of JHB, and we provide a brief overview of the platform in this article. As a whole, the data and analyses presented here provide a starting-place for further critical reflection on the evolution of the history of biology over the past half-century.Comment: 45 pages, 14 figures, 4 table

    How users assess web pages for information-seeking

    Get PDF
    In this paper, we investigate the criteria used by online searchers when assessing the relevance of web pages for information-seeking tasks. Twenty four participants were given three tasks each, and indicated the features of web pages which they employed when deciding about the usefulness of the pages in relation to the tasks. These tasks were presented within the context of a simulated work-task situation. We investigated the relative utility of features identified by participants (web page content,structure and quality), and how the importance of these features is affected by the type of information-seeking task performed and the stage of the search. The results of this study provide a set of criteria used by searchers to decide about the utility of web pages for different types of tasks. Such criteria can have implications for the design of systems that use or recommend web pages

    Analysis of science textbook pictures about energy and pupils' readings of them

    Get PDF
    This article outlines the findings of the part of the "Science Teacher Training in an Information Society" (STTIS) project concerned with describing the possible difficulties the pupils have when "reading" science textbook pictures about "energy". Six documents were selected on the basis that they had some of the textual/graphical features previously identified by the project as potentially presenting difficulties to pupils. The pupils' readings of these were investigated using a questionnaire and a follow-up interview. The analysis of three of the documents and of twelve pupils' readings of them is reported in this paper. The results confirm the hypothesis that the "reading" of science textbook pictures is not at all trivial for pupils and conclude that teachers need to spend time and effort talking through the meaning of the images with them. They also suggest that the list of textual/graphical features used in this research is a good starting point for this kind of critical examination

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Animating the evolution of software

    Get PDF
    The use and development of open source software has increased significantly in the last decade. The high frequency of changes and releases across a distributed environment requires good project management tools in order to control the process adequately. However, even with these tools in place, the nature of the development and the fact that developers will often work on many other projects simultaneously, means that the developers are unlikely to have a clear picture of the current state of the project at any time. Furthermore, the poor documentation associated with many projects has a detrimental effect when encouraging new developers to contribute to the software. A typical version control repository contains a mine of information that is not always obvious and not easy to comprehend in its raw form. However, presenting this historical data in a suitable format by using software visualisation techniques allows the evolution of the software over a number of releases to be shown. This allows the changes that have been made to the software to be identified clearly, thus ensuring that the effect of those changes will also be emphasised. This then enables both managers and developers to gain a more detailed view of the current state of the project. The visualisation of evolving software introduces a number of new issues. This thesis investigates some of these issues in detail, and recommends a number of solutions in order to alleviate the problems that may otherwise arise. The solutions are then demonstrated in the definition of two new visualisations. These use historical data contained within version control repositories to show the evolution of the software at a number of levels of granularity. Additionally, animation is used as an integral part of both visualisations - not only to show the evolution by representing the progression of time, but also to highlight the changes that have occurred. Previously, the use of animation within software visualisation has been primarily restricted to small-scale, hand generated visualisations. However, this thesis shows the viability of using animation within software visualisation with automated visualisations on a large scale. In addition, evaluation of the visualisations has shown that they are suitable for showing the changes that have occurred in the software over a period of time, and subsequently how the software has evolved. These visualisations are therefore suitable for use by developers and managers involved with open source software. In addition, they also provide a basis for future research in evolutionary visualisations, software evolution and open source development
    • …
    corecore