6,048 research outputs found

    Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project

    Get PDF
    From July 16-to November 8, 2019, the Aida digital libraries research team at the University of Nebraska-Lincoln collaborated with the Library of Congress on “Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project.“ This demonstration project sought to (1) develop and investigate the viability and feasibility of textual and image-based data analytics approaches to support and facilitate discovery; (2) understand technical tools and requirements for the Library of Congress to improve access and discovery of its digital collections; and (3) enable the Library of Congress to plan for future possibilities. In pursuit of these goals, we focused our work around two areas: extracting and foregrounding visual content from Chronicling America (chroniclingamerica.loc.gov) and applying a series of image processing and machine learning methods to minimally processed manuscript collections featured in By the People (crowd.loc.gov). We undertook a series of explorations and investigated a range of issues and challenges related to machine learning and the Library’s collections. This final report details the explorations, addresses social and technical challenges with regard to the explorations and that are critical context for the development of machine learning in the cultural heritage sector, and makes several recommendations to the Library of Congress as it plans for future possibilities. We propose two top-level recommendations. First, the Library should focus the weight of its machine learning efforts and energies on social and technical infrastructures for the development of machine learning in cultural heritage organizations, research libraries, and digital libraries. Second, we recommend that the Library invest in continued, ongoing, intentional explorations and investigations of particular machine learning applications to its collections. Both of these top-level recommendations map to the three goals of the Library’s 2019 digital strategy. Within each top-level recommendation, we offer three more concrete, short- and medium-term recommendations. They include, under social and technical infrastructures: (1) Develop a statement of values or principles that will guide how the Library of Congress pursues the use, application, and development of machine learning for cultural heritage. (2) Create and scope a machine learning roadmap for the Library that looks both internally to the Library of Congress and its needs and goals and externally to the larger cultural heritage and other research communities. (3) Focus efforts on developing ground truth sets and benchmarking data and making these easily available. Nested under the recommendation to support ongoing explorations and investigations, we recommend that the Library: (4) Join the Library of Congress’s emergent efforts in machine learning with its existing expertise and leadership in crowdsourcing. Combine these areas as “informed crowdsourcing” as appropriate. (5) Sponsor challenges for teams to create additional metadata for digital collections in the Library of Congress. As part of these challenges, require teams to engage across a range of social and technical questions and problem areas. (6) Continue to create and support opportunities for researchers to partner in substantive ways with the Library of Congress on machine learning explorations. Each of these recommendations speak to the investigation and challenge areas identified by Thomas Padilla in Responsible Operations: Data Science, Machine Learning, and AI in Libraries. This demonstration project—via its explorations, discussion, and recommendations—shows the potential of machine learning toward a variety of goals and use cases, and it argues that the technology itself will not be the hardest part of this work. The hardest part will be the myriad challenges to undertaking this work in ways that are socially and culturally responsible, while also upholding responsibility to make the Library of Congress’s materials available in timely and accessible ways. Fortunately, the Library of Congress is in a remarkable position to advance machine learning for cultural heritage organizations, through its size, the diversity of its collections, and its commitment to digital strategy

    Advanced document data extraction techniques to improve supply chain performance

    Get PDF
    In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information

    The effect(s) of word processing software on the equality of the composing process, product, and attitudes of adult academic ESL (English as a second language) writers

    Get PDF
    The focus of this study was on the effect of word processing on the quality of the composing process, product, and attitudes of adult academic ESL writers. Twenty adult ESL students, comprising an ‘intact’ EAP (English for Academic Purposes) group, completed a number of written assignments as part of their ESL unit, using either word processing or conventional ‘pen and paper’ composition methods. Their handwritten and word processed work was analysed and compared through the use of an holistic/analytic scale of writing quality. In addition to this analysis of the ‘finished product’, texts were analysed in terms of the frequency, nature and extent of revisions made within the composition process. Statistical analysis of the writing quality and revision data – as well as audio-taped verbal protocols from selected subjects, interviews, and observational notes, were used to determine the effect (s) of word processing on the composing process, product and attitudes of these subjects. The data indicate that word processing does improve writing quality – and that it also influences revising behaviours and subject attitudes towards writing. There does not appear, for these subjects, to have been any significant correlation between revision and writing quality

    Understanding Optical Music Recognition

    Get PDF
    For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords

    The implications of handwritten text recognition for accessing the past at scale

    Get PDF
    Before Handwritten Text Recognition (HTR), manuscripts were costly to convert to machine-processable text for research and analysis. With HTR now achieving high levels of accuracy, we ask what near-future behaviour, interaction, experience, values and infrastructures may occur when HTR is applied to historical documents? When combined with mass-digitisation of GLAM (galleries, libraries, archives and museums) content, how will HTR’s application, use, and affordances generate new knowledge of the past, and affect our information environment? This paper’s findings emerge from a literature review surveying current understanding of the impact of HTR, to explore emerging issues over the coming decade. We aim to deconstruct the simplistic narrative that the speed, efficiency, and scale of HTR will “transform scholarship in the archives” (Muehlberger et al., 2019: 955), providing a more nuanced consideration of its application, possibilities, and opportunities. In doing so, our recommendations will assist researchers, data and platform providers, memory institutions and data scientists to understand how the results of HTR interact with the wider information environment.We find that HTR supports the creation of accurate transcriptions from historical manuscripts, and the enhancement of existing datasets. HTR facilitates access to a greater range of materials, including endangered languages, enabling a new focus on personal and private materials (diaries, letters), increasing access to historical voices not usually incorporated into the historical record, and increasing the scale and heterogeneity of available material. The production of general training models leads to a virtuous digitisation circle where similar datasets are easier – and therefore more likely – to be produced. This leads to the requirement for processes that will facilitate the storage, and discoverability of HTR generated content, and for memory institutions to rethink search and access to collections. Challenges include HTR’s dependency on digitisation, its relation to archival history and omission, and the entrenchment of bias in data sources. The paper details several near future issues, including: the potential of HTR for the basis of automated metadata extraction; the integration of advanced Artificial Intelligence (AI) processes (including Large Language Models (LLMs) and generative AI) into HTR systems; legal and moral issues such as copyright, privacy and data ethics which are challenged by the use of HTR; how individual contributions to shared HTR models can be credited; and the environmental costs of HTR infrastructure. We identify the need for greater collaboration between communities including historians, information scientists, and data scientists to navigate these issues, and for further skills support to allow non-specialist audiences to make the most of HTR. Data literacy will become increasingly important, as will building frameworks to establish data sharing, data consent, and reuse principles, particularly in building open repositories to share models and datasets. Finally, we suggest that an understanding of how HTR is changing the information environment is a crucial aspect of future technological development. <br/

    Modern Information Systems

    Get PDF
    The development of modern information systems is a demanding task. New technologies and tools are designed, implemented and presented in the market on a daily bases. User needs change dramatically fast and the IT industry copes to reach the level of efficiency and adaptability for its systems in order to be competitive and up-to-date. Thus, the realization of modern information systems with great characteristics and functionalities implemented for specific areas of interest is a fact of our modern and demanding digital society and this is the main scope of this book. Therefore, this book aims to present a number of innovative and recently developed information systems. It is titled "Modern Information Systems" and includes 8 chapters. This book may assist researchers on studying the innovative functions of modern systems in various areas like health, telematics, knowledge management, etc. It can also assist young students in capturing the new research tendencies of the information systems' development

    Human and Artificial Intelligence

    Get PDF
    Although tremendous advances have been made in recent years, many real-world problems still cannot be solved by machines alone. Hence, the integration between Human Intelligence and Artificial Intelligence is needed. However, several challenges make this integration complex. The aim of this Special Issue was to provide a large and varied collection of high-level contributions presenting novel approaches and solutions to address the above issues. This Special Issue contains 14 papers (13 research papers and 1 review paper) that deal with various topics related to human–machine interactions and cooperation. Most of these works concern different aspects of recommender systems, which are among the most widespread decision support systems. The domains covered range from healthcare to movies and from biometrics to cultural heritage. However, there are also contributions on vocal assistants and smart interactive technologies. In summary, each paper included in this Special Issue represents a step towards a future with human–machine interactions and cooperation. We hope the readers enjoy reading these articles and may find inspiration for their research activities

    Towards Second and Third Generation Web-Based Multimedia

    Get PDF
    First generation Web-content encodes information in handwritten (HTML) Web pages. Second generation Web content generates HTML pages on demand, e.g. by filling in templates with content retrieved dynamically from a database or transformation of structured documents using style sheets (e.g. XSLT). Third generation Web pages will make use of rich markup (e.g. XML) along with metadata (e.g. RDF) schemes to make the content not only machine readable but also machine processable - a necessary pre-requisite to the emphSemantic Web. While text-based content on the Web is already rapidly approaching the third generation, multimedia content is still trying to catch up with second generation techniques. Multimedia document processing has a number of fundamentally different requirements from text which make it more difficult to incorporate within the document processing chain. In particular, multimedia transformation uses different document and presentation abstractions, its formatting rules cannot be based on text-flow, it requires feedback from the formatting back-end and is hard to describe in the functional style of current style languages. We state the requirements for second generation processing of multimedia and describe how these have been incorporated in our prototype multimedia document transformation environment, emphCuypers. The system overcomes a number of the restrictions of the text-flow based tool sets by integrating a number of conceptually distinct processing steps in a single runtime execution environment. We describe the need for these different processing steps and describe them in turn (semantic structure, communicative device, qualitative constraints, quantitative constraints, final form presentation), and illustrate our approach by means of an example. We conclude by discussing the models and techniques required for the creation of third generation multimedia content
    • 

    corecore