15 research outputs found

    Post-OCR Paragraph Recognition by Graph Convolutional Networks

    Full text link
    Paragraphs are an important class of document entities. We propose a new approach for paragraph identification by spatial graph convolutional neural networks (GCN) applied on OCR text boxes. Two steps, namely line splitting and line clustering, are performed to extract paragraphs from the lines in OCR results. Each step uses a beta-skeleton graph constructed from bounding boxes, where the graph edges provide efficient support for graph convolution operations. With only pure layout input features, the GCN model size is 3~4 orders of magnitude smaller compared to R-CNN based models, while achieving comparable or better accuracies on PubLayNet and other datasets. Furthermore, the GCN models show good generalization from synthetic training data to real-world images, and good adaptivity for variable document styles

    Large language models in machine translation

    Get PDF
    This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 trillion tokens, resulting in language models having up to 300 billion n-grams. It is capable of providing smoothed probabilities for fast, single-pass decoding. We introduce a new smoothing method, dubbed Stupid Backoff, that is inexpensive to train on large data sets and approaches the quality of Kneser-Ney Smoothing as the amount of training data increases.

    Advanced Television Research Program

    Get PDF
    Contains reports on ten research projects.National Science Foundation Grant MIP 87-14969National Science Foundation FellowshipAdvanced Television Research ProgramAT&T Bell Laboratories Doctoral Support ProgramKodak FellowshipU.S. Air Force - Electronic Systems Division Contract F1 9628-89-K-004

    Advanced Television Research Program

    Get PDF
    Contains an introduction and reports on twelve research projects.Advanced Television Research ProgramNational Science Foundation Grant MIP 87-14969National Science Foundation FellowshipKodak Fellowshi

    Search and Retrieval—search processes General Terms

    No full text
    An advanced visual interface system is presented for fluid interaction in a personal digital library system. The system employs a zoomable planar representation of a collection using hybrid continuous/quantum treemap visualizations to facilitate navigation while minimizing cognitive load. By providing both fluidity and a means of reading documents within the same visualization, the system obliterates the traditional boundary separating the acquisition of materials from their use. In addition, the system provides a means of streamlining and largely automating the addition of new documents into a collection. The system is particularly well suited to user tasks which, in the physical world, are normally carried out by laying out a set of related documents on a physical desk — namely, those tasks that require frequent and rapid transfer of attention from one document in the collection to another. Discussed are the design and implementation of the system as well as its relationship to previous work

    Making UpLib Useful: Personal Document Engineering ABSTRACT

    No full text
    Any new system must provide significant advantages to users for them to adopt it over their existing practices. In this paper, we discuss changes made over the last two years of use of the UpLib personal digital library system, to provide those advantages in the realm of document management. These changes are concentrated in the document acquisition phase, where document analysis is performed and databases of document information are prepared. However, some changes have been made in the areas of document management and document usage, primarily to allow user better interaction with the improved document projections

    Fluid Interface for Personal Digital Libraries

    No full text
    Abstract. An advanced interface is presented for fluid interaction in a personal digital library system. The system employs a zoomable planar representation of a collection using hybrid continuous/quantum treemap visualizations to facilitate navigation while minimizing cognitive load. The system is particularly well suited to user tasks which, in the physical world, are normally carried out by laying out a set of related documents on a physical desk — namely, those tasks that require frequent and rapid transfer of attention from one document in the collection to another. Discussed are the design and implementation of the system as well as its relationship to previous work.
    corecore