80 research outputs found

    Techniques for document image processing in compressed domain

    Full text link
    The main objective for image compression is usually considered the minimization of storage space. However, as the need to frequently access images increases, it is becoming more important for people to process the compressed representation directly. In this work, the techniques that can be applied directly and efficiently to digital information encoded by a given compression algorithm are investigated. Lossless compression schemes and information processing algorithms for binary document images and text data are two closely related areas bridged together by the fast processing of coded data. The compressed domains, which have been addressed in this work, i.e., the ITU fax standards and JBIG standard, are two major schemes used for document compression. Based on ITU Group IV, a modified coding scheme, MG4, which explores the 2-dimensional correlation between scan lines, is developed. From the viewpoints of compression efficiency and processing flexibility of image operations, the MG4 coding principle and its feature-preserving behavior in the compressed domain are investigated and examined. Two popular coding schemes in the area of bi-level image compression, run-length and Group IV, are studied and compared with MG4 in the three aspects of compression complexity, compression ratio, and feasibility of compressed-domain algorithms. In particular, for the operations of connected component extraction, skew detection, and rotation, MG4 shows a significant speed advantage over conventional algorithms. Some useful techniques for processing the JBIG encoded images directly in the compressed domain, or concurrently while they are being decoded, are proposed and generalized; In the second part of this work, the possibility of facilitating image processing in the wavelet transform domain is investigated. The textured images can be distinguished from each other by examining their wavelet transforms. The basic idea is that highly textured regions can be segmented using feature vectors extracted from high frequency bands based on the observation that textured images have large energies in both high and middle frequencies while images in which the grey level varies smoothly are heavily dominated by the low-frequency channels in the wavelet transform domain. As a result, a new method is developed and implemented to detect textures and abnormalities existing in document images by using polynomial wavelets. Segmentation experiments indicate that this approach is superior to other traditional methods in terms of memory space and processing time

    An approach to summarize video data in compressed domain

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Electronics and Communication Engineering, Izmir, 2007Includes bibliographical references (leaves: 54-56)Text in English; Abstract: Turkish and Englishx, 59 leavesThe requirements to represent digital video and images efficiently and feasibly have collected great efforts on research, development and standardization over past 20 years. These efforts targeted a vast area of applications such as video on demand, digital TV/HDTV broadcasting, multimedia video databases, surveillance applications etc. Moreover, the applications demand more efficient collections of algorithms to enable lower bit rate levels, with acceptable quality depending on application requirements. In our time, most of the video content either stored, transmitted is in compressed form. The increase in the amount of video data that is being shared attracted interest of researchers on the interrelated problems of video summarization, indexing and abstraction. In this study, the scene cut detection in emerging ISO/ITU H264/AVC coded bit stream is realized by extracting spatio-temporal prediction information directly in the compressed domain. The syntax and semantics, parsing and decoding processes of ISO/ITU H264/AVC bit-stream is analyzed to detect scene information. Various video test data is constructed using Joint Video Team.s test model JM encoder, and implementations are made on JM decoder. The output of the study is the scene information to address video summarization, skimming, indexing applications that use the new generation ISO/ITU H264/AVC video

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Realizing EDGAR: eliminating information asymmetries through artificial intelligence analysis of SEC filings

    Get PDF
    The U.S. Securities and Exchange Commission (SEC) maintains a publicly-accessible database of all required filings of all publicly traded companies. Known as EDGAR (Electronic Data Gathering, Analysis, and Retrieval), this database contains documents ranging from annual reports of major companies to personal disclosures of senior managers. However, the common user and particularly the retail investor are overwhelmed by the deluge of information, not empowered. EDGAR as it currently functions entrenches the information asymmetry between these retail investors and the large financial institutions with which they often trade. With substantial research staffs and budgets coupled to an industry standard of “playing both sides” of a transaction, these investors “in the know” lead price fluctuations while others must follow. In general, this thesis applies recent technological advancements to the development of software tools that will derive valuable insights from EDGAR documents in an efficient time period. While numerous such commercial products currently exist, all come with significant price tags and many still rely on significant human involvement in deriving such insights. Recent years, however, have seen an explosion in the fields of Machine Learning (ML) and Natural Language Processing (NLP), which show promise in automating many of these functions with greater efficiency. ML aims to develop software which learns parameters from large datasets as opposed to traditional software which merely applies a programmer’s logic. NLP aims to read, understand, and generate language naturally, an area where recent ML advancements have proven particularly adept. Specifically, this thesis serves as an exploratory study in applying recent advancements in ML and NLP to the vast range of documents contained in the EDGAR database. While algorithms will likely never replace the hordes of research analysts that now saturate securities markets nor the advantages that accrue to large and diverse trading desks, they do hold the potential to provide small yet significant insights at little cost. This study first examines methods for document acquisition from EDGAR with a focus on a baseline efficiency sufficient for the real-time trading needs of market participants. Next, it applies recent advancements in ML and NLP, specifically recurrent neural networks, to the task of standardizing financial statements across different filers. Finally, the conclusion contextualizes these findings in an environment of continued technological and commercial evolution

    A survey of computer uses in music

    Full text link
    This thesis covers research into the mathematical basis inherent in music including review of projects related to optical character recognition (OCR) of musical symbols. Research was done about fractals creating new pieces by assigning pitches to numbers. Existing musical pieces can be taken apart and reassembled creating new ideas for composers. Musical notation understanding is covered and its requirement for the recognition of a music sheet by the computer for editing and reproduction purposes is explained. The first phase of a musical OCR was created in this thesis with the recognition of staff lines on a good quality image. Modifications will need to be made to take care of noise and tilted images that may result from scanning

    Fractal compression and analysis on remotely sensed imagery

    Get PDF
    Remote sensing images contain huge amount of geographical information and reflect the complexity of geographical features and spatial structures. As the means of observing and describing geographical phenomena, the rapid development of remote sensing has provided an enormous amount of geographical information. The massive information is very useful in a variety of applications but the sheer bulk of this information has increased beyond what can be analyzed and used efficiently and effectively. This uneven increase in the technologies of gathering and analyzing information has created difficulties in its storage, transfer, and processing. Fractal geometry provides a means of describing and analyzing the complexity of different geographical features in remotely sensed images. It also provides a more powerful tool to compress the remote sensing data than traditional methods. This study suggests, for the first time, the implementation of this usage of fractals to remotely sensed images. In this study, based on fractal concepts, compression and decompression algorithms were developed and applied to Landsat TM images of eight study areas with different land cover types; the fidelity and efficiency of the algorithms and their relationship with the spatial complexity of the images were evaluated. Three research hypotheses were tested and the fractal compression was compared with two commonly used compression methods, JPEG and WinZip. The effects of spatial complexity and pixel resolution on the compression rate were also examined. The results from this study show that the fractal compression method has higher compression rate than JPEG and WinZip. As expected, higher compression rates were obtained from images of lower complexity and from images of lower spatial resolution (larger pixel size). This study shows that in addition to the fractal’s use in measuring, describing, and simulating the roughness of landscapes in geography, fractal techniques were useful in remotely sensed image compression. Moreover, the compression technique can be seen as a new method of measuring the diverse landscapes and geographical features. As such, this study has introduced a new and advantageous passageway for fractal applications and their important applications in remote sensing
    corecore