1,207 research outputs found

    Focused image search in the social Web.

    Get PDF
    Recently, social multimedia-sharing websites, which allow users to upload, annotate, and share online photo or video collections, have become increasingly popular. The user tags or annotations constitute the new multimedia meta-data . We present an image search system that exploits both image textual and visual information. First, we use focused crawling and DOM Tree based web data extraction methods to extract image textual features from social networking image collections. Second, we propose the concept of visual words to handle the image\u27s visual content for fast indexing and searching. We also develop several user friendly search options to allow users to query the index using words and image feature descriptions (visual words). The developed image search system tries to bridge the gap between the scalable industrial image search engines, which are based on keyword search, and the slower content based image retrieval systems developed mostly in the academic field and designed to search based on image content only. We have implemented a working prototype by crawling and indexing over 16,056 images from flickr.com, one of the most popular image sharing websites. Our experimental results on a working prototype confirm the efficiency and effectiveness of the methods, that we proposed

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Optimum Implementation of Compound Compression of a Computer Screen for Real-Time Transmission in Low Network Bandwidth Environments

    Get PDF
    Remote working is becoming increasingly more prevalent in recent times. A large part of remote working involves sharing computer screens between servers and clients. The image content that is presented when sharing computer screens consists of both natural camera captured image data as well as computer generated graphics and text. The attributes of natural camera captured image data differ greatly to the attributes of computer generated image data. An image containing a mixture of both natural camera captured image and computer generated image data is known as a compound image. The research presented in this thesis focuses on the challenge of constructing a compound compression strategy to apply the ‘best fit’ compression algorithm for the mixed content found in a compound image. The research also involves analysis and classification of the types of data a given compound image may contain. While researching optimal types of compression, consideration is given to the computational overhead of a given algorithm because the research is being developed for real time systems such as cloud computing services, where latency has a detrimental impact on end user experience. The previous and current state of the art videos codec’s have been researched along many of the most current publishing’s from academia, to design and implement a novel approach to a low complexity compound compression algorithm that will be suitable for real time transmission. The compound compression algorithm will utilise a mixture of lossless and lossy compression algorithms with parameters that can be used to control the performance of the algorithm. An objective image quality assessment is needed to determine whether the proposed algorithm can produce an acceptable quality image after processing. Both traditional metrics such as Peak Signal to Noise Ratio will be used along with a new more modern approach specifically designed for compound images which is known as Structural Similarity Index will be used to define the quality of the decompressed Image. In finishing, the compression strategy will be tested on a set of generated compound images. Using open source software, the same images will be compressed with the previous and current state of the art video codec’s to compare the three main metrics, compression ratio, computational complexity and objective image quality

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Extraction and representation of semantic information in digital media

    Get PDF

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Multispectral texture synthesis

    Get PDF
    Synthesizing texture involves the ordering of pixels in a 2D arrangement so as to display certain known spatial correlations, generally as described by a sample texture. In an abstract sense, these pixels could be gray-scale values, RGB color values, or entire spectral curves. The focus of this work is to develop a practical synthesis framework that maintains this abstract view while synthesizing texture with high spectral dimension, effectively achieving spectral invariance. The principle idea is to use a single monochrome texture synthesis step to capture the spatial information in a multispectral texture. The first step is to use a global color space transform to condense the spatial information in a sample texture into a principle luminance channel. Then, a monochrome texture synthesis step generates the corresponding principle band in the synthetic texture. This spatial information is then used to condition the generation of spectral information. A number of variants of this general approach are introduced. The first uses a multiresolution transform to decompose the spatial information in the principle band into an equivalent scale/space representation. This information is encapsulated into a set of low order statistical constraints that are used to iteratively coerce white noise into the desired texture. The residual spectral information is then generated using a non-parametric Markov Ran dom field model (MRF). The remaining variants use a non-parametric MRF to generate the spatial and spectral components simultaneously. In this ap proach, multispectral texture is grown from a seed region by sampling from the set of nearest neighbors in the sample texture as identified by a template matching procedure in the principle band. The effectiveness of both algorithms is demonstrated on a number of texture examples ranging from greyscale to RGB textures, as well as 16, 22, 32 and 63 band spectral images. In addition to the standard visual test that predominates the literature, effort is made to quantify the accuracy of the synthesis using informative and effective metrics. These include first and second order statistical comparisons as well as statistical divergence tests
    • …
    corecore