867 research outputs found

    Fast object detection in compressed JPEG Images

    Full text link
    Object detection in still images has drawn a lot of attention over past few years, and with the advent of Deep Learning impressive performances have been achieved with numerous industrial applications. Most of these deep learning models rely on RGB images to localize and identify objects in the image. However in some application scenarii, images are compressed either for storage savings or fast transmission. Therefore a time consuming image decompression step is compulsory in order to apply the aforementioned deep models. To alleviate this drawback, we propose a fast deep architecture for object detection in JPEG images, one of the most widespread compression format. We train a neural network to detect objects based on the blockwise DCT (discrete cosine transform) coefficients {issued from} the JPEG compression algorithm. We modify the well-known Single Shot multibox Detector (SSD) by replacing its first layers with one convolutional layer dedicated to process the DCT inputs. Experimental evaluations on PASCAL VOC and industrial dataset comprising images of road traffic surveillance show that the model is about 2×2\times faster than regular SSD with promising detection performances. To the best of our knowledge, this paper is the first to address detection in compressed JPEG images

    An approach to summarize video data in compressed domain

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Electronics and Communication Engineering, Izmir, 2007Includes bibliographical references (leaves: 54-56)Text in English; Abstract: Turkish and Englishx, 59 leavesThe requirements to represent digital video and images efficiently and feasibly have collected great efforts on research, development and standardization over past 20 years. These efforts targeted a vast area of applications such as video on demand, digital TV/HDTV broadcasting, multimedia video databases, surveillance applications etc. Moreover, the applications demand more efficient collections of algorithms to enable lower bit rate levels, with acceptable quality depending on application requirements. In our time, most of the video content either stored, transmitted is in compressed form. The increase in the amount of video data that is being shared attracted interest of researchers on the interrelated problems of video summarization, indexing and abstraction. In this study, the scene cut detection in emerging ISO/ITU H264/AVC coded bit stream is realized by extracting spatio-temporal prediction information directly in the compressed domain. The syntax and semantics, parsing and decoding processes of ISO/ITU H264/AVC bit-stream is analyzed to detect scene information. Various video test data is constructed using Joint Video Team.s test model JM encoder, and implementations are made on JM decoder. The output of the study is the scene information to address video summarization, skimming, indexing applications that use the new generation ISO/ITU H264/AVC video

    Segmentation for Image Indexing and Retrieval on Discrete Cosines Domain

    Get PDF
    This paper used region growing segmentation technique to segment the Discrete Cosines (DC) image. The classic problem of content Based image retrieval (CBIR) is the lack of accuracy in matching between image query and image in the database. By using region growing technique on DC image,it reduced the number of image regions indexed. The proposed of recursive region growing is not new technique but its application on DC images to build  indexing keys is quite new and not yet presented by many  authors. The experimental results show that the proposed methods on segmented images present good precision which are higher than 0.60 on all classes. So, it could be concluded that region growing segmented based CBIR more efficient   compared to DC images in term of their precision 0.59 and 0.75, respectively. Moreover, DC based CBIR can save time and simplify algorithm compared to DCT images. The most significant finding from this work is instead of using 64 DCT coefficients this research only used 1/64 coefficients which is DC coefficient.

    WAVELET BASED DATA HIDING OF DEM IN THE CONTEXT OF REALTIME 3D VISUALIZATION (Visualisation 3D Temps-Réel à Distance de MNT par Insertion de Données Cachées Basée Ondelettes)

    No full text
    The use of aerial photographs, satellite images, scanned maps and digital elevation models necessitates the setting up of strategies for the storage and visualization of these data. In order to obtain a three dimensional visualization it is necessary to drape the images, called textures, onto the terrain geometry, called Digital Elevation Model (DEM). Practically, all these information are stored in three different files: DEM, texture and position/projection of the data in a geo-referential system. In this paper we propose to stock all these information in a single file for the purpose of synchronization. For this we have developed a wavelet-based embedding method for hiding the data in a colored image. The texture images containing hidden DEM data can then be sent from the server to a client in order to effect 3D visualization of terrains. The embedding method is integrable with the JPEG2000 coder to accommodate compression and multi-resolution visualization. Résumé L'utilisation de photographies aériennes, d'images satellites, de cartes scannées et de modèles numériques de terrains amène à mettre en place des stratégies de stockage et de visualisation de ces données. Afin d'obtenir une visualisation en trois dimensions, il est nécessaire de lier ces images appelées textures avec la géométrie du terrain nommée Modèle Numérique de Terrain (MNT). Ces informations sont en pratiques stockées dans trois fichiers différents : MNT, texture, position et projection des données dans un système géo-référencé. Dans cet article, nous proposons de stocker toutes ces informations dans un seul fichier afin de les synchroniser. Nous avons développé pour cela une méthode d'insertion de données cachées basée ondelettes dans une image couleur. Les images de texture contenant les données MNT cachées peuvent ensuite être envoyées du serveur au client afin d'effectuer une visualisation 3D de terrains. Afin de combiner une visualisation en multirésolution et une compression, l'insertion des données cachées est intégrable dans le codeur JPEG 2000

    The contour tree image encoding technique and file format

    Get PDF
    The process of contourization is presented which converts a raster image into a discrete set of plateaux or contours. These contours can be grouped into a hierarchical structure, defining total spatial inclusion, called a contour tree. A contour coder has been developed which fully describes these contours in a compact and efficient manner and is the basis for an image compression method. Simplification of the contour tree has been undertaken by merging contour tree nodes thus lowering the contour tree's entropy. This can be exploited by the contour coder to increase the image compression ratio. By applying general and simple rules derived from physiological experiments on the human vision system, lossy image compression can be achieved which minimises noticeable artifacts in the simplified image. The contour merging technique offers a complementary lossy compression system to the QDCT (Quantised Discrete Cosine Transform). The artifacts introduced by the two methods are very different; QDCT produces a general blurring and adds extra highlights in the form of overshoots, whereas contour merging sharpens edges, reduces highlights and introduces a degree of false contouring. A format based on the contourization technique which caters for most image types is defined, called the contour tree image format. Image operations directly on this compressed format have been studied which for certain manipulations can offer significant operational speed increases over using a standard raster image format. A couple of examples of operations specific to the contour tree format are presented showing some of the features of the new format.Science and Engineering Research Counci

    LIDAR data classification and compression

    Get PDF
    Airborne Laser Detection and Ranging (LIDAR) data has a wide range of applications in agriculture, archaeology, biology, geology, meteorology, military and transportation, etc. LIDAR data consumes hundreds of gigabytes in a typical day of acquisition, and the amount of data collected will continue to grow as sensors improve in resolution and functionality. LIDAR data classification and compression are therefore very important for managing, visualizing, analyzing and using this huge amount of data. Among the existing LIDAR data classification schemes, supervised learning has been used and can obtain up to 96% of accuracy. However some of the features used are not readily available, and the training data is also not always available in practice. In existing LIDAR data compression schemes, the compressed size can be 5%-23% of the original size, but still could be in the order of gigabyte, which is impractical for many applications. The objectives of this dissertation are (1) to develop LIDAR classification schemes that can classify airborne LIDAR data more accurately without some features or training data that existing work requires; (2) to explore lossy compression schemes that can compress LIDAR data at a much higher compression rate than is currently available. We first investigate two independent ways to classify LIDAR data depending on the availability of training data: when training data is available, we use supervised machine learning techniques such as support vector machine (SVM); when training data is not readily available, we develop an unsupervised classification method that can classify LIDAR data as good as supervised classification methods. Experimental results show that the accuracy of our classification results are over 99%. We then present two new lossy LIDAR data compression methods and compare their performance. The first one is a wavelet based compression scheme while the second one is geometry based. Our new geometry based compression is a geometry and statistics driven LIDAR point-cloud compression method which combines both application knowledge and scene content to enable fast transmission from the sensor platform while preserving the geometric properties of objects within a scene. The new algorithm is based on the idea of compression by classification. It utilizes the unique height function simplicity as well as the local spatial coherence and linearity of the aerial LIDAR data and can automatically compress the data to the desired level-of-details defined by the user. Either of the two developed classification methods can be used to automatically detect regions that are not locally linear such as vegetations or trees. In those regions, the local statistics descriptions, such as mean, variance, expectation, etc., are stored to efficiently represent the region and restore the geometry in the decompression phase. The new geometry-based compression schemes for building and ground data can compress efficiently and significantly reduce the file size, while retaining a good fit for the scalable "zoom in" requirements. Experimental results show that compared with existing LIDAR lossy compression work, our proposed approach achieves two orders of magnitude lower bit rate with the same quality, making it feasible for applications that were not practical before. The ability to store information into a database and query them efficiently becomes possible with the proposed highly efficient compression scheme.Includes bibliographical references (pages 106-116)
    corecore