Search CORE

55 research outputs found

Compressing Proteomes: The Relevance of Medium Range Correlations

Author: Benedetto Dario
Caglioti Emanuele
Chica Claudia
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

We study the nonrandomness of proteome sequences by analysing the correlations that arise between amino acids at a short and medium range, more specifically, between amino acids located 10 or 100 residues apart; respectively. We show that statistical models that consider these two types of correlation are more likely to seize the information contained in protein sequences and thus achieve good compression rates. Finally, we propose that the cause for this redundancy is related to the evolutionary origin of proteomes and protein sequences

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

A joint motion & disparity motion estimation technique for 3D integral video compression using evolutionary strategy

Author: Adedoyin S
Aggoun A
Fernando WAC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

3D imaging techniques have the potential to establish a future mass-market in the fields of entertainment and communications. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. Just like any digital video, 3D video sequences must also be compressed in order to make it suitable for consumer domain applications. However, ordinary compression techniques found in state-of-the-art video coding standards such as H.264, MPEG-4 and MPEG-2 are not capable of producing enough compression while preserving the 3D clues. Fortunately, a huge amount of redundancies can be found in an integral video sequence in terms of motion and disparity. This paper discusses a novel approach to use both motion and disparity information to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression. We further propose an optimization technique based on evolutionary strategies to minimize the computational complexity of the joint motion disparity estimation. Experimental results demonstrate that Joint Motion and Disparity Estimation can achieve over 1 dB objective quality gain over normal motion estimation. Once combined with Evolutionary strategy, this can achieve up to 94% computational cost saving

CiteSeerX

Crossref

Surrey Research Insight

Brunel University Research Archive

Using semantic knowledge to improve compression on log files

Author: Otten Frederick John
Publication venue: Faculty of Science, Computer Science
Publication date: 19/11/2008
Field of study

With the move towards global and multi-national companies, information technology infrastructure requirements are increasing. As the size of these computer networks increases, it becomes more and more difficult to monitor, control, and secure them. Networks consist of a number of diverse devices, sensors, and gateways which are often spread over large geographical areas. Each of these devices produce log files which need to be analysed and monitored to provide network security and satisfy regulations. Data compression programs such as gzip and bzip2 are commonly used to reduce the quantity of data for archival purposes after the log files have been rotated. However, there are many other compression programs which exist - each with their own advantages and disadvantages. These programs each use a different amount of memory and take different compression and decompression times to achieve different compression ratios. System log files also contain redundancy which is not necessarily exploited by standard compression programs. Log messages usually use a similar format with a defined syntax. In the log files, all the ASCII characters are not used and the messages contain certain "phrases" which often repeated. This thesis investigates the use of compression as a means of data reduction and how the use of semantic knowledge can improve data compression (also applying results to different scenarios that can occur in a distributed computing environment). It presents the results of a series of tests performed on different log files. It also examines the semantic knowledge which exists in maillog files and how it can be exploited to improve the compression results. The results from a series of text preprocessors which exploit this knowledge are presented and evaluated. These preprocessors include: one which replaces the timestamps and IP addresses with their binary equivalents and one which replaces words from a dictionary with unused ASCII characters. In this thesis, data compression is shown to be an effective method of data reduction producing up to 98 percent reduction in filesize on a corpus of log files. The use of preprocessors which exploit semantic knowledge results in up to 56 percent improvement in overall compression time and up to 32 percent reduction in compressed size.TeXpdfTeX-1.40.

South East Academic Libraries System (SEALS)

Rhodes Repository (SEALS)

Adaptive edge-based prediction for lossless image compression

Author: Parthe Rahul G.
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2005
Field of study

Many lossless image compression methods have been suggested with established results hard to surpass. However there are some aspects that can be considered to improve the performance further. This research focuses on two-phase prediction-encoding method, separately studying each and suggesting new techniques.;In the prediction module, proposed Edge-Based-Predictor (EBP) and Least-Squares-Edge-Based-Predictor (LS-EBP) emphasizes on image edges and make predictions accordingly. EBP is a gradient based nonlinear adaptive predictor. EBP switches between prediction-rules based on few threshold parameters automatically determined by a pre-analysis procedure, which makes a first pass. The LS-EBP also uses these parameters, but optimizes the prediction for each pre-analysis assigned edge location, thus applying least-square approach only at the edge points.;For encoding module: a novel Burrows Wheeler Transform (BWT) inspired method is suggested, which performs better than applying the BWT directly on the images. We also present a context-based adaptive error modeling and encoding scheme. When coupled with the above-mentioned prediction schemes, the result is the best-known compression performance in the genre of compression schemes with same time and space complexity

The Research Repository @ WVU (West Virginia University)

Distributed Source Coding Techniques for Lossless Compression of Hyperspectral Images

Author: Abrardo Andrea
Barni Mauro
Grangetto Marco
Magli Enrico
Publication venue
Publication date: 01/01/2007
Field of study

This paper deals with the application of distributed source coding (DSC) theory to remote sensing image compression. Although DSC exhibits a significant potential in many application fields, up till now the results obtained on real signals fall short of the theoretical bounds, and often impose additional system-level constraints. The objective of this paper is to assess the potential of DSC for lossless image compression carried out onboard a remote platform. We first provide a brief overview of DSC of correlated information sources. We then focus on onboard lossless image compression, and apply DSC techniques in order to reduce the complexity of the onboard encoder, at the expense of the decoder's, by exploiting the correlation of different bands of a hyperspectral dataset. Specifically, we propose two different compression schemes, one based on powerful binary error-correcting codes employed as source codes, and one based on simpler multilevel coset codes. The performance of both schemes is evaluated on a few AVIRIS scenes, and is compared with other state-of-the-art 2D and 3D coders. Both schemes turn out to achieve competitive compression performance, and one of them also has reduced complexity. Based on these results, we highlight the main issues that are still to be solved to further improve the performance of DSC-based remote sensing systems

CiteSeerX

Archivio della Ricerca - Università degli Studi di Siena

Springer - Publisher Connector

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Open Access Repository

PORTO Publications Open Repository TOrino

Document Compression Improvements Based on Data Clustering

Author: Jan Martinovic
Jan Platos
Jiri Dvorsky
Vaclav Snasel
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref

Compressão e análise de dados genómicos

Author: Pratas Diogo
Publication venue: Universidade de Aveiro
Publication date: 01/01/2016
Field of study

Doutoramento em InformáticaGenomic sequences are large codi ed messages describing most of the structure of all known living organisms. Since the presentation of the rst genomic sequence, a huge amount of genomics data have been generated, with diversi ed characteristics, rendering the data deluge phenomenon a serious problem in most genomics centers. As such, most of the data are discarded (when possible), while other are compressed using general purpose algorithms, often attaining modest data reduction results. Several speci c algorithms have been proposed for the compression of genomic data, but unfortunately only a few of them have been made available as usable and reliable compression tools. From those, most have been developed to some speci c purpose. In this thesis, we propose a compressor for genomic sequences of multiple natures, able to function in a reference or reference-free mode. Besides, it is very exible and can cope with diverse hardware speci cations. It uses a mixture of nite-context models (FCMs) and eXtended FCMs. The results show improvements over state-of-the-art compressors. Since the compressor can be seen as a unsupervised alignment-free method to estimate algorithmic complexity of genomic sequences, it is the ideal candidate to perform analysis of and between sequences. Accordingly, we de ne a way to approximate directly the Normalized Information Distance, aiming to identify evolutionary similarities in intra- and inter-species. Moreover, we introduce a new concept, the Normalized Relative Compression, that is able to quantify and infer new characteristics of the data, previously undetected by other methods. We also investigate local measures, being able to locate speci c events, using complexity pro les. Furthermore, we present and explore a method based on complexity pro les to detect and visualize genomic rearrangements between sequences, identifying several insights of the genomic evolution of humans. Finally, we introduce the concept of relative uniqueness and apply it to the Ebolavirus, identifying three regions that appear in all the virus sequences outbreak but nowhere in the human genome. In fact, we show that these sequences are su cient to classify di erent sub-species. Also, we identify regions in human chromosomes that are absent from close primates DNA, specifying novel traits in human uniqueness.As sequências genómicas podem ser vistas como grandes mensagens codificadas, descrevendo a maior parte da estrutura de todos os organismos vivos. Desde a apresentação da primeira sequência, um enorme número de dados genómicos tem sido gerado, com diversas características, originando um sério problema de excesso de dados nos principais centros de genómica. Por esta razão, a maioria dos dados é descartada (quando possível), enquanto outros são comprimidos usando algoritmos genéricos, quase sempre obtendo resultados de compressão modestos. Têm também sido propostos alguns algoritmos de compressão para sequências genómicas, mas infelizmente apenas alguns estão disponíveis como ferramentas eficientes e prontas para utilização. Destes, a maioria tem sido utilizada para propósitos específicos. Nesta tese, propomos um compressor para sequências genómicas de natureza múltipla, capaz de funcionar em modo referencial ou sem referência. Além disso, é bastante flexível e pode lidar com diversas especificações de hardware. O compressor usa uma mistura de modelos de contexto-finito (FCMs) e FCMs estendidos. Os resultados mostram melhorias relativamente a compressores estado-dearte. Uma vez que o compressor pode ser visto como um método não supervisionado, que não utiliza alinhamentos para estimar a complexidade algortímica das sequências genómicas, ele é o candidato ideal para realizar análise de e entre sequências. Em conformidade, definimos uma maneira de aproximar directamente a distância de informação normalizada (NID), visando a identificação evolucionária de similaridades em intra e interespécies. Além disso, introduzimos um novo conceito, a compressão relativa normalizada (NRC), que é capaz de quantificar e inferir novas características nos dados, anteriormente indetectados por outros métodos. Investigamos também medidas locais, localizando eventos específicos, usando perfis de complexidade. Propomos e exploramos um novo método baseado em perfis de complexidade para detectar e visualizar rearranjos genómicos entre sequências, identificando algumas características da evolução genómica humana. Por último, introduzimos um novo conceito de singularidade relativa e aplicamo-lo ao Ebolavirus, identificando três regiões presentes em todas as sequências do surto viral, mas ausentes do genoma humano. De facto, mostramos que as três sequências são suficientes para classificar diferentes sub-espécies. Também identificamos regiões nos cromossomas humanos que estão ausentes do ADN de primatas próximos, especificando novas características da singularidade humana

Repositório Institucional da Universidade de Aveiro

A new multistage lattice vector quantization with adaptive subband thresholding for image compression

Author: A Chandra
A Said
AA Gersho
AN Skodras
AO Zaid
AS Akbari
CC Chao
D Mukherjee
DG Jeong
FF Kossentini
H Man
J Senecal
JD Gibson
JH Conway
JH Conway
JM Shapiro
M Barlaud
MFM Salleh
NJA Sloane
SP Voukelatos
SW Golomb
T Sikora
WA Pearlman
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2007
Field of study

Lattice vector quantization (LVQ) reduces coding complexity and computation due to its regular structure. A new multistage LVQ (MLVQ) using an adaptive subband thresholding technique is presented and applied to image compression. The technique concentrates on reducing the quantization error of the quantized vectors by "blowing out" the residual quantization errors with an LVQ scale factor. The significant coefficients of each subband are identified using an optimum adaptive thresholding scheme for each subband. A variable length coding procedure using Golomb codes is used to compress the codebook index which produces a very efficient and fast technique for entropy coding. Experimental results using the MLVQ are shown to be significantly better than JPEG 2000 and the recent VQ techniques for various test images

Crossref

University of Strathclyde Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals