CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
On Compressing Collections of Substring Samples
Authors
Golnaz Badkobeh
Sara Giuliani
Zsuzsanna Lipták
Simon J. Puglisi
Publication date
1 January 2022
Publisher
Abstract
Publisher Copyright: © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).Given a string X = X[1..n] of length n, and integers m and s, such that n > m ≥ 2s > 0, we consider the problem of compressing the string S formed by concatenating the substrings of X of length m starting at positions i ≡ 1 (mod s). In particular, we provide an upper bound of (2n − m)/s + 2z + (m − s) on the size of the Lempel-Ziv (LZ77) parsing of S, where z is the size of the parsing of X. We also show that a related bound holds regardless of the order in which the substrings are concatenated in the formation of S. If X is viewed as a genome sequence, the above substring sampling process corresponds to an idealized model of short read DNA sequencing.Peer reviewe
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
Helsingin yliopiston digitaalinen arkisto
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:helda.helsinki.fi:10138/35...
Last time updated on 12/03/2023
Goldsmiths Research Online
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:eprints.gold.ac.uk:32777
Last time updated on 17/12/2022
Catalogo dei prodotti della ricerca
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:iris.univr.it:11562/109182...
Last time updated on 19/04/2023