Search CORE

2 research outputs found

Representation-Oblivious Error Correction by Natural Redundancy

Author: Anxiao
Jiang
Upadhyaya Pulakesh
Publication venue
Publication date: 09/11/2018
Field of study

Storage systems have a strong need for substantially improving their error correction capabilities, especially for long-term storage where the accumulating errors can exceed the decoding threshold of error-correcting codes (ECCs). In this work, a new scheme is presented that uses deep learning to perform soft decoding for noisy files based on their natural redundancy. The soft decoding result is then combined with ECCs for substantially better error correction performance. The scheme is representation-oblivious: it requires no prior knowledge on how data are represented (e.g., mapped from symbols to bits, compressed, and combined with meta data) in different types of files, which makes the solution more convenient to use for storage systems. Experimental results confirm that the scheme can substantially improve the ability to recover data for different types of files even when the bit error rates in the files have significantly exceeded the decoding threshold of the ECC.Comment: 7 pages, 5 figures, submitted to IEEE International Conference on Communications-201

arXiv.org e-Print Archive

Machine Learning for Error Correction with Natural Redundancy

Author: Jiang Anxiao
Upadhyaya Pulakesh
Publication venue
Publication date: 15/10/2019
Field of study

The persistent storage of big data requires advanced error correction schemes. The classical approach is to use error correcting codes (ECCs). This work studies an alternative approach, which uses the redundancy inherent in data itself for error correction. This type of redundancy, called Natural Redundancy (NR), is abundant in many types of uncompressed or even compressed files. The complex structures of Natural Redundancy, however, require machine learning techniques. In this paper, we study two fundamental approaches to use Natural Redundancy for error correction. The first approach, called Representation-Oblivious, requires no prior knowledge on how data are represented or compressed in files. It uses deep learning to detect file types accurately, and then mine Natural Redundancy for soft decoding. The second approach, called Representation-Aware, assumes that such knowledge is known and uses it for error correction. Furthermore, both approaches combine the decoding based on NR and ECCs. Both experimental results and analysis show that such an integrated scheme can substantially improve the error correction performance.Comment: 35 pages. arXiv admin note: text overlap with arXiv:1811.0403

arXiv.org e-Print Archive