Sliding Window Update Using Suffix Arrays
- Ferreira, A.; A.M.L. Oliveira; Figueiredo, M. A. T.;
"Sliding Window Update Using Suffix Arrays
IEEE Data Compression Conf. - DCC
Snowbird, UT, USA
The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash-tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is outputted. Our algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than tree-based encoders. We compare our technique against tree-based encoders, on a large set of benchmark files. In some cases, our encoder is also faster than tree-based encoders.
Full text (PDF 339KB)