Sliding Window Update Using Suffix Arrays
Ferreira, A.
; Oliveira, A. L.
;
Figueiredo, M. A. T.
Sliding Window Update Using Suffix Arrays, Proc IEEE Data Compression Conference - DCC, Snowbird, UT, USA, United States, Vol. --, pp. 456 - 456, March, 2011.
Digital Object Identifier:
Download Full text PDF ( 339 KBs)
Abstract
The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash-tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is outputted. Our algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than tree-based encoders. We compare our technique against tree-based encoders, on a large set of benchmark files. In some cases, our encoder is also faster than tree-based encoders.