ALGORITHMS - third_party/bzip2 - Git at Google


 Bzip2 is not research work, in the sense that it doesn't present any
 new ideas.  Rather, it's an engineering exercise based on existing
 ideas.

 Four documents describe essentially all the ideas behind bzip2:

    Michael Burrows and D. J. Wheeler:
      "A block-sorting lossless data compression algorithm"
       10th May 1994.
       Digital SRC Research Report 124.
       ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz

    Daniel S. Hirschberg and Debra A. LeLewer
      "Efficient Decoding of Prefix Codes"
       Communications of the ACM, April 1990, Vol 33, Number 4.
       You might be able to get an electronic copy of this
          from the ACM Digital Library.

    David J. Wheeler
       Program bred3.c and accompanying document bred3.ps.
       This contains the idea behind the multi-table Huffman
       coding scheme.
       ftp://ftp.cl.cam.ac.uk/pub/user/djw3/

    Jon L. Bentley and Robert Sedgewick
      "Fast Algorithms for Sorting and Searching Strings"
       Available from Sedgewick's web page,
       www.cs.princeton.edu/~rs

 The following paper gives valuable additional insights into the
 algorithm, but is not immediately the basis of any code
 used in bzip2.

    Peter Fenwick:
       Block Sorting Text Compression
       Proceedings of the 19th Australasian Computer Science Conference,
         Melbourne, Australia.  Jan 31 - Feb 2, 1996.
       ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps

 All three are well written, and make fascinating reading.  If you want
 to modify bzip2 in any non-trivial way, I strongly suggest you obtain,
 read and understand these papers.

 I am much indebted to the various authors for their help, support and
 advice.

	Bzip2 is not research work, in the sense that it doesn't present any
	new ideas. Rather, it's an engineering exercise based on existing
	ideas.

	Four documents describe essentially all the ideas behind bzip2:

	Michael Burrows and D. J. Wheeler:
	"A block-sorting lossless data compression algorithm"
	10th May 1994.
	Digital SRC Research Report 124.
	ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz

	Daniel S. Hirschberg and Debra A. LeLewer
	"Efficient Decoding of Prefix Codes"
	Communications of the ACM, April 1990, Vol 33, Number 4.
	You might be able to get an electronic copy of this
	from the ACM Digital Library.

	David J. Wheeler
	Program bred3.c and accompanying document bred3.ps.
	This contains the idea behind the multi-table Huffman
	coding scheme.
	ftp://ftp.cl.cam.ac.uk/pub/user/djw3/

	Jon L. Bentley and Robert Sedgewick
	"Fast Algorithms for Sorting and Searching Strings"
	Available from Sedgewick's web page,
	www.cs.princeton.edu/~rs

	The following paper gives valuable additional insights into the
	algorithm, but is not immediately the basis of any code
	used in bzip2.

	Peter Fenwick:
	Block Sorting Text Compression
	Proceedings of the 19th Australasian Computer Science Conference,
	Melbourne, Australia. Jan 31 - Feb 2, 1996.
	ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps

	All three are well written, and make fascinating reading. If you want
	to modify bzip2 in any non-trivial way, I strongly suggest you obtain,
	read and understand these papers.

	I am much indebted to the various authors for their help, support and
	advice.