bzip2-0.1

commit: 33d134030248633ffa7d60c0a35a783c46da034b [log] [tgz]
author: Julian Seward <jseward@acm.org> Thu Aug 07 22:13:13 1997 +0200
committer: Julian Seward <jseward@acm.org> Thu Aug 07 22:13:13 1997 +0200
tree: b760dc34185dccc7054989c1472478574223cc31
diff --git a/ALGORITHMS b/ALGORITHMS
new file mode 100644
index 0000000..7c7d2ca
--- /dev/null
+++ b/ALGORITHMS

@@ -0,0 +1,47 @@
+
+Bzip2 is not research work, in the sense that it doesn't present any
+new ideas.  Rather, it's an engineering exercise based on existing
+ideas.
+
+Four documents describe essentially all the ideas behind bzip2:
+ 
+   Michael Burrows and D. J. Wheeler:
+     "A block-sorting lossless data compression algorithm"
+      10th May 1994. 
+      Digital SRC Research Report 124.
+      ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
+
+   Daniel S. Hirschberg and Debra A. LeLewer
+     "Efficient Decoding of Prefix Codes"
+      Communications of the ACM, April 1990, Vol 33, Number 4.
+      You might be able to get an electronic copy of this
+         from the ACM Digital Library.
+
+   David J. Wheeler
+      Program bred3.c and accompanying document bred3.ps.
+      This contains the idea behind the multi-table Huffman
+      coding scheme.
+      ftp://ftp.cl.cam.ac.uk/pub/user/djw3/
+
+   Jon L. Bentley and Robert Sedgewick
+     "Fast Algorithms for Sorting and Searching Strings"
+      Available from Sedgewick's web page,
+      www.cs.princeton.edu/~rs
+
+The following paper gives valuable additional insights into the
+algorithm, but is not immediately the basis of any code
+used in bzip2.
+
+   Peter Fenwick:
+      Block Sorting Text Compression
+      Proceedings of the 19th Australasian Computer Science Conference,
+        Melbourne, Australia.  Jan 31 - Feb 2, 1996.
+      ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
+      
+All three are well written, and make fascinating reading.  If you want
+to modify bzip2 in any non-trivial way, I strongly suggest you obtain,
+read and understand these papers.
+
+I am much indebted to the various authors for their help, support and
+advice.
+

diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..a43ea21
--- /dev/null
+++ b/LICENSE

@@ -0,0 +1,339 @@
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+                          675 Mass Ave, Cambridge, MA 02139, USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+		     END OF TERMS AND CONDITIONS
+
+	Appendix: How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) 19yy  <name of author>
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) 19yy name of author
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Library General
+Public License instead of this License.

diff --git a/Makefile b/Makefile
new file mode 100644
index 0000000..d124743
--- /dev/null
+++ b/Makefile

@@ -0,0 +1,30 @@
+
+CC = gcc
+SH = /bin/sh
+
+CFLAGS = -O3 -fomit-frame-pointer -funroll-loops -Wall -Winline -W
+
+
+
+all:
+	cat words0
+	$(CC) $(CFLAGS) -o bzip2 bzip2.c
+	$(CC) $(CFLAGS) -o bzip2recover bzip2recover.c
+	rm -f bunzip2
+	ln -s ./bzip2 ./bunzip2
+	cat words1
+	./bzip2 -1 < sample1.ref > sample1.rb2
+	./bzip2 -2 < sample2.ref > sample2.rb2
+	./bunzip2 < sample1.bz2 > sample1.tst
+	./bunzip2 < sample2.bz2 > sample2.tst
+	cat words2
+	cmp sample1.bz2 sample1.rb2 
+	cmp sample2.bz2 sample2.rb2
+	cmp sample1.tst sample1.ref
+	cmp sample2.tst sample2.ref
+	cat words3
+
+
+clean:
+	rm -f bzip2 bunzip2 bzip2recover sample*.tst sample*.rb2
+

diff --git a/README b/README
new file mode 100644
index 0000000..d77830f
--- /dev/null
+++ b/README

@@ -0,0 +1,243 @@
+
+GREETINGS!
+
+   This is the README for bzip2, my block-sorting file compressor,
+   version 0.1.  
+
+   bzip2 is distributed under the GNU General Public License version 2;
+   for details, see the file LICENSE.  Pointers to the algorithms used
+   are in ALGORITHMS.  Instructions for use are in bzip2.1.preformatted.
+
+   Please read this file carefully.
+
+
+
+HOW TO BUILD
+
+   -- for UNIX:
+
+        Type `make'.     (tough, huh? :-)
+
+        This creates binaries "bzip2", and "bunzip2",
+        which is a symbolic link to "bzip2".
+
+        It also runs four compress-decompress tests to make sure
+        things are working properly.  If all goes well, you should be up &
+        running.  Please be sure to read the output from `make'
+        just to be sure that the tests went ok.
+
+        To install bzip2 properly:
+
+           -- Copy the binary "bzip2" to a publically visible place,
+              possibly /usr/bin, /usr/common/bin or /usr/local/bin.
+
+           -- In that directory, make "bunzip2" be a symbolic link
+              to "bzip2".
+
+           -- Copy the manual page, bzip2.1, to the relevant place.
+              Probably the right place is /usr/man/man1/.
+   
+   -- for Windows 95 and NT: 
+
+        For a start, do you *really* want to recompile bzip2?  
+        The standard distribution includes a pre-compiled version
+        for Windows 95 and NT, `bzip2.exe'.
+
+        This executable was created with Jacob Navia's excellent
+        port to Win32 of Chris Fraser & David Hanson's excellent
+        ANSI C compiler, "lcc".  You can get to it at the pages
+        of the CS department of Princeton University, 
+        www.cs.princeton.edu.  
+        I have not tried to compile this version of bzip2 with
+        a commercial C compiler such as MS Visual C, as I don't
+        have one available.
+
+        Note that lcc is designed primarily to be portable and
+        fast.  Code quality is a secondary aim, so bzip2.exe
+        runs perhaps 40% slower than it could if compiled with
+        a good optimising compiler.
+
+        I compiled a previous version of bzip (0.21) with Borland
+        C 5.0, which worked fine, and with MS VC++ 2.0, which
+        didn't.  Here is an comment from the README for bzip-0.21.
+
+           MS VC++ 2.0's optimising compiler has a bug which, at 
+           maximum optimisation, gives an executable which produces 
+           garbage compressed files.  Proceed with caution. 
+           I do not know whether or not this happens with later 
+           versions of VC++.
+
+           Edit the defines starting at line 86 of bzip.c to 
+           select your platform/compiler combination, and then compile.
+           Then check that the resulting executable (assumed to be 
+           called bzip.exe) works correctly, using the SELFTEST.BAT file.  
+           Bearing in mind the previous paragraph, the self-test is
+           important.
+
+        Note that the defines which bzip-0.21 had, to support 
+        compilation with VC 2.0 and BC 5.0, are gone.  Windows
+        is not my preferred operating system, and I am, for the
+        moment, content with the modestly fast executable created
+        by lcc-win32.
+
+   A manual page is supplied, unformatted (bzip2.1),
+   preformatted (bzip2.1.preformatted), and preformatted
+   and sanitised for MS-DOS (bzip2.txt).
+
+   
+
+COMPILATION NOTES
+
+   bzip2 should work on any 32 or 64-bit machine.  It is known to work
+   [meaning: it has compiled and passed self-tests] on the 
+   following platform-os combinations:
+
+      Intel i386/i486        running Linux 2.0.21
+      Sun Sparcs (various)   running SunOS 4.1.4 and Solaris 2.5
+      Intel i386/i486        running Windows 95 and NT
+      DEC Alpha              running Digital Unix 4.0
+
+   Following the release of bzip-0.21, many people mailed me
+   from around the world to say they had made it work on all sorts
+   of weird and wonderful machines.  Chances are, if you have
+   a reasonable ANSI C compiler and a 32-bit machine, you can
+   get it to work.
+
+   The #defines starting at around line 82 of bzip2.c supply some
+   degree of platform-independance.  If you configure bzip2 for some
+   new far-out platform which is not covered by the existing definitions,
+   please send me the relevant definitions.
+
+   I recommend GNU C for compilation.  The code is standard ANSI C,
+   except for the Unix-specific file handling, so any ANSI C compiler
+   should work.  Note however that the many routines marked INLINE
+   should be inlined by your compiler, else performance will be very
+   poor.  Asking your compiler to unroll loops gives some
+   small improvement too; for gcc, the relevant flag is
+   -funroll-loops.
+
+   On a 386/486 machines, I'd recommend giving gcc the
+   -fomit-frame-pointer flag; this liberates another register for
+   allocation, which measurably improves performance.
+
+   I used the abovementioned lcc compiler to develop bzip2.
+   I would highly recommend this compiler for day-to-day development;
+   it is fast, reliable, lightweight, has an excellent profiler,
+   and is generally excellent.  And it's fun to retarget, if you're
+   into that kind of thing.
+
+   If you compile bzip2 on a new platform or with a new compiler,
+   please be sure to run the four compress-decompress tests, either
+   using the Makefile, or with the test.bat (MSDOS) or test.cmd (OS/2)
+   files.  Some compilers have been seen to introduce subtle bugs
+   when optimising, so this check is important.  Ideally you should
+   then go on to test bzip2 on a file several megabytes or even
+   tens of megabytes long, just to be 110% sure.  ``Professional
+   programmers are paranoid programmers.'' (anon).
+
+
+
+VALIDATION
+
+   Correct operation, in the sense that a compressed file can always be
+   decompressed to reproduce the original, is obviously of paramount
+   importance.  To validate bzip2, I used a modified version of 
+   Mark Nelson's churn program.  Churn is an automated test driver
+   which recursively traverses a directory structure, using bzip2 to
+   compress and then decompress each file it encounters, and checking
+   that the decompressed data is the same as the original.  As test 
+   material, I used several runs over several filesystems of differing
+   sizes.
+
+   One set of tests was done on my base Linux filesystem,
+   410 megabytes in 23,000 files.  There were several runs over
+   this filesystem, in various configurations designed to break bzip2.
+   That filesystem also contained some specially constructed test
+   files designed to exercise boundary cases in the code.
+   This included files of zero length, various long, highly repetitive 
+   files, and some files which generate blocks with all values the same.
+
+   The other set of tests was done just with the "normal" configuration,
+   but on a much larger quantity of data.
+
+      Tests are:
+
+         Linux FS, 410M, 23000 files
+
+         As above, with --repetitive-fast
+
+         As above, with -1
+
+         Low level disk image of a disk containing
+            Windows NT4.0; 420M in a single huge file
+
+         Linux distribution, incl Slackware, 
+            all GNU sources.   1900M in 2300 files.
+
+         Approx ~100M compiler sources and related
+            programming tools, running under Purify.
+
+         About 500M of data in 120 files of around
+            4 M each.  This is raw data from a 
+            biomagnetometer (SQUID-based thing).
+
+      Overall, total volume of test data is about
+         3300 megabytes in 25000 files.
+
+   The distribution does four tests after building bzip.  These tests
+   include test decompressions of pre-supplied compressed files, so
+   they not only test that bzip works correctly on the machine it was
+   built on, but can also decompress files compressed on a different
+   machine.  This guards against unforseen interoperability problems.
+
+
+Please read and be aware of the following:
+
+WARNING:
+
+   This program (attempts to) compress data by performing several
+   non-trivial transformations on it.  Unless you are 100% familiar
+   with *all* the algorithms contained herein, and with the
+   consequences of modifying them, you should NOT meddle with the
+   compression or decompression machinery.  Incorrect changes can and
+   very likely *will* lead to disastrous loss of data.
+
+
+DISCLAIMER:
+
+   I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
+   USE OF THIS PROGRAM, HOWSOEVER CAUSED.
+
+   Every compression of a file implies an assumption that the
+   compressed file can be decompressed to reproduce the original.
+   Great efforts in design, coding and testing have been made to
+   ensure that this program works correctly.  However, the complexity
+   of the algorithms, and, in particular, the presence of various
+   special cases in the code which occur with very low but non-zero
+   probability make it impossible to rule out the possibility of bugs
+   remaining in the program.  DO NOT COMPRESS ANY DATA WITH THIS
+   PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER
+   SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
+
+   That is not to say this program is inherently unreliable.  Indeed,
+   I very much hope the opposite is true.  bzip2 has been carefully
+   constructed and extensively tested.
+
+End of nasty legalities.
+
+
+I hope you find bzip2 useful.  Feel free to contact me at
+   jseward@acm.org
+if you have any suggestions or queries.  Many people mailed me with
+comments, suggestions and patches after the releases of 0.15 and 0.21, 
+and the changes in bzip2 are largely a result of this feedback.
+I thank you for your comments.
+
+Julian Seward
+
+Manchester, UK
+18 July 1996 (version 0.15)
+25 August 1996 (version 0.21)
+
+Guildford, Surrey, UK
+7 August 1997 (bzip2, version 0.0)
\ No newline at end of file

diff --git a/README.DOS b/README.DOS
new file mode 100644
index 0000000..d522b81
--- /dev/null
+++ b/README.DOS

@@ -0,0 +1,20 @@
+

+Windows 95 & Windows NT users:

+

+1.  There's a pre-built executable, bzip2.exe, which

+    should work.  You don't need to compile anything.

+    You can run the `test.bat' batch file to check

+    the executable is working ok, if you want.

+

+2.  The control-C signal catcher seems pretty dodgy

+    under Windows, at least for the executable supplied.

+    When it catches a control-C, bzip2 tries to delete

+    its output file, so you don't get left with a half-

+    baked file.  But this sometimes seems to fail

+    under Windows.  Caveat Emptor!  I think I am doing

+    something not-quite-right in the signal catching.

+    Windows-&-C gurus got any suggestions?

+

+    Control-C handling all seems to work fine under Unix.

+

+7 Aug 97


diff --git a/bzip2.1 b/bzip2.1
new file mode 100644
index 0000000..9094c7c
--- /dev/null
+++ b/bzip2.1

@@ -0,0 +1,441 @@
+.PU
+.TH bzip2 1
+.SH NAME
+bzip2, bunzip2 \- a block-sorting file compressor, v0.1
+.br
+bzip2recover \- recovers data from damaged bzip2 files
+
+.SH SYNOPSIS
+.ll +8
+.B bzip2
+.RB [ " \-cdfkstvVL123456789 " ]
+[
+.I "filenames \&..."
+]
+.ll -8
+.br
+.B bunzip2
+.RB [ " \-kvsVL " ]
+[
+.I "filenames \&..."
+]
+.br
+.B bzip2recover
+.I "filename"
+
+.SH DESCRIPTION
+.I Bzip2
+compresses files using the Burrows-Wheeler block-sorting 
+text compression algorithm, and Huffman coding.
+Compression is generally considerably
+better than that 
+achieved by more conventional LZ77/LZ78-based compressors,
+and approaches the performance of the PPM family of statistical
+compressors.
+
+The command-line options are deliberately very similar to 
+those of 
+.I GNU Gzip,
+but they are not identical.
+
+.I Bzip2 
+expects a list of file names to accompany the command-line flags.  
+Each file is replaced by a compressed version of itself,
+with the name "original_name.bz2".
+Each compressed file has the same modification date and permissions
+as the corresponding original, so that these properties can be 
+correctly restored at decompression time.  File name handling is
+naive in the sense that there is no mechanism for preserving
+original file names, permissions and dates in filesystems 
+which lack these concepts, or have serious file name length
+restrictions, such as MS-DOS.
+
+.I Bzip2
+and
+.I bunzip2
+will not overwrite existing files; if you want this to happen,
+you should delete them first.
+
+If no file names are specified,
+.I bzip2
+compresses from standard input to standard output.
+In this case,
+.I bzip2
+will decline to write compressed output to a terminal, as
+this would be entirely incomprehensible and therefore pointless.
+
+.I Bunzip2
+(or
+.I bzip2 \-d
+) decompresses and restores all specified files whose names
+end in ".bz2".
+Files without this suffix are ignored.  
+Again, supplying no filenames
+causes decompression from standard input to standard output.
+
+You can also compress or decompress files to
+the standard output by giving the \-c flag.
+You can decompress multiple files like this, but you may
+only compress a single file this way, since it would otherwise
+be difficult to separate out the compressed representations of
+the original files.
+
+Compression is always performed, even if the compressed file is
+slightly larger than the original.  Files of less than about
+one hundred bytes tend to get larger, since the compression 
+mechanism has a constant overhead in the region of 50 bytes.
+Random data (including the output of most file compressors)
+is coded at about 8.05 bits per byte, giving an expansion of 
+around 0.5%.
+
+As a self-check for your protection,
+.I bzip2
+uses 32-bit CRCs to make sure that the decompressed
+version of a file is identical to the original.  
+This guards against corruption of the compressed data,
+and against undetected bugs in
+.I bzip2
+(hopefully very unlikely).
+The chances of data corruption going undetected is 
+microscopic, about one chance in four billion
+for each file processed.  Be aware, though, that the check
+occurs upon decompression, so it can only tell you that
+that something is wrong.  It can't help you recover the
+original uncompressed data.
+You can use
+.I bzip2recover
+to try to recover data from damaged files.
+
+Return values: 
+0 for a normal exit, 
+1 for environmental
+problems (file not found, invalid flags, I/O errors, &c),
+2 to indicate a corrupt compressed file,
+3 for an internal consistency error (eg, bug) which caused
+.I bzip2 
+to panic.
+
+.SH MEMORY MANAGEMENT
+.I Bzip2
+compresses large files in blocks.  The block size affects both the 
+compression ratio achieved, and the amount of memory needed both for
+compression and decompression.  The flags \-1 through \-9
+specify the block size to be 100,000 bytes through 900,000 bytes
+(the default) respectively.  At decompression-time, the block size used for
+compression is read from the header of the compressed file, and
+.I bunzip2
+then allocates itself just enough memory to decompress the file.
+Since block sizes are stored in compressed files, it follows that the flags
+\-1 to \-9
+are irrelevant to and so ignored during decompression.
+Compression and decompression requirements, in bytes, can be estimated as:
+
+      Compression:   400k + ( 7 x block size )
+
+      Decompression: 100k + ( 5 x block size ), or
+.br
+                     100k + ( 2.5 x block size )
+
+Larger block sizes give rapidly diminishing marginal returns; most
+of the 
+compression comes from the first two or three hundred k of block size,
+a fact worth bearing in mind when using 
+.I bzip2
+on small machines.  It is also important to appreciate that the
+decompression memory requirement is set at compression-time by the
+choice of block size.
+
+For files compressed with the default 900k block size, 
+.I bunzip2
+will require about 4600 kbytes to decompress.
+To support decompression of any file on a 4 megabyte machine,
+.I bunzip2
+has an option to decompress using approximately half this
+amount of memory, about 2300 kbytes.  Decompression speed is
+also halved, so you should use this option only where necessary.
+The relevant flag is \-s.
+
+In general, try and use the largest block size
+memory constraints allow, since that maximises the compression
+achieved.  Compression and decompression
+speed are virtually unaffected by block size.
+
+Another significant point applies to files which fit in a single
+block -- that means most files you'd encounter using a large 
+block size.  The amount of real memory touched is proportional
+to the size of the file, since the file is smaller than a block.
+For example, compressing a file 20,000 bytes long with the flag
+\-9
+will cause the compressor to allocate around
+6700k of memory, but only touch 400k + 20000 * 7 = 540
+kbytes of it.  Similarly, the decompressor will allocate 4600k but
+only touch 100k + 20000 * 5 = 200 kbytes.
+
+Here is a table which summarises the maximum memory usage for 
+different block sizes.  Also recorded is the total compressed
+size for 14 files of the Calgary Text Compression Corpus
+totalling 3,141,622 bytes.  This column gives some feel for how
+compression varies with block size.  These figures tend to understate
+the advantage of larger block sizes for larger files, since the
+Corpus is dominated by smaller files.
+
+           Compress   Decompress   Decompress   Corpus
+    Flag     usage      usage       -s usage     Size
+
+     -1      1100k       600k         350k      914704
+     -2      1800k      1100k         600k      877703
+     -3      2500k      1600k         850k      860338
+     -4      3200k      2100k        1100k      846899
+     -5      3900k      2600k        1350k      845160
+     -6      4600k      3100k        1600k      838626
+     -7      5400k      3600k        1850k      834096
+     -8      6000k      4100k        2100k      828642
+     -9      6700k      4600k        2350k      828642
+
+.SH OPTIONS
+.TP
+.B \-c  --stdout
+Compress or decompress to standard output.  \-c will decompress
+multiple files to stdout, but will only compress a single file to
+stdout.
+.TP
+.B \-d --decompress
+Force decompression.
+.I Bzip2
+and
+.I bunzip2
+are really the same program, and the decision about whether to
+compress or decompress is done on the basis of which name is
+used.  This flag overrides that mechanism, and forces
+.I bzip2
+to decompress.
+.TP 
+.B \-f --compress
+The complement to \-d: forces compression, regardless of the invokation
+name.
+.TP
+.B \-t --test
+Check integrity of the specified file(s), but don't decompress them.
+This really performs a trial decompression and throws away the result,
+using the low-memory decompression algorithm (see \-s).
+.TP
+.B \-k --keep
+Keep (don't delete) input files during compression or decompression.
+.TP
+.B \-s --small
+Reduce memory usage, both for compression and decompression.
+Files are decompressed using a modified algorithm which only
+requires 2.5 bytes per block byte.  This means any file can be
+decompressed in 2300k of memory, albeit somewhat more slowly than
+usual.
+
+During compression, -s selects a block size of 200k, which limits
+memory use to around the same figure, at the expense of your
+compression ratio.  In short, if your machine is low on memory
+(8 megabytes or less), use -s for everything.  See
+MEMORY MANAGEMENT above.
+
+.TP
+.B \-v --verbose
+Verbose mode -- show the compression ratio for each file processed.
+Further \-v's increase the verbosity level, spewing out lots of
+information which is primarily of interest for diagnostic purposes.
+.TP
+.B \-L --license
+Display the software version, license terms and conditions.
+.TP
+.B \-V --version
+Same as \-L.
+.TP
+.B \-1 to \-9 
+Set the block size to 100 k, 200 k .. 900 k when
+compressing.  Has no effect when decompressing.
+See MEMORY MANAGEMENT above.
+.TP
+.B \--repetitive-fast
+.I bzip2
+injects some small pseudo-random variations
+into very repetitive blocks to limit
+worst-case performance during compression.
+If sorting runs into difficulties, the block
+is randomised, and sorting is restarted.  
+Very roughly, 
+.I bzip2
+persists for three times as long as a well-behaved input
+would take before resorting to randomisation.
+This flag makes it give up much sooner.
+
+.TP
+.B \--repetitive-best
+Opposite of \--repetitive-fast; try a lot harder before 
+resorting to randomisation.
+
+.SH RECOVERING DATA FROM DAMAGED FILES
+.I bzip2
+compresses files in blocks, usually 900kbytes long.
+Each block is handled independently.  If a media or
+transmission error causes a multi-block .bz2 
+file to become damaged,
+it may be possible to recover data from the undamaged blocks
+in the file.  
+
+The compressed representation of each block is delimited by
+a 48-bit pattern, which makes it possible to find the block
+boundaries with reasonable certainty.  Each block also carries
+its own 32-bit CRC, so damaged blocks can be
+distinguished from undamaged ones.
+
+.I bzip2recover
+is a simple program whose purpose is to search for 
+blocks in .bz2 files, and write each block out into
+its own .bz2 file.  You can then use
+.I bzip2 -t
+to test the integrity of the resulting files, 
+and decompress those which are undamaged.
+
+.I bzip2recover
+takes a single argument, the name of the damaged file,
+and writes a number of files "rec0001file.bz2", "rec0002file.bz2",
+etc, containing the extracted blocks.  The output filenames
+are designed so that the use of wildcards in subsequent processing
+-- for example, "bzip2 -dc rec*file.bz2 > recovered_data" --
+lists the files in the "right" order.
+
+.I bzip2recover
+should be of most use dealing with large .bz2 files, as
+these will contain many blocks.  It is clearly futile to
+use it on damaged single-block files, since a damaged
+block cannot be recovered.  If you wish to minimise 
+any potential data loss through media or transmission
+errors, you might consider compressing with a smaller
+block size.
+
+.SH PERFORMANCE NOTES
+The sorting phase of compression gathers together similar strings
+in the file.  Because of this, files containing very long 
+runs of repeated symbols, like "aabaabaabaab ..." (repeated
+several hundred times) may compress extraordinarily slowly.
+You can use the
+\-vvvvv 
+option to monitor progress in great detail, if you want.
+Decompression speed is unaffected.
+
+Such pathological cases
+seem rare in practice, appearing mostly in artificially-constructed
+test files, and in low-level disk images.  It may be inadvisable to
+use 
+.I bzip2
+to compress the latter.  
+If you do get a file which causes severe slowness in compression,
+try making the block size as small as possible, with flag \-1.
+
+Incompressible or virtually-incompressible data may decompress
+rather more slowly than one would hope.  This is due to 
+a naive implementation of the move-to-front coder.
+
+.I bzip2
+usually allocates several megabytes of memory to operate in,
+and then charges all over it in a fairly random fashion.  This
+means that performance, both for compressing and decompressing,
+is largely determined by the speed
+at which your machine can service cache misses.  
+Because of this, small changes
+to the code to reduce the miss rate have been observed to give
+disproportionately large performance improvements.
+I imagine 
+.I bzip2
+will perform best on machines with very large caches.
+
+Test mode (\-t) uses the low-memory decompression algorithm
+(\-s).  This means test mode does not run as fast as it could;
+it could run as fast as the normal decompression machinery.
+This could easily be fixed at the cost of some code bloat.
+
+.SH CAVEATS
+I/O error messages are not as helpful as they could be.
+.I Bzip2
+tries hard to detect I/O errors and exit cleanly, but the
+details of what the problem is sometimes seem rather misleading.
+
+This manual page pertains to version 0.1 of 
+.I bzip2.  
+It may well happen that some future version will
+use a different compressed file format.  If you try to 
+decompress, using 0.1, a .bz2 file created with some
+future version which uses a different compressed file format,
+0.1 will complain that your file "is not a bzip2 file".
+If that happens, you should obtain a more recent version
+of 
+.I bzip2
+and use that to decompress the file.
+
+Wildcard expansion for Windows 95 and NT 
+is flaky.
+
+.I bzip2recover
+uses 32-bit integers to represent bit positions in
+compressed files, so it cannot handle compressed files
+more than 512 megabytes long.  This could easily be fixed.
+
+.I bzip2recover
+sometimes reports a very small, incomplete final block.
+This is spurious and can be safely ignored.
+
+.SH RELATIONSHIP TO bzip-0.21
+This program is a descendant of the 
+.I bzip
+program, version 0.21, which I released in August 1996.  
+The primary difference of
+.I bzip2
+is its avoidance of the possibly patented algorithms
+which were used in 0.21.  
+.I bzip2
+also brings various useful refinements (\-s, \-t),
+uses less memory, decompresses significantly faster, and
+has support for recovering data from damaged files.
+
+Because
+.I bzip2
+uses Huffman coding to construct the compressed bitstream,
+rather than the arithmetic coding used in 0.21,
+the compressed representations generated by the two programs
+are incompatible, and they will not interoperate.  The change
+in suffix from .bz to .bz2 reflects this.  It would have been
+helpful to at least allow
+.I bzip2
+to decompress files created by 0.21, but this would
+defeat the primary aim of having a patent-free compressor.
+
+Huffman coding necessarily involves some coding inefficiency
+compared to arithmetic coding.  This means that
+.I bzip2
+compresses about 1% worse than 0.21, an unfortunate but
+unavoidable fact-of-life.  On the other hand, decompression
+is approximately 50% faster for the same reason, and the
+change in file format gave an opportunity to add data-recovery
+features.  So it is not all bad.
+
+.SH AUTHOR
+Julian Seward, jseward@acm.org.
+
+The ideas embodied in 
+.I bzip
+and
+.I bzip2
+are due to (at least) the following people:
+Michael Burrows and David Wheeler (for the block sorting
+transformation), David Wheeler (again, for the Huffman coder),
+Peter Fenwick (for the structured coding model in 0.21, 
+and many refinements),
+and
+Alistair Moffat, Radford Neal and Ian Witten (for the arithmetic
+coder in 0.21).  I am much indebted for their help, support and advice.
+See the file ALGORITHMS in the source distribution for pointers to
+sources of documentation.
+Christian von Roques encouraged me to look for faster
+sorting algorithms, so as to speed up compression.
+Bela Lubkin encouraged me to improve the worst-case
+compression performance.
+Many people sent patches, helped with portability problems,
+lent machines, gave advice and were generally helpful.
+

diff --git a/bzip2.1.preformatted b/bzip2.1.preformatted
new file mode 100644
index 0000000..947dc97
--- /dev/null
+++ b/bzip2.1.preformatted

@@ -0,0 +1,462 @@
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+NNAAMMEE
+       bzip2, bunzip2 - a block-sorting file compressor, v0.1
+       bzip2recover - recovers data from damaged bzip2 files
+
+
+SSYYNNOOPPSSIISS
+       bbzziipp22 [ --ccddffkkssttvvVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbuunnzziipp22 [ --kkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._.  ]
+       bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e
+
+
+DDEESSCCRRIIPPTTIIOONN
+       _B_z_i_p_2  compresses  files  using the Burrows-Wheeler block-
+       sorting text compression algorithm,  and  Huffman  coding.
+       Compression  is  generally  considerably  better than that
+       achieved by more conventional LZ77/LZ78-based compressors,
+       and  approaches  the performance of the PPM family of sta-
+       tistical compressors.
+
+       The command-line options are deliberately very similar  to
+       those of _G_N_U _G_z_i_p_, but they are not identical.
+
+       _B_z_i_p_2  expects  a list of file names to accompany the com-
+       mand-line flags.  Each file is replaced  by  a  compressed
+       version  of  itself,  with  the  name "original_name.bz2".
+       Each compressed file has the same  modification  date  and
+       permissions  as  the corresponding original, so that these
+       properties can  be  correctly  restored  at  decompression
+       time.  File name handling is naive in the sense that there
+       is no mechanism for preserving original file  names,  per-
+       missions  and  dates  in filesystems which lack these con-
+       cepts, or have serious file name length restrictions, such
+       as MS-DOS.
+
+       _B_z_i_p_2  and  _b_u_n_z_i_p_2  will not overwrite existing files; if
+       you want this to happen, you should delete them first.
+
+       If no file names  are  specified,  _b_z_i_p_2  compresses  from
+       standard  input  to  standard output.  In this case, _b_z_i_p_2
+       will decline to write compressed output to a terminal,  as
+       this  would  be  entirely  incomprehensible  and therefore
+       pointless.
+
+       _B_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d ) decompresses and restores all spec-
+       ified files whose names end in ".bz2".  Files without this
+       suffix are ignored.  Again, supplying no filenames  causes
+       decompression from standard input to standard output.
+
+       You  can also compress or decompress files to the standard
+       output by giving the -c flag.  You can decompress multiple
+       files  like  this, but you may only compress a single file
+       this way, since it would otherwise be difficult  to  sepa-
+       rate  out  the  compressed representations of the original
+       files.
+
+
+
+                                                                1
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       Compression is always performed, even  if  the  compressed
+       file  is slightly larger than the original.  Files of less
+       than about one hundred bytes tend to get larger, since the
+       compression  mechanism  has  a  constant  overhead  in the
+       region of 50 bytes.  Random data (including the output  of
+       most  file  compressors)  is  coded at about 8.05 bits per
+       byte, giving an expansion of around 0.5%.
+
+       As a self-check for your  protection,  _b_z_i_p_2  uses  32-bit
+       CRCs  to make sure that the decompressed version of a file
+       is identical to the original.  This guards against corrup-
+       tion  of  the compressed data, and against undetected bugs
+       in _b_z_i_p_2 (hopefully very unlikely).  The chances  of  data
+       corruption  going  undetected  is  microscopic,  about one
+       chance in four billion for each file processed.  Be aware,
+       though,  that  the  check occurs upon decompression, so it
+       can only tell you that that something is wrong.  It  can't
+       help  you recover the original uncompressed data.  You can
+       use _b_z_i_p_2_r_e_c_o_v_e_r to  try  to  recover  data  from  damaged
+       files.
+
+       Return  values:  0  for a normal exit, 1 for environmental
+       problems (file not found, invalid flags, I/O errors,  &c),
+       2 to indicate a corrupt compressed file, 3 for an internal
+       consistency error (eg, bug) which caused _b_z_i_p_2 to panic.
+
+
+MMEEMMOORRYY MMAANNAAGGEEMMEENNTT
+       _B_z_i_p_2 compresses large files in blocks.   The  block  size
+       affects  both  the  compression  ratio  achieved,  and the
+       amount of memory needed both for  compression  and  decom-
+       pression.   The flags -1 through -9 specify the block size
+       to be 100,000 bytes through 900,000  bytes  (the  default)
+       respectively.   At decompression-time, the block size used
+       for compression is read from the header of the  compressed
+       file, and _b_u_n_z_i_p_2 then allocates itself just enough memory
+       to decompress the file.  Since block sizes are  stored  in
+       compressed  files,  it follows that the flags -1 to -9 are
+       irrelevant to and so ignored during  decompression.   Com-
+       pression  and decompression requirements, in bytes, can be
+       estimated as:
+
+             Compression:   400k + ( 7 x block size )
+
+             Decompression: 100k + ( 5 x block size ), or
+                            100k + ( 2.5 x block size )
+
+       Larger  block  sizes  give  rapidly  diminishing  marginal
+       returns;  most of the compression comes from the first two
+       or three hundred k of block size, a fact worth bearing  in
+       mind  when  using  _b_z_i_p_2  on  small  machines.  It is also
+       important to  appreciate  that  the  decompression  memory
+       requirement  is  set  at compression-time by the choice of
+       block size.
+
+
+
+                                                                2
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       For files compressed with the  default  900k  block  size,
+       _b_u_n_z_i_p_2  will require about 4600 kbytes to decompress.  To
+       support decompression of any file on a 4 megabyte machine,
+       _b_u_n_z_i_p_2  has  an  option to decompress using approximately
+       half this amount of memory, about 2300 kbytes.  Decompres-
+       sion  speed  is also halved, so you should use this option
+       only where necessary.  The relevant flag is -s.
+
+       In general, try and use the largest block size memory con-
+       straints  allow,  since  that  maximises  the  compression
+       achieved.  Compression and decompression speed are  virtu-
+       ally unaffected by block size.
+
+       Another  significant point applies to files which fit in a
+       single block -- that  means  most  files  you'd  encounter
+       using  a  large  block  size.   The  amount of real memory
+       touched is proportional to the size of the file, since the
+       file  is smaller than a block.  For example, compressing a
+       file 20,000 bytes long with the flag  -9  will  cause  the
+       compressor  to  allocate  around 6700k of memory, but only
+       touch 400k + 20000 * 7 = 540 kbytes of it.  Similarly, the
+       decompressor  will  allocate  4600k  but only touch 100k +
+       20000 * 5 = 200 kbytes.
+
+       Here is a table which summarises the maximum memory  usage
+       for  different  block  sizes.   Also recorded is the total
+       compressed size for 14 files of the Calgary Text  Compres-
+       sion  Corpus totalling 3,141,622 bytes.  This column gives
+       some feel for how  compression  varies  with  block  size.
+       These  figures  tend to understate the advantage of larger
+       block sizes for larger files, since the  Corpus  is  domi-
+       nated by smaller files.
+
+                  Compress   Decompress   Decompress   Corpus
+           Flag     usage      usage       -s usage     Size
+
+            -1      1100k       600k         350k      914704
+            -2      1800k      1100k         600k      877703
+            -3      2500k      1600k         850k      860338
+            -4      3200k      2100k        1100k      846899
+            -5      3900k      2600k        1350k      845160
+            -6      4600k      3100k        1600k      838626
+            -7      5400k      3600k        1850k      834096
+            -8      6000k      4100k        2100k      828642
+            -9      6700k      4600k        2350k      828642
+
+
+OOPPTTIIOONNSS
+       --cc ----ssttddoouutt
+              Compress or decompress to standard output.  -c will
+              decompress multiple files to stdout, but will  only
+              compress a single file to stdout.
+
+
+
+
+
+                                                                3
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       --dd ----ddeeccoommpprreessss
+              Force  decompression.  _B_z_i_p_2 and _b_u_n_z_i_p_2 are really
+              the same program, and the decision about whether to
+              compress  or  decompress  is  done  on the basis of
+              which name is used.  This flag overrides that mech-
+              anism, and forces _b_z_i_p_2 to decompress.
+
+       --ff ----ccoommpprreessss
+              The  complement  to -d: forces compression, regard-
+              less of the invokation name.
+
+       --tt ----tteesstt
+              Check integrity of the specified file(s), but don't
+              decompress  them.   This  really  performs  a trial
+              decompression and throws away the result, using the
+              low-memory decompression algorithm (see -s).
+
+       --kk ----kkeeeepp
+              Keep  (don't delete) input files during compression
+              or decompression.
+
+       --ss ----ssmmaallll
+              Reduce  memory  usage,  both  for  compression  and
+              decompression.  Files are decompressed using a mod-
+              ified algorithm which only requires 2.5  bytes  per
+              block  byte.   This  means  any  file can be decom-
+              pressed in 2300k of memory,  albeit  somewhat  more
+              slowly than usual.
+
+              During  compression,  -s  selects  a  block size of
+              200k, which limits memory use to  around  the  same
+              figure,  at  the expense of your compression ratio.
+              In short, if your  machine  is  low  on  memory  (8
+              megabytes  or  less),  use  -s for everything.  See
+              MEMORY MANAGEMENT above.
+
+
+       --vv ----vveerrbboossee
+              Verbose mode -- show the compression ratio for each
+              file  processed.   Further  -v's  increase the ver-
+              bosity level, spewing out lots of information which
+              is primarily of interest for diagnostic purposes.
+
+       --LL ----lliicceennssee
+              Display  the  software  version,  license terms and
+              conditions.
+
+       --VV ----vveerrssiioonn
+              Same as -L.
+
+       --11 ttoo --99
+              Set the block size to 100 k, 200 k ..  900  k  when
+              compressing.   Has  no  effect  when decompressing.
+              See MEMORY MANAGEMENT above.
+
+
+
+                                                                4
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       ----rreeppeettiittiivvee--ffaasstt
+              _b_z_i_p_2 injects some small  pseudo-random  variations
+              into  very  repetitive  blocks  to limit worst-case
+              performance during compression.   If  sorting  runs
+              into  difficulties,  the  block  is randomised, and
+              sorting is restarted.  Very roughly, _b_z_i_p_2 persists
+              for  three  times  as  long as a well-behaved input
+              would take before resorting to randomisation.  This
+              flag makes it give up much sooner.
+
+
+       ----rreeppeettiittiivvee--bbeesstt
+              Opposite  of  --repetitive-fast;  try  a lot harder
+              before resorting to randomisation.
+
+
+RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS
+       _b_z_i_p_2 compresses files in blocks, usually 900kbytes  long.
+       Each block is handled independently.  If a media or trans-
+       mission error causes a multi-block  .bz2  file  to  become
+       damaged,  it  may  be  possible  to  recover data from the
+       undamaged blocks in the file.
+
+       The compressed representation of each block  is  delimited
+       by  a  48-bit pattern, which makes it possible to find the
+       block boundaries with reasonable  certainty.   Each  block
+       also  carries its own 32-bit CRC, so damaged blocks can be
+       distinguished from undamaged ones.
+
+       _b_z_i_p_2_r_e_c_o_v_e_r is a  simple  program  whose  purpose  is  to
+       search  for blocks in .bz2 files, and write each block out
+       into its own .bz2 file.  You can then use _b_z_i_p_2 _-_t to test
+       the integrity of the resulting files, and decompress those
+       which are undamaged.
+
+       _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam-
+       aged file, and writes a number of files "rec0001file.bz2",
+       "rec0002file.bz2", etc, containing the  extracted  blocks.
+       The output filenames are designed so that the use of wild-
+       cards in subsequent processing -- for example, "bzip2  -dc
+       rec*file.bz2  >  recovered_data" -- lists the files in the
+       "right" order.
+
+       _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2
+       files,  as  these will contain many blocks.  It is clearly
+       futile to use it on damaged single-block  files,  since  a
+       damaged  block  cannot  be recovered.  If you wish to min-
+       imise any potential data loss through media  or  transmis-
+       sion errors, you might consider compressing with a smaller
+       block size.
+
+
+PPEERRFFOORRMMAANNCCEE NNOOTTEESS
+       The sorting phase of compression gathers together  similar
+
+
+
+                                                                5
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       strings  in  the  file.  Because of this, files containing
+       very long runs of  repeated  symbols,  like  "aabaabaabaab
+       ..."   (repeated   several  hundred  times)  may  compress
+       extraordinarily slowly.  You can use the -vvvvv option  to
+       monitor progress in great detail, if you want.  Decompres-
+       sion speed is unaffected.
+
+       Such pathological cases seem rare in  practice,  appearing
+       mostly in artificially-constructed test files, and in low-
+       level disk images.  It may be inadvisable to use _b_z_i_p_2  to
+       compress  the  latter.   If you do get a file which causes
+       severe slowness in compression, try making the block  size
+       as small as possible, with flag -1.
+
+       Incompressible or virtually-incompressible data may decom-
+       press rather more slowly than one would hope.  This is due
+       to a naive implementation of the move-to-front coder.
+
+       _b_z_i_p_2  usually  allocates  several  megabytes of memory to
+       operate in, and then charges all over it in a fairly  ran-
+       dom  fashion.   This means that performance, both for com-
+       pressing and decompressing, is largely determined  by  the
+       speed  at  which  your  machine  can service cache misses.
+       Because of this, small changes to the code to  reduce  the
+       miss  rate  have  been observed to give disproportionately
+       large performance improvements.  I imagine _b_z_i_p_2 will per-
+       form best on machines with very large caches.
+
+       Test mode (-t) uses the low-memory decompression algorithm
+       (-s).  This means test mode does not run  as  fast  as  it
+       could;  it  could  run as fast as the normal decompression
+       machinery.  This could easily be fixed at the cost of some
+       code bloat.
+
+
+CCAAVVEEAATTSS
+       I/O  error  messages  are not as helpful as they could be.
+       _B_z_i_p_2 tries hard to detect I/O errors  and  exit  cleanly,
+       but  the  details  of  what  the problem is sometimes seem
+       rather misleading.
+
+       This manual page pertains to version 0.1 of _b_z_i_p_2_.  It may
+       well  happen that some future version will use a different
+       compressed file format.  If you try to  decompress,  using
+       0.1,  a  .bz2  file created with some future version which
+       uses a different compressed file format, 0.1 will complain
+       that  your  file  "is not a bzip2 file".  If that happens,
+       you should obtain a more recent version of _b_z_i_p_2  and  use
+       that to decompress the file.
+
+       Wildcard expansion for Windows 95 and NT is flaky.
+
+       _b_z_i_p_2_r_e_c_o_v_e_r  uses  32-bit integers to represent bit posi-
+       tions in compressed files, so it cannot handle  compressed
+
+
+
+                                                                6
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       files  more than 512 megabytes long.  This could easily be
+       fixed.
+
+       _b_z_i_p_2_r_e_c_o_v_e_r sometimes reports a  very  small,  incomplete
+       final  block.  This is spurious and can be safely ignored.
+
+
+RREELLAATTIIOONNSSHHIIPP TTOO bbzziipp--00..2211
+       This program is a descendant of the _b_z_i_p program,  version
+       0.21,  which  I released in August 1996.  The primary dif-
+       ference of _b_z_i_p_2 is its avoidance of the possibly patented
+       algorithms  which  were  used  in 0.21.  _b_z_i_p_2 also brings
+       various useful refinements (-s,  -t),  uses  less  memory,
+       decompresses  significantly  faster,  and  has support for
+       recovering data from damaged files.
+
+       Because _b_z_i_p_2 uses Huffman coding to  construct  the  com-
+       pressed  bitstream, rather than the arithmetic coding used
+       in 0.21, the compressed representations generated  by  the
+       two  programs are incompatible, and they will not interop-
+       erate.  The change in suffix from  .bz  to  .bz2  reflects
+       this.   It would have been helpful to at least allow _b_z_i_p_2
+       to decompress files created by 0.21, but this would defeat
+       the primary aim of having a patent-free compressor.
+
+       Huffman  coding  necessarily  involves some coding ineffi-
+       ciency compared to arithmetic  coding.   This  means  that
+       _b_z_i_p_2  compresses about 1% worse than 0.21, an unfortunate
+       but unavoidable fact-of-life.  On the other  hand,  decom-
+       pression  is approximately 50% faster for the same reason,
+       and the change in file format gave an opportunity  to  add
+       data-recovery features.  So it is not all bad.
+
+
+AAUUTTHHOORR
+       Julian Seward, jseward@acm.org.
+
+       The ideas embodied in _b_z_i_p and _b_z_i_p_2 are due to (at least)
+       the following people: Michael Burrows  and  David  Wheeler
+       (for  the  block  sorting  transformation),  David Wheeler
+       (again, for the Huffman coder),  Peter  Fenwick  (for  the
+       structured  coding  model  in 0.21, and many refinements),
+       and Alistair Moffat, Radford Neal and Ian Witten (for  the
+       arithmetic  coder  in 0.21).  I am much indebted for their
+       help, support and advice.  See the file ALGORITHMS in  the
+       source  distribution for pointers to sources of documenta-
+       tion.  Christian von Roques  encouraged  me  to  look  for
+       faster  sorting algorithms, so as to speed up compression.
+       Bela Lubkin encouraged me to improve the  worst-case  com-
+       pression  performance.   Many  people sent patches, helped
+       with portability problems, lent machines, gave advice  and
+       were generally helpful.
+
+
+
+
+
+                                                                7
+
+

diff --git a/bzip2.c b/bzip2.c
new file mode 100644
index 0000000..0fb45fb
--- /dev/null
+++ b/bzip2.c

@@ -0,0 +1,4036 @@
+
+/*-----------------------------------------------------------*/
+/*--- A block-sorting, lossless compressor        bzip2.c ---*/
+/*-----------------------------------------------------------*/
+
+/*--
+  This program is bzip2, a lossless, block-sorting data compressor,
+  version 0.1pl0, dated 17-Aug-1997.
+
+  Copyright (C) 1996, 1997 by Julian Seward.
+     Guildford, Surrey, UK
+     email: jseward@acm.org
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program; if not, write to the Free Software
+  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+  The GNU General Public License is contained in the file LICENSE.
+
+  This program is based on (at least) the work of:
+     Mike Burrows
+     David Wheeler
+     Peter Fenwick
+     Alistair Moffat
+     Radford Neal
+     Ian H. Witten
+     Robert Sedgewick
+     Jon L. Bentley
+
+  For more information on these sources, see the file ALGORITHMS.
+--*/
+
+/*----------------------------------------------------*/
+/*--- IMPORTANT                                    ---*/
+/*----------------------------------------------------*/
+
+/*--
+   WARNING:
+      This program (attempts to) compress data by performing several
+      non-trivial transformations on it.  Unless you are 100% familiar
+      with *all* the algorithms contained herein, and with the
+      consequences of modifying them, you should NOT meddle with the
+      compression or decompression machinery.  Incorrect changes can
+      and very likely *will* lead to disasterous loss of data.
+
+   DISCLAIMER:
+      I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
+      USE OF THIS PROGRAM, HOWSOEVER CAUSED.
+
+      Every compression of a file implies an assumption that the
+      compressed file can be decompressed to reproduce the original.
+      Great efforts in design, coding and testing have been made to
+      ensure that this program works correctly.  However, the
+      complexity of the algorithms, and, in particular, the presence
+      of various special cases in the code which occur with very low
+      but non-zero probability make it impossible to rule out the
+      possibility of bugs remaining in the program.  DO NOT COMPRESS
+      ANY DATA WITH THIS PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE
+      POSSIBILITY, HOWEVER SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
+
+      That is not to say this program is inherently unreliable.
+      Indeed, I very much hope the opposite is true.  bzip2 has been
+      carefully constructed and extensively tested.
+--*/
+
+
+
+/*----------------------------------------------------*/
+/*--- and now for something much more pleasant :-) ---*/
+/*----------------------------------------------------*/
+
+/*---------------------------------------------*/
+/*--
+  Place a 1 beside your platform, and 0 elsewhere.
+--*/
+
+/*--
+  Generic 32-bit Unix.
+  Also works on 64-bit Unix boxes.
+--*/
+#define BZ_UNIX      1
+
+/*--
+  Win32, as seen by Jacob Navia's excellent
+  port of (Chris Fraser & David Hanson)'s excellent
+  lcc compiler.
+--*/
+#define BZ_LCCWIN32  0
+
+
+
+/*---------------------------------------------*/
+/*--
+  Some stuff for all platforms.
+--*/
+
+#include <stdio.h>
+#include <stdlib.h>
+#if DEBUG
+  #include <assert.h>
+#endif
+#include <string.h>
+#include <signal.h>
+#include <errno.h>
+#include <math.h>
+
+#define ERROR_IF_EOF(i)       { if ((i) == EOF)  ioError(); }
+#define ERROR_IF_NOT_ZERO(i)  { if ((i) != 0)    ioError(); }
+#define ERROR_IF_MINUS_ONE(i) { if ((i) == (-1)) ioError(); }
+
+
+/*---------------------------------------------*/
+/*--
+   Platform-specific stuff.
+--*/
+
+#if BZ_UNIX
+   #include <utime.h>
+   #include <unistd.h>
+   #include <malloc.h>
+   #include <sys/stat.h>
+   #include <sys/times.h>
+
+   #define Int32   int
+   #define UInt32  unsigned int
+   #define Char    char
+   #define UChar   unsigned char
+   #define Int16   short
+   #define UInt16  unsigned short
+
+   #define PATH_SEP    '/'
+   #define MY_LSTAT    lstat
+   #define MY_S_IFREG  S_ISREG
+   #define MY_STAT     stat
+
+   #define APPEND_FILESPEC(root, name) \
+      root=snocString((root), (name))
+
+   #define SET_BINARY_MODE(fd) /**/
+
+   /*--
+      You should try very hard to persuade your C compiler
+      to inline the bits marked INLINE.  Otherwise bzip2 will
+      run rather slowly.  gcc version 2.x is recommended.
+   --*/
+   #ifdef __GNUC__
+      #define INLINE   inline
+      #define NORETURN __attribute__ ((noreturn))
+   #else
+      #define INLINE   /**/
+      #define NORETURN /**/
+   #endif
+#endif
+
+
+
+#if BZ_LCCWIN32
+   #include <io.h>
+   #include <fcntl.h>
+   #include <sys\stat.h>
+
+   #define Int32   int
+   #define UInt32  unsigned int
+   #define Int16   short
+   #define UInt16  unsigned short
+   #define Char    char
+   #define UChar   unsigned char
+
+   #define INLINE         /**/
+   #define NORETURN       /**/
+   #define PATH_SEP       '\\'
+   #define MY_LSTAT       _stat
+   #define MY_STAT        _stat
+   #define MY_S_IFREG(x)  ((x) & _S_IFREG)
+
+   #if 0
+   /*-- lcc-win32 seems to expand wildcards itself --*/
+   #define APPEND_FILESPEC(root, spec)                \
+      do {                                            \
+         if ((spec)[0] == '-') {                      \
+            root = snocString((root), (spec));        \
+         } else {                                     \
+            struct _finddata_t c_file;                \
+            long hFile;                               \
+            hFile = _findfirst((spec), &c_file);      \
+            if ( hFile == -1L ) {                     \
+               root = snocString ((root), (spec));    \
+            } else {                                  \
+               int anInt = 0;                         \
+               while ( anInt == 0 ) {                 \
+                  root = snocString((root),           \
+                            &c_file.name[0]);         \
+                  anInt = _findnext(hFile, &c_file);  \
+               }                                      \
+            }                                         \
+         }                                            \
+      } while ( 0 )
+   #else
+   #define APPEND_FILESPEC(root, name)                \
+      root = snocString ((root), (name))
+   #endif
+
+   #define SET_BINARY_MODE(fd)                        \
+      do {                                            \
+         int retVal = setmode ( fileno ( fd ),        \
+                               O_BINARY );            \
+         ERROR_IF_MINUS_ONE ( retVal );               \
+      } while ( 0 )
+
+#endif
+
+
+/*---------------------------------------------*/
+/*--
+  Some more stuff for all platforms :-)
+--*/
+
+#define Bool   unsigned char
+#define True   1
+#define False  0
+
+/*--
+  IntNative is your platform's `native' int size.
+  Only here to avoid probs with 64-bit platforms.
+--*/
+#define IntNative int
+
+
+/*--
+   change to 1, or compile with -DDEBUG=1 to debug
+--*/
+#ifndef DEBUG
+#define DEBUG 0
+#endif
+
+
+/*---------------------------------------------------*/
+/*---                                             ---*/
+/*---------------------------------------------------*/
+
+/*--
+   Implementation notes, July 1997
+   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+   Memory allocation
+   ~~~~~~~~~~~~~~~~~
+   All large data structures are allocated on the C heap,
+   for better or for worse.  That includes the various
+   arrays of pointers, striped words, bytes, frequency
+   tables and buffers for compression and decompression.
+
+   bzip2 can operate at various block-sizes, ranging from
+   100k to 900k in 100k steps, and it allocates only as
+   much as it needs to.  When compressing, we know from the
+   command-line options what the block-size is going to be,
+   so all allocation can be done at start-up; if that
+   succeeds, there can be no further allocation problems.
+
+   Decompression is more complicated.  Each compressed file
+   contains, in its header, a byte indicating the block
+   size used for compression.  This means bzip2 potentially
+   needs to reallocate memory for each file it deals with,
+   which in turn opens the possibility for a memory allocation
+   failure part way through a run of files, by encountering
+   a file requiring a much larger block size than all the
+   ones preceding it.
+
+   The policy is to simply give up if a memory allocation
+   failure occurs.  During decompression, it would be
+   possible to move on to subsequent files in the hope that
+   some might ask for a smaller block size, but the
+   complications for doing this seem more trouble than they
+   are worth.
+
+
+   Compressed file formats
+   ~~~~~~~~~~~~~~~~~~~~~~~
+   [This is now entirely different from both 0.21, and from
+    any previous Huffman-coded variant of bzip.
+    See the associated file bzip2.txt for details.]
+
+
+   Error conditions
+   ~~~~~~~~~~~~~~~~
+   Dealing with error conditions is the least satisfactory
+   aspect of bzip2.  The policy is to try and leave the
+   filesystem in a consistent state, then quit, even if it
+   means not processing some of the files mentioned in the
+   command line.  `A consistent state' means that a file
+   exists either in its compressed or uncompressed form,
+   but not both.  This boils down to the rule `delete the
+   output file if an error condition occurs, leaving the
+   input intact'.  Input files are only deleted when we can
+   be pretty sure the output file has been written and
+   closed successfully.
+
+   Errors are a dog because there's so many things to
+   deal with.  The following can happen mid-file, and
+   require cleaning up.
+
+     internal `panics' -- indicating a bug
+     corrupted or inconsistent compressed file
+     can't allocate enough memory to decompress this file
+     I/O error reading/writing/opening/closing
+     signal catches -- Control-C, SIGTERM, SIGHUP.
+
+   Other conditions, primarily pertaining to file names,
+   can be checked in-between files, which makes dealing
+   with them easier.
+--*/
+
+
+
+/*---------------------------------------------------*/
+/*--- Misc (file handling) data decls             ---*/
+/*---------------------------------------------------*/
+
+UInt32  bytesIn, bytesOut;
+Int32   verbosity;
+Bool    keepInputFiles, smallMode, testFailsExist;
+UInt32  globalCrc;
+Int32   numFileNames, numFilesProcessed;
+
+
+/*-- source modes; F==file, I==stdin, O==stdout --*/
+#define SM_I2O           1
+#define SM_F2O           2
+#define SM_F2F           3
+
+/*-- operation modes --*/
+#define OM_Z             1
+#define OM_UNZ           2
+#define OM_TEST          3
+
+Int32   opMode;
+Int32   srcMode;
+
+
+Int32   longestFileName;
+Char    inName[1024];
+Char    outName[1024];
+Char    *progName;
+Char    progNameReally[1024];
+FILE    *outputHandleJustInCase;
+
+void    panic                 ( Char* )          NORETURN;
+void    ioError               ( void )           NORETURN;
+void    compressOutOfMemory   ( Int32, Int32 )   NORETURN;
+void    uncompressOutOfMemory ( Int32, Int32 )   NORETURN;
+void    blockOverrun          ( void )           NORETURN;
+void    badBlockHeader        ( void )           NORETURN;
+void    badBGLengths          ( void )           NORETURN;
+void    crcError              ( UInt32, UInt32 ) NORETURN;
+void    bitStreamEOF          ( void )           NORETURN;
+void    cleanUpAndFail        ( Int32 )          NORETURN;
+void    compressedStreamEOF   ( void )           NORETURN;
+
+void*   myMalloc ( Int32 );
+
+
+
+/*---------------------------------------------------*/
+/*--- Data decls for the front end                ---*/
+/*---------------------------------------------------*/
+
+/*--
+   The overshoot bytes allow us to avoid most of
+   the cost of pointer renormalisation during
+   comparison of rotations in sorting.
+   The figure of 20 is derived as follows:
+      qSort3 allows an overshoot of up to 10.
+      It then calls simpleSort, which calls
+      fullGtU, also with max overshoot 10.
+      fullGtU does up to 10 comparisons without
+      renormalising, giving 10+10 == 20.
+--*/
+#define NUM_OVERSHOOT_BYTES 20
+
+/*--
+  These are the main data structures for
+  the Burrows-Wheeler transform.
+--*/
+
+/*--
+  Pointers to compression and decompression
+  structures.  Set by
+     allocateCompressStructures   and
+     setDecompressStructureSizes
+
+  The structures are always set to be suitable
+  for a block of size 100000 * blockSize100k.
+--*/
+UChar    *block;    /*-- compress   --*/
+UInt16   *quadrant; /*-- compress   --*/
+Int32    *zptr;     /*-- compress   --*/ 
+UInt16   *szptr;    /*-- overlays zptr ---*/
+Int32    *ftab;     /*-- compress   --*/
+
+UInt16   *ll16;     /*-- small decompress --*/
+UChar    *ll4;      /*-- small decompress --*/
+
+Int32    *tt;       /*-- fast decompress  --*/
+UChar    *ll8;      /*-- fast decompress  --*/
+
+
+/*--
+  freq table collected to save a pass over the data
+  during decompression.
+--*/
+Int32   unzftab[256];
+
+
+/*--
+   index of the last char in the block, so
+   the block size == last + 1.
+--*/
+Int32  last;
+
+
+/*--
+  index in zptr[] of original string after sorting.
+--*/
+Int32  origPtr;
+
+
+/*--
+  always: in the range 0 .. 9.
+  The current block size is 100000 * this number.
+--*/
+Int32  blockSize100k;
+
+
+/*--
+  Used when sorting.  If too many long comparisons
+  happen, we stop sorting, randomise the block 
+  slightly, and try again.
+--*/
+
+Int32  workFactor;
+Int32  workDone;
+Int32  workLimit;
+Bool   blockRandomised;
+Bool   firstAttempt;
+Int32  nBlocksRandomised;
+
+
+
+/*---------------------------------------------------*/
+/*--- Data decls for the back end                 ---*/
+/*---------------------------------------------------*/
+
+#define MAX_ALPHA_SIZE 258
+#define MAX_CODE_LEN    23
+
+#define RUNA 0
+#define RUNB 1
+
+#define N_GROUPS 6
+#define G_SIZE   50
+#define N_ITERS  4
+
+#define MAX_SELECTORS (2 + (900000 / G_SIZE))
+
+Bool  inUse[256];
+Int32 nInUse;
+
+UChar seqToUnseq[256];
+UChar unseqToSeq[256];
+
+UChar selector   [MAX_SELECTORS];
+UChar selectorMtf[MAX_SELECTORS];
+
+Int32 nMTF;
+
+Int32 mtfFreq[MAX_ALPHA_SIZE];
+
+UChar len  [N_GROUPS][MAX_ALPHA_SIZE];
+
+/*-- decompress only --*/
+Int32 limit  [N_GROUPS][MAX_ALPHA_SIZE];
+Int32 base   [N_GROUPS][MAX_ALPHA_SIZE];
+Int32 perm   [N_GROUPS][MAX_ALPHA_SIZE];
+Int32 minLens[N_GROUPS];
+
+/*-- compress only --*/
+Int32  code [N_GROUPS][MAX_ALPHA_SIZE];
+Int32  rfreq[N_GROUPS][MAX_ALPHA_SIZE];
+
+
+/*---------------------------------------------------*/
+/*--- 32-bit CRC grunge                           ---*/
+/*---------------------------------------------------*/
+
+/*--
+  I think this is an implementation of the AUTODIN-II,
+  Ethernet & FDDI 32-bit CRC standard.  Vaguely derived
+  from code by Rob Warnock, in Section 51 of the
+  comp.compression FAQ.
+--*/
+
+UInt32 crc32Table[256] = {
+
+   /*-- Ugly, innit? --*/
+
+   0x00000000L, 0x04c11db7L, 0x09823b6eL, 0x0d4326d9L,
+   0x130476dcL, 0x17c56b6bL, 0x1a864db2L, 0x1e475005L,
+   0x2608edb8L, 0x22c9f00fL, 0x2f8ad6d6L, 0x2b4bcb61L,
+   0x350c9b64L, 0x31cd86d3L, 0x3c8ea00aL, 0x384fbdbdL,
+   0x4c11db70L, 0x48d0c6c7L, 0x4593e01eL, 0x4152fda9L,
+   0x5f15adacL, 0x5bd4b01bL, 0x569796c2L, 0x52568b75L,
+   0x6a1936c8L, 0x6ed82b7fL, 0x639b0da6L, 0x675a1011L,
+   0x791d4014L, 0x7ddc5da3L, 0x709f7b7aL, 0x745e66cdL,
+   0x9823b6e0L, 0x9ce2ab57L, 0x91a18d8eL, 0x95609039L,
+   0x8b27c03cL, 0x8fe6dd8bL, 0x82a5fb52L, 0x8664e6e5L,
+   0xbe2b5b58L, 0xbaea46efL, 0xb7a96036L, 0xb3687d81L,
+   0xad2f2d84L, 0xa9ee3033L, 0xa4ad16eaL, 0xa06c0b5dL,
+   0xd4326d90L, 0xd0f37027L, 0xddb056feL, 0xd9714b49L,
+   0xc7361b4cL, 0xc3f706fbL, 0xceb42022L, 0xca753d95L,
+   0xf23a8028L, 0xf6fb9d9fL, 0xfbb8bb46L, 0xff79a6f1L,
+   0xe13ef6f4L, 0xe5ffeb43L, 0xe8bccd9aL, 0xec7dd02dL,
+   0x34867077L, 0x30476dc0L, 0x3d044b19L, 0x39c556aeL,
+   0x278206abL, 0x23431b1cL, 0x2e003dc5L, 0x2ac12072L,
+   0x128e9dcfL, 0x164f8078L, 0x1b0ca6a1L, 0x1fcdbb16L,
+   0x018aeb13L, 0x054bf6a4L, 0x0808d07dL, 0x0cc9cdcaL,
+   0x7897ab07L, 0x7c56b6b0L, 0x71159069L, 0x75d48ddeL,
+   0x6b93dddbL, 0x6f52c06cL, 0x6211e6b5L, 0x66d0fb02L,
+   0x5e9f46bfL, 0x5a5e5b08L, 0x571d7dd1L, 0x53dc6066L,
+   0x4d9b3063L, 0x495a2dd4L, 0x44190b0dL, 0x40d816baL,
+   0xaca5c697L, 0xa864db20L, 0xa527fdf9L, 0xa1e6e04eL,
+   0xbfa1b04bL, 0xbb60adfcL, 0xb6238b25L, 0xb2e29692L,
+   0x8aad2b2fL, 0x8e6c3698L, 0x832f1041L, 0x87ee0df6L,
+   0x99a95df3L, 0x9d684044L, 0x902b669dL, 0x94ea7b2aL,
+   0xe0b41de7L, 0xe4750050L, 0xe9362689L, 0xedf73b3eL,
+   0xf3b06b3bL, 0xf771768cL, 0xfa325055L, 0xfef34de2L,
+   0xc6bcf05fL, 0xc27dede8L, 0xcf3ecb31L, 0xcbffd686L,
+   0xd5b88683L, 0xd1799b34L, 0xdc3abdedL, 0xd8fba05aL,
+   0x690ce0eeL, 0x6dcdfd59L, 0x608edb80L, 0x644fc637L,
+   0x7a089632L, 0x7ec98b85L, 0x738aad5cL, 0x774bb0ebL,
+   0x4f040d56L, 0x4bc510e1L, 0x46863638L, 0x42472b8fL,
+   0x5c007b8aL, 0x58c1663dL, 0x558240e4L, 0x51435d53L,
+   0x251d3b9eL, 0x21dc2629L, 0x2c9f00f0L, 0x285e1d47L,
+   0x36194d42L, 0x32d850f5L, 0x3f9b762cL, 0x3b5a6b9bL,
+   0x0315d626L, 0x07d4cb91L, 0x0a97ed48L, 0x0e56f0ffL,
+   0x1011a0faL, 0x14d0bd4dL, 0x19939b94L, 0x1d528623L,
+   0xf12f560eL, 0xf5ee4bb9L, 0xf8ad6d60L, 0xfc6c70d7L,
+   0xe22b20d2L, 0xe6ea3d65L, 0xeba91bbcL, 0xef68060bL,
+   0xd727bbb6L, 0xd3e6a601L, 0xdea580d8L, 0xda649d6fL,
+   0xc423cd6aL, 0xc0e2d0ddL, 0xcda1f604L, 0xc960ebb3L,
+   0xbd3e8d7eL, 0xb9ff90c9L, 0xb4bcb610L, 0xb07daba7L,
+   0xae3afba2L, 0xaafbe615L, 0xa7b8c0ccL, 0xa379dd7bL,
+   0x9b3660c6L, 0x9ff77d71L, 0x92b45ba8L, 0x9675461fL,
+   0x8832161aL, 0x8cf30badL, 0x81b02d74L, 0x857130c3L,
+   0x5d8a9099L, 0x594b8d2eL, 0x5408abf7L, 0x50c9b640L,
+   0x4e8ee645L, 0x4a4ffbf2L, 0x470cdd2bL, 0x43cdc09cL,
+   0x7b827d21L, 0x7f436096L, 0x7200464fL, 0x76c15bf8L,
+   0x68860bfdL, 0x6c47164aL, 0x61043093L, 0x65c52d24L,
+   0x119b4be9L, 0x155a565eL, 0x18197087L, 0x1cd86d30L,
+   0x029f3d35L, 0x065e2082L, 0x0b1d065bL, 0x0fdc1becL,
+   0x3793a651L, 0x3352bbe6L, 0x3e119d3fL, 0x3ad08088L,
+   0x2497d08dL, 0x2056cd3aL, 0x2d15ebe3L, 0x29d4f654L,
+   0xc5a92679L, 0xc1683bceL, 0xcc2b1d17L, 0xc8ea00a0L,
+   0xd6ad50a5L, 0xd26c4d12L, 0xdf2f6bcbL, 0xdbee767cL,
+   0xe3a1cbc1L, 0xe760d676L, 0xea23f0afL, 0xeee2ed18L,
+   0xf0a5bd1dL, 0xf464a0aaL, 0xf9278673L, 0xfde69bc4L,
+   0x89b8fd09L, 0x8d79e0beL, 0x803ac667L, 0x84fbdbd0L,
+   0x9abc8bd5L, 0x9e7d9662L, 0x933eb0bbL, 0x97ffad0cL,
+   0xafb010b1L, 0xab710d06L, 0xa6322bdfL, 0xa2f33668L,
+   0xbcb4666dL, 0xb8757bdaL, 0xb5365d03L, 0xb1f740b4L
+};
+
+
+/*---------------------------------------------*/
+void initialiseCRC ( void )
+{
+   globalCrc = 0xffffffffL;
+}
+
+
+/*---------------------------------------------*/
+UInt32 getFinalCRC ( void )
+{
+   return ~globalCrc;
+}
+
+
+/*---------------------------------------------*/
+UInt32 getGlobalCRC ( void )
+{
+   return globalCrc;
+}
+
+
+/*---------------------------------------------*/
+void setGlobalCRC ( UInt32 newCrc )
+{
+   globalCrc = newCrc;
+}
+
+
+/*---------------------------------------------*/
+#define UPDATE_CRC(crcVar,cha)              \
+{                                           \
+   crcVar = (crcVar << 8) ^                 \
+            crc32Table[(crcVar >> 24) ^     \
+                       ((UChar)cha)];       \
+}
+
+
+/*---------------------------------------------------*/
+/*--- Bit stream I/O                              ---*/
+/*---------------------------------------------------*/
+
+
+UInt32 bsBuff;
+Int32  bsLive;
+FILE*  bsStream;
+Bool   bsWriting;
+
+
+/*---------------------------------------------*/
+void bsSetStream ( FILE* f, Bool wr )
+{
+   if (bsStream != NULL) panic ( "bsSetStream" );
+   bsStream = f;
+   bsLive = 0;
+   bsBuff = 0;
+   bytesOut = 0;
+   bytesIn = 0;
+   bsWriting = wr;
+}
+
+
+/*---------------------------------------------*/
+void bsFinishedWithStream ( void )
+{
+   if (bsWriting)
+      while (bsLive > 0) {
+         fputc ( (UChar)(bsBuff >> 24), bsStream );
+         bsBuff <<= 8;
+         bsLive -= 8;
+         bytesOut++;
+      }
+   bsStream = NULL;
+}
+
+
+/*---------------------------------------------*/
+#define bsNEEDR(nz)                           \
+{                                             \
+   while (bsLive < nz) {                      \
+      Int32 zzi = fgetc ( bsStream );         \
+      if (zzi == EOF) compressedStreamEOF();  \
+      bsBuff = (bsBuff << 8) | (zzi & 0xffL); \
+      bsLive += 8;                            \
+   }                                          \
+}
+
+
+/*---------------------------------------------*/
+#define bsNEEDW(nz)                           \
+{                                             \
+   while (bsLive >= 8) {                      \
+      fputc ( (UChar)(bsBuff >> 24),          \
+               bsStream );                    \
+      bsBuff <<= 8;                           \
+      bsLive -= 8;                            \
+      bytesOut++;                             \
+   }                                          \
+}
+
+
+/*---------------------------------------------*/
+#define bsR1(vz)                              \
+{                                             \
+   bsNEEDR(1);                                \
+   vz = (bsBuff >> (bsLive-1)) & 1;           \
+   bsLive--;                                  \
+}
+
+
+/*---------------------------------------------*/
+INLINE UInt32 bsR ( Int32 n )
+{
+   UInt32 v;
+   bsNEEDR ( n );
+   v = (bsBuff >> (bsLive-n)) & ((1 << n)-1);
+   bsLive -= n;
+   return v;
+}
+
+
+/*---------------------------------------------*/
+INLINE void bsW ( Int32 n, UInt32 v )
+{
+   bsNEEDW ( n );
+   bsBuff |= (v << (32 - bsLive - n));
+   bsLive += n;
+}
+
+
+/*---------------------------------------------*/
+UChar bsGetUChar ( void )
+{
+   return (UChar)bsR(8);
+}
+
+
+/*---------------------------------------------*/
+void bsPutUChar ( UChar c )
+{
+   bsW(8, (UInt32)c );
+}
+
+
+/*---------------------------------------------*/
+Int32 bsGetUInt32 ( void )
+{
+   UInt32 u;
+   u = 0;
+   u = (u << 8) | bsR(8);
+   u = (u << 8) | bsR(8);
+   u = (u << 8) | bsR(8);
+   u = (u << 8) | bsR(8);
+   return u;
+}
+
+
+/*---------------------------------------------*/
+UInt32 bsGetIntVS ( UInt32 numBits )
+{
+   return (UInt32)bsR(numBits);
+}
+
+
+/*---------------------------------------------*/
+UInt32 bsGetInt32 ( void )
+{
+   return (Int32)bsGetUInt32();
+}
+
+
+/*---------------------------------------------*/
+void bsPutUInt32 ( UInt32 u )
+{
+   bsW ( 8, (u >> 24) & 0xffL );
+   bsW ( 8, (u >> 16) & 0xffL );
+   bsW ( 8, (u >>  8) & 0xffL );
+   bsW ( 8,  u        & 0xffL );
+}
+
+
+/*---------------------------------------------*/
+void bsPutInt32 ( Int32 c )
+{
+   bsPutUInt32 ( (UInt32)c );
+}
+
+
+/*---------------------------------------------*/
+void bsPutIntVS ( Int32 numBits, UInt32 c )
+{
+   bsW ( numBits, c );
+}
+
+
+/*---------------------------------------------------*/
+/*--- Huffman coding low-level stuff              ---*/
+/*---------------------------------------------------*/
+
+#define WEIGHTOF(zz0)  ((zz0) & 0xffffff00)
+#define DEPTHOF(zz1)   ((zz1) & 0x000000ff)
+#define MYMAX(zz2,zz3) ((zz2) > (zz3) ? (zz2) : (zz3))
+
+#define ADDWEIGHTS(zw1,zw2)                           \
+   (WEIGHTOF(zw1)+WEIGHTOF(zw2)) |                    \
+   (1 + MYMAX(DEPTHOF(zw1),DEPTHOF(zw2)))
+
+#define UPHEAP(z)                                     \
+{                                                     \
+   Int32 zz, tmp;                                     \
+   zz = z; tmp = heap[zz];                            \
+   while (weight[tmp] < weight[heap[zz >> 1]]) {      \
+      heap[zz] = heap[zz >> 1];                       \
+      zz >>= 1;                                       \
+   }                                                  \
+   heap[zz] = tmp;                                    \
+}
+
+#define DOWNHEAP(z)                                   \
+{                                                     \
+   Int32 zz, yy, tmp;                                 \
+   zz = z; tmp = heap[zz];                            \
+   while (True) {                                     \
+      yy = zz << 1;                                   \
+      if (yy > nHeap) break;                          \
+      if (yy < nHeap &&                               \
+          weight[heap[yy+1]] < weight[heap[yy]])      \
+         yy++;                                        \
+      if (weight[tmp] < weight[heap[yy]]) break;      \
+      heap[zz] = heap[yy];                            \
+      zz = yy;                                        \
+   }                                                  \
+   heap[zz] = tmp;                                    \
+}
+
+
+/*---------------------------------------------*/
+void hbMakeCodeLengths ( UChar *len, 
+                         Int32 *freq,
+                         Int32 alphaSize,
+                         Int32 maxLen )
+{
+   /*--
+      Nodes and heap entries run from 1.  Entry 0
+      for both the heap and nodes is a sentinel.
+   --*/
+   Int32 nNodes, nHeap, n1, n2, i, j, k;
+   Bool  tooLong;
+
+   Int32 heap   [ MAX_ALPHA_SIZE + 2 ];
+   Int32 weight [ MAX_ALPHA_SIZE * 2 ];
+   Int32 parent [ MAX_ALPHA_SIZE * 2 ]; 
+
+   for (i = 0; i < alphaSize; i++)
+      weight[i+1] = (freq[i] == 0 ? 1 : freq[i]) << 8;
+
+   while (True) {
+
+      nNodes = alphaSize;
+      nHeap = 0;
+
+      heap[0] = 0;
+      weight[0] = 0;
+      parent[0] = -2;
+
+      for (i = 1; i <= alphaSize; i++) {
+         parent[i] = -1;
+         nHeap++;
+         heap[nHeap] = i;
+         UPHEAP(nHeap);
+      }
+      if (!(nHeap < (MAX_ALPHA_SIZE+2))) 
+         panic ( "hbMakeCodeLengths(1)" );
+   
+      while (nHeap > 1) {
+         n1 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
+         n2 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
+         nNodes++;
+         parent[n1] = parent[n2] = nNodes;
+         weight[nNodes] = ADDWEIGHTS(weight[n1], weight[n2]);
+         parent[nNodes] = -1;
+         nHeap++;
+         heap[nHeap] = nNodes;
+         UPHEAP(nHeap);
+      }
+      if (!(nNodes < (MAX_ALPHA_SIZE * 2)))
+         panic ( "hbMakeCodeLengths(2)" );
+
+      tooLong = False;
+      for (i = 1; i <= alphaSize; i++) {
+         j = 0;
+         k = i;
+         while (parent[k] >= 0) { k = parent[k]; j++; }
+         len[i-1] = j;
+         if (j > maxLen) tooLong = True;
+      }
+      
+      if (! tooLong) break;
+
+      for (i = 1; i < alphaSize; i++) {
+         j = weight[i] >> 8;
+         j = 1 + (j / 2);
+         weight[i] = j << 8;
+      }
+   }
+}
+
+
+/*---------------------------------------------*/
+void hbAssignCodes ( Int32 *code,
+                     UChar *length,
+                     Int32 minLen,
+                     Int32 maxLen,
+                     Int32 alphaSize )
+{
+   Int32 n, vec, i;
+
+   vec = 0;
+   for (n = minLen; n <= maxLen; n++) {
+      for (i = 0; i < alphaSize; i++)
+         if (length[i] == n) { code[i] = vec; vec++; };
+      vec <<= 1;
+   }
+}
+
+
+/*---------------------------------------------*/
+void hbCreateDecodeTables ( Int32 *limit,
+                            Int32 *base,
+                            Int32 *perm,
+                            UChar *length,
+                            Int32 minLen,
+                            Int32 maxLen,
+                            Int32 alphaSize )
+{
+   Int32 pp, i, j, vec;
+
+   pp = 0;
+   for (i = minLen; i <= maxLen; i++)
+      for (j = 0; j < alphaSize; j++)
+         if (length[j] == i) { perm[pp] = j; pp++; };
+
+   for (i = 0; i < MAX_CODE_LEN; i++) base[i] = 0;
+   for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
+
+   for (i = 1; i < MAX_CODE_LEN; i++) base[i] += base[i-1];
+
+   for (i = 0; i < MAX_CODE_LEN; i++) limit[i] = 0;
+   vec = 0;
+
+   for (i = minLen; i <= maxLen; i++) {
+      vec += (base[i+1] - base[i]);
+      limit[i] = vec-1;
+      vec <<= 1;
+   }
+   for (i = minLen + 1; i <= maxLen; i++)
+      base[i] = ((limit[i-1] + 1) << 1) - base[i];
+}
+
+
+
+/*---------------------------------------------------*/
+/*--- Undoing the reversible transformation       ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+#define SET_LL4(i,n)                                          \
+   { if (((i) & 0x1) == 0)                                    \
+        ll4[(i) >> 1] = (ll4[(i) >> 1] & 0xf0) | (n); else    \
+        ll4[(i) >> 1] = (ll4[(i) >> 1] & 0x0f) | ((n) << 4);  \
+   }
+
+#define GET_LL4(i)                             \
+    (((UInt32)(ll4[(i) >> 1])) >> (((i) << 2) & 0x4) & 0xF)
+
+#define SET_LL(i,n)                       \
+   { ll16[i] = (UInt16)(n & 0x0000ffff);  \
+     SET_LL4(i, n >> 16);                 \
+   }
+
+#define GET_LL(i) \
+   (((UInt32)ll16[i]) | (GET_LL4(i) << 16))
+
+
+/*---------------------------------------------*/
+/*--
+  Manage memory for compression/decompression.
+  When compressing, a single block size applies to
+  all files processed, and that's set when the
+  program starts.  But when decompressing, each file
+  processed could have been compressed with a
+  different block size, so we may have to free
+  and reallocate on a per-file basis.
+
+  A call with argument of zero means
+  `free up everything.'  And a value of zero for
+  blockSize100k means no memory is currently allocated.
+--*/
+
+
+/*---------------------------------------------*/
+void allocateCompressStructures ( void )
+{
+   Int32 n  = 100000 * blockSize100k;
+   block    = malloc ( (n + 1 + NUM_OVERSHOOT_BYTES) * sizeof(UChar) );
+   quadrant = malloc ( (n     + NUM_OVERSHOOT_BYTES) * sizeof(Int16) );
+   zptr     = malloc ( n                             * sizeof(Int32) );
+   ftab     = malloc ( 65537                         * sizeof(Int32) );
+
+   if (block == NULL || quadrant == NULL ||
+       zptr == NULL  || ftab == NULL) {
+      Int32 totalDraw
+         = (n + 1 + NUM_OVERSHOOT_BYTES) * sizeof(UChar) +
+           (n     + NUM_OVERSHOOT_BYTES) * sizeof(Int16) +
+           n                             * sizeof(Int32) +
+           65537                         * sizeof(Int32);
+
+      compressOutOfMemory ( totalDraw, n );
+   }
+
+   /*--
+      Since we want valid indexes for block of
+      -1 to n + NUM_OVERSHOOT_BYTES - 1
+      inclusive.
+   --*/
+   block++;
+
+   /*--
+      The back end needs a place to store the MTF values
+      whilst it calculates the coding tables.  We could
+      put them in the zptr array.  However, these values
+      will fit in a short, so we overlay szptr at the 
+      start of zptr, in the hope of reducing the number
+      of cache misses induced by the multiple traversals
+      of the MTF values when calculating coding tables.
+      Seems to improve compression speed by about 1%.
+   --*/
+   szptr = (UInt16*)zptr;
+}
+
+
+/*---------------------------------------------*/
+void setDecompressStructureSizes ( Int32 newSize100k )
+{
+   if (! (0 <= newSize100k   && newSize100k   <= 9 &&
+          0 <= blockSize100k && blockSize100k <= 9))
+      panic ( "setDecompressStructureSizes" );
+
+   if (newSize100k == blockSize100k) return;
+
+   blockSize100k = newSize100k;
+
+   if (ll16  != NULL) free ( ll16  );
+   if (ll4   != NULL) free ( ll4   );
+   if (ll8   != NULL) free ( ll8   );
+   if (tt    != NULL) free ( tt    );
+
+   if (newSize100k == 0) return;
+
+   if (smallMode) {
+
+      Int32 n = 100000 * newSize100k;
+      ll16    = malloc ( n * sizeof(UInt16) );
+      ll4     = malloc ( ((n+1) >> 1) * sizeof(UChar) );
+
+      if (ll4 == NULL || ll16 == NULL) {
+         Int32 totalDraw
+            = n * sizeof(Int16) + ((n+1) >> 1) * sizeof(UChar);
+         uncompressOutOfMemory ( totalDraw, n );
+      }
+
+   } else {
+
+      Int32 n = 100000 * newSize100k;
+      ll8     = malloc ( n * sizeof(UChar) );
+      tt      = malloc ( n * sizeof(Int32) );
+
+      if (ll8 == NULL || tt == NULL) {
+         Int32 totalDraw
+            = n * sizeof(UChar) + n * sizeof(UInt32);
+         uncompressOutOfMemory ( totalDraw, n );
+      }
+
+   }
+}
+
+
+
+/*---------------------------------------------------*/
+/*--- The new back end                            ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void makeMaps ( void )
+{
+   Int32 i;
+   nInUse = 0;
+   for (i = 0; i < 256; i++)
+      if (inUse[i]) {
+         seqToUnseq[nInUse] = i;
+         unseqToSeq[i] = nInUse;
+         nInUse++;
+      }
+}
+
+
+/*---------------------------------------------*/
+void generateMTFValues ( void )
+{
+   UChar  yy[256];
+   Int32  i, j;
+   UChar  tmp;
+   UChar  tmp2;
+   Int32  zPend;
+   Int32  wr;
+   Int32  EOB;
+
+   makeMaps();
+   EOB = nInUse+1;
+
+   for (i = 0; i <= EOB; i++) mtfFreq[i] = 0;
+
+   wr = 0;
+   zPend = 0;
+   for (i = 0; i < nInUse; i++) yy[i] = (UChar) i;
+   
+
+   for (i = 0; i <= last; i++) {
+      UChar ll_i;
+
+      #if DEBUG
+         assert (wr <= i);
+      #endif
+
+      ll_i = unseqToSeq[block[zptr[i] - 1]];
+      #if DEBUG
+         assert (ll_i < nInUse);
+      #endif
+
+      j = 0;
+      tmp = yy[j];
+      while ( ll_i != tmp ) {
+         j++;
+         tmp2 = tmp;
+         tmp = yy[j];
+         yy[j] = tmp2;
+      };
+      yy[0] = tmp;
+
+      if (j == 0) {
+         zPend++;
+      } else {
+         if (zPend > 0) {
+            zPend--;
+            while (True) {
+               switch (zPend % 2) {
+                  case 0: szptr[wr] = RUNA; wr++; mtfFreq[RUNA]++; break;
+                  case 1: szptr[wr] = RUNB; wr++; mtfFreq[RUNB]++; break;
+               };
+               if (zPend < 2) break;
+               zPend = (zPend - 2) / 2;
+            };
+            zPend = 0;
+         }
+         szptr[wr] = j+1; wr++; mtfFreq[j+1]++;
+      }
+   }
+
+   if (zPend > 0) {
+      zPend--;
+      while (True) {
+         switch (zPend % 2) {
+            case 0:  szptr[wr] = RUNA; wr++; mtfFreq[RUNA]++; break;
+            case 1:  szptr[wr] = RUNB; wr++; mtfFreq[RUNB]++; break;
+         };
+         if (zPend < 2) break;
+         zPend = (zPend - 2) / 2;
+      };
+   }
+
+   szptr[wr] = EOB; wr++; mtfFreq[EOB]++;
+
+   nMTF = wr;
+}
+
+
+/*---------------------------------------------*/
+#define LESSER_ICOST  0
+#define GREATER_ICOST 15
+
+void sendMTFValues ( void )
+{
+   Int32 v, t, i, j, gs, ge, totc, bt, bc, iter;
+   Int32 nSelectors, alphaSize, minLen, maxLen, selCtr;
+   Int32 nGroups, nBytes;
+
+   /*--
+   UChar  len [N_GROUPS][MAX_ALPHA_SIZE];
+   is a global since the decoder also needs it.
+
+   Int32  code[N_GROUPS][MAX_ALPHA_SIZE];
+   Int32  rfreq[N_GROUPS][MAX_ALPHA_SIZE];
+   are also globals only used in this proc.
+   Made global to keep stack frame size small.
+   --*/
+
+
+   UInt16 cost[N_GROUPS];
+   Int32  fave[N_GROUPS];
+
+   if (verbosity >= 3)
+      fprintf ( stderr, 
+                "      %d in block, %d after MTF & 1-2 coding, %d+2 syms in use\n", 
+                last+1, nMTF, nInUse );
+
+   alphaSize = nInUse+2;
+   for (t = 0; t < N_GROUPS; t++)
+      for (v = 0; v < alphaSize; v++)
+         len[t][v] = GREATER_ICOST;
+
+   /*--- Decide how many coding tables to use ---*/
+   if (nMTF <= 0) panic ( "sendMTFValues(0)" );
+   if (nMTF < 200) nGroups = 2; else
+   if (nMTF < 800) nGroups = 4; else
+                   nGroups = 6;
+
+   /*--- Generate an initial set of coding tables ---*/
+   { 
+      Int32 nPart, remF, tFreq, aFreq;
+
+      nPart = nGroups;
+      remF  = nMTF;
+      gs = 0;
+      while (nPart > 0) {
+         tFreq = remF / nPart;
+         ge = gs-1;
+         aFreq = 0;
+         while (aFreq < tFreq && ge < alphaSize-1) {
+            ge++;
+            aFreq += mtfFreq[ge];
+         }
+
+         if (ge > gs 
+             && nPart != nGroups && nPart != 1 
+             && ((nGroups-nPart) % 2 == 1)) {
+            aFreq -= mtfFreq[ge];
+            ge--;
+         }
+
+         if (verbosity >= 3)
+            fprintf ( stderr, 
+                      "      initial group %d, [%d .. %d], has %d syms (%4.1f%%)\n",
+                              nPart, gs, ge, aFreq, 
+                              (100.0 * (float)aFreq) / (float)nMTF );
+ 
+         for (v = 0; v < alphaSize; v++)
+            if (v >= gs && v <= ge) 
+               len[nPart-1][v] = LESSER_ICOST; else
+               len[nPart-1][v] = GREATER_ICOST;
+ 
+         nPart--;
+         gs = ge+1;
+         remF -= aFreq;
+      }
+   }
+
+   /*--- 
+      Iterate up to N_ITERS times to improve the tables.
+   ---*/
+   for (iter = 0; iter < N_ITERS; iter++) {
+
+      for (t = 0; t < nGroups; t++) fave[t] = 0;
+
+      for (t = 0; t < nGroups; t++)
+         for (v = 0; v < alphaSize; v++)
+            rfreq[t][v] = 0;
+
+      nSelectors = 0;
+      totc = 0;
+      gs = 0;
+      while (True) {
+
+         /*--- Set group start & end marks. --*/
+         if (gs >= nMTF) break;
+         ge = gs + G_SIZE - 1; 
+         if (ge >= nMTF) ge = nMTF-1;
+
+         /*-- 
+            Calculate the cost of this group as coded
+            by each of the coding tables.
+         --*/
+         for (t = 0; t < nGroups; t++) cost[t] = 0;
+
+         if (nGroups == 6) {
+            register UInt16 cost0, cost1, cost2, cost3, cost4, cost5;
+            cost0 = cost1 = cost2 = cost3 = cost4 = cost5 = 0;
+            for (i = gs; i <= ge; i++) { 
+               UInt16 icv = szptr[i];
+               cost0 += len[0][icv];
+               cost1 += len[1][icv];
+               cost2 += len[2][icv];
+               cost3 += len[3][icv];
+               cost4 += len[4][icv];
+               cost5 += len[5][icv];
+            }
+            cost[0] = cost0; cost[1] = cost1; cost[2] = cost2;
+            cost[3] = cost3; cost[4] = cost4; cost[5] = cost5;
+         } else {
+            for (i = gs; i <= ge; i++) { 
+               UInt16 icv = szptr[i];
+               for (t = 0; t < nGroups; t++) cost[t] += len[t][icv];
+            }
+         }
+ 
+         /*-- 
+            Find the coding table which is best for this group,
+            and record its identity in the selector table.
+         --*/
+         bc = 999999999; bt = -1;
+         for (t = 0; t < nGroups; t++)
+            if (cost[t] < bc) { bc = cost[t]; bt = t; };
+         totc += bc;
+         fave[bt]++;
+         selector[nSelectors] = bt;
+         nSelectors++;
+
+         /*-- 
+            Increment the symbol frequencies for the selected table.
+          --*/
+         for (i = gs; i <= ge; i++)
+            rfreq[bt][ szptr[i] ]++;
+
+         gs = ge+1;
+      }
+      if (verbosity >= 3) {
+         fprintf ( stderr, 
+                   "      pass %d: size is %d, grp uses are ", 
+                   iter+1, totc/8 );
+         for (t = 0; t < nGroups; t++)
+            fprintf ( stderr, "%d ", fave[t] );
+         fprintf ( stderr, "\n" );
+      }
+
+      /*--
+        Recompute the tables based on the accumulated frequencies.
+      --*/
+      for (t = 0; t < nGroups; t++)
+         hbMakeCodeLengths ( &len[t][0], &rfreq[t][0], alphaSize, 20 );
+   }
+
+
+   if (!(nGroups < 8)) panic ( "sendMTFValues(1)" );
+   if (!(nSelectors < 32768 &&
+         nSelectors <= (2 + (900000 / G_SIZE))))
+                       panic ( "sendMTFValues(2)" );
+
+
+   /*--- Compute MTF values for the selectors. ---*/
+   {
+      UChar pos[N_GROUPS], ll_i, tmp2, tmp;
+      for (i = 0; i < nGroups; i++) pos[i] = i;
+      for (i = 0; i < nSelectors; i++) {
+         ll_i = selector[i];
+         j = 0;
+         tmp = pos[j];
+         while ( ll_i != tmp ) {
+            j++;
+            tmp2 = tmp;
+            tmp = pos[j];
+            pos[j] = tmp2;
+         };
+         pos[0] = tmp;
+         selectorMtf[i] = j;
+      }
+   };
+
+   /*--- Assign actual codes for the tables. --*/
+   for (t = 0; t < nGroups; t++) {
+      minLen = 32;
+      maxLen = 0;
+      for (i = 0; i < alphaSize; i++) {
+         if (len[t][i] > maxLen) maxLen = len[t][i];
+         if (len[t][i] < minLen) minLen = len[t][i];
+      }
+      if (maxLen > 20) panic ( "sendMTFValues(3)" );
+      if (minLen < 1)  panic ( "sendMTFValues(4)" );
+      hbAssignCodes ( &code[t][0], &len[t][0], 
+                      minLen, maxLen, alphaSize );
+   }
+
+   /*--- Transmit the mapping table. ---*/
+   { 
+      Bool inUse16[16];
+      for (i = 0; i < 16; i++) {
+          inUse16[i] = False;
+          for (j = 0; j < 16; j++)
+             if (inUse[i * 16 + j]) inUse16[i] = True;
+      }
+     
+      nBytes = bytesOut;
+      for (i = 0; i < 16; i++)
+         if (inUse16[i]) bsW(1,1); else bsW(1,0);
+
+      for (i = 0; i < 16; i++)
+         if (inUse16[i])
+            for (j = 0; j < 16; j++)
+               if (inUse[i * 16 + j]) bsW(1,1); else bsW(1,0);
+
+      if (verbosity >= 3) 
+         fprintf ( stderr, "      bytes: mapping %d, ", bytesOut-nBytes );
+   }
+
+   /*--- Now the selectors. ---*/
+   nBytes = bytesOut;
+   bsW ( 3, nGroups );
+   bsW ( 15, nSelectors );
+   for (i = 0; i < nSelectors; i++) { 
+      for (j = 0; j < selectorMtf[i]; j++) bsW(1,1);
+      bsW(1,0);
+   }
+   if (verbosity >= 3)
+      fprintf ( stderr, "selectors %d, ", bytesOut-nBytes );
+
+   /*--- Now the coding tables. ---*/
+   nBytes = bytesOut;
+
+   for (t = 0; t < nGroups; t++) {
+      Int32 curr = len[t][0];
+      bsW ( 5, curr );
+      for (i = 0; i < alphaSize; i++) {
+         while (curr < len[t][i]) { bsW(2,2); curr++; /* 10 */ };
+         while (curr > len[t][i]) { bsW(2,3); curr--; /* 11 */ };
+         bsW ( 1, 0 );
+      }
+   }
+
+   if (verbosity >= 3)
+      fprintf ( stderr, "code lengths %d, ", bytesOut-nBytes );
+
+   /*--- And finally, the block data proper ---*/
+   nBytes = bytesOut;
+   selCtr = 0;
+   gs = 0;
+   while (True) {
+      if (gs >= nMTF) break;
+      ge = gs + G_SIZE - 1; 
+      if (ge >= nMTF) ge = nMTF-1;
+      for (i = gs; i <= ge; i++) { 
+         #if DEBUG
+            assert (selector[selCtr] < nGroups);
+         #endif
+         bsW ( len  [selector[selCtr]] [szptr[i]],
+               code [selector[selCtr]] [szptr[i]] );
+      }
+
+      gs = ge+1;
+      selCtr++;
+   }
+   if (!(selCtr == nSelectors)) panic ( "sendMTFValues(5)" );
+
+   if (verbosity >= 3)
+      fprintf ( stderr, "codes %d\n", bytesOut-nBytes );
+}
+
+
+/*---------------------------------------------*/
+void moveToFrontCodeAndSend ( void )
+{
+   bsPutIntVS ( 24, origPtr );
+   generateMTFValues();
+   sendMTFValues();
+}
+
+
+/*---------------------------------------------*/
+void recvDecodingTables ( void )
+{
+   Int32 i, j, t, nGroups, nSelectors, alphaSize;
+   Int32 minLen, maxLen;
+   Bool inUse16[16];
+
+   /*--- Receive the mapping table ---*/
+   for (i = 0; i < 16; i++)
+      if (bsR(1) == 1) 
+         inUse16[i] = True; else 
+         inUse16[i] = False;
+
+   for (i = 0; i < 256; i++) inUse[i] = False;
+
+   for (i = 0; i < 16; i++)
+      if (inUse16[i])
+         for (j = 0; j < 16; j++)
+            if (bsR(1) == 1) inUse[i * 16 + j] = True;
+
+   makeMaps();
+   alphaSize = nInUse+2;
+
+   /*--- Now the selectors ---*/
+   nGroups = bsR ( 3 );
+   nSelectors = bsR ( 15 );
+   for (i = 0; i < nSelectors; i++) {
+      j = 0;
+      while (bsR(1) == 1) j++;
+      selectorMtf[i] = j;
+   }
+
+   /*--- Undo the MTF values for the selectors. ---*/
+   {
+      UChar pos[N_GROUPS], tmp, v;
+      for (v = 0; v < nGroups; v++) pos[v] = v;
+   
+      for (i = 0; i < nSelectors; i++) {
+         v = selectorMtf[i];
+         tmp = pos[v];
+         while (v > 0) { pos[v] = pos[v-1]; v--; }
+         pos[0] = tmp;
+         selector[i] = tmp;
+      }
+   }
+
+   /*--- Now the coding tables ---*/
+   for (t = 0; t < nGroups; t++) {
+      Int32 curr = bsR ( 5 );
+      for (i = 0; i < alphaSize; i++) {
+         while (bsR(1) == 1) {
+            if (bsR(1) == 0) curr++; else curr--;
+         }
+         len[t][i] = curr;
+      }
+   }
+
+   /*--- Create the Huffman decoding tables ---*/
+   for (t = 0; t < nGroups; t++) {
+      minLen = 32;
+      maxLen = 0;
+      for (i = 0; i < alphaSize; i++) {
+         if (len[t][i] > maxLen) maxLen = len[t][i];
+         if (len[t][i] < minLen) minLen = len[t][i];
+      }
+      hbCreateDecodeTables ( 
+         &limit[t][0], &base[t][0], &perm[t][0], &len[t][0],
+         minLen, maxLen, alphaSize
+      );
+      minLens[t] = minLen;
+   }
+}
+
+
+/*---------------------------------------------*/
+#define GET_MTF_VAL(lval)                 \
+{                                         \
+   Int32 zt, zn, zvec, zj;                \
+   if (groupPos == 0) {                   \
+      groupNo++;                          \
+      groupPos = G_SIZE;                  \
+   }                                      \
+   groupPos--;                            \
+   zt = selector[groupNo];                \
+   zn = minLens[zt];                      \
+   zvec = bsR ( zn );                     \
+   while (zvec > limit[zt][zn]) {         \
+      zn++; bsR1(zj);                     \
+      zvec = (zvec << 1) | zj;            \
+   };                                     \
+   lval = perm[zt][zvec - base[zt][zn]];  \
+}
+
+
+/*---------------------------------------------*/
+void getAndMoveToFrontDecode ( void )
+{
+   UChar  yy[256];
+   Int32  i, j, nextSym, limitLast;
+   Int32  EOB, groupNo, groupPos;
+
+   limitLast = 100000 * blockSize100k;
+   origPtr   = bsGetIntVS ( 24 );
+
+   recvDecodingTables();
+   EOB      = nInUse+1;
+   groupNo  = -1;
+   groupPos = 0;
+
+   /*--
+      Setting up the unzftab entries here is not strictly
+      necessary, but it does save having to do it later
+      in a separate pass, and so saves a block's worth of
+      cache misses.
+   --*/
+   for (i = 0; i <= 255; i++) unzftab[i] = 0;
+
+   for (i = 0; i <= 255; i++) yy[i] = (UChar) i;
+
+   last = -1;
+
+   GET_MTF_VAL(nextSym);
+
+   while (True) {
+
+      if (nextSym == EOB) break;
+
+      if (nextSym == RUNA || nextSym == RUNB) {
+         UChar ch;
+         Int32 s = -1;
+         Int32 N = 1;
+         do {
+            if (nextSym == RUNA) s = s + (0+1) * N; else
+            if (nextSym == RUNB) s = s + (1+1) * N;
+            N = N * 2;
+            GET_MTF_VAL(nextSym);
+         }
+            while (nextSym == RUNA || nextSym == RUNB);
+
+         s++;
+         ch = seqToUnseq[yy[0]];
+         unzftab[ch] += s;
+
+         if (smallMode)
+            while (s > 0) {
+               last++; 
+               ll16[last] = ch;
+               s--;
+            }
+         else
+            while (s > 0) {
+               last++;
+               ll8[last] = ch;
+               s--;
+            };
+
+         if (last >= limitLast) blockOverrun();
+         continue;
+
+      } else {
+
+         UChar tmp;
+         last++; if (last >= limitLast) blockOverrun();
+
+         tmp = yy[nextSym-1];
+         unzftab[seqToUnseq[tmp]]++;
+         if (smallMode)
+            ll16[last] = seqToUnseq[tmp]; else
+            ll8[last]  = seqToUnseq[tmp];
+
+         /*--
+            This loop is hammered during decompression,
+            hence the unrolling.
+
+            for (j = nextSym-1; j > 0; j--) yy[j] = yy[j-1];
+         --*/
+
+         j = nextSym-1;
+         for (; j > 3; j -= 4) {
+            yy[j]   = yy[j-1];
+            yy[j-1] = yy[j-2];
+            yy[j-2] = yy[j-3];
+            yy[j-3] = yy[j-4];
+         }
+         for (; j > 0; j--) yy[j] = yy[j-1];
+
+         yy[0] = tmp;
+         GET_MTF_VAL(nextSym);
+         continue;
+      }
+   }
+}
+
+
+/*---------------------------------------------------*/
+/*--- Block-sorting machinery                     ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+/*--
+  Compare two strings in block.  We assume (see
+  discussion above) that i1 and i2 have a max
+  offset of 10 on entry, and that the first
+  bytes of both block and quadrant have been
+  copied into the "overshoot area", ie
+  into the subscript range
+  [last+1 .. last+NUM_OVERSHOOT_BYTES].
+--*/
+INLINE Bool fullGtU ( Int32 i1, Int32 i2 )
+{
+   Int32 k;
+   UChar c1, c2;
+   UInt16 s1, s2;
+
+   #if DEBUG
+      /*--
+        shellsort shouldn't ask to compare
+        something with itself.
+      --*/
+      assert (i1 != i2);
+   #endif
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   c1 = block[i1];
+   c2 = block[i2];
+   if (c1 != c2) return (c1 > c2);
+   i1++; i2++;
+
+   k = last + 1;
+
+   do {
+
+      c1 = block[i1];
+      c2 = block[i2];
+      if (c1 != c2) return (c1 > c2);
+      s1 = quadrant[i1];
+      s2 = quadrant[i2];
+      if (s1 != s2) return (s1 > s2);
+      i1++; i2++;
+
+      c1 = block[i1];
+      c2 = block[i2];
+      if (c1 != c2) return (c1 > c2);
+      s1 = quadrant[i1];
+      s2 = quadrant[i2];
+      if (s1 != s2) return (s1 > s2);
+      i1++; i2++;
+
+      c1 = block[i1];
+      c2 = block[i2];
+      if (c1 != c2) return (c1 > c2);
+      s1 = quadrant[i1];
+      s2 = quadrant[i2];
+      if (s1 != s2) return (s1 > s2);
+      i1++; i2++;
+
+      c1 = block[i1];
+      c2 = block[i2];
+      if (c1 != c2) return (c1 > c2);
+      s1 = quadrant[i1];
+      s2 = quadrant[i2];
+      if (s1 != s2) return (s1 > s2);
+      i1++; i2++;
+
+      if (i1 > last) { i1 -= last; i1--; };
+      if (i2 > last) { i2 -= last; i2--; };
+
+      k -= 4;
+      workDone++;
+   }
+      while (k >= 0);
+
+   return False;
+}
+
+/*---------------------------------------------*/
+/*--
+   Knuth's increments seem to work better
+   than Incerpi-Sedgewick here.  Possibly
+   because the number of elems to sort is
+   usually small, typically <= 20.
+--*/
+Int32 incs[14] = { 1, 4, 13, 40, 121, 364, 1093, 3280,
+                   9841, 29524, 88573, 265720,
+                   797161, 2391484 };
+
+void simpleSort ( Int32 lo, Int32 hi, Int32 d )
+{
+   Int32 i, j, h, bigN, hp;
+   Int32 v;
+
+   bigN = hi - lo + 1;
+   if (bigN < 2) return;
+
+   hp = 0;
+   while (incs[hp] < bigN) hp++;
+   hp--;
+
+   for (; hp >= 0; hp--) {
+      h = incs[hp];
+      if (verbosity >= 5) 
+         fprintf ( stderr, "          shell increment %d\n", h );
+
+      i = lo + h;
+      while (True) {
+
+         /*-- copy 1 --*/
+         if (i > hi) break;
+         v = zptr[i];
+         j = i;
+         while ( fullGtU ( zptr[j-h]+d, v+d ) ) {
+            zptr[j] = zptr[j-h];
+            j = j - h;
+            if (j <= (lo + h - 1)) break;
+         }
+         zptr[j] = v;
+         i++;
+
+         /*-- copy 2 --*/
+         if (i > hi) break;
+         v = zptr[i];
+         j = i;
+         while ( fullGtU ( zptr[j-h]+d, v+d ) ) {
+            zptr[j] = zptr[j-h];
+            j = j - h;
+            if (j <= (lo + h - 1)) break;
+         }
+         zptr[j] = v;
+         i++;
+
+         /*-- copy 3 --*/
+         if (i > hi) break;
+         v = zptr[i];
+         j = i;
+         while ( fullGtU ( zptr[j-h]+d, v+d ) ) {
+            zptr[j] = zptr[j-h];
+            j = j - h;
+            if (j <= (lo + h - 1)) break;
+         }
+         zptr[j] = v;
+         i++;
+
+         if (workDone > workLimit && firstAttempt) return;
+      }
+   }
+}
+
+
+/*---------------------------------------------*/
+/*--
+   The following is an implementation of
+   an elegant 3-way quicksort for strings,
+   described in a paper "Fast Algorithms for
+   Sorting and Searching Strings", by Robert
+   Sedgewick and Jon L. Bentley.
+--*/
+
+#define swap(lv1, lv2) \
+   { Int32 tmp = lv1; lv1 = lv2; lv2 = tmp; }
+
+INLINE void vswap ( Int32 p1, Int32 p2, Int32 n )
+{
+   while (n > 0) {
+      swap(zptr[p1], zptr[p2]);
+      p1++; p2++; n--;
+   }
+}
+
+INLINE UChar med3 ( UChar a, UChar b, UChar c )
+{
+   UChar t;
+   if (a > b) { t = a; a = b; b = t; };
+   if (b > c) { t = b; b = c; c = t; };
+   if (a > b)          b = a;
+   return b;
+}
+
+
+#define min(a,b) ((a) < (b)) ? (a) : (b)
+
+typedef
+   struct { Int32 ll; Int32 hh; Int32 dd; }
+   StackElem;
+
+#define push(lz,hz,dz) { stack[sp].ll = lz; \
+                         stack[sp].hh = hz; \
+                         stack[sp].dd = dz; \
+                         sp++; }
+
+#define pop(lz,hz,dz) { sp--;               \
+                        lz = stack[sp].ll;  \
+                        hz = stack[sp].hh;  \
+                        dz = stack[sp].dd; }
+
+#define SMALL_THRESH 20
+#define DEPTH_THRESH 10
+
+/*--
+   If you are ever unlucky/improbable enough
+   to get a stack overflow whilst sorting,
+   increase the following constant and try
+   again.  In practice I have never seen the
+   stack go above 27 elems, so the following
+   limit seems very generous.
+--*/
+#define QSORT_STACK_SIZE 1000
+
+
+void qSort3 ( Int32 loSt, Int32 hiSt, Int32 dSt )
+{
+   Int32 unLo, unHi, ltLo, gtHi, med, n, m;
+   Int32 sp, lo, hi, d;
+   StackElem stack[QSORT_STACK_SIZE];
+
+   sp = 0;
+   push ( loSt, hiSt, dSt );
+
+   while (sp > 0) {
+
+      if (sp >= QSORT_STACK_SIZE) panic ( "stack overflow in qSort3" );
+
+      pop ( lo, hi, d );
+
+      if (hi - lo < SMALL_THRESH || d > DEPTH_THRESH) {
+         simpleSort ( lo, hi, d );
+         if (workDone > workLimit && firstAttempt) return;
+         continue;
+      }
+
+      med = med3 ( block[zptr[ lo         ]+d],
+                   block[zptr[ hi         ]+d],
+                   block[zptr[ (lo+hi)>>1 ]+d] );
+
+      unLo = ltLo = lo;
+      unHi = gtHi = hi;
+
+      while (True) {
+         while (True) {
+            if (unLo > unHi) break;
+            n = ((Int32)block[zptr[unLo]+d]) - med;
+            if (n == 0) { swap(zptr[unLo], zptr[ltLo]); ltLo++; unLo++; continue; };
+            if (n >  0) break;
+            unLo++;
+         }
+         while (True) {
+            if (unLo > unHi) break;
+            n = ((Int32)block[zptr[unHi]+d]) - med;
+            if (n == 0) { swap(zptr[unHi], zptr[gtHi]); gtHi--; unHi--; continue; };
+            if (n <  0) break;
+            unHi--;
+         }
+         if (unLo > unHi) break;
+         swap(zptr[unLo], zptr[unHi]); unLo++; unHi--;
+      }
+      #if DEBUG
+         assert (unHi == unLo-1);
+      #endif
+
+      if (gtHi < ltLo) {
+         push(lo, hi, d+1 );
+         continue;
+      }
+
+      n = min(ltLo-lo, unLo-ltLo); vswap(lo, unLo-n, n);
+      m = min(hi-gtHi, gtHi-unHi); vswap(unLo, hi-m+1, m);
+
+      n = lo + unLo - ltLo - 1;
+      m = hi - (gtHi - unHi) + 1;
+
+      push ( lo, n, d );
+      push ( n+1, m-1, d+1 );
+      push ( m, hi, d );
+   }
+}
+
+
+/*---------------------------------------------*/
+
+#define BIGFREQ(b) (ftab[((b)+1) << 8] - ftab[(b) << 8])
+
+#define SETMASK (1 << 21)
+#define CLEARMASK (~(SETMASK))
+
+void sortIt ( void )
+{
+   Int32 i, j, ss, sb;
+   Int32 runningOrder[256];
+   Int32 copy[256];
+   Bool bigDone[256];
+   UChar c1, c2;
+   Int32 numQSorted;
+
+   /*--
+      In the various block-sized structures, live data runs
+      from 0 to last+NUM_OVERSHOOT_BYTES inclusive.  First,
+      set up the overshoot area for block.
+   --*/
+
+   if (verbosity >= 4) fprintf ( stderr, "        sort initialise ...\n" );
+   for (i = 0; i < NUM_OVERSHOOT_BYTES; i++)
+       block[last+i+1] = block[i % (last+1)];
+   for (i = 0; i <= last+NUM_OVERSHOOT_BYTES; i++)
+       quadrant[i] = 0;
+
+   block[-1] = block[last];
+
+   if (last < 4000) {
+
+      /*--
+         Use simpleSort(), since the full sorting mechanism
+         has quite a large constant overhead.
+      --*/
+      if (verbosity >= 4) fprintf ( stderr, "        simpleSort ...\n" );
+      for (i = 0; i <= last; i++) zptr[i] = i;
+      firstAttempt = False;
+      workDone = workLimit = 0;
+      simpleSort ( 0, last, 0 );
+      if (verbosity >= 4) fprintf ( stderr, "        simpleSort done.\n" );
+
+   } else {
+
+      numQSorted = 0;
+      for (i = 0; i <= 255; i++) bigDone[i] = False;
+
+      if (verbosity >= 4) fprintf ( stderr, "        bucket sorting ...\n" );
+
+      for (i = 0; i <= 65536; i++) ftab[i] = 0;
+
+      c1 = block[-1];
+      for (i = 0; i <= last; i++) {
+         c2 = block[i];
+         ftab[(c1 << 8) + c2]++;
+         c1 = c2;
+      }
+
+      for (i = 1; i <= 65536; i++) ftab[i] += ftab[i-1];
+
+      c1 = block[0];
+      for (i = 0; i < last; i++) {
+         c2 = block[i+1];
+         j = (c1 << 8) + c2;
+         c1 = c2;
+         ftab[j]--;
+         zptr[ftab[j]] = i;
+      }
+      j = (block[last] << 8) + block[0];
+      ftab[j]--;
+      zptr[ftab[j]] = last;
+
+      /*--
+         Now ftab contains the first loc of every small bucket.
+         Calculate the running order, from smallest to largest
+         big bucket.
+      --*/
+
+      for (i = 0; i <= 255; i++) runningOrder[i] = i;
+
+      {
+         Int32 vv;
+         Int32 h = 1;
+         do h = 3 * h + 1; while (h <= 256);
+         do {
+            h = h / 3;
+            for (i = h; i <= 255; i++) {
+               vv = runningOrder[i];
+               j = i;
+               while ( BIGFREQ(runningOrder[j-h]) > BIGFREQ(vv) ) {
+                  runningOrder[j] = runningOrder[j-h];
+                  j = j - h;
+                  if (j <= (h - 1)) goto zero;
+               }
+               zero:
+               runningOrder[j] = vv;
+            }
+         } while (h != 1);
+      }
+
+      /*--
+         The main sorting loop.
+      --*/
+
+      for (i = 0; i <= 255; i++) {
+
+         /*--
+            Process big buckets, starting with the least full.
+         --*/
+         ss = runningOrder[i];
+
+         /*--
+            Complete the big bucket [ss] by quicksorting
+            any unsorted small buckets [ss, j].  Hopefully
+            previous pointer-scanning phases have already
+            completed many of the small buckets [ss, j], so
+            we don't have to sort them at all.
+         --*/
+         for (j = 0; j <= 255; j++) {
+            sb = (ss << 8) + j;
+            if ( ! (ftab[sb] & SETMASK) ) {
+               Int32 lo = ftab[sb]   & CLEARMASK;
+               Int32 hi = (ftab[sb+1] & CLEARMASK) - 1;
+               if (hi > lo) {
+                  if (verbosity >= 4)
+                     fprintf ( stderr,
+                               "        qsort [0x%x, 0x%x]   done %d   this %d\n",
+                               ss, j, numQSorted, hi - lo + 1 );
+                  qSort3 ( lo, hi, 2 );
+                  numQSorted += ( hi - lo + 1 );
+                  if (workDone > workLimit && firstAttempt) return;
+               }
+               ftab[sb] |= SETMASK;
+            }
+         }
+
+         /*--
+            The ss big bucket is now done.  Record this fact,
+            and update the quadrant descriptors.  Remember to
+            update quadrants in the overshoot area too, if
+            necessary.  The "if (i < 255)" test merely skips
+            this updating for the last bucket processed, since
+            updating for the last bucket is pointless.
+         --*/
+         bigDone[ss] = True;
+
+         if (i < 255) {
+            Int32 bbStart  = ftab[ss << 8] & CLEARMASK;
+            Int32 bbSize   = (ftab[(ss+1) << 8] & CLEARMASK) - bbStart;
+            Int32 shifts   = 0;
+
+            while ((bbSize >> shifts) > 65534) shifts++;
+
+            for (j = 0; j < bbSize; j++) {
+               Int32 a2update     = zptr[bbStart + j];
+               UInt16 qVal        = (UInt16)(j >> shifts);
+               quadrant[a2update] = qVal;
+               if (a2update < NUM_OVERSHOOT_BYTES)
+                  quadrant[a2update + last + 1] = qVal;
+            }
+
+            if (! ( ((bbSize-1) >> shifts) <= 65535 )) panic ( "sortIt" );
+         }
+
+         /*--
+            Now scan this big bucket so as to synthesise the
+            sorted order for small buckets [t, ss] for all t != ss.
+         --*/
+         for (j = 0; j <= 255; j++)
+            copy[j] = ftab[(j << 8) + ss] & CLEARMASK;
+
+         for (j = ftab[ss << 8] & CLEARMASK;
+              j < (ftab[(ss+1) << 8] & CLEARMASK);
+              j++) {
+            c1 = block[zptr[j]-1];
+            if ( ! bigDone[c1] ) {
+               zptr[copy[c1]] = zptr[j] == 0 ? last : zptr[j] - 1;
+               copy[c1] ++;
+            }
+         }
+
+         for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
+      }
+      if (verbosity >= 4)
+         fprintf ( stderr, "        %d pointers, %d sorted, %d scanned\n",
+                           last+1, numQSorted, (last+1) - numQSorted );
+   }
+}
+
+
+/*---------------------------------------------------*/
+/*--- Stuff for randomising repetitive blocks     ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+Int32 rNums[512] = { 
+   619, 720, 127, 481, 931, 816, 813, 233, 566, 247, 
+   985, 724, 205, 454, 863, 491, 741, 242, 949, 214, 
+   733, 859, 335, 708, 621, 574, 73, 654, 730, 472, 
+   419, 436, 278, 496, 867, 210, 399, 680, 480, 51, 
+   878, 465, 811, 169, 869, 675, 611, 697, 867, 561, 
+   862, 687, 507, 283, 482, 129, 807, 591, 733, 623, 
+   150, 238, 59, 379, 684, 877, 625, 169, 643, 105, 
+   170, 607, 520, 932, 727, 476, 693, 425, 174, 647, 
+   73, 122, 335, 530, 442, 853, 695, 249, 445, 515, 
+   909, 545, 703, 919, 874, 474, 882, 500, 594, 612, 
+   641, 801, 220, 162, 819, 984, 589, 513, 495, 799, 
+   161, 604, 958, 533, 221, 400, 386, 867, 600, 782, 
+   382, 596, 414, 171, 516, 375, 682, 485, 911, 276, 
+   98, 553, 163, 354, 666, 933, 424, 341, 533, 870, 
+   227, 730, 475, 186, 263, 647, 537, 686, 600, 224, 
+   469, 68, 770, 919, 190, 373, 294, 822, 808, 206, 
+   184, 943, 795, 384, 383, 461, 404, 758, 839, 887, 
+   715, 67, 618, 276, 204, 918, 873, 777, 604, 560, 
+   951, 160, 578, 722, 79, 804, 96, 409, 713, 940, 
+   652, 934, 970, 447, 318, 353, 859, 672, 112, 785, 
+   645, 863, 803, 350, 139, 93, 354, 99, 820, 908, 
+   609, 772, 154, 274, 580, 184, 79, 626, 630, 742, 
+   653, 282, 762, 623, 680, 81, 927, 626, 789, 125, 
+   411, 521, 938, 300, 821, 78, 343, 175, 128, 250, 
+   170, 774, 972, 275, 999, 639, 495, 78, 352, 126, 
+   857, 956, 358, 619, 580, 124, 737, 594, 701, 612, 
+   669, 112, 134, 694, 363, 992, 809, 743, 168, 974, 
+   944, 375, 748, 52, 600, 747, 642, 182, 862, 81, 
+   344, 805, 988, 739, 511, 655, 814, 334, 249, 515, 
+   897, 955, 664, 981, 649, 113, 974, 459, 893, 228, 
+   433, 837, 553, 268, 926, 240, 102, 654, 459, 51, 
+   686, 754, 806, 760, 493, 403, 415, 394, 687, 700, 
+   946, 670, 656, 610, 738, 392, 760, 799, 887, 653, 
+   978, 321, 576, 617, 626, 502, 894, 679, 243, 440, 
+   680, 879, 194, 572, 640, 724, 926, 56, 204, 700, 
+   707, 151, 457, 449, 797, 195, 791, 558, 945, 679, 
+   297, 59, 87, 824, 713, 663, 412, 693, 342, 606, 
+   134, 108, 571, 364, 631, 212, 174, 643, 304, 329, 
+   343, 97, 430, 751, 497, 314, 983, 374, 822, 928, 
+   140, 206, 73, 263, 980, 736, 876, 478, 430, 305, 
+   170, 514, 364, 692, 829, 82, 855, 953, 676, 246, 
+   369, 970, 294, 750, 807, 827, 150, 790, 288, 923, 
+   804, 378, 215, 828, 592, 281, 565, 555, 710, 82, 
+   896, 831, 547, 261, 524, 462, 293, 465, 502, 56, 
+   661, 821, 976, 991, 658, 869, 905, 758, 745, 193, 
+   768, 550, 608, 933, 378, 286, 215, 979, 792, 961, 
+   61, 688, 793, 644, 986, 403, 106, 366, 905, 644, 
+   372, 567, 466, 434, 645, 210, 389, 550, 919, 135, 
+   780, 773, 635, 389, 707, 100, 626, 958, 165, 504, 
+   920, 176, 193, 713, 857, 265, 203, 50, 668, 108, 
+   645, 990, 626, 197, 510, 357, 358, 850, 858, 364, 
+   936, 638
+};
+
+
+#define RAND_DECLS                                \
+   Int32 rNToGo = 0;                              \
+   Int32 rTPos  = 0;                              \
+
+#define RAND_MASK ((rNToGo == 1) ? 1 : 0)
+
+#define RAND_UPD_MASK                             \
+   if (rNToGo == 0) {                             \
+      rNToGo = rNums[rTPos];                      \
+      rTPos++; if (rTPos == 512) rTPos = 0;       \
+   }                                              \
+   rNToGo--;
+
+
+
+/*---------------------------------------------------*/
+/*--- The Reversible Transformation (tm)          ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void randomiseBlock ( void )
+{
+   Int32 i;
+   RAND_DECLS;
+   for (i = 0; i < 256; i++) inUse[i] = False;
+
+   for (i = 0; i <= last; i++) {
+      RAND_UPD_MASK;
+      block[i] ^= RAND_MASK;
+      inUse[block[i]] = True;
+   }
+}
+
+
+/*---------------------------------------------*/
+void doReversibleTransformation ( void )
+{
+   Int32 i;
+
+   if (verbosity >= 2) fprintf ( stderr, "\n" );
+
+   workLimit       = workFactor * last;
+   workDone        = 0;
+   blockRandomised = False;
+   firstAttempt    = True;
+
+   sortIt ();
+
+   if (verbosity >= 3)
+      fprintf ( stderr, "      %d work, %d block, ratio %5.2f\n",
+                        workDone, last, (float)workDone / (float)(last) );
+
+   if (workDone > workLimit && firstAttempt) {
+      if (verbosity >= 2)
+         fprintf ( stderr, "    sorting aborted; randomising block\n" );
+      randomiseBlock ();
+      workLimit = workDone = 0;
+      blockRandomised = True;
+      firstAttempt = False;
+      sortIt();
+      if (verbosity >= 3)
+         fprintf ( stderr, "      %d work, %d block, ratio %f\n",
+                           workDone, last, (float)workDone / (float)(last) );
+   }
+
+   origPtr = -1;
+   for (i = 0; i <= last; i++)
+       if (zptr[i] == 0)
+          { origPtr = i; break; };
+
+   if (origPtr == -1) panic ( "doReversibleTransformation" );
+}
+
+
+/*---------------------------------------------*/
+
+INLINE Int32 indexIntoF ( Int32 indx, Int32 *cftab )
+{
+   Int32 nb, na, mid;
+   nb = 0;
+   na = 256;
+   do {
+      mid = (nb + na) >> 1;
+      if (indx >= cftab[mid]) nb = mid; else na = mid;
+   }
+   while (na - nb != 1);
+   return nb;
+}
+
+
+#define GET_SMALL(cccc)                     \
+                                            \
+      cccc = indexIntoF ( tPos, cftab );    \
+      tPos = GET_LL(tPos);
+
+
+void undoReversibleTransformation_small ( FILE* dst )
+{
+   Int32  cftab[257], cftabAlso[257];
+   Int32  i, j, tmp, tPos;
+   UChar  ch;
+
+   /*--
+      We assume here that the global array unzftab will
+      already be holding the frequency counts for
+      ll8[0 .. last].
+   --*/
+
+   /*-- Set up cftab to facilitate generation of indexIntoF --*/
+   cftab[0] = 0;
+   for (i = 1; i <= 256; i++) cftab[i] = unzftab[i-1];
+   for (i = 1; i <= 256; i++) cftab[i] += cftab[i-1];
+
+   /*-- Make a copy of it, used in generation of T --*/
+   for (i = 0; i <= 256; i++) cftabAlso[i] = cftab[i];
+
+   /*-- compute the T vector --*/
+   for (i = 0; i <= last; i++) {
+      ch = (UChar)ll16[i];
+      SET_LL(i, cftabAlso[ch]);
+      cftabAlso[ch]++;
+   }
+
+   /*--
+      Compute T^(-1) by pointer reversal on T.  This is rather
+      subtle, in that, if the original block was two or more
+      (in general, N) concatenated copies of the same thing,
+      the T vector will consist of N cycles, each of length
+      blocksize / N, and decoding will involve traversing one
+      of these cycles N times.  Which particular cycle doesn't
+      matter -- they are all equivalent.  The tricky part is to
+      make sure that the pointer reversal creates a correct
+      reversed cycle for us to traverse.  So, the code below
+      simply reverses whatever cycle origPtr happens to fall into,
+      without regard to the cycle length.  That gives one reversed
+      cycle, which for normal blocks, is the entire block-size long.
+      For repeated blocks, it will be interspersed with the other
+      N-1 non-reversed cycles.  Providing that the F-subscripting
+      phase which follows starts at origPtr, all then works ok.
+   --*/
+   i = origPtr;
+   j = GET_LL(i);
+   do {
+      tmp = GET_LL(j);
+      SET_LL(j, i);
+      i = j;
+      j = tmp;
+   }
+      while (i != origPtr);
+
+   /*--
+      We recreate the original by subscripting F through T^(-1).
+      The run-length-decoder below requires characters incrementally,
+      so tPos is set to a starting value, and is updated by
+      the GET_SMALL macro.
+   --*/
+   tPos   = origPtr;
+
+   /*-------------------------------------------------*/
+   /*--
+      This is pretty much a verbatim copy of the
+      run-length decoder present in the distribution
+      bzip-0.21; it has to be here to avoid creating
+      block[] as an intermediary structure.  As in 0.21,
+      this code derives from some sent to me by
+      Christian von Roques.
+
+      It allows dst==NULL, so as to support the test (-t)
+      option without slowing down the fast decompression
+      code.
+   --*/
+   {
+      IntNative retVal;
+      Int32     i2, count, chPrev, ch2;
+      UInt32    localCrc;
+
+      count    = 0;
+      i2       = 0;
+      ch2      = 256;   /*-- not a char and not EOF --*/
+      localCrc = getGlobalCRC();
+
+      {
+         RAND_DECLS;
+         while ( i2 <= last ) {
+            chPrev = ch2;
+            GET_SMALL(ch2);
+            if (blockRandomised) {
+               RAND_UPD_MASK;
+               ch2 ^= (UInt32)RAND_MASK;
+            }
+            i2++;
+   
+            if (dst)
+               retVal = putc ( ch2, dst );
+   
+            UPDATE_CRC ( localCrc, (UChar)ch2 );
+   
+            if (ch2 != chPrev) {
+               count = 1;
+            } else {
+               count++;
+               if (count >= 4) {
+                  Int32 j2;
+                  UChar z;
+                  GET_SMALL(z);
+                  if (blockRandomised) {
+                     RAND_UPD_MASK;
+                     z ^= RAND_MASK;
+                  }
+                  for (j2 = 0;  j2 < (Int32)z;  j2++) {
+                     if (dst) retVal = putc (ch2, dst);
+                     UPDATE_CRC ( localCrc, (UChar)ch2 );
+                  }
+                  i2++;
+                  count = 0;
+               }
+            }
+         }
+      }
+
+      setGlobalCRC ( localCrc );
+   }
+   /*-- end of the in-line run-length-decoder. --*/
+}
+#undef GET_SMALL
+
+
+/*---------------------------------------------*/
+
+#define GET_FAST(cccc)                       \
+                                             \
+      cccc = ll8[tPos];                      \
+      tPos = tt[tPos];
+
+
+void undoReversibleTransformation_fast ( FILE* dst )
+{
+   Int32  cftab[257];
+   Int32  i, tPos;
+   UChar  ch;
+
+   /*--
+      We assume here that the global array unzftab will
+      already be holding the frequency counts for
+      ll8[0 .. last].
+   --*/
+
+   /*-- Set up cftab to facilitate generation of T^(-1) --*/
+   cftab[0] = 0;
+   for (i = 1; i <= 256; i++) cftab[i] = unzftab[i-1];
+   for (i = 1; i <= 256; i++) cftab[i] += cftab[i-1];
+
+   /*-- compute the T^(-1) vector --*/
+   for (i = 0; i <= last; i++) {
+      ch = (UChar)ll8[i];
+      tt[cftab[ch]] = i;
+      cftab[ch]++;
+   }
+
+   /*--
+      We recreate the original by subscripting L through T^(-1).
+      The run-length-decoder below requires characters incrementally,
+      so tPos is set to a starting value, and is updated by
+      the GET_FAST macro.
+   --*/
+   tPos   = tt[origPtr];
+
+   /*-------------------------------------------------*/
+   /*--
+      This is pretty much a verbatim copy of the
+      run-length decoder present in the distribution
+      bzip-0.21; it has to be here to avoid creating
+      block[] as an intermediary structure.  As in 0.21,
+      this code derives from some sent to me by
+      Christian von Roques.
+   --*/
+   {
+      IntNative retVal;
+      Int32     i2, count, chPrev, ch2;
+      UInt32    localCrc;
+
+      count    = 0;
+      i2       = 0;
+      ch2      = 256;   /*-- not a char and not EOF --*/
+      localCrc = getGlobalCRC();
+
+      if (blockRandomised) {
+         RAND_DECLS;
+         while ( i2 <= last ) {
+            chPrev = ch2;
+            GET_FAST(ch2);
+            RAND_UPD_MASK;
+            ch2 ^= (UInt32)RAND_MASK;
+            i2++;
+   
+            retVal = putc ( ch2, dst );
+            UPDATE_CRC ( localCrc, (UChar)ch2 );
+   
+            if (ch2 != chPrev) {
+               count = 1;
+            } else {
+               count++;
+               if (count >= 4) {
+                  Int32 j2;
+                  UChar z;
+                  GET_FAST(z);
+                  RAND_UPD_MASK;
+                  z ^= RAND_MASK;
+                  for (j2 = 0;  j2 < (Int32)z;  j2++) {
+                     retVal = putc (ch2, dst);
+                     UPDATE_CRC ( localCrc, (UChar)ch2 );
+                  }
+                  i2++;
+                  count = 0;
+               }
+            }
+         }
+
+      } else {
+
+         while ( i2 <= last ) {
+            chPrev = ch2;
+            GET_FAST(ch2);
+            i2++;
+   
+            retVal = putc ( ch2, dst );
+            UPDATE_CRC ( localCrc, (UChar)ch2 );
+   
+            if (ch2 != chPrev) {
+               count = 1;
+            } else {
+               count++;
+               if (count >= 4) {
+                  Int32 j2;
+                  UChar z;
+                  GET_FAST(z);
+                  for (j2 = 0;  j2 < (Int32)z;  j2++) {
+                     retVal = putc (ch2, dst);
+                     UPDATE_CRC ( localCrc, (UChar)ch2 );
+                  }
+                  i2++;
+                  count = 0;
+               }
+            }
+         }
+
+      }   /*-- if (blockRandomised) --*/
+
+      setGlobalCRC ( localCrc );
+   }
+   /*-- end of the in-line run-length-decoder. --*/
+}
+#undef GET_FAST
+
+
+/*---------------------------------------------------*/
+/*--- The block loader and RLEr                   ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+/*  Top 16:   run length, 1 to 255.
+*   Lower 16: the char, or MY_EOF for EOF.
+*/
+
+#define MY_EOF 257
+
+INLINE Int32 getRLEpair ( FILE* src )
+{
+   Int32     runLength;
+   IntNative ch, chLatest;
+
+   ch = getc ( src );
+
+   /*--- Because I have no idea what kind of a value EOF is. ---*/
+   if (ch == EOF) {
+      ERROR_IF_NOT_ZERO ( errno );
+      return (1 << 16) | MY_EOF;
+   }
+
+   runLength = 0;
+   do {
+      chLatest = getc ( src );
+      runLength++;
+      bytesIn++;
+   }
+      while (ch == chLatest && runLength < 255);
+
+   if ( chLatest != EOF ) {
+      if ( ungetc ( chLatest, src ) == EOF )
+         panic ( "getRLEpair: ungetc failed" );
+   } else {
+      ERROR_IF_NOT_ZERO ( errno );
+   }
+
+   /*--- Conditional is just a speedup hack. ---*/
+   if (runLength == 1) {
+      UPDATE_CRC ( globalCrc, (UChar)ch );
+      return (1 << 16) | ch;
+   } else {
+      Int32 i;
+      for (i = 1; i <= runLength; i++)
+         UPDATE_CRC ( globalCrc, (UChar)ch );
+      return (runLength << 16) | ch;
+   }
+}
+
+
+/*---------------------------------------------*/
+void loadAndRLEsource ( FILE* src )
+{
+   Int32 ch, allowableBlockSize, i;
+
+   last = -1;
+   ch   = 0;
+
+   for (i = 0; i < 256; i++) inUse[i] = False;
+
+   /*--- 20 is just a paranoia constant ---*/
+   allowableBlockSize = 100000 * blockSize100k - 20;
+
+   while (last < allowableBlockSize && ch != MY_EOF) {
+      Int32 rlePair, runLen;
+      rlePair = getRLEpair ( src );
+      ch      = rlePair & 0xFFFF;
+      runLen  = (UInt32)rlePair >> 16;
+
+      #if DEBUG
+         assert (runLen >= 1 && runLen <= 255);
+      #endif
+
+      if (ch != MY_EOF) {
+         inUse[ch] = True;
+         switch (runLen) {
+            case 1:
+               last++; block[last] = (UChar)ch; break;
+            case 2:
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch; break;
+            case 3:
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch; break;
+            default:
+               inUse[runLen-4] = True;
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)ch;
+               last++; block[last] = (UChar)(runLen-4); break;
+         }
+      }
+   }
+}
+
+
+/*---------------------------------------------------*/
+/*--- Processing of complete files and streams    ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void compressStream ( FILE *stream, FILE *zStream )
+{
+   IntNative  retVal;
+   UInt32     blockCRC, combinedCRC;
+   Int32      blockNo;
+
+   blockNo  = 0;
+   bytesIn  = 0;
+   bytesOut = 0;
+   nBlocksRandomised = 0;
+
+   SET_BINARY_MODE(stream);
+   SET_BINARY_MODE(zStream);
+
+   ERROR_IF_NOT_ZERO ( ferror(stream) );
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+
+   bsSetStream ( zStream, True );
+
+   /*--- Write `magic' bytes B and Z,
+         then h indicating file-format == huffmanised,
+         followed by a digit indicating blockSize100k.
+   ---*/
+   bsPutUChar ( 'B' );
+   bsPutUChar ( 'Z' );
+   bsPutUChar ( 'h' );
+   bsPutUChar ( '0' + blockSize100k );
+
+   combinedCRC = 0;
+
+   if (verbosity >= 2) fprintf ( stderr, "\n" );
+
+   while (True) {
+
+      blockNo++;
+      initialiseCRC ();
+      loadAndRLEsource ( stream );
+      ERROR_IF_NOT_ZERO ( ferror(stream) );
+      if (last == -1) break;
+
+      blockCRC = getFinalCRC ();
+      combinedCRC = (combinedCRC << 1) | (combinedCRC >> 31);
+      combinedCRC ^= blockCRC;
+
+      if (verbosity >= 2)
+         fprintf ( stderr, "    block %d: crc = 0x%8x, combined CRC = 0x%8x, size = %d",
+                           blockNo, blockCRC, combinedCRC, last+1 );
+
+      /*-- sort the block and establish posn of original string --*/
+      doReversibleTransformation ();
+
+      /*--
+        A 6-byte block header, the value chosen arbitrarily
+        as 0x314159265359 :-).  A 32 bit value does not really
+        give a strong enough guarantee that the value will not
+        appear by chance in the compressed datastream.  Worst-case
+        probability of this event, for a 900k block, is about
+        2.0e-3 for 32 bits, 1.0e-5 for 40 bits and 4.0e-8 for 48 bits.
+        For a compressed file of size 100Gb -- about 100000 blocks --
+        only a 48-bit marker will do.  NB: normal compression/
+        decompression do *not* rely on these statistical properties.
+        They are only important when trying to recover blocks from
+        damaged files.
+      --*/
+      bsPutUChar ( 0x31 ); bsPutUChar ( 0x41 );
+      bsPutUChar ( 0x59 ); bsPutUChar ( 0x26 );
+      bsPutUChar ( 0x53 ); bsPutUChar ( 0x59 );
+
+      /*-- Now the block's CRC, so it is in a known place. --*/
+      bsPutUInt32 ( blockCRC );
+
+      /*-- Now a single bit indicating randomisation. --*/
+      if (blockRandomised) {
+         bsW(1,1); nBlocksRandomised++;
+      } else
+         bsW(1,0);
+
+      /*-- Finally, block's contents proper. --*/
+      moveToFrontCodeAndSend ();
+
+      ERROR_IF_NOT_ZERO ( ferror(zStream) );
+   }
+
+   if (verbosity >= 2 && nBlocksRandomised > 0)
+      fprintf ( stderr, "    %d block%s needed randomisation\n", 
+                        nBlocksRandomised,
+                        nBlocksRandomised == 1 ? "" : "s" );
+
+   /*--
+      Now another magic 48-bit number, 0x177245385090, to
+      indicate the end of the last block.  (sqrt(pi), if
+      you want to know.  I did want to use e, but it contains
+      too much repetition -- 27 18 28 18 28 46 -- for me
+      to feel statistically comfortable.  Call me paranoid.)
+   --*/
+
+   bsPutUChar ( 0x17 ); bsPutUChar ( 0x72 );
+   bsPutUChar ( 0x45 ); bsPutUChar ( 0x38 );
+   bsPutUChar ( 0x50 ); bsPutUChar ( 0x90 );
+
+   bsPutUInt32 ( combinedCRC );
+   if (verbosity >= 2)
+      fprintf ( stderr, "    final combined CRC = 0x%x\n   ", combinedCRC );
+
+   /*-- Close the files in an utterly paranoid way. --*/
+   bsFinishedWithStream ();
+
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+   retVal = fflush ( zStream );
+   ERROR_IF_EOF ( retVal );
+   retVal = fclose ( zStream );
+   ERROR_IF_EOF ( retVal );
+
+   ERROR_IF_NOT_ZERO ( ferror(stream) );
+   retVal = fclose ( stream );
+   ERROR_IF_EOF ( retVal );
+
+   if (bytesIn == 0) bytesIn = 1;
+   if (bytesOut == 0) bytesOut = 1;
+
+   if (verbosity >= 1)
+      fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
+                        "%5.2f%% saved, %d in, %d out.\n",
+                (float)bytesIn / (float)bytesOut,
+                (8.0 * (float)bytesOut) / (float)bytesIn,
+                100.0 * (1.0 - (float)bytesOut / (float)bytesIn),
+                bytesIn,
+                bytesOut
+              );
+}
+
+
+/*---------------------------------------------*/
+Bool uncompressStream ( FILE *zStream, FILE *stream )
+{
+   UChar      magic1, magic2, magic3, magic4;
+   UChar      magic5, magic6;
+   UInt32     storedBlockCRC, storedCombinedCRC;
+   UInt32     computedBlockCRC, computedCombinedCRC;
+   Int32      currBlockNo;
+   IntNative  retVal;
+
+   SET_BINARY_MODE(stream);
+   SET_BINARY_MODE(zStream);
+
+   ERROR_IF_NOT_ZERO ( ferror(stream) );
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+
+   bsSetStream ( zStream, False );
+
+   /*--
+      A bad magic number is `recoverable from';
+      return with False so the caller skips the file.
+   --*/
+   magic1 = bsGetUChar ();
+   magic2 = bsGetUChar ();
+   magic3 = bsGetUChar ();
+   magic4 = bsGetUChar ();
+   if (magic1 != 'B' ||
+       magic2 != 'Z' ||
+       magic3 != 'h' ||
+       magic4 < '1'  ||
+       magic4 > '9') {
+     bsFinishedWithStream();
+     retVal = fclose ( stream );
+     ERROR_IF_EOF ( retVal );
+     return False;
+   }
+
+   setDecompressStructureSizes ( magic4 - '0' );
+   computedCombinedCRC = 0;
+
+   if (verbosity >= 2) fprintf ( stderr, "\n    " );
+   currBlockNo = 0;
+
+   while (True) {
+      magic1 = bsGetUChar ();
+      magic2 = bsGetUChar ();
+      magic3 = bsGetUChar ();
+      magic4 = bsGetUChar ();
+      magic5 = bsGetUChar ();
+      magic6 = bsGetUChar ();
+      if (magic1 == 0x17 && magic2 == 0x72 &&
+          magic3 == 0x45 && magic4 == 0x38 &&
+          magic5 == 0x50 && magic6 == 0x90) break;
+
+      if (magic1 != 0x31 || magic2 != 0x41 ||
+          magic3 != 0x59 || magic4 != 0x26 ||
+          magic5 != 0x53 || magic6 != 0x59) badBlockHeader();
+
+      storedBlockCRC = bsGetUInt32 ();
+
+      if (bsR(1) == 1)
+         blockRandomised = True; else
+         blockRandomised = False;
+
+      currBlockNo++;
+      if (verbosity >= 2)
+         fprintf ( stderr, "[%d: huff+mtf ", currBlockNo );
+      getAndMoveToFrontDecode ();
+      ERROR_IF_NOT_ZERO ( ferror(zStream) );
+
+      initialiseCRC();
+      if (verbosity >= 2) fprintf ( stderr, "rt+rld" );
+      if (smallMode)
+         undoReversibleTransformation_small ( stream );
+         else
+         undoReversibleTransformation_fast  ( stream );
+
+      ERROR_IF_NOT_ZERO ( ferror(stream) );
+
+      computedBlockCRC = getFinalCRC();
+      if (verbosity >= 3)
+         fprintf ( stderr, " {0x%x, 0x%x}", storedBlockCRC, computedBlockCRC );
+      if (verbosity >= 2) fprintf ( stderr, "] " );
+
+      /*-- A bad CRC is considered a fatal error. --*/
+      if (storedBlockCRC != computedBlockCRC)
+         crcError ( storedBlockCRC, computedBlockCRC );
+
+      computedCombinedCRC = (computedCombinedCRC << 1) | (computedCombinedCRC >> 31);
+      computedCombinedCRC ^= computedBlockCRC;
+   };
+
+   if (verbosity >= 2) fprintf ( stderr, "\n    " );
+
+   storedCombinedCRC  = bsGetUInt32 ();
+   if (verbosity >= 2)
+      fprintf ( stderr,
+                "combined CRCs: stored = 0x%x, computed = 0x%x\n    ",
+                storedCombinedCRC, computedCombinedCRC );
+   if (storedCombinedCRC != computedCombinedCRC)
+      crcError ( storedCombinedCRC, computedCombinedCRC );
+
+
+   bsFinishedWithStream ();
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+   retVal = fclose ( zStream );
+   ERROR_IF_EOF ( retVal );
+
+   ERROR_IF_NOT_ZERO ( ferror(stream) );
+   retVal = fflush ( stream );
+   ERROR_IF_NOT_ZERO ( retVal );
+   if (stream != stdout) {
+      retVal = fclose ( stream );
+      ERROR_IF_EOF ( retVal );
+   }
+   return True;
+}
+
+
+/*---------------------------------------------*/
+Bool testStream ( FILE *zStream )
+{
+   UChar      magic1, magic2, magic3, magic4;
+   UChar      magic5, magic6;
+   UInt32     storedBlockCRC, storedCombinedCRC;
+   UInt32     computedBlockCRC, computedCombinedCRC;
+   Int32      currBlockNo;
+   IntNative  retVal;
+
+   SET_BINARY_MODE(zStream);
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+
+   bsSetStream ( zStream, False );
+
+   magic1 = bsGetUChar ();
+   magic2 = bsGetUChar ();
+   magic3 = bsGetUChar ();
+   magic4 = bsGetUChar ();
+   if (magic1 != 'B' ||
+       magic2 != 'Z' ||
+       magic3 != 'h' ||
+       magic4 < '1'  ||
+       magic4 > '9') {
+     bsFinishedWithStream();
+     fclose ( zStream );
+     fprintf ( stderr, "\n%s: bad magic number (ie, not created by bzip2)\n",
+                       inName );
+     return False;
+   }
+
+   smallMode = True;
+   setDecompressStructureSizes ( magic4 - '0' );
+   computedCombinedCRC = 0;
+
+   if (verbosity >= 2) fprintf ( stderr, "\n" );
+   currBlockNo = 0;
+
+   while (True) {
+      magic1 = bsGetUChar ();
+      magic2 = bsGetUChar ();
+      magic3 = bsGetUChar ();
+      magic4 = bsGetUChar ();
+      magic5 = bsGetUChar ();
+      magic6 = bsGetUChar ();
+      if (magic1 == 0x17 && magic2 == 0x72 &&
+          magic3 == 0x45 && magic4 == 0x38 &&
+          magic5 == 0x50 && magic6 == 0x90) break;
+
+      currBlockNo++;
+      if (magic1 != 0x31 || magic2 != 0x41 ||
+          magic3 != 0x59 || magic4 != 0x26 ||
+          magic5 != 0x53 || magic6 != 0x59) {
+         bsFinishedWithStream();
+         fclose ( zStream );
+         fprintf ( stderr,
+                   "\n%s, block %d: bad header (not == 0x314159265359)\n",
+                   inName, currBlockNo );
+         return False;
+      }
+      storedBlockCRC = bsGetUInt32 ();
+
+      if (bsR(1) == 1)
+         blockRandomised = True; else
+         blockRandomised = False;
+
+      if (verbosity >= 2)
+         fprintf ( stderr, "    block [%d: huff+mtf ", currBlockNo );
+      getAndMoveToFrontDecode ();
+      ERROR_IF_NOT_ZERO ( ferror(zStream) );
+
+      initialiseCRC();
+      if (verbosity >= 2) fprintf ( stderr, "rt+rld" );
+      undoReversibleTransformation_small ( NULL );
+
+      computedBlockCRC = getFinalCRC();
+      if (verbosity >= 3)
+         fprintf ( stderr, " {0x%x, 0x%x}", storedBlockCRC, computedBlockCRC );
+      if (verbosity >= 2) fprintf ( stderr, "] " );
+
+      if (storedBlockCRC != computedBlockCRC) {
+         bsFinishedWithStream();
+         fclose ( zStream );
+         fprintf ( stderr, "\n%s, block %d: computed CRC does not match stored one\n",
+                           inName, currBlockNo );
+         return False;
+      }
+
+      if (verbosity >= 2) fprintf ( stderr, "ok\n" );
+      computedCombinedCRC = (computedCombinedCRC << 1) | (computedCombinedCRC >> 31);
+      computedCombinedCRC ^= computedBlockCRC;
+   };
+
+   storedCombinedCRC  = bsGetUInt32 ();
+   if (verbosity >= 2)
+      fprintf ( stderr,
+                "    combined CRCs: stored = 0x%x, computed = 0x%x\n    ",
+                storedCombinedCRC, computedCombinedCRC );
+   if (storedCombinedCRC != computedCombinedCRC) {
+      bsFinishedWithStream();
+      fclose ( zStream );
+      fprintf ( stderr, "\n%s: computed CRC does not match stored one\n",
+                        inName );
+      return False;
+   }
+
+   bsFinishedWithStream ();
+   ERROR_IF_NOT_ZERO ( ferror(zStream) );
+   retVal = fclose ( zStream );
+   ERROR_IF_EOF ( retVal );
+   return True;
+}
+
+
+
+/*---------------------------------------------------*/
+/*--- Error [non-] handling grunge                ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void cadvise ( void )
+{
+   fprintf (
+      stderr,
+      "\nIt is possible that the compressed file(s) have become corrupted.\n"
+        "You can use the -tvv option to test integrity of such files.\n\n"
+        "You can use the `bzip2recover' program to *attempt* to recover\n"
+        "data from undamaged sections of corrupted files.\n\n"
+    );
+}
+
+
+/*---------------------------------------------*/
+void showFileNames ( void )
+{
+   fprintf (
+      stderr,
+      "\tInput file = %s, output file = %s\n",
+      inName==NULL  ? "(null)" : inName,
+      outName==NULL ? "(null)" : outName
+   );
+}
+
+
+/*---------------------------------------------*/
+void cleanUpAndFail ( Int32 ec )
+{
+   IntNative retVal;
+
+   if ( srcMode == SM_F2F && opMode != OM_TEST ) {
+      fprintf ( stderr, "%s: Deleting output file %s, if it exists.\n",
+                progName,
+                outName==NULL ? "(null)" : outName );
+      if (outputHandleJustInCase != NULL)
+         fclose ( outputHandleJustInCase );
+      retVal = remove ( outName );
+      if (retVal != 0)
+         fprintf ( stderr,
+                   "%s: WARNING: deletion of output file (apparently) failed.\n",
+                   progName );
+   }
+   if (numFileNames > 0 && numFilesProcessed < numFileNames) {
+      fprintf ( stderr, 
+                "%s: WARNING: some files have not been processed:\n"
+                "\t%d specified on command line, %d not processed yet.\n\n",
+                progName, numFileNames, 
+                          numFileNames - numFilesProcessed );
+   }
+   exit ( ec );
+}
+
+
+/*---------------------------------------------*/
+void panic ( Char* s )
+{
+   fprintf ( stderr,
+             "\n%s: PANIC -- internal consistency error:\n"
+             "\t%s\n"
+             "\tThis is a BUG.  Please report it to me at:\n"
+             "\tjseward@acm.org\n",
+             progName, s );
+   showFileNames();
+   cleanUpAndFail( 3 );
+}
+
+
+/*---------------------------------------------*/
+void badBGLengths ( void )
+{
+   fprintf ( stderr,
+             "\n%s: error when reading background model code lengths,\n"
+             "\twhich probably means the compressed file is corrupted.\n",
+             progName );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void crcError ( UInt32 crcStored, UInt32 crcComputed )
+{
+   fprintf ( stderr,
+             "\n%s: Data integrity error when decompressing.\n"
+             "\tStored CRC = 0x%x, computed CRC = 0x%x\n",
+             progName, crcStored, crcComputed );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void compressedStreamEOF ( void )
+{
+   fprintf ( stderr,
+             "\n%s: Compressed file ends unexpectedly;\n\t"
+             "perhaps it is corrupted?  *Possible* reason follows.\n",
+             progName );
+   perror ( progName );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void ioError ( )
+{
+   fprintf ( stderr,
+             "\n%s: I/O or other error, bailing out.  Possible reason follows.\n",
+             progName );
+   perror ( progName );
+   showFileNames();
+   cleanUpAndFail( 1 );
+}
+
+
+/*---------------------------------------------*/
+void blockOverrun ()
+{
+   fprintf ( stderr,
+             "\n%s: block overrun during decompression,\n"
+             "\twhich probably means the compressed file\n"
+             "\tis corrupted.\n",
+             progName );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void badBlockHeader ()
+{
+   fprintf ( stderr,
+             "\n%s: bad block header in the compressed file,\n"
+             "\twhich probably means it is corrupted.\n",
+             progName );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void bitStreamEOF ()
+{
+   fprintf ( stderr,
+             "\n%s: read past the end of compressed data,\n"
+             "\twhich probably means it is corrupted.\n",
+             progName );
+   showFileNames();
+   cadvise();
+   cleanUpAndFail( 2 );
+}
+
+
+/*---------------------------------------------*/
+void mySignalCatcher ( IntNative n )
+{
+   fprintf ( stderr,
+             "\n%s: Control-C (or similar) caught, quitting.\n",
+             progName );
+   cleanUpAndFail(1);
+}
+
+
+/*---------------------------------------------*/
+void mySIGSEGVorSIGBUScatcher ( IntNative n )
+{
+   if (opMode == OM_Z)
+      fprintf ( stderr,
+                "\n%s: Caught a SIGSEGV or SIGBUS whilst compressing,\n"
+                "\twhich probably indicates a bug in bzip2.  Please\n"
+                "\treport it to me at: jseward@acm.org\n",
+                progName );
+      else
+      fprintf ( stderr,
+                "\n%s: Caught a SIGSEGV or SIGBUS whilst decompressing,\n"
+                "\twhich probably indicates that the compressed data\n"
+                "\tis corrupted.\n",
+                progName );
+
+   showFileNames();
+   if (opMode == OM_Z)
+      cleanUpAndFail( 3 ); else
+      { cadvise(); cleanUpAndFail( 2 ); }
+}
+
+
+/*---------------------------------------------*/
+void uncompressOutOfMemory ( Int32 draw, Int32 blockSize )
+{
+   fprintf ( stderr,
+             "\n%s: Can't allocate enough memory for decompression.\n"
+             "\tRequested %d bytes for a block size of %d.\n"
+             "\tTry selecting space-economic decompress (with flag -s)\n"
+             "\tand failing that, find a machine with more memory.\n",
+             progName, draw, blockSize );
+   showFileNames();
+   cleanUpAndFail(1);
+}
+
+
+/*---------------------------------------------*/
+void compressOutOfMemory ( Int32 draw, Int32 blockSize )
+{
+   fprintf ( stderr,
+             "\n%s: Can't allocate enough memory for compression.\n"
+             "\tRequested %d bytes for a block size of %d.\n"
+             "\tTry selecting a small block size (with flag -s).\n",
+             progName, draw, blockSize );
+   showFileNames();
+   cleanUpAndFail(1);
+}
+
+
+/*---------------------------------------------------*/
+/*--- The main driver machinery                   ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void pad ( Char *s )
+{
+   Int32 i;
+   if ( (Int32)strlen(s) >= longestFileName ) return;
+   for (i = 1; i <= longestFileName - (Int32)strlen(s); i++)
+      fprintf ( stderr, " " );
+}
+
+
+/*---------------------------------------------*/
+Bool fileExists ( Char* name )
+{
+   FILE *tmp   = fopen ( name, "rb" );
+   Bool exists = (tmp != NULL);
+   if (tmp != NULL) fclose ( tmp );
+   return exists;
+}
+
+
+/*---------------------------------------------*/
+/*--
+  if in doubt, return True
+--*/
+Bool notABogStandardFile ( Char* name )
+{
+   IntNative      i;
+   struct MY_STAT statBuf;
+
+   i = MY_LSTAT ( name, &statBuf );
+   if (i != 0) return True;
+   if (MY_S_IFREG(statBuf.st_mode)) return False;
+   return True;
+}
+
+
+/*---------------------------------------------*/
+void copyDateAndPermissions ( Char *srcName, Char *dstName )
+{
+   #if BZ_UNIX
+   IntNative      retVal;
+   struct MY_STAT statBuf;
+   struct utimbuf uTimBuf;
+
+   retVal = MY_LSTAT ( srcName, &statBuf );
+   ERROR_IF_NOT_ZERO ( retVal );
+   uTimBuf.actime = statBuf.st_atime;
+   uTimBuf.modtime = statBuf.st_mtime;
+
+   retVal = chmod ( dstName, statBuf.st_mode );
+   ERROR_IF_NOT_ZERO ( retVal );
+   retVal = utime ( dstName, &uTimBuf );
+   ERROR_IF_NOT_ZERO ( retVal );
+   #endif
+}
+
+
+/*---------------------------------------------*/
+Bool endsInBz2 ( Char* name )
+{
+   Int32 n = strlen ( name );
+   if (n <= 4) return False;
+   return
+      (name[n-4] == '.' &&
+       name[n-3] == 'b' &&
+       name[n-2] == 'z' &&
+       name[n-1] == '2');
+}
+
+
+/*---------------------------------------------*/
+Bool containsDubiousChars ( Char* name )
+{
+   Bool cdc = False;
+   for (; *name != '\0'; name++)
+      if (*name == '?' || *name == '*') cdc = True;
+   return cdc;
+}
+
+
+/*---------------------------------------------*/
+void compress ( Char *name )
+{
+   FILE *inStr;
+   FILE *outStr;
+
+   if (name == NULL && srcMode != SM_I2O)
+      panic ( "compress: bad modes\n" );
+
+   switch (srcMode) {
+      case SM_I2O: strcpy ( inName, "(stdin)" );
+                   strcpy ( outName, "(stdout)" ); break;
+      case SM_F2F: strcpy ( inName, name );
+                   strcpy ( outName, name );
+                   strcat ( outName, ".bz2" ); break;
+      case SM_F2O: strcpy ( inName, name );
+                   strcpy ( outName, "(stdout)" ); break;
+   }
+
+   if ( srcMode != SM_I2O && containsDubiousChars ( inName ) ) {
+      fprintf ( stderr, "%s: There are no files matching `%s'.\n",
+      progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && !fileExists ( inName ) ) {
+      fprintf ( stderr, "%s: Input file %s doesn't exist, skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && endsInBz2 ( inName )) {
+      fprintf ( stderr, "%s: Input file name %s ends in `.bz2', skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && notABogStandardFile ( inName )) {
+      fprintf ( stderr, "%s: Input file %s is not a normal file, skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode == SM_F2F && fileExists ( outName ) ) {
+      fprintf ( stderr, "%s: Output file %s already exists, skipping.\n",
+                progName, outName );
+      return;
+   }
+
+   switch ( srcMode ) {
+
+      case SM_I2O:
+         inStr = stdin;
+         outStr = stdout;
+         if ( isatty ( fileno ( stdout ) ) ) {
+            fprintf ( stderr,
+                      "%s: I won't write compressed data to a terminal.\n",
+                      progName );
+            fprintf ( stderr, "%s: For help, type: `%s --help'.\n",
+                              progName, progName );
+            return;
+         };
+         break;
+
+      case SM_F2O:
+         inStr = fopen ( inName, "rb" );
+         outStr = stdout;
+         if ( isatty ( fileno ( stdout ) ) ) {
+            fprintf ( stderr,
+                      "%s: I won't write compressed data to a terminal.\n",
+                      progName );
+            fprintf ( stderr, "%s: For help, type: `%s --help'.\n",
+                              progName, progName );
+            return;
+         };
+         if ( inStr == NULL ) {
+            fprintf ( stderr, "%s: Can't open input file %s, skipping.\n",
+                      progName, inName );
+            return;
+         };
+         break;
+
+      case SM_F2F:
+         inStr = fopen ( inName, "rb" );
+         outStr = fopen ( outName, "wb" );
+         if ( outStr == NULL) {
+            fprintf ( stderr, "%s: Can't create output file %s, skipping.\n",
+                      progName, outName );
+            return;
+         }
+         if ( inStr == NULL ) {
+            fprintf ( stderr, "%s: Can't open input file %s, skipping.\n",
+                      progName, inName );
+            return;
+         };
+         break;
+
+      default:
+         panic ( "compress: bad srcMode" );
+         break;
+   }
+
+   if (verbosity >= 1) {
+      fprintf ( stderr,  "  %s: ", inName );
+      pad ( inName );
+      fflush ( stderr );
+   }
+
+   /*--- Now the input and output handles are sane.  Do the Biz. ---*/
+   errno = 0;
+   outputHandleJustInCase = outStr;
+   compressStream ( inStr, outStr );
+   outputHandleJustInCase = NULL;
+
+   /*--- If there was an I/O error, we won't get here. ---*/
+   if ( srcMode == SM_F2F ) {
+      copyDateAndPermissions ( inName, outName );
+      if ( !keepInputFiles ) {
+         IntNative retVal = remove ( inName );
+         ERROR_IF_NOT_ZERO ( retVal );
+      }
+   }
+}
+
+
+/*---------------------------------------------*/
+void uncompress ( Char *name )
+{
+   FILE *inStr;
+   FILE *outStr;
+   Bool magicNumberOK;
+
+   if (name == NULL && srcMode != SM_I2O)
+      panic ( "uncompress: bad modes\n" );
+
+   switch (srcMode) {
+      case SM_I2O: strcpy ( inName, "(stdin)" );
+                   strcpy ( outName, "(stdout)" ); break;
+      case SM_F2F: strcpy ( inName, name );
+                   strcpy ( outName, name );
+                   if (endsInBz2 ( outName ))
+                      outName [ strlen ( outName ) - 4 ] = '\0';
+                   break;
+      case SM_F2O: strcpy ( inName, name );
+                   strcpy ( outName, "(stdout)" ); break;
+   }
+
+   if ( srcMode != SM_I2O && containsDubiousChars ( inName ) ) {
+      fprintf ( stderr, "%s: There are no files matching `%s'.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && !fileExists ( inName ) ) {
+      fprintf ( stderr, "%s: Input file %s doesn't exist, skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && !endsInBz2 ( inName )) {
+      fprintf ( stderr,
+                "%s: Input file name %s doesn't end in `.bz2', skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && notABogStandardFile ( inName )) {
+      fprintf ( stderr, "%s: Input file %s is not a normal file, skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode == SM_F2F && fileExists ( outName ) ) {
+      fprintf ( stderr, "%s: Output file %s already exists, skipping.\n",
+                progName, outName );
+      return;
+   }
+
+   switch ( srcMode ) {
+
+      case SM_I2O:
+         inStr = stdin;
+         outStr = stdout;
+         if ( isatty ( fileno ( stdin ) ) ) {
+            fprintf ( stderr,
+                      "%s: I won't read compressed data from a terminal.\n",
+                      progName );
+            fprintf ( stderr, "%s: For help, type: `%s --help'.\n",
+                              progName, progName );
+            return;
+         };
+         break;
+
+      case SM_F2O:
+         inStr = fopen ( inName, "rb" );
+         outStr = stdout;
+         if ( inStr == NULL ) {
+            fprintf ( stderr, "%s: Can't open input file %s, skipping.\n",
+                      progName, inName );
+            return;
+         };
+         break;
+
+      case SM_F2F:
+         inStr = fopen ( inName, "rb" );
+         outStr = fopen ( outName, "wb" );
+         if ( outStr == NULL) {
+            fprintf ( stderr, "%s: Can't create output file %s, skipping.\n",
+                      progName, outName );
+            return;
+         }
+         if ( inStr == NULL ) {
+            fprintf ( stderr, "%s: Can't open input file %s, skipping.\n",
+                      progName, inName );
+            return;
+         };
+         break;
+
+      default:
+         panic ( "uncompress: bad srcMode" );
+         break;
+   }
+
+   if (verbosity >= 1) {
+      fprintf ( stderr, "  %s: ", inName );
+      pad ( inName );
+      fflush ( stderr );
+   }
+
+   /*--- Now the input and output handles are sane.  Do the Biz. ---*/
+   errno = 0;
+   outputHandleJustInCase = outStr;
+   magicNumberOK = uncompressStream ( inStr, outStr );
+   outputHandleJustInCase = NULL;
+
+   /*--- If there was an I/O error, we won't get here. ---*/
+   if ( magicNumberOK ) {
+      if ( srcMode == SM_F2F ) {
+         copyDateAndPermissions ( inName, outName );
+         if ( !keepInputFiles ) {
+            IntNative retVal = remove ( inName );
+            ERROR_IF_NOT_ZERO ( retVal );
+         }
+      }
+   } else {
+      if ( srcMode == SM_F2F ) {
+         IntNative retVal = remove ( outName );
+         ERROR_IF_NOT_ZERO ( retVal );
+      }
+   }
+
+   if ( magicNumberOK ) {
+      if (verbosity >= 1)
+         fprintf ( stderr, "done\n" );
+   } else {
+      if (verbosity >= 1)
+         fprintf ( stderr, "not a bzip2 file, skipping.\n" ); else
+         fprintf ( stderr,
+                   "%s: %s is not a bzip2 file, skipping.\n",
+                   progName, inName );
+   }
+
+}
+
+
+/*---------------------------------------------*/
+void testf ( Char *name )
+{
+   FILE *inStr;
+   Bool allOK;
+
+   if (name == NULL && srcMode != SM_I2O)
+      panic ( "testf: bad modes\n" );
+
+   strcpy ( outName, "(none)" );
+   switch (srcMode) {
+      case SM_I2O: strcpy ( inName, "(stdin)" ); break;
+      case SM_F2F: strcpy ( inName, name ); break;
+      case SM_F2O: strcpy ( inName, name ); break;
+   }
+
+   if ( srcMode != SM_I2O && containsDubiousChars ( inName ) ) {
+      fprintf ( stderr, "%s: There are no files matching `%s'.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && !fileExists ( inName ) ) {
+      fprintf ( stderr, "%s: Input file %s doesn't exist, skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && !endsInBz2 ( inName )) {
+      fprintf ( stderr,
+                "%s: Input file name %s doesn't end in `.bz2', skipping.\n",
+                progName, inName );
+      return;
+   }
+   if ( srcMode != SM_I2O && notABogStandardFile ( inName )) {
+      fprintf ( stderr, "%s: Input file %s is not a normal file, skipping.\n",
+                progName, inName );
+      return;
+   }
+
+   switch ( srcMode ) {
+
+      case SM_I2O:
+         if ( isatty ( fileno ( stdin ) ) ) {
+            fprintf ( stderr,
+                      "%s: I won't read compressed data from a terminal.\n",
+                      progName );
+            fprintf ( stderr, "%s: For help, type: `%s --help'.\n",
+                              progName, progName );
+            return;
+         };
+         inStr = stdin;
+         break;
+
+      case SM_F2O: case SM_F2F:
+         inStr = fopen ( inName, "rb" );
+         if ( inStr == NULL ) {
+            fprintf ( stderr, "%s: Can't open input file %s, skipping.\n",
+                      progName, inName );
+            return;
+         };
+         break;
+
+      default:
+         panic ( "testf: bad srcMode" );
+         break;
+   }
+
+   if (verbosity >= 1) {
+      fprintf ( stderr, "  %s: ", inName );
+      pad ( inName );
+      fflush ( stderr );
+   }
+
+   /*--- Now the input handle is sane.  Do the Biz. ---*/
+   errno = 0;
+   allOK = testStream ( inStr );
+
+   if (allOK && verbosity >= 1) fprintf ( stderr, "ok\n" );
+   if (!allOK) testFailsExist = True;
+}
+
+
+/*---------------------------------------------*/
+void license ( void )
+{
+   fprintf ( stderr,
+
+    "bzip2, a block-sorting file compressor.  "
+    "Version 0.1pl0, 17-Aug-97.\n"
+    "   \n"
+    "   Copyright (C) 1996, 1997 by Julian Seward.\n"
+    "   \n"
+    "   This program is free software; you can redistribute it and/or modify\n"
+    "   it under the terms of the GNU General Public License as published by\n"
+    "   the Free Software Foundation; either version 2 of the License, or\n"
+    "   (at your option) any later version.\n"
+    "   \n"
+    "   This program is distributed in the hope that it will be useful,\n"
+    "   but WITHOUT ANY WARRANTY; without even the implied warranty of\n"
+    "   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n"
+    "   GNU General Public License for more details.\n"
+    "   \n"
+    "   You should have received a copy of the GNU General Public License\n"
+    "   along with this program; if not, write to the Free Software\n"
+    "   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.\n"
+    "   \n"
+    "   The GNU General Public License is contained in the file LICENSE.\n"
+    "   \n"
+   );
+}
+
+
+/*---------------------------------------------*/
+void usage ( Char *fullProgName )
+{
+   fprintf (
+      stderr,
+      "bzip2, a block-sorting file compressor.  "
+      "Version 0.1pl0, 17-Aug-97.\n"
+      "\n   usage: %s [flags and input files in any order]\n"
+      "\n"
+      "   -h --help           print this message\n"
+      "   -d --decompress     force decompression\n"
+      "   -f --compress       force compression\n"
+      "   -t --test           test compressed file integrity\n"
+      "   -c --stdout         output to standard out\n"
+      "   -v --verbose        be verbose (a 2nd -v gives more)\n"
+      "   -k --keep           keep (don't delete) input files\n"
+      "   -L --license        display software version & license\n"
+      "   -V --version        display software version & license\n"
+      "   -s --small          use less memory (at most 2500k)\n"
+      "   -1 .. -9            set block size to 100k .. 900k\n"
+      "   --repetitive-fast   compress repetitive blocks faster\n"
+      "   --repetitive-best   compress repetitive blocks better\n"
+      "\n"
+      "   If invoked as `bzip2', the default action is to compress.\n"
+      "              as `bunzip2', the default action is to decompress.\n"
+      "\n"
+      "   If no file names are given, bzip2 compresses or decompresses\n"
+      "   from standard input to standard output.  You can combine\n"
+      "   flags, so `-v -4' means the same as -v4 or -4v, &c.\n"
+      #if BZ_UNIX
+      "\n"
+      #endif
+      ,
+
+      fullProgName
+   );
+}
+
+
+/*---------------------------------------------*/
+/*--
+  All the garbage from here to main() is purely to
+  implement a linked list of command-line arguments,
+  into which main() copies argv[1 .. argc-1].
+
+  The purpose of this ridiculous exercise is to
+  facilitate the expansion of wildcard characters
+  * and ? in filenames for halfwitted OSs like
+  MSDOS, Windows 95 and NT.
+
+  The actual Dirty Work is done by the platform-specific
+  macro APPEND_FILESPEC.
+--*/
+
+typedef
+   struct zzzz {
+      Char        *name;
+      struct zzzz *link;
+   }
+   Cell;
+
+
+/*---------------------------------------------*/
+void *myMalloc ( Int32 n )
+{
+   void* p;
+
+   p = malloc ( (size_t)n );
+   if (p == NULL) {
+      fprintf (
+         stderr,
+         "%s: `malloc' failed on request for %d bytes.\n",
+         progName, n
+      );
+      exit ( 1 );
+   }
+   return p;
+}
+
+
+/*---------------------------------------------*/
+Cell *mkCell ( void )
+{
+   Cell *c;
+
+   c = (Cell*) myMalloc ( sizeof ( Cell ) );
+   c->name = NULL;
+   c->link = NULL;
+   return c;
+}
+
+
+/*---------------------------------------------*/
+Cell *snocString ( Cell *root, Char *name )
+{
+   if (root == NULL) {
+      Cell *tmp = mkCell();
+      tmp->name = (Char*) myMalloc ( 5 + strlen(name) );
+      strcpy ( tmp->name, name );
+      return tmp;
+   } else {
+      Cell *tmp = root;
+      while (tmp->link != NULL) tmp = tmp->link;
+      tmp->link = snocString ( tmp->link, name );
+      return root;
+   }
+}
+
+
+
+/*---------------------------------------------*/
+#define ISFLAG(s) (strcmp(aa->name, (s))==0)
+
+
+IntNative main ( IntNative argc, Char *argv[] )
+{
+   Int32  i, j;
+   Char   *tmp;
+   Cell   *argList;
+   Cell   *aa;
+
+
+   #if DEBUG
+      fprintf ( stderr, "bzip2: *** compiled with debugging ON ***\n" );
+   #endif
+
+   /*-- Be really really really paranoid :-) --*/
+   if (sizeof(Int32) != 4 || sizeof(UInt32) != 4  ||
+       sizeof(Int16) != 2 || sizeof(UInt16) != 2  ||
+       sizeof(Char)  != 1 || sizeof(UChar)  != 1) {
+      fprintf ( stderr,
+                "bzip2: I'm not configured correctly for this platform!\n"
+                "\tI require Int32, Int16 and Char to have sizes\n"
+                "\tof 4, 2 and 1 bytes to run properly, and they don't.\n"
+                "\tProbably you can fix this by defining them correctly,\n"
+                "\tand recompiling.  Bye!\n" );
+      exit(1);
+   }
+
+
+   /*-- Set up signal handlers --*/
+   signal (SIGINT,  mySignalCatcher);
+   signal (SIGTERM, mySignalCatcher);
+   signal (SIGSEGV, mySIGSEGVorSIGBUScatcher);
+   #if BZ_UNIX
+   signal (SIGHUP,  mySignalCatcher);
+   signal (SIGBUS,  mySIGSEGVorSIGBUScatcher);
+   #endif
+
+
+   /*-- Initialise --*/
+   outputHandleJustInCase  = NULL;
+   ftab                    = NULL;
+   ll4                     = NULL;
+   ll16                    = NULL;
+   ll8                     = NULL;
+   tt                      = NULL;
+   block                   = NULL;
+   zptr                    = NULL;
+   errno                   = 0;
+   smallMode               = False;
+   keepInputFiles          = False;
+   verbosity               = 0;
+   blockSize100k           = 9;
+   testFailsExist          = False;
+   bsStream                = NULL;
+   numFileNames            = 0;
+   numFilesProcessed       = 0;
+   workFactor              = 30;
+
+   strcpy ( inName,  "(none)" );
+   strcpy ( outName, "(none)" );
+
+   strcpy ( progNameReally, argv[0] );
+   progName = &progNameReally[0];
+   for (tmp = &progNameReally[0]; *tmp != '\0'; tmp++)
+      if (*tmp == PATH_SEP) progName = tmp + 1;
+
+
+   /*-- Expand filename wildcards in arg list --*/
+   argList = NULL;
+   for (i = 1; i <= argc-1; i++)
+      APPEND_FILESPEC(argList, argv[i]);
+
+
+   /*-- Find the length of the longest filename --*/
+   longestFileName = 7;
+   numFileNames    = 0;
+   for (aa = argList; aa != NULL; aa = aa->link)
+      if (aa->name[0] != '-') {
+         numFileNames++;
+         if (longestFileName < (Int32)strlen(aa->name) )
+            longestFileName = (Int32)strlen(aa->name);
+      }
+
+
+   /*-- Determine what to do (compress/uncompress/test). --*/
+   /*-- Note that subsequent flag handling may change this. --*/
+   opMode = OM_Z;
+
+   if ( (strcmp ( "bunzip2",     progName ) == 0) ||
+        (strcmp ( "BUNZIP2",     progName ) == 0) ||
+        (strcmp ( "bunzip2.exe", progName ) == 0) ||
+        (strcmp ( "BUNZIP2.EXE", progName ) == 0) )
+      opMode = OM_UNZ;
+
+
+   /*-- Determine source modes; flag handling may change this too. --*/
+   if (numFileNames == 0)
+      srcMode = SM_I2O; else srcMode = SM_F2F;
+
+
+   /*-- Look at the flags. --*/
+   for (aa = argList; aa != NULL; aa = aa->link)
+      if (aa->name[0] == '-' && aa->name[1] != '-')
+         for (j = 1; aa->name[j] != '\0'; j++)
+            switch (aa->name[j]) {
+               case 'c': srcMode          = SM_F2O; break;
+               case 'd': opMode           = OM_UNZ; break;
+               case 'f': opMode           = OM_Z; break;
+               case 't': opMode           = OM_TEST; break;
+               case 'k': keepInputFiles   = True; break;
+               case 's': smallMode        = True; break;
+               case '1': blockSize100k    = 1; break;
+               case '2': blockSize100k    = 2; break;
+               case '3': blockSize100k    = 3; break;
+               case '4': blockSize100k    = 4; break;
+               case '5': blockSize100k    = 5; break;
+               case '6': blockSize100k    = 6; break;
+               case '7': blockSize100k    = 7; break;
+               case '8': blockSize100k    = 8; break;
+               case '9': blockSize100k    = 9; break;
+               case 'V':
+               case 'L': license();            break;
+               case 'v': verbosity++; break;
+               case 'h': usage ( progName );
+                         exit ( 1 );
+                         break;
+               default:  fprintf ( stderr, "%s: Bad flag `%s'\n",
+                                   progName, aa->name );
+                         usage ( progName );
+                         exit ( 1 );
+                         break;
+         }
+
+   /*-- And again ... --*/
+   for (aa = argList; aa != NULL; aa = aa->link) {
+      if (ISFLAG("--stdout"))            srcMode          = SM_F2O;  else
+      if (ISFLAG("--decompress"))        opMode           = OM_UNZ;  else
+      if (ISFLAG("--compress"))          opMode           = OM_Z;    else
+      if (ISFLAG("--test"))              opMode           = OM_TEST; else
+      if (ISFLAG("--keep"))              keepInputFiles   = True;    else
+      if (ISFLAG("--small"))             smallMode        = True;    else
+      if (ISFLAG("--version"))           license();                  else
+      if (ISFLAG("--license"))           license();                  else
+      if (ISFLAG("--repetitive-fast"))   workFactor = 5;             else
+      if (ISFLAG("--repetitive-best"))   workFactor = 150;           else
+      if (ISFLAG("--verbose"))           verbosity++;                else
+      if (ISFLAG("--help"))              { usage ( progName ); exit ( 1 ); }
+         else
+         if (strncmp ( aa->name, "--", 2) == 0) {
+            fprintf ( stderr, "%s: Bad flag `%s'\n", progName, aa->name );
+            usage ( progName );
+            exit ( 1 );
+         }
+   }
+
+   if (opMode == OM_Z && smallMode) blockSize100k = 2;
+
+   if (opMode == OM_Z && srcMode == SM_F2O && numFileNames > 1) {
+      fprintf ( stderr, "%s: I won't compress multiple files to stdout.\n",
+                progName );
+      exit ( 1 );
+   }
+
+   if (opMode == OM_TEST && srcMode == SM_F2O) {
+      fprintf ( stderr, "%s: -c and -t cannot be used together.\n",
+                progName );
+      exit ( 1 );
+   }
+
+   if (opMode != OM_Z) blockSize100k = 0;
+
+   if (opMode == OM_Z) {
+      allocateCompressStructures();
+      if (srcMode == SM_I2O)
+         compress ( NULL );
+         else
+         for (aa = argList; aa != NULL; aa = aa->link)
+            if (aa->name[0] != '-') {
+               numFilesProcessed++;
+               compress ( aa->name );
+            }
+   } else
+   if (opMode == OM_UNZ) {
+      if (srcMode == SM_I2O)
+         uncompress ( NULL );
+         else
+         for (aa = argList; aa != NULL; aa = aa->link)
+            if (aa->name[0] != '-') {
+               numFilesProcessed++;
+               uncompress ( aa->name );
+            }
+   } else {
+      testFailsExist = False;
+      if (srcMode == SM_I2O)
+         testf ( NULL );
+         else
+         for (aa = argList; aa != NULL; aa = aa->link)
+            if (aa->name[0] != '-') {
+               numFilesProcessed++;
+               testf ( aa->name );
+            }
+      if (testFailsExist) {
+         fprintf ( stderr,
+           "\n"
+           "You can use the `bzip2recover' program to *attempt* to recover\n"
+           "data from undamaged sections of corrupted files.\n\n"
+         );
+         exit(2);
+      }
+   }
+   return 0;
+}
+
+
+/*-----------------------------------------------------------*/
+/*--- end                                         bzip2.c ---*/
+/*-----------------------------------------------------------*/

diff --git a/bzip2.exe b/bzip2.exe
new file mode 100644
index 0000000..4b3c4c1
--- /dev/null
+++ b/bzip2.exe
Binary files differ

diff --git a/bzip2.txt b/bzip2.txt
new file mode 100644
index 0000000..83366bc
--- /dev/null
+++ b/bzip2.txt

@@ -0,0 +1,462 @@
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+NAME
+       bzip2, bunzip2 - a block-sorting file compressor, v0.1
+       bzip2recover - recovers data from damaged bzip2 files
+
+
+SYNOPSIS
+       bzip2 [ -cdfkstvVL123456789 ] [ filenames ...  ]
+       bunzip2 [ -kvsVL ] [ filenames ...  ]
+       bzip2recover filename
+
+
+DESCRIPTION
+       Bzip2  compresses  files  using the Burrows-Wheeler block-
+       sorting text compression algorithm,  and  Huffman  coding.
+       Compression  is  generally  considerably  better than that
+       achieved by more conventional LZ77/LZ78-based compressors,
+       and  approaches  the performance of the PPM family of sta-
+       tistical compressors.
+
+       The command-line options are deliberately very similar  to
+       those of GNU Gzip, but they are not identical.
+
+       Bzip2  expects  a list of file names to accompany the com-
+       mand-line flags.  Each file is replaced  by  a  compressed
+       version  of  itself,  with  the  name "original_name.bz2".
+       Each compressed file has the same  modification  date  and
+       permissions  as  the corresponding original, so that these
+       properties can  be  correctly  restored  at  decompression
+       time.  File name handling is naive in the sense that there
+       is no mechanism for preserving original file  names,  per-
+       missions  and  dates  in filesystems which lack these con-
+       cepts, or have serious file name length restrictions, such
+       as MS-DOS.
+
+       Bzip2  and  bunzip2  will not overwrite existing files; if
+       you want this to happen, you should delete them first.
+
+       If no file names  are  specified,  bzip2  compresses  from
+       standard  input  to  standard output.  In this case, bzip2
+       will decline to write compressed output to a terminal,  as
+       this  would  be  entirely  incomprehensible  and therefore
+       pointless.
+
+       Bunzip2 (or bzip2 -d ) decompresses and restores all spec-
+       ified files whose names end in ".bz2".  Files without this
+       suffix are ignored.  Again, supplying no filenames  causes
+       decompression from standard input to standard output.
+
+       You  can also compress or decompress files to the standard
+       output by giving the -c flag.  You can decompress multiple
+       files  like  this, but you may only compress a single file
+       this way, since it would otherwise be difficult  to  sepa-
+       rate  out  the  compressed representations of the original
+       files.
+
+
+
+                                                                1
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       Compression is always performed, even  if  the  compressed
+       file  is slightly larger than the original.  Files of less
+       than about one hundred bytes tend to get larger, since the
+       compression  mechanism  has  a  constant  overhead  in the
+       region of 50 bytes.  Random data (including the output  of
+       most  file  compressors)  is  coded at about 8.05 bits per
+       byte, giving an expansion of around 0.5%.
+
+       As a self-check for your  protection,  bzip2  uses  32-bit
+       CRCs  to make sure that the decompressed version of a file
+       is identical to the original.  This guards against corrup-
+       tion  of  the compressed data, and against undetected bugs
+       in bzip2 (hopefully very unlikely).  The chances  of  data
+       corruption  going  undetected  is  microscopic,  about one
+       chance in four billion for each file processed.  Be aware,
+       though,  that  the  check occurs upon decompression, so it
+       can only tell you that that something is wrong.  It  can't
+       help  you recover the original uncompressed data.  You can
+       use bzip2recover to  try  to  recover  data  from  damaged
+       files.
+
+       Return  values:  0  for a normal exit, 1 for environmental
+       problems (file not found, invalid flags, I/O errors,  &c),
+       2 to indicate a corrupt compressed file, 3 for an internal
+       consistency error (eg, bug) which caused bzip2 to panic.
+
+
+MEMORY MANAGEMENT
+       Bzip2 compresses large files in blocks.   The  block  size
+       affects  both  the  compression  ratio  achieved,  and the
+       amount of memory needed both for  compression  and  decom-
+       pression.   The flags -1 through -9 specify the block size
+       to be 100,000 bytes through 900,000  bytes  (the  default)
+       respectively.   At decompression-time, the block size used
+       for compression is read from the header of the  compressed
+       file, and bunzip2 then allocates itself just enough memory
+       to decompress the file.  Since block sizes are  stored  in
+       compressed  files,  it follows that the flags -1 to -9 are
+       irrelevant to and so ignored during  decompression.   Com-
+       pression  and decompression requirements, in bytes, can be
+       estimated as:
+
+             Compression:   400k + ( 7 x block size )
+
+             Decompression: 100k + ( 5 x block size ), or
+                            100k + ( 2.5 x block size )
+
+       Larger  block  sizes  give  rapidly  diminishing  marginal
+       returns;  most of the compression comes from the first two
+       or three hundred k of block size, a fact worth bearing  in
+       mind  when  using  bzip2  on  small  machines.  It is also
+       important to  appreciate  that  the  decompression  memory
+       requirement  is  set  at compression-time by the choice of
+       block size.
+
+
+
+                                                                2
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       For files compressed with the  default  900k  block  size,
+       bunzip2  will require about 4600 kbytes to decompress.  To
+       support decompression of any file on a 4 megabyte machine,
+       bunzip2  has  an  option to decompress using approximately
+       half this amount of memory, about 2300 kbytes.  Decompres-
+       sion  speed  is also halved, so you should use this option
+       only where necessary.  The relevant flag is -s.
+
+       In general, try and use the largest block size memory con-
+       straints  allow,  since  that  maximises  the  compression
+       achieved.  Compression and decompression speed are  virtu-
+       ally unaffected by block size.
+
+       Another  significant point applies to files which fit in a
+       single block -- that  means  most  files  you'd  encounter
+       using  a  large  block  size.   The  amount of real memory
+       touched is proportional to the size of the file, since the
+       file  is smaller than a block.  For example, compressing a
+       file 20,000 bytes long with the flag  -9  will  cause  the
+       compressor  to  allocate  around 6700k of memory, but only
+       touch 400k + 20000 * 7 = 540 kbytes of it.  Similarly, the
+       decompressor  will  allocate  4600k  but only touch 100k +
+       20000 * 5 = 200 kbytes.
+
+       Here is a table which summarises the maximum memory  usage
+       for  different  block  sizes.   Also recorded is the total
+       compressed size for 14 files of the Calgary Text  Compres-
+       sion  Corpus totalling 3,141,622 bytes.  This column gives
+       some feel for how  compression  varies  with  block  size.
+       These  figures  tend to understate the advantage of larger
+       block sizes for larger files, since the  Corpus  is  domi-
+       nated by smaller files.
+
+                  Compress   Decompress   Decompress   Corpus
+           Flag     usage      usage       -s usage     Size
+
+            -1      1100k       600k         350k      914704
+            -2      1800k      1100k         600k      877703
+            -3      2500k      1600k         850k      860338
+            -4      3200k      2100k        1100k      846899
+            -5      3900k      2600k        1350k      845160
+            -6      4600k      3100k        1600k      838626
+            -7      5400k      3600k        1850k      834096
+            -8      6000k      4100k        2100k      828642
+            -9      6700k      4600k        2350k      828642
+
+
+OPTIONS
+       -c --stdout
+              Compress or decompress to standard output.  -c will
+              decompress multiple files to stdout, but will  only
+              compress a single file to stdout.
+
+
+
+
+
+                                                                3
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       -d --decompress
+              Force  decompression.  Bzip2 and bunzip2 are really
+              the same program, and the decision about whether to
+              compress  or  decompress  is  done  on the basis of
+              which name is used.  This flag overrides that mech-
+              anism, and forces bzip2 to decompress.
+
+       -f --compress
+              The  complement  to -d: forces compression, regard-
+              less of the invokation name.
+
+       -t --test
+              Check integrity of the specified file(s), but don't
+              decompress  them.   This  really  performs  a trial
+              decompression and throws away the result, using the
+              low-memory decompression algorithm (see -s).
+
+       -k --keep
+              Keep  (don't delete) input files during compression
+              or decompression.
+
+       -s --small
+              Reduce  memory  usage,  both  for  compression  and
+              decompression.  Files are decompressed using a mod-
+              ified algorithm which only requires 2.5  bytes  per
+              block  byte.   This  means  any  file can be decom-
+              pressed in 2300k of memory,  albeit  somewhat  more
+              slowly than usual.
+
+              During  compression,  -s  selects  a  block size of
+              200k, which limits memory use to  around  the  same
+              figure,  at  the expense of your compression ratio.
+              In short, if your  machine  is  low  on  memory  (8
+              megabytes  or  less),  use  -s for everything.  See
+              MEMORY MANAGEMENT above.
+
+
+       -v --verbose
+              Verbose mode -- show the compression ratio for each
+              file  processed.   Further  -v's  increase the ver-
+              bosity level, spewing out lots of information which
+              is primarily of interest for diagnostic purposes.
+
+       -L --license
+              Display  the  software  version,  license terms and
+              conditions.
+
+       -V --version
+              Same as -L.
+
+       -1 to -9
+              Set the block size to 100 k, 200 k ..  900  k  when
+              compressing.   Has  no  effect  when decompressing.
+              See MEMORY MANAGEMENT above.
+
+
+
+                                                                4
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       --repetitive-fast
+              bzip2 injects some small  pseudo-random  variations
+              into  very  repetitive  blocks  to limit worst-case
+              performance during compression.   If  sorting  runs
+              into  difficulties,  the  block  is randomised, and
+              sorting is restarted.  Very roughly, bzip2 persists
+              for  three  times  as  long as a well-behaved input
+              would take before resorting to randomisation.  This
+              flag makes it give up much sooner.
+
+
+       --repetitive-best
+              Opposite  of  --repetitive-fast;  try  a lot harder
+              before resorting to randomisation.
+
+
+RECOVERING DATA FROM DAMAGED FILES
+       bzip2 compresses files in blocks, usually 900kbytes  long.
+       Each block is handled independently.  If a media or trans-
+       mission error causes a multi-block  .bz2  file  to  become
+       damaged,  it  may  be  possible  to  recover data from the
+       undamaged blocks in the file.
+
+       The compressed representation of each block  is  delimited
+       by  a  48-bit pattern, which makes it possible to find the
+       block boundaries with reasonable  certainty.   Each  block
+       also  carries its own 32-bit CRC, so damaged blocks can be
+       distinguished from undamaged ones.
+
+       bzip2recover is a  simple  program  whose  purpose  is  to
+       search  for blocks in .bz2 files, and write each block out
+       into its own .bz2 file.  You can then use bzip2 -t to test
+       the integrity of the resulting files, and decompress those
+       which are undamaged.
+
+       bzip2recover takes a single argument, the name of the dam-
+       aged file, and writes a number of files "rec0001file.bz2",
+       "rec0002file.bz2", etc, containing the  extracted  blocks.
+       The output filenames are designed so that the use of wild-
+       cards in subsequent processing -- for example, "bzip2  -dc
+       rec*file.bz2  >  recovered_data" -- lists the files in the
+       "right" order.
+
+       bzip2recover should be of most use dealing with large .bz2
+       files,  as  these will contain many blocks.  It is clearly
+       futile to use it on damaged single-block  files,  since  a
+       damaged  block  cannot  be recovered.  If you wish to min-
+       imise any potential data loss through media  or  transmis-
+       sion errors, you might consider compressing with a smaller
+       block size.
+
+
+PERFORMANCE NOTES
+       The sorting phase of compression gathers together  similar
+
+
+
+                                                                5
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       strings  in  the  file.  Because of this, files containing
+       very long runs of  repeated  symbols,  like  "aabaabaabaab
+       ..."   (repeated   several  hundred  times)  may  compress
+       extraordinarily slowly.  You can use the -vvvvv option  to
+       monitor progress in great detail, if you want.  Decompres-
+       sion speed is unaffected.
+
+       Such pathological cases seem rare in  practice,  appearing
+       mostly in artificially-constructed test files, and in low-
+       level disk images.  It may be inadvisable to use bzip2  to
+       compress  the  latter.   If you do get a file which causes
+       severe slowness in compression, try making the block  size
+       as small as possible, with flag -1.
+
+       Incompressible or virtually-incompressible data may decom-
+       press rather more slowly than one would hope.  This is due
+       to a naive implementation of the move-to-front coder.
+
+       bzip2  usually  allocates  several  megabytes of memory to
+       operate in, and then charges all over it in a fairly  ran-
+       dom  fashion.   This means that performance, both for com-
+       pressing and decompressing, is largely determined  by  the
+       speed  at  which  your  machine  can service cache misses.
+       Because of this, small changes to the code to  reduce  the
+       miss  rate  have  been observed to give disproportionately
+       large performance improvements.  I imagine bzip2 will per-
+       form best on machines with very large caches.
+
+       Test mode (-t) uses the low-memory decompression algorithm
+       (-s).  This means test mode does not run  as  fast  as  it
+       could;  it  could  run as fast as the normal decompression
+       machinery.  This could easily be fixed at the cost of some
+       code bloat.
+
+
+CAVEATS
+       I/O  error  messages  are not as helpful as they could be.
+       Bzip2 tries hard to detect I/O errors  and  exit  cleanly,
+       but  the  details  of  what  the problem is sometimes seem
+       rather misleading.
+
+       This manual page pertains to version 0.1 of bzip2.  It may
+       well  happen that some future version will use a different
+       compressed file format.  If you try to  decompress,  using
+       0.1,  a  .bz2  file created with some future version which
+       uses a different compressed file format, 0.1 will complain
+       that  your  file  "is not a bzip2 file".  If that happens,
+       you should obtain a more recent version of bzip2  and  use
+       that to decompress the file.
+
+       Wildcard expansion for Windows 95 and NT is flaky.
+
+       bzip2recover  uses  32-bit integers to represent bit posi-
+       tions in compressed files, so it cannot handle  compressed
+
+
+
+                                                                6
+
+
+
+
+
+bzip2(1)                                                 bzip2(1)
+
+
+       files  more than 512 megabytes long.  This could easily be
+       fixed.
+
+       bzip2recover sometimes reports a  very  small,  incomplete
+       final  block.  This is spurious and can be safely ignored.
+
+
+RELATIONSHIP TO bzip-0.21
+       This program is a descendant of the bzip program,  version
+       0.21,  which  I released in August 1996.  The primary dif-
+       ference of bzip2 is its avoidance of the possibly patented
+       algorithms  which  were  used  in 0.21.  bzip2 also brings
+       various useful refinements (-s,  -t),  uses  less  memory,
+       decompresses  significantly  faster,  and  has support for
+       recovering data from damaged files.
+
+       Because bzip2 uses Huffman coding to  construct  the  com-
+       pressed  bitstream, rather than the arithmetic coding used
+       in 0.21, the compressed representations generated  by  the
+       two  programs are incompatible, and they will not interop-
+       erate.  The change in suffix from  .bz  to  .bz2  reflects
+       this.   It would have been helpful to at least allow bzip2
+       to decompress files created by 0.21, but this would defeat
+       the primary aim of having a patent-free compressor.
+
+       Huffman  coding  necessarily  involves some coding ineffi-
+       ciency compared to arithmetic  coding.   This  means  that
+       bzip2  compresses about 1% worse than 0.21, an unfortunate
+       but unavoidable fact-of-life.  On the other  hand,  decom-
+       pression  is approximately 50% faster for the same reason,
+       and the change in file format gave an opportunity  to  add
+       data-recovery features.  So it is not all bad.
+
+
+AUTHOR
+       Julian Seward, jseward@acm.org.
+
+       The ideas embodied in bzip and bzip2 are due to (at least)
+       the following people: Michael Burrows  and  David  Wheeler
+       (for  the  block  sorting  transformation),  David Wheeler
+       (again, for the Huffman coder),  Peter  Fenwick  (for  the
+       structured  coding  model  in 0.21, and many refinements),
+       and Alistair Moffat, Radford Neal and Ian Witten (for  the
+       arithmetic  coder  in 0.21).  I am much indebted for their
+       help, support and advice.  See the file ALGORITHMS in  the
+       source  distribution for pointers to sources of documenta-
+       tion.  Christian von Roques  encouraged  me  to  look  for
+       faster  sorting algorithms, so as to speed up compression.
+       Bela Lubkin encouraged me to improve the  worst-case  com-
+       pression  performance.   Many  people sent patches, helped
+       with portability problems, lent machines, gave advice  and
+       were generally helpful.
+
+
+
+
+
+                                                                7
+
+

diff --git a/bzip2recover.c b/bzip2recover.c
new file mode 100644
index 0000000..efdfb3c
--- /dev/null
+++ b/bzip2recover.c

@@ -0,0 +1,399 @@
+
+/*-----------------------------------------------------------*/
+/*--- Block recoverer program for bzip2                   ---*/
+/*---                                      bzip2recover.c ---*/
+/*-----------------------------------------------------------*/
+
+/*--
+  This program is bzip2recover, a program to attempt data 
+  salvage from damaged files created by the accompanying
+  bzip2 program.
+
+  Copyright (C) 1996, 1997 by Julian Seward.
+     Guildford, Surrey, UK
+     email: jseward@acm.org
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program; if not, write to the Free Software
+  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+  The GNU General Public License is contained in the file LICENSE.
+--*/
+
+
+#include <stdio.h>
+#include <errno.h>
+#include <malloc.h>
+#include <stdlib.h>
+#include <strings.h>  /*-- or try string.h --*/
+
+#define UInt32  unsigned int
+#define Int32   int
+#define UChar   unsigned char
+#define Char    char
+#define Bool    unsigned char
+#define True    1
+#define False   0
+
+
+Char inFileName[2000];
+Char outFileName[2000];
+Char progName[2000];
+
+UInt32 bytesOut = 0;
+UInt32 bytesIn  = 0;
+
+
+/*---------------------------------------------------*/
+/*--- I/O errors                                  ---*/
+/*---------------------------------------------------*/
+
+/*---------------------------------------------*/
+void readError ( void )
+{
+   fprintf ( stderr,
+             "%s: I/O error reading `%s', possible reason follows.\n",
+            progName, inFileName );
+   perror ( progName );
+   fprintf ( stderr, "%s: warning: output file(s) may be incomplete.\n",
+             progName );
+   exit ( 1 );
+}
+
+
+/*---------------------------------------------*/
+void writeError ( void )
+{
+   fprintf ( stderr,
+             "%s: I/O error reading `%s', possible reason follows.\n",
+            progName, inFileName );
+   perror ( progName );
+   fprintf ( stderr, "%s: warning: output file(s) may be incomplete.\n",
+             progName );
+   exit ( 1 );
+}
+
+
+/*---------------------------------------------*/
+void mallocFail ( Int32 n )
+{
+   fprintf ( stderr,
+             "%s: malloc failed on request for %d bytes.\n",
+            progName, n );
+   fprintf ( stderr, "%s: warning: output file(s) may be incomplete.\n",
+             progName );
+   exit ( 1 );
+}
+
+
+/*---------------------------------------------------*/
+/*--- Bit stream I/O                              ---*/
+/*---------------------------------------------------*/
+
+typedef
+   struct {
+      FILE*  handle;
+      Int32  buffer;
+      Int32  buffLive;
+      Char   mode;
+   }
+   BitStream;
+
+
+/*---------------------------------------------*/
+BitStream* bsOpenReadStream ( FILE* stream )
+{
+   BitStream *bs = malloc ( sizeof(BitStream) );
+   if (bs == NULL) mallocFail ( sizeof(BitStream) );
+   bs->handle = stream;
+   bs->buffer = 0;
+   bs->buffLive = 0;
+   bs->mode = 'r';
+   return bs;
+}
+
+
+/*---------------------------------------------*/
+BitStream* bsOpenWriteStream ( FILE* stream )
+{
+   BitStream *bs = malloc ( sizeof(BitStream) );
+   if (bs == NULL) mallocFail ( sizeof(BitStream) );
+   bs->handle = stream;
+   bs->buffer = 0;
+   bs->buffLive = 0;
+   bs->mode = 'w';
+   return bs;
+}
+
+
+/*---------------------------------------------*/
+void bsPutBit ( BitStream* bs, Int32 bit )
+{
+   if (bs->buffLive == 8) {
+      Int32 retVal = putc ( (UChar) bs->buffer, bs->handle );
+      if (retVal == EOF) writeError();
+      bytesOut++;
+      bs->buffLive = 1;
+      bs->buffer = bit & 0x1;
+   } else {
+      bs->buffer = ( (bs->buffer << 1) | (bit & 0x1) );
+      bs->buffLive++;
+   };
+}
+
+
+/*---------------------------------------------*/
+/*--
+   Returns 0 or 1, or 2 to indicate EOF.
+--*/
+Int32 bsGetBit ( BitStream* bs )
+{
+   if (bs->buffLive > 0) {
+      bs->buffLive --;
+      return ( ((bs->buffer) >> (bs->buffLive)) & 0x1 );
+   } else {
+      Int32 retVal = getc ( bs->handle );
+      if ( retVal == EOF ) {
+         if (errno != 0) readError();
+         return 2;
+      }
+      bs->buffLive = 7;
+      bs->buffer = retVal;
+      return ( ((bs->buffer) >> 7) & 0x1 );
+   }
+}
+
+
+/*---------------------------------------------*/
+void bsClose ( BitStream* bs )
+{
+   Int32 retVal;
+
+   if ( bs->mode == 'w' ) {
+      while ( bs->buffLive < 8 ) {
+         bs->buffLive++;
+         bs->buffer <<= 1;
+      };
+      retVal = putc ( (UChar) (bs->buffer), bs->handle );
+      if (retVal == EOF) writeError();
+      bytesOut++;
+      retVal = fflush ( bs->handle );
+      if (retVal == EOF) writeError();
+   }
+   retVal = fclose ( bs->handle );
+   if (retVal == EOF)
+      if (bs->mode == 'w') writeError(); else readError();
+   free ( bs );
+}
+
+
+/*---------------------------------------------*/
+void bsPutUChar ( BitStream* bs, UChar c )
+{
+   Int32 i;
+   for (i = 7; i >= 0; i--)
+      bsPutBit ( bs, (((UInt32) c) >> i) & 0x1 );
+}
+
+
+/*---------------------------------------------*/
+void bsPutUInt32 ( BitStream* bs, UInt32 c )
+{
+   Int32 i;
+
+   for (i = 31; i >= 0; i--)
+      bsPutBit ( bs, (c >> i) & 0x1 );
+}
+
+
+/*---------------------------------------------*/
+Bool endsInBz2 ( Char* name )
+{
+   Int32 n = strlen ( name );
+   if (n <= 4) return False;
+   return
+      (name[n-4] == '.' &&
+       name[n-3] == 'b' &&
+       name[n-2] == 'z' &&
+       name[n-1] == '2');
+}
+
+
+/*---------------------------------------------------*/
+/*---                                             ---*/
+/*---------------------------------------------------*/
+
+#define BLOCK_HEADER_HI  0x00003141UL
+#define BLOCK_HEADER_LO  0x59265359UL
+
+#define BLOCK_ENDMARK_HI 0x00001772UL
+#define BLOCK_ENDMARK_LO 0x45385090UL
+
+Int32 main ( Int32 argc, Char** argv )
+{
+   FILE*       inFile;
+   FILE*       outFile;
+   BitStream*  bsIn, *bsWr;
+   Int32       currBlock, b, wrBlock;
+   UInt32      bitsRead;
+   UInt32      bStart[20000];
+   UInt32      bEnd[20000];
+   UInt32      buffHi, buffLo, blockCRC;
+   Char*       p;
+
+   strcpy ( progName, argv[0] );
+   inFileName[0] = outFileName[0] = 0;
+
+   fprintf ( stderr, "bzip2recover: extracts blocks from damaged .bz2 files.\n" );
+
+   if (argc != 2) {
+      fprintf ( stderr, "%s: usage is `%s damaged_file_name'.\n",
+                        progName, progName );
+      exit(1);
+   }
+
+   strcpy ( inFileName, argv[1] );
+
+   inFile = fopen ( inFileName, "rb" );
+   if (inFile == NULL) {
+      fprintf ( stderr, "%s: can't read `%s'\n", progName, inFileName );
+      exit(1);
+   }
+
+   bsIn = bsOpenReadStream ( inFile );
+   fprintf ( stderr, "%s: searching for block boundaries ...\n", progName );
+
+   bitsRead = 0;
+   buffHi = buffLo = 0;
+   currBlock = 0;
+   bStart[currBlock] = 0;
+
+   while (True) {
+      b = bsGetBit ( bsIn );
+      bitsRead++;
+      if (b == 2) {
+         if (bitsRead >= bStart[currBlock] &&
+            (bitsRead - bStart[currBlock]) >= 40) {
+            bEnd[currBlock] = bitsRead-1;
+            if (currBlock > 0)
+               fprintf ( stderr, "   block %d runs from %d to %d (incomplete)\n",
+                         currBlock,  bStart[currBlock], bEnd[currBlock] );
+         } else
+            currBlock--;
+         break;
+      }
+      buffHi = (buffHi << 1) | (buffLo >> 31);
+      buffLo = (buffLo << 1) | (b & 1);
+      if ( ( (buffHi & 0x0000ffff) == BLOCK_HEADER_HI 
+             && buffLo == BLOCK_HEADER_LO)
+           || 
+           ( (buffHi & 0x0000ffff) == BLOCK_ENDMARK_HI 
+             && buffLo == BLOCK_ENDMARK_LO)
+         ) {
+         if (bitsRead > 49)
+            bEnd[currBlock] = bitsRead-49; else
+            bEnd[currBlock] = 0;
+         if (currBlock > 0)
+            fprintf ( stderr, "   block %d runs from %d to %d\n",
+                      currBlock,  bStart[currBlock], bEnd[currBlock] );
+         currBlock++;
+         bStart[currBlock] = bitsRead;
+      }
+   }
+
+   bsClose ( bsIn );
+
+   /*-- identified blocks run from 1 to currBlock inclusive. --*/
+
+   if (currBlock < 1) {
+      fprintf ( stderr,
+                "%s: sorry, I couldn't find any block boundaries.\n",
+                progName );
+      exit(1);
+   };
+
+   fprintf ( stderr, "%s: splitting into blocks\n", progName );
+
+   inFile = fopen ( inFileName, "rb" );
+   if (inFile == NULL) {
+      fprintf ( stderr, "%s: can't open `%s'\n", progName, inFileName );
+      exit(1);
+   }
+   bsIn = bsOpenReadStream ( inFile );
+
+   /*-- placate gcc's dataflow analyser --*/
+   blockCRC = 0; bsWr = 0;
+
+   bitsRead = 0;
+   outFile = NULL;
+   wrBlock = 1;
+   while (True) {
+      b = bsGetBit(bsIn);
+      if (b == 2) break;
+      buffHi = (buffHi << 1) | (buffLo >> 31);
+      buffLo = (buffLo << 1) | (b & 1);
+      if (bitsRead == 47+bStart[wrBlock]) 
+         blockCRC = (buffHi << 16) | (buffLo >> 16);
+
+      if (outFile != NULL && bitsRead >= bStart[wrBlock]
+                          && bitsRead <= bEnd[wrBlock]) {
+         bsPutBit ( bsWr, b );
+      }
+
+      bitsRead++;
+
+      if (bitsRead == bEnd[wrBlock]+1) {
+         if (outFile != NULL) {
+            bsPutUChar ( bsWr, 0x17 ); bsPutUChar ( bsWr, 0x72 );
+            bsPutUChar ( bsWr, 0x45 ); bsPutUChar ( bsWr, 0x38 );
+            bsPutUChar ( bsWr, 0x50 ); bsPutUChar ( bsWr, 0x90 );
+            bsPutUInt32 ( bsWr, blockCRC );
+            bsClose ( bsWr );
+         }
+         if (wrBlock >= currBlock) break;
+         wrBlock++;
+      } else
+      if (bitsRead == bStart[wrBlock]) {
+         outFileName[0] = 0;
+         sprintf ( outFileName, "rec%4d", wrBlock );
+         for (p = outFileName; *p != 0; p++) if (*p == ' ') *p = '0';
+         strcat ( outFileName, inFileName );
+         if ( !endsInBz2(outFileName)) strcat ( outFileName, ".bz2" );
+
+         fprintf ( stderr, "   writing block %d to `%s' ...\n",
+                           wrBlock, outFileName );
+
+         outFile = fopen ( outFileName, "wb" );
+         if (outFile == NULL) {
+            fprintf ( stderr, "%s: can't write `%s'\n",
+                      progName, outFileName );
+            exit(1);
+         }
+         bsWr = bsOpenWriteStream ( outFile );
+         bsPutUChar ( bsWr, 'B' ); bsPutUChar ( bsWr, 'Z' );
+         bsPutUChar ( bsWr, 'h' ); bsPutUChar ( bsWr, '9' );
+         bsPutUChar ( bsWr, 0x31 ); bsPutUChar ( bsWr, 0x41 );
+         bsPutUChar ( bsWr, 0x59 ); bsPutUChar ( bsWr, 0x26 );
+         bsPutUChar ( bsWr, 0x53 ); bsPutUChar ( bsWr, 0x59 );
+      }
+   }
+
+   fprintf ( stderr, "%s: finished\n", progName );
+   return 0;
+}
+
+
+
+/*-----------------------------------------------------------*/
+/*--- end                                  bzip2recover.c ---*/
+/*-----------------------------------------------------------*/

diff --git a/sample1.bz2 b/sample1.bz2
new file mode 100644
index 0000000..18dea60
--- /dev/null
+++ b/sample1.bz2
Binary files differ

diff --git a/sample1.ref b/sample1.ref
new file mode 100644
index 0000000..a56e52b
--- /dev/null
+++ b/sample1.ref
Binary files differ

diff --git a/sample2.bz2 b/sample2.bz2
new file mode 100644
index 0000000..d5a6160
--- /dev/null
+++ b/sample2.bz2
Binary files differ

diff --git a/sample2.ref b/sample2.ref
new file mode 100644
index 0000000..34af958
--- /dev/null
+++ b/sample2.ref
Binary files differ

diff --git a/test.bat b/test.bat
new file mode 100644
index 0000000..30b747d
--- /dev/null
+++ b/test.bat

@@ -0,0 +1,9 @@
+@rem

+@rem MSDOS test driver for bzip2

+@rem

+type words1

+.\bzip2 -1 < sample1.ref > sample1.rbz

+.\bzip2 -2 < sample2.ref > sample2.rbz

+.\bzip2 -dvv < sample1.bz2 > sample1.tst

+.\bzip2 -dvv < sample2.bz2 > sample2.tst

+type words3sh
\ No newline at end of file

diff --git a/test.cmd b/test.cmd
new file mode 100644
index 0000000..f7bc866
--- /dev/null
+++ b/test.cmd

@@ -0,0 +1,9 @@
+@rem

+@rem OS/2 test driver for bzip2

+@rem

+type words1

+.\bzip2 -1 < sample1.ref > sample1.rbz

+.\bzip2 -2 < sample2.ref > sample2.rbz

+.\bzip2 -dvv < sample1.bz2 > sample1.tst

+.\bzip2 -dvv < sample2.bz2 > sample2.tst

+type words3sh
\ No newline at end of file

diff --git a/words0 b/words0
new file mode 100644
index 0000000..527fb43
--- /dev/null
+++ b/words0

@@ -0,0 +1,7 @@
+***-------------------------------------------------***
+***--------- IMPORTANT: READ WHAT FOLLOWS! ---------***
+***---------     viz: pay attention :-)    ---------***
+***-------------------------------------------------***
+
+Compiling bzip2 ...
+

diff --git a/words1 b/words1
new file mode 100644
index 0000000..c75293b
--- /dev/null
+++ b/words1

@@ -0,0 +1,5 @@
+
+
+Doing 4 tests (2 compress, 2 uncompress) ...
+If there's a problem, things might stop at this point.
+ 

diff --git a/words2 b/words2
new file mode 100644
index 0000000..d3cafb9
--- /dev/null
+++ b/words2

@@ -0,0 +1,6 @@
+
+
+Checking test results.  If any of the four "cmp"s which follow
+report any differences, something is wrong.  If you can't easily
+figure out what, please let me know (jseward@acm.org).
+

diff --git a/words3 b/words3
new file mode 100644
index 0000000..5739d18
--- /dev/null
+++ b/words3

@@ -0,0 +1,23 @@
+
+
+If you got this far and the "cmp"s didn't find anything amiss, looks
+like you're in business.  You should install bzip2 and bunzip2:
+
+   copy bzip2 to a public place, maybe /usr/bin.
+   In that public place, make bunzip2 a symbolic link
+      to the bzip2 you just copied there.
+   Put the manual page, bzip2.1, somewhere appropriate;
+      perhaps in /usr/man/man1.
+
+Complete instructions for use are in the preformatted
+manual page, in the file bzip2.1.preformatted.
+
+You can also do "bzip2 --help" to see some helpful information. 
+
+"bzip2 -L" displays the software license.
+
+Please read the README file carefully.  
+Finally, note that bzip2 comes with ABSOLUTELY NO WARRANTY.
+
+Happy compressing!
+

diff --git a/words3sh b/words3sh
new file mode 100644
index 0000000..1139177
--- /dev/null
+++ b/words3sh

@@ -0,0 +1,12 @@
+If you got this far and the "bzip2 -dvv"s give identical

+stored vs computed CRCs, you're probably in business.

+Complete instructions for use are in the preformatted manual page, 

+in the file bzip2.txt.

+

+You can also do "bzip2 --help" to see some helpful information. 

+"bzip2 -L" displays the software license.

+

+Please read the README file carefully.  

+Finally, note that bzip2 comes with ABSOLUTELY NO WARRANTY.

+

+Happy compressing!
\ No newline at end of file
commit	33d134030248633ffa7d60c0a35a783c46da034b	[log] [tgz]
author	Julian Seward <jseward@acm.org>	Thu Aug 07 22:13:13 1997 +0200
committer	Julian Seward <jseward@acm.org>	Thu Aug 07 22:13:13 1997 +0200
tree	b760dc34185dccc7054989c1472478574223cc31