PROJECTS - third_party/libarchive - Git at Google

 The following is a list of things I would like to see
 done with libarchive.  It's sorted roughly in priority;
 more feasible and more often-requested items are higher
 on the list.  If you think you have time to work on any
 of these, please let me know.

 * More compression options:  Recent improvements to the
   read bidding system and external program support should
   make it very simple to add support for lzo, lzf, and
   many other command-line decompression programs.
   I've even written up a Wiki page describing how to
   do this.

 * cpio front-end.  The basic bsdcpio front-end is now
   working.  I'm looking for feedback about what additional
   features are necessary.

 * pax front-end.  Once cpio is a little more stable, I plan
   to fork it as the basis of a pax front-end.  The only really
   tricky part will be implementing the header-editing features
   from POSIX 2001 'pax', which will require some changes to
   the libarchive API.

 * libarchive on Windows.  libarchive mostly builds cleanly
   on Windows and Visual Studio.  Making this really clean
   requires reworking the public API to not use dev/ino; I
   think I know how to do this but could use advice from
   someone knowledgable about Windows file-management APIs.

 * Linux large-file/small-file dance.  libarchive always
   builds with 64-bit off_t and stat structures; client programs
   don't always do this.  Supporting client programs built
   with 32-bit off_t requires a little trickery.  I know how
   to do this but haven't had time to work through it.

 * bsdtar on Windows.  After libarchive is working on Windows,
   this should be much simpler.  At worst, you can just disable
   features.

 * Writing tar sparse entries.  The GNU "1.0" sparse format
   sucks a lot less than the old GNU sparse format, so I'm finally
   dropping my objections to sparse file writing.  This requires
   extending archive_entry to support a block list, and will
   require extensive changes to bsdtar to generate block lists.
   The sparse read code will also need to put block lists into
   the entry so that archive-to-archive copies preserve sparseness.

 * ISO9660 writing.  Writing ISO9660 requires two passes, and
   libarchive is a single-pass API.  For ISO9660, you can hide
   that behind a temp file, though.  Collect metadata in memory,
   append file bodies (properly padded to 2k sector boundaries)
   to a temp file, then format the directory section and copy
   the file data through at format close.

 * archive_read_disk:  Currently, libarchive can generate a stream
   of entries from an archive file and can feed entries to an
   archive file or a directory.  The missing corner is pulling
   entries from a directory.  With that, libarchive can provide
   efficient bulk copy services for dir-to-dir, dir-to-archive,
   archive-to-dir, and archive-to-archive.  Right now, the
   read-from-disk capabilities are handled in the client.

 * ISO9660 Level 3.  ISO9660 Level 3 supports files over 4GB.

 * --split=<limit> option to bsdtar.  This would watch the total output
   size and begin a new archive file whenever <next file size> +
   <total archive size> exceeded <limit>.  Not as robust as
   GNU tar's ability to split an entry across archives, but still
   useful in many situations.

 * Filename matching extensions:  ^ to anchor a pattern to the
   beginning of the file, [!...] negated character classes, etc.
	The following is a list of things I would like to see
	done with libarchive. It's sorted roughly in priority;
	more feasible and more often-requested items are higher
	on the list. If you think you have time to work on any
	of these, please let me know.

	* More compression options: Recent improvements to the
	read bidding system and external program support should
	make it very simple to add support for lzo, lzf, and
	many other command-line decompression programs.
	I've even written up a Wiki page describing how to
	do this.

	* cpio front-end. The basic bsdcpio front-end is now
	working. I'm looking for feedback about what additional
	features are necessary.

	* pax front-end. Once cpio is a little more stable, I plan
	to fork it as the basis of a pax front-end. The only really
	tricky part will be implementing the header-editing features
	from POSIX 2001 'pax', which will require some changes to
	the libarchive API.

	* libarchive on Windows. libarchive mostly builds cleanly
	on Windows and Visual Studio. Making this really clean
	requires reworking the public API to not use dev/ino; I
	think I know how to do this but could use advice from
	someone knowledgable about Windows file-management APIs.

	* Linux large-file/small-file dance. libarchive always
	builds with 64-bit off_t and stat structures; client programs
	don't always do this. Supporting client programs built
	with 32-bit off_t requires a little trickery. I know how
	to do this but haven't had time to work through it.

	* bsdtar on Windows. After libarchive is working on Windows,
	this should be much simpler. At worst, you can just disable
	features.

	* Writing tar sparse entries. The GNU "1.0" sparse format
	sucks a lot less than the old GNU sparse format, so I'm finally
	dropping my objections to sparse file writing. This requires
	extending archive_entry to support a block list, and will
	require extensive changes to bsdtar to generate block lists.
	The sparse read code will also need to put block lists into
	the entry so that archive-to-archive copies preserve sparseness.

	* ISO9660 writing. Writing ISO9660 requires two passes, and
	libarchive is a single-pass API. For ISO9660, you can hide
	that behind a temp file, though. Collect metadata in memory,
	append file bodies (properly padded to 2k sector boundaries)
	to a temp file, then format the directory section and copy
	the file data through at format close.

	* archive_read_disk: Currently, libarchive can generate a stream
	of entries from an archive file and can feed entries to an
	archive file or a directory. The missing corner is pulling
	entries from a directory. With that, libarchive can provide
	efficient bulk copy services for dir-to-dir, dir-to-archive,
	archive-to-dir, and archive-to-archive. Right now, the
	read-from-disk capabilities are handled in the client.

	* ISO9660 Level 3. ISO9660 Level 3 supports files over 4GB.

	* --split=<limit> option to bsdtar. This would watch the total output
	size and begin a new archive file whenever <next file size> +
	<total archive size> exceeded <limit>. Not as robust as
	GNU tar's ability to split an entry across archives, but still
	useful in many situations.

	* Filename matching extensions: ^ to anchor a pattern to the
	beginning of the file, [!...] negated character classes, etc.