| '\" t |
| .\" |
| .\" Author: Lasse Collin |
| .\" |
| .\" This file has been put into the public domain. |
| .\" You can do whatever you want with this file. |
| .\" |
| .TH XZ 1 "2014-12-16" "Tukaani" "XZ Utils" |
| . |
| .SH NAME |
| xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files |
| . |
| .SH SYNOPSIS |
| .B xz |
| .RI [ option... ] |
| .RI [ file... ] |
| . |
| .SH COMMAND ALIASES |
| .B unxz |
| is equivalent to |
| .BR "xz \-\-decompress" . |
| .br |
| .B xzcat |
| is equivalent to |
| .BR "xz \-\-decompress \-\-stdout" . |
| .br |
| .B lzma |
| is equivalent to |
| .BR "xz \-\-format=lzma" . |
| .br |
| .B unlzma |
| is equivalent to |
| .BR "xz \-\-format=lzma \-\-decompress" . |
| .br |
| .B lzcat |
| is equivalent to |
| .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" . |
| .PP |
| When writing scripts that need to decompress files, |
| it is recommended to always use the name |
| .B xz |
| with appropriate arguments |
| .RB ( "xz \-d" |
| or |
| .BR "xz \-dc" ) |
| instead of the names |
| .B unxz |
| and |
| .BR xzcat . |
| . |
| .SH DESCRIPTION |
| .B xz |
| is a general-purpose data compression tool with |
| command line syntax similar to |
| .BR gzip (1) |
| and |
| .BR bzip2 (1). |
| The native file format is the |
| .B .xz |
| format, but the legacy |
| .B .lzma |
| format used by LZMA Utils and |
| raw compressed streams with no container format headers |
| are also supported. |
| .PP |
| .B xz |
| compresses or decompresses each |
| .I file |
| according to the selected operation mode. |
| If no |
| .I files |
| are given or |
| .I file |
| is |
| .BR \- , |
| .B xz |
| reads from standard input and writes the processed data |
| to standard output. |
| .B xz |
| will refuse (display an error and skip the |
| .IR file ) |
| to write compressed data to standard output if it is a terminal. |
| Similarly, |
| .B xz |
| will refuse to read compressed data |
| from standard input if it is a terminal. |
| .PP |
| Unless |
| .B \-\-stdout |
| is specified, |
| .I files |
| other than |
| .B \- |
| are written to a new file whose name is derived from the source |
| .I file |
| name: |
| .IP \(bu 3 |
| When compressing, the suffix of the target file format |
| .RB ( .xz |
| or |
| .BR .lzma ) |
| is appended to the source filename to get the target filename. |
| .IP \(bu 3 |
| When decompressing, the |
| .B .xz |
| or |
| .B .lzma |
| suffix is removed from the filename to get the target filename. |
| .B xz |
| also recognizes the suffixes |
| .B .txz |
| and |
| .BR .tlz , |
| and replaces them with the |
| .B .tar |
| suffix. |
| .PP |
| If the target file already exists, an error is displayed and the |
| .I file |
| is skipped. |
| .PP |
| Unless writing to standard output, |
| .B xz |
| will display a warning and skip the |
| .I file |
| if any of the following applies: |
| .IP \(bu 3 |
| .I File |
| is not a regular file. |
| Symbolic links are not followed, |
| and thus they are not considered to be regular files. |
| .IP \(bu 3 |
| .I File |
| has more than one hard link. |
| .IP \(bu 3 |
| .I File |
| has setuid, setgid, or sticky bit set. |
| .IP \(bu 3 |
| The operation mode is set to compress and the |
| .I file |
| already has a suffix of the target file format |
| .RB ( .xz |
| or |
| .B .txz |
| when compressing to the |
| .B .xz |
| format, and |
| .B .lzma |
| or |
| .B .tlz |
| when compressing to the |
| .B .lzma |
| format). |
| .IP \(bu 3 |
| The operation mode is set to decompress and the |
| .I file |
| doesn't have a suffix of any of the supported file formats |
| .RB ( .xz , |
| .BR .txz , |
| .BR .lzma , |
| or |
| .BR .tlz ). |
| .PP |
| After successfully compressing or decompressing the |
| .IR file , |
| .B xz |
| copies the owner, group, permissions, access time, |
| and modification time from the source |
| .I file |
| to the target file. |
| If copying the group fails, the permissions are modified |
| so that the target file doesn't become accessible to users |
| who didn't have permission to access the source |
| .IR file . |
| .B xz |
| doesn't support copying other metadata like access control lists |
| or extended attributes yet. |
| .PP |
| Once the target file has been successfully closed, the source |
| .I file |
| is removed unless |
| .B \-\-keep |
| was specified. |
| The source |
| .I file |
| is never removed if the output is written to standard output. |
| .PP |
| Sending |
| .B SIGINFO |
| or |
| .B SIGUSR1 |
| to the |
| .B xz |
| process makes it print progress information to standard error. |
| This has only limited use since when standard error |
| is a terminal, using |
| .B \-\-verbose |
| will display an automatically updating progress indicator. |
| . |
| .SS "Memory usage" |
| The memory usage of |
| .B xz |
| varies from a few hundred kilobytes to several gigabytes |
| depending on the compression settings. |
| The settings used when compressing a file determine |
| the memory requirements of the decompressor. |
| Typically the decompressor needs 5\ % to 20\ % of |
| the amount of memory that the compressor needed when |
| creating the file. |
| For example, decompressing a file created with |
| .B xz \-9 |
| currently requires 65\ MiB of memory. |
| Still, it is possible to have |
| .B .xz |
| files that require several gigabytes of memory to decompress. |
| .PP |
| Especially users of older systems may find |
| the possibility of very large memory usage annoying. |
| To prevent uncomfortable surprises, |
| .B xz |
| has a built-in memory usage limiter, which is disabled by default. |
| While some operating systems provide ways to limit |
| the memory usage of processes, relying on it |
| wasn't deemed to be flexible enough (e.g. using |
| .BR ulimit (1) |
| to limit virtual memory tends to cripple |
| .BR mmap (2)). |
| .PP |
| The memory usage limiter can be enabled with |
| the command line option \fB\-\-memlimit=\fIlimit\fR. |
| Often it is more convenient to enable the limiter |
| by default by setting the environment variable |
| .BR XZ_DEFAULTS , |
| e.g.\& |
| .BR XZ_DEFAULTS=\-\-memlimit=150MiB . |
| It is possible to set the limits separately |
| for compression and decompression |
| by using \fB\-\-memlimit\-compress=\fIlimit\fR and |
| \fB\-\-memlimit\-decompress=\fIlimit\fR. |
| Using these two options outside |
| .B XZ_DEFAULTS |
| is rarely useful because a single run of |
| .B xz |
| cannot do both compression and decompression and |
| .BI \-\-memlimit= limit |
| (or \fB\-M\fR \fIlimit\fR) |
| is shorter to type on the command line. |
| .PP |
| If the specified memory usage limit is exceeded when decompressing, |
| .B xz |
| will display an error and decompressing the file will fail. |
| If the limit is exceeded when compressing, |
| .B xz |
| will try to scale the settings down so that the limit |
| is no longer exceeded (except when using \fB\-\-format=raw\fR |
| or \fB\-\-no\-adjust\fR). |
| This way the operation won't fail unless the limit is very small. |
| The scaling of the settings is done in steps that don't |
| match the compression level presets, e.g. if the limit is |
| only slightly less than the amount required for |
| .BR "xz \-9" , |
| the settings will be scaled down only a little, |
| not all the way down to |
| .BR "xz \-8" . |
| . |
| .SS "Concatenation and padding with .xz files" |
| It is possible to concatenate |
| .B .xz |
| files as is. |
| .B xz |
| will decompress such files as if they were a single |
| .B .xz |
| file. |
| .PP |
| It is possible to insert padding between the concatenated parts |
| or after the last part. |
| The padding must consist of null bytes and the size |
| of the padding must be a multiple of four bytes. |
| This can be useful e.g. if the |
| .B .xz |
| file is stored on a medium that measures file sizes |
| in 512-byte blocks. |
| .PP |
| Concatenation and padding are not allowed with |
| .B .lzma |
| files or raw streams. |
| . |
| .SH OPTIONS |
| . |
| .SS "Integer suffixes and special values" |
| In most places where an integer argument is expected, |
| an optional suffix is supported to easily indicate large integers. |
| There must be no space between the integer and the suffix. |
| .TP |
| .B KiB |
| Multiply the integer by 1,024 (2^10). |
| .BR Ki , |
| .BR k , |
| .BR kB , |
| .BR K , |
| and |
| .B KB |
| are accepted as synonyms for |
| .BR KiB . |
| .TP |
| .B MiB |
| Multiply the integer by 1,048,576 (2^20). |
| .BR Mi , |
| .BR m , |
| .BR M , |
| and |
| .B MB |
| are accepted as synonyms for |
| .BR MiB . |
| .TP |
| .B GiB |
| Multiply the integer by 1,073,741,824 (2^30). |
| .BR Gi , |
| .BR g , |
| .BR G , |
| and |
| .B GB |
| are accepted as synonyms for |
| .BR GiB . |
| .PP |
| The special value |
| .B max |
| can be used to indicate the maximum integer value |
| supported by the option. |
| . |
| .SS "Operation mode" |
| If multiple operation mode options are given, |
| the last one takes effect. |
| .TP |
| .BR \-z ", " \-\-compress |
| Compress. |
| This is the default operation mode when no operation mode option |
| is specified and no other operation mode is implied from |
| the command name (for example, |
| .B unxz |
| implies |
| .BR \-\-decompress ). |
| .TP |
| .BR \-d ", " \-\-decompress ", " \-\-uncompress |
| Decompress. |
| .TP |
| .BR \-t ", " \-\-test |
| Test the integrity of compressed |
| .IR files . |
| This option is equivalent to |
| .B "\-\-decompress \-\-stdout" |
| except that the decompressed data is discarded instead of being |
| written to standard output. |
| No files are created or removed. |
| .TP |
| .BR \-l ", " \-\-list |
| Print information about compressed |
| .IR files . |
| No uncompressed output is produced, |
| and no files are created or removed. |
| In list mode, the program cannot read |
| the compressed data from standard |
| input or from other unseekable sources. |
| .IP "" |
| The default listing shows basic information about |
| .IR files , |
| one file per line. |
| To get more detailed information, use also the |
| .B \-\-verbose |
| option. |
| For even more information, use |
| .B \-\-verbose |
| twice, but note that this may be slow, because getting all the extra |
| information requires many seeks. |
| The width of verbose output exceeds |
| 80 characters, so piping the output to e.g.\& |
| .B "less\ \-S" |
| may be convenient if the terminal isn't wide enough. |
| .IP "" |
| The exact output may vary between |
| .B xz |
| versions and different locales. |
| For machine-readable output, |
| .B \-\-robot \-\-list |
| should be used. |
| . |
| .SS "Operation modifiers" |
| .TP |
| .BR \-k ", " \-\-keep |
| Don't delete the input files. |
| .TP |
| .BR \-f ", " \-\-force |
| This option has several effects: |
| .RS |
| .IP \(bu 3 |
| If the target file already exists, |
| delete it before compressing or decompressing. |
| .IP \(bu 3 |
| Compress or decompress even if the input is |
| a symbolic link to a regular file, |
| has more than one hard link, |
| or has the setuid, setgid, or sticky bit set. |
| The setuid, setgid, and sticky bits are not copied |
| to the target file. |
| .IP \(bu 3 |
| When used with |
| .B \-\-decompress |
| .BR \-\-stdout |
| and |
| .B xz |
| cannot recognize the type of the source file, |
| copy the source file as is to standard output. |
| This allows |
| .B xzcat |
| .B \-\-force |
| to be used like |
| .BR cat (1) |
| for files that have not been compressed with |
| .BR xz . |
| Note that in future, |
| .B xz |
| might support new compressed file formats, which may make |
| .B xz |
| decompress more types of files instead of copying them as is to |
| standard output. |
| .BI \-\-format= format |
| can be used to restrict |
| .B xz |
| to decompress only a single file format. |
| .RE |
| .TP |
| .BR \-c ", " \-\-stdout ", " \-\-to\-stdout |
| Write the compressed or decompressed data to |
| standard output instead of a file. |
| This implies |
| .BR \-\-keep . |
| .TP |
| .B \-\-single\-stream |
| Decompress only the first |
| .B .xz |
| stream, and |
| silently ignore possible remaining input data following the stream. |
| Normally such trailing garbage makes |
| .B xz |
| display an error. |
| .IP "" |
| .B xz |
| never decompresses more than one stream from |
| .B .lzma |
| files or raw streams, but this option still makes |
| .B xz |
| ignore the possible trailing data after the |
| .B .lzma |
| file or raw stream. |
| .IP "" |
| This option has no effect if the operation mode is not |
| .B \-\-decompress |
| or |
| .BR \-\-test . |
| .TP |
| .B \-\-no\-sparse |
| Disable creation of sparse files. |
| By default, if decompressing into a regular file, |
| .B xz |
| tries to make the file sparse if the decompressed data contains |
| long sequences of binary zeros. |
| It also works when writing to standard output |
| as long as standard output is connected to a regular file |
| and certain additional conditions are met to make it safe. |
| Creating sparse files may save disk space and speed up |
| the decompression by reducing the amount of disk I/O. |
| .TP |
| \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf |
| When compressing, use |
| .I .suf |
| as the suffix for the target file instead of |
| .B .xz |
| or |
| .BR .lzma . |
| If not writing to standard output and |
| the source file already has the suffix |
| .IR .suf , |
| a warning is displayed and the file is skipped. |
| .IP "" |
| When decompressing, recognize files with the suffix |
| .I .suf |
| in addition to files with the |
| .BR .xz , |
| .BR .txz , |
| .BR .lzma , |
| or |
| .B .tlz |
| suffix. |
| If the source file has the suffix |
| .IR .suf , |
| the suffix is removed to get the target filename. |
| .IP "" |
| When compressing or decompressing raw streams |
| .RB ( \-\-format=raw ), |
| the suffix must always be specified unless |
| writing to standard output, |
| because there is no default suffix for raw streams. |
| .TP |
| \fB\-\-files\fR[\fB=\fIfile\fR] |
| Read the filenames to process from |
| .IR file ; |
| if |
| .I file |
| is omitted, filenames are read from standard input. |
| Filenames must be terminated with the newline character. |
| A dash |
| .RB ( \- ) |
| is taken as a regular filename; it doesn't mean standard input. |
| If filenames are given also as command line arguments, they are |
| processed before the filenames read from |
| .IR file . |
| .TP |
| \fB\-\-files0\fR[\fB=\fIfile\fR] |
| This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except |
| that each filename must be terminated with the null character. |
| . |
| .SS "Basic file format and compression options" |
| .TP |
| \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat |
| Specify the file |
| .I format |
| to compress or decompress: |
| .RS |
| .TP |
| .B auto |
| This is the default. |
| When compressing, |
| .B auto |
| is equivalent to |
| .BR xz . |
| When decompressing, |
| the format of the input file is automatically detected. |
| Note that raw streams (created with |
| .BR \-\-format=raw ) |
| cannot be auto-detected. |
| .TP |
| .B xz |
| Compress to the |
| .B .xz |
| file format, or accept only |
| .B .xz |
| files when decompressing. |
| .TP |
| .BR lzma ", " alone |
| Compress to the legacy |
| .B .lzma |
| file format, or accept only |
| .B .lzma |
| files when decompressing. |
| The alternative name |
| .B alone |
| is provided for backwards compatibility with LZMA Utils. |
| .TP |
| .B raw |
| Compress or uncompress a raw stream (no headers). |
| This is meant for advanced users only. |
| To decode raw streams, you need use |
| .B \-\-format=raw |
| and explicitly specify the filter chain, |
| which normally would have been stored in the container headers. |
| .RE |
| .TP |
| \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck |
| Specify the type of the integrity check. |
| The check is calculated from the uncompressed data and |
| stored in the |
| .B .xz |
| file. |
| This option has an effect only when compressing into the |
| .B .xz |
| format; the |
| .B .lzma |
| format doesn't support integrity checks. |
| The integrity check (if any) is verified when the |
| .B .xz |
| file is decompressed. |
| .IP "" |
| Supported |
| .I check |
| types: |
| .RS |
| .TP |
| .B none |
| Don't calculate an integrity check at all. |
| This is usually a bad idea. |
| This can be useful when integrity of the data is verified |
| by other means anyway. |
| .TP |
| .B crc32 |
| Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet). |
| .TP |
| .B crc64 |
| Calculate CRC64 using the polynomial from ECMA-182. |
| This is the default, since it is slightly better than CRC32 |
| at detecting damaged files and the speed difference is negligible. |
| .TP |
| .B sha256 |
| Calculate SHA-256. |
| This is somewhat slower than CRC32 and CRC64. |
| .RE |
| .IP "" |
| Integrity of the |
| .B .xz |
| headers is always verified with CRC32. |
| It is not possible to change or disable it. |
| .TP |
| .B \-\-ignore\-check |
| Don't verify the integrity check of the compressed data when decompressing. |
| The CRC32 values in the |
| .B .xz |
| headers will still be verified normally. |
| .IP "" |
| .B "Do not use this option unless you know what you are doing." |
| Possible reasons to use this option: |
| .RS |
| .IP \(bu 3 |
| Trying to recover data from a corrupt .xz file. |
| .IP \(bu 3 |
| Speeding up decompression. |
| This matters mostly with SHA-256 or |
| with files that have compressed extremely well. |
| It's recommended to not use this option for this purpose |
| unless the file integrity is verified externally in some other way. |
| .RE |
| .TP |
| .BR \-0 " ... " \-9 |
| Select a compression preset level. |
| The default is |
| .BR \-6 . |
| If multiple preset levels are specified, |
| the last one takes effect. |
| If a custom filter chain was already specified, setting |
| a compression preset level clears the custom filter chain. |
| .IP "" |
| The differences between the presets are more significant than with |
| .BR gzip (1) |
| and |
| .BR bzip2 (1). |
| The selected compression settings determine |
| the memory requirements of the decompressor, |
| thus using a too high preset level might make it painful |
| to decompress the file on an old system with little RAM. |
| Specifically, |
| .B "it's not a good idea to blindly use \-9 for everything" |
| like it often is with |
| .BR gzip (1) |
| and |
| .BR bzip2 (1). |
| .RS |
| .TP |
| .BR "\-0" " ... " "\-3" |
| These are somewhat fast presets. |
| .B \-0 |
| is sometimes faster than |
| .B "gzip \-9" |
| while compressing much better. |
| The higher ones often have speed comparable to |
| .BR bzip2 (1) |
| with comparable or better compression ratio, |
| although the results |
| depend a lot on the type of data being compressed. |
| .TP |
| .BR "\-4" " ... " "\-6" |
| Good to very good compression while keeping |
| decompressor memory usage reasonable even for old systems. |
| .B \-6 |
| is the default, which is usually a good choice |
| e.g. for distributing files that need to be decompressible |
| even on systems with only 16\ MiB RAM. |
| .RB ( \-5e |
| or |
| .B \-6e |
| may be worth considering too. |
| See |
| .BR \-\-extreme .) |
| .TP |
| .B "\-7 ... \-9" |
| These are like |
| .B \-6 |
| but with higher compressor and decompressor memory requirements. |
| These are useful only when compressing files bigger than |
| 8\ MiB, 16\ MiB, and 32\ MiB, respectively. |
| .RE |
| .IP "" |
| On the same hardware, the decompression speed is approximately |
| a constant number of bytes of compressed data per second. |
| In other words, the better the compression, |
| the faster the decompression will usually be. |
| This also means that the amount of uncompressed output |
| produced per second can vary a lot. |
| .IP "" |
| The following table summarises the features of the presets: |
| .RS |
| .RS |
| .PP |
| .TS |
| tab(;); |
| c c c c c |
| n n n n n. |
| Preset;DictSize;CompCPU;CompMem;DecMem |
| \-0;256 KiB;0;3 MiB;1 MiB |
| \-1;1 MiB;1;9 MiB;2 MiB |
| \-2;2 MiB;2;17 MiB;3 MiB |
| \-3;4 MiB;3;32 MiB;5 MiB |
| \-4;4 MiB;4;48 MiB;5 MiB |
| \-5;8 MiB;5;94 MiB;9 MiB |
| \-6;8 MiB;6;94 MiB;9 MiB |
| \-7;16 MiB;6;186 MiB;17 MiB |
| \-8;32 MiB;6;370 MiB;33 MiB |
| \-9;64 MiB;6;674 MiB;65 MiB |
| .TE |
| .RE |
| .RE |
| .IP "" |
| Column descriptions: |
| .RS |
| .IP \(bu 3 |
| DictSize is the LZMA2 dictionary size. |
| It is waste of memory to use a dictionary bigger than |
| the size of the uncompressed file. |
| This is why it is good to avoid using the presets |
| .BR \-7 " ... " \-9 |
| when there's no real need for them. |
| At |
| .B \-6 |
| and lower, the amount of memory wasted is |
| usually low enough to not matter. |
| .IP \(bu 3 |
| CompCPU is a simplified representation of the LZMA2 settings |
| that affect compression speed. |
| The dictionary size affects speed too, |
| so while CompCPU is the same for levels |
| .BR \-6 " ... " \-9 , |
| higher levels still tend to be a little slower. |
| To get even slower and thus possibly better compression, see |
| .BR \-\-extreme . |
| .IP \(bu 3 |
| CompMem contains the compressor memory requirements |
| in the single-threaded mode. |
| It may vary slightly between |
| .B xz |
| versions. |
| Memory requirements of some of the future multithreaded modes may |
| be dramatically higher than that of the single-threaded mode. |
| .IP \(bu 3 |
| DecMem contains the decompressor memory requirements. |
| That is, the compression settings determine |
| the memory requirements of the decompressor. |
| The exact decompressor memory usage is slightly more than |
| the LZMA2 dictionary size, but the values in the table |
| have been rounded up to the next full MiB. |
| .RE |
| .TP |
| .BR \-e ", " \-\-extreme |
| Use a slower variant of the selected compression preset level |
| .RB ( \-0 " ... " \-9 ) |
| to hopefully get a little bit better compression ratio, |
| but with bad luck this can also make it worse. |
| Decompressor memory usage is not affected, |
| but compressor memory usage increases a little at preset levels |
| .BR \-0 " ... " \-3 . |
| .IP "" |
| Since there are two presets with dictionary sizes |
| 4\ MiB and 8\ MiB, the presets |
| .B \-3e |
| and |
| .B \-5e |
| use slightly faster settings (lower CompCPU) than |
| .B \-4e |
| and |
| .BR \-6e , |
| respectively. |
| That way no two presets are identical. |
| .RS |
| .RS |
| .PP |
| .TS |
| tab(;); |
| c c c c c |
| n n n n n. |
| Preset;DictSize;CompCPU;CompMem;DecMem |
| \-0e;256 KiB;8;4 MiB;1 MiB |
| \-1e;1 MiB;8;13 MiB;2 MiB |
| \-2e;2 MiB;8;25 MiB;3 MiB |
| \-3e;4 MiB;7;48 MiB;5 MiB |
| \-4e;4 MiB;8;48 MiB;5 MiB |
| \-5e;8 MiB;7;94 MiB;9 MiB |
| \-6e;8 MiB;8;94 MiB;9 MiB |
| \-7e;16 MiB;8;186 MiB;17 MiB |
| \-8e;32 MiB;8;370 MiB;33 MiB |
| \-9e;64 MiB;8;674 MiB;65 MiB |
| .TE |
| .RE |
| .RE |
| .IP "" |
| For example, there are a total of four presets that use |
| 8\ MiB dictionary, whose order from the fastest to the slowest is |
| .BR \-5 , |
| .BR \-6 , |
| .BR \-5e , |
| and |
| .BR \-6e . |
| .TP |
| .B \-\-fast |
| .PD 0 |
| .TP |
| .B \-\-best |
| .PD |
| These are somewhat misleading aliases for |
| .B \-0 |
| and |
| .BR \-9 , |
| respectively. |
| These are provided only for backwards compatibility |
| with LZMA Utils. |
| Avoid using these options. |
| .TP |
| .BI \-\-block\-size= size |
| When compressing to the |
| .B .xz |
| format, split the input data into blocks of |
| .I size |
| bytes. |
| The blocks are compressed independently from each other, |
| which helps with multi-threading and |
| makes limited random-access decompression possible. |
| This option is typically used to override the default |
| block size in multi-threaded mode, |
| but this option can be used in single-threaded mode too. |
| .IP "" |
| In multi-threaded mode about three times |
| .I size |
| bytes will be allocated in each thread for buffering input and output. |
| The default |
| .I size |
| is three times the LZMA2 dictionary size or 1 MiB, |
| whichever is more. |
| Typically a good value is 2\-4 times |
| the size of the LZMA2 dictionary or at least 1 MiB. |
| Using |
| .I size |
| less than the LZMA2 dictionary size is waste of RAM |
| because then the LZMA2 dictionary buffer will never get fully used. |
| The sizes of the blocks are stored in the block headers, |
| which a future version of |
| .B xz |
| will use for multi-threaded decompression. |
| .IP "" |
| In single-threaded mode no block splitting is done by default. |
| Setting this option doesn't affect memory usage. |
| No size information is stored in block headers, |
| thus files created in single-threaded mode |
| won't be identical to files created in multi-threaded mode. |
| The lack of size information also means that a future version of |
| .B xz |
| won't be able decompress the files in multi-threaded mode. |
| .TP |
| .BI \-\-block\-list= sizes |
| When compressing to the |
| .B .xz |
| format, start a new block after |
| the given intervals of uncompressed data. |
| .IP "" |
| The uncompressed |
| .I sizes |
| of the blocks are specified as a comma-separated list. |
| Omitting a size (two or more consecutive commas) is a shorthand |
| to use the size of the previous block. |
| .IP "" |
| If the input file is bigger than the sum of |
| .IR sizes , |
| the last value in |
| .I sizes |
| is repeated until the end of the file. |
| A special value of |
| .B 0 |
| may be used as the last value to indicate that |
| the rest of the file should be encoded as a single block. |
| .IP "" |
| If one specifies |
| .I sizes |
| that exceed the encoder's block size |
| (either the default value in threaded mode or |
| the value specified with \fB\-\-block\-size=\fIsize\fR), |
| the encoder will create additional blocks while |
| keeping the boundaries specified in |
| .IR sizes . |
| For example, if one specifies |
| .B \-\-block\-size=10MiB |
| .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB |
| and the input file is 80 MiB, |
| one will get 11 blocks: |
| 5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB. |
| .IP "" |
| In multi-threaded mode the sizes of the blocks |
| are stored in the block headers. |
| This isn't done in single-threaded mode, |
| so the encoded output won't be |
| identical to that of the multi-threaded mode. |
| .TP |
| .BI \-\-flush\-timeout= timeout |
| When compressing, if more than |
| .I timeout |
| milliseconds (a positive integer) has passed since the previous flush and |
| reading more input would block, |
| all the pending input data is flushed from the encoder and |
| made available in the output stream. |
| This can be useful if |
| .B xz |
| is used to compress data that is streamed over a network. |
| Small |
| .I timeout |
| values make the data available at the receiving end |
| with a small delay, but large |
| .I timeout |
| values give better compression ratio. |
| .IP "" |
| This feature is disabled by default. |
| If this option is specified more than once, the last one takes effect. |
| The special |
| .I timeout |
| value of |
| .B 0 |
| can be used to explicitly disable this feature. |
| .IP "" |
| This feature is not available on non-POSIX systems. |
| .IP "" |
| .\" FIXME |
| .B "This feature is still experimental." |
| Currently |
| .B xz |
| is unsuitable for decompressing the stream in real time due to how |
| .B xz |
| does buffering. |
| .TP |
| .BI \-\-memlimit\-compress= limit |
| Set a memory usage limit for compression. |
| If this option is specified multiple times, |
| the last one takes effect. |
| .IP "" |
| If the compression settings exceed the |
| .IR limit , |
| .B xz |
| will adjust the settings downwards so that |
| the limit is no longer exceeded and display a notice that |
| automatic adjustment was done. |
| Such adjustments are not made when compressing with |
| .B \-\-format=raw |
| or if |
| .B \-\-no\-adjust |
| has been specified. |
| In those cases, an error is displayed and |
| .B xz |
| will exit with exit status 1. |
| .IP "" |
| The |
| .I limit |
| can be specified in multiple ways: |
| .RS |
| .IP \(bu 3 |
| The |
| .I limit |
| can be an absolute value in bytes. |
| Using an integer suffix like |
| .B MiB |
| can be useful. |
| Example: |
| .B "\-\-memlimit\-compress=80MiB" |
| .IP \(bu 3 |
| The |
| .I limit |
| can be specified as a percentage of total physical memory (RAM). |
| This can be useful especially when setting the |
| .B XZ_DEFAULTS |
| environment variable in a shell initialization script |
| that is shared between different computers. |
| That way the limit is automatically bigger |
| on systems with more memory. |
| Example: |
| .B "\-\-memlimit\-compress=70%" |
| .IP \(bu 3 |
| The |
| .I limit |
| can be reset back to its default value by setting it to |
| .BR 0 . |
| This is currently equivalent to setting the |
| .I limit |
| to |
| .B max |
| (no memory usage limit). |
| Once multithreading support has been implemented, |
| there may be a difference between |
| .B 0 |
| and |
| .B max |
| for the multithreaded case, so it is recommended to use |
| .B 0 |
| instead of |
| .B max |
| until the details have been decided. |
| .RE |
| .IP "" |
| See also the section |
| .BR "Memory usage" . |
| .TP |
| .BI \-\-memlimit\-decompress= limit |
| Set a memory usage limit for decompression. |
| This also affects the |
| .B \-\-list |
| mode. |
| If the operation is not possible without exceeding the |
| .IR limit , |
| .B xz |
| will display an error and decompressing the file will fail. |
| See |
| .BI \-\-memlimit\-compress= limit |
| for possible ways to specify the |
| .IR limit . |
| .TP |
| \fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit |
| This is equivalent to specifying \fB\-\-memlimit\-compress=\fIlimit |
| \fB\-\-memlimit\-decompress=\fIlimit\fR. |
| .TP |
| .B \-\-no\-adjust |
| Display an error and exit if the compression settings exceed |
| the memory usage limit. |
| The default is to adjust the settings downwards so |
| that the memory usage limit is not exceeded. |
| Automatic adjusting is always disabled when creating raw streams |
| .RB ( \-\-format=raw ). |
| .TP |
| \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads |
| Specify the number of worker threads to use. |
| Setting |
| .I threads |
| to a special value |
| .B 0 |
| makes |
| .B xz |
| use as many threads as there are CPU cores on the system. |
| The actual number of threads can be less than |
| .I threads |
| if the input file is not big enough |
| for threading with the given settings or |
| if using more threads would exceed the memory usage limit. |
| .IP "" |
| Currently the only threading method is to split the input into |
| blocks and compress them independently from each other. |
| The default block size depends on the compression level and |
| can be overriden with the |
| .BI \-\-block\-size= size |
| option. |
| . |
| .SS "Custom compressor filter chains" |
| A custom filter chain allows specifying |
| the compression settings in detail instead of relying on |
| the settings associated to the presets. |
| When a custom filter chain is specified, |
| preset options (\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR) |
| earlier on the command line are forgotten. |
| If a preset option is specified |
| after one or more custom filter chain options, |
| the new preset takes effect and |
| the custom filter chain options specified earlier are forgotten. |
| .PP |
| A filter chain is comparable to piping on the command line. |
| When compressing, the uncompressed input goes to the first filter, |
| whose output goes to the next filter (if any). |
| The output of the last filter gets written to the compressed file. |
| The maximum number of filters in the chain is four, |
| but typically a filter chain has only one or two filters. |
| .PP |
| Many filters have limitations on where they can be |
| in the filter chain: |
| some filters can work only as the last filter in the chain, |
| some only as a non-last filter, and some work in any position |
| in the chain. |
| Depending on the filter, this limitation is either inherent to |
| the filter design or exists to prevent security issues. |
| .PP |
| A custom filter chain is specified by using one or more |
| filter options in the order they are wanted in the filter chain. |
| That is, the order of filter options is significant! |
| When decoding raw streams |
| .RB ( \-\-format=raw ), |
| the filter chain is specified in the same order as |
| it was specified when compressing. |
| .PP |
| Filters take filter-specific |
| .I options |
| as a comma-separated list. |
| Extra commas in |
| .I options |
| are ignored. |
| Every option has a default value, so you need to |
| specify only those you want to change. |
| .PP |
| To see the whole filter chain and |
| .IR options , |
| use |
| .B "xz \-vv" |
| (that is, use |
| .B \-\-verbose |
| twice). |
| This works also for viewing the filter chain options used by presets. |
| .TP |
| \fB\-\-lzma1\fR[\fB=\fIoptions\fR] |
| .PD 0 |
| .TP |
| \fB\-\-lzma2\fR[\fB=\fIoptions\fR] |
| .PD |
| Add LZMA1 or LZMA2 filter to the filter chain. |
| These filters can be used only as the last filter in the chain. |
| .IP "" |
| LZMA1 is a legacy filter, |
| which is supported almost solely due to the legacy |
| .B .lzma |
| file format, which supports only LZMA1. |
| LZMA2 is an updated |
| version of LZMA1 to fix some practical issues of LZMA1. |
| The |
| .B .xz |
| format uses LZMA2 and doesn't support LZMA1 at all. |
| Compression speed and ratios of LZMA1 and LZMA2 |
| are practically the same. |
| .IP "" |
| LZMA1 and LZMA2 share the same set of |
| .IR options : |
| .RS |
| .TP |
| .BI preset= preset |
| Reset all LZMA1 or LZMA2 |
| .I options |
| to |
| .IR preset . |
| .I Preset |
| consist of an integer, which may be followed by single-letter |
| preset modifiers. |
| The integer can be from |
| .B 0 |
| to |
| .BR 9 , |
| matching the command line options \fB\-0\fR ... \fB\-9\fR. |
| The only supported modifier is currently |
| .BR e , |
| which matches |
| .BR \-\-extreme . |
| If no |
| .B preset |
| is specified, the default values of LZMA1 or LZMA2 |
| .I options |
| are taken from the preset |
| .BR 6 . |
| .TP |
| .BI dict= size |
| Dictionary (history buffer) |
| .I size |
| indicates how many bytes of the recently processed |
| uncompressed data is kept in memory. |
| The algorithm tries to find repeating byte sequences (matches) in |
| the uncompressed data, and replace them with references |
| to the data currently in the dictionary. |
| The bigger the dictionary, the higher is the chance |
| to find a match. |
| Thus, increasing dictionary |
| .I size |
| usually improves compression ratio, but |
| a dictionary bigger than the uncompressed file is waste of memory. |
| .IP "" |
| Typical dictionary |
| .I size |
| is from 64\ KiB to 64\ MiB. |
| The minimum is 4\ KiB. |
| The maximum for compression is currently 1.5\ GiB (1536\ MiB). |
| The decompressor already supports dictionaries up to |
| one byte less than 4\ GiB, which is the maximum for |
| the LZMA1 and LZMA2 stream formats. |
| .IP "" |
| Dictionary |
| .I size |
| and match finder |
| .RI ( mf ) |
| together determine the memory usage of the LZMA1 or LZMA2 encoder. |
| The same (or bigger) dictionary |
| .I size |
| is required for decompressing that was used when compressing, |
| thus the memory usage of the decoder is determined |
| by the dictionary size used when compressing. |
| The |
| .B .xz |
| headers store the dictionary |
| .I size |
| either as |
| .RI "2^" n |
| or |
| .RI "2^" n " + 2^(" n "\-1)," |
| so these |
| .I sizes |
| are somewhat preferred for compression. |
| Other |
| .I sizes |
| will get rounded up when stored in the |
| .B .xz |
| headers. |
| .TP |
| .BI lc= lc |
| Specify the number of literal context bits. |
| The minimum is 0 and the maximum is 4; the default is 3. |
| In addition, the sum of |
| .I lc |
| and |
| .I lp |
| must not exceed 4. |
| .IP "" |
| All bytes that cannot be encoded as matches |
| are encoded as literals. |
| That is, literals are simply 8-bit bytes |
| that are encoded one at a time. |
| .IP "" |
| The literal coding makes an assumption that the highest |
| .I lc |
| bits of the previous uncompressed byte correlate |
| with the next byte. |
| E.g. in typical English text, an upper-case letter is |
| often followed by a lower-case letter, and a lower-case |
| letter is usually followed by another lower-case letter. |
| In the US-ASCII character set, the highest three bits are 010 |
| for upper-case letters and 011 for lower-case letters. |
| When |
| .I lc |
| is at least 3, the literal coding can take advantage of |
| this property in the uncompressed data. |
| .IP "" |
| The default value (3) is usually good. |
| If you want maximum compression, test |
| .BR lc=4 . |
| Sometimes it helps a little, and |
| sometimes it makes compression worse. |
| If it makes it worse, test e.g.\& |
| .B lc=2 |
| too. |
| .TP |
| .BI lp= lp |
| Specify the number of literal position bits. |
| The minimum is 0 and the maximum is 4; the default is 0. |
| .IP "" |
| .I Lp |
| affects what kind of alignment in the uncompressed data is |
| assumed when encoding literals. |
| See |
| .I pb |
| below for more information about alignment. |
| .TP |
| .BI pb= pb |
| Specify the number of position bits. |
| The minimum is 0 and the maximum is 4; the default is 2. |
| .IP "" |
| .I Pb |
| affects what kind of alignment in the uncompressed data is |
| assumed in general. |
| The default means four-byte alignment |
| .RI (2^ pb =2^2=4), |
| which is often a good choice when there's no better guess. |
| .IP "" |
| When the aligment is known, setting |
| .I pb |
| accordingly may reduce the file size a little. |
| E.g. with text files having one-byte |
| alignment (US-ASCII, ISO-8859-*, UTF-8), setting |
| .B pb=0 |
| can improve compression slightly. |
| For UTF-16 text, |
| .B pb=1 |
| is a good choice. |
| If the alignment is an odd number like 3 bytes, |
| .B pb=0 |
| might be the best choice. |
| .IP "" |
| Even though the assumed alignment can be adjusted with |
| .I pb |
| and |
| .IR lp , |
| LZMA1 and LZMA2 still slightly favor 16-byte alignment. |
| It might be worth taking into account when designing file formats |
| that are likely to be often compressed with LZMA1 or LZMA2. |
| .TP |
| .BI mf= mf |
| Match finder has a major effect on encoder speed, |
| memory usage, and compression ratio. |
| Usually Hash Chain match finders are faster than Binary Tree |
| match finders. |
| The default depends on the |
| .IR preset : |
| 0 uses |
| .BR hc3 , |
| 1\-3 |
| use |
| .BR hc4 , |
| and the rest use |
| .BR bt4 . |
| .IP "" |
| The following match finders are supported. |
| The memory usage formulas below are rough approximations, |
| which are closest to the reality when |
| .I dict |
| is a power of two. |
| .RS |
| .TP |
| .B hc3 |
| Hash Chain with 2- and 3-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 3 |
| .br |
| Memory usage: |
| .br |
| .I dict |
| * 7.5 (if |
| .I dict |
| <= 16 MiB); |
| .br |
| .I dict |
| * 5.5 + 64 MiB (if |
| .I dict |
| > 16 MiB) |
| .TP |
| .B hc4 |
| Hash Chain with 2-, 3-, and 4-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 4 |
| .br |
| Memory usage: |
| .br |
| .I dict |
| * 7.5 (if |
| .I dict |
| <= 32 MiB); |
| .br |
| .I dict |
| * 6.5 (if |
| .I dict |
| > 32 MiB) |
| .TP |
| .B bt2 |
| Binary Tree with 2-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 2 |
| .br |
| Memory usage: |
| .I dict |
| * 9.5 |
| .TP |
| .B bt3 |
| Binary Tree with 2- and 3-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 3 |
| .br |
| Memory usage: |
| .br |
| .I dict |
| * 11.5 (if |
| .I dict |
| <= 16 MiB); |
| .br |
| .I dict |
| * 9.5 + 64 MiB (if |
| .I dict |
| > 16 MiB) |
| .TP |
| .B bt4 |
| Binary Tree with 2-, 3-, and 4-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 4 |
| .br |
| Memory usage: |
| .br |
| .I dict |
| * 11.5 (if |
| .I dict |
| <= 32 MiB); |
| .br |
| .I dict |
| * 10.5 (if |
| .I dict |
| > 32 MiB) |
| .RE |
| .TP |
| .BI mode= mode |
| Compression |
| .I mode |
| specifies the method to analyze |
| the data produced by the match finder. |
| Supported |
| .I modes |
| are |
| .B fast |
| and |
| .BR normal . |
| The default is |
| .B fast |
| for |
| .I presets |
| 0\-3 and |
| .B normal |
| for |
| .I presets |
| 4\-9. |
| .IP "" |
| Usually |
| .B fast |
| is used with Hash Chain match finders and |
| .B normal |
| with Binary Tree match finders. |
| This is also what the |
| .I presets |
| do. |
| .TP |
| .BI nice= nice |
| Specify what is considered to be a nice length for a match. |
| Once a match of at least |
| .I nice |
| bytes is found, the algorithm stops |
| looking for possibly better matches. |
| .IP "" |
| .I Nice |
| can be 2\-273 bytes. |
| Higher values tend to give better compression ratio |
| at the expense of speed. |
| The default depends on the |
| .IR preset . |
| .TP |
| .BI depth= depth |
| Specify the maximum search depth in the match finder. |
| The default is the special value of 0, |
| which makes the compressor determine a reasonable |
| .I depth |
| from |
| .I mf |
| and |
| .IR nice . |
| .IP "" |
| Reasonable |
| .I depth |
| for Hash Chains is 4\-100 and 16\-1000 for Binary Trees. |
| Using very high values for |
| .I depth |
| can make the encoder extremely slow with some files. |
| Avoid setting the |
| .I depth |
| over 1000 unless you are prepared to interrupt |
| the compression in case it is taking far too long. |
| .RE |
| .IP "" |
| When decoding raw streams |
| .RB ( \-\-format=raw ), |
| LZMA2 needs only the dictionary |
| .IR size . |
| LZMA1 needs also |
| .IR lc , |
| .IR lp , |
| and |
| .IR pb . |
| .TP |
| \fB\-\-x86\fR[\fB=\fIoptions\fR] |
| .PD 0 |
| .TP |
| \fB\-\-powerpc\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-ia64\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-arm\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-armthumb\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-sparc\fR[\fB=\fIoptions\fR] |
| .PD |
| Add a branch/call/jump (BCJ) filter to the filter chain. |
| These filters can be used only as a non-last filter |
| in the filter chain. |
| .IP "" |
| A BCJ filter converts relative addresses in |
| the machine code to their absolute counterparts. |
| This doesn't change the size of the data, |
| but it increases redundancy, |
| which can help LZMA2 to produce 0\-15\ % smaller |
| .B .xz |
| file. |
| The BCJ filters are always reversible, |
| so using a BCJ filter for wrong type of data |
| doesn't cause any data loss, although it may make |
| the compression ratio slightly worse. |
| .IP "" |
| It is fine to apply a BCJ filter on a whole executable; |
| there's no need to apply it only on the executable section. |
| Applying a BCJ filter on an archive that contains both executable |
| and non-executable files may or may not give good results, |
| so it generally isn't good to blindly apply a BCJ filter when |
| compressing binary packages for distribution. |
| .IP "" |
| These BCJ filters are very fast and |
| use insignificant amount of memory. |
| If a BCJ filter improves compression ratio of a file, |
| it can improve decompression speed at the same time. |
| This is because, on the same hardware, |
| the decompression speed of LZMA2 is roughly |
| a fixed number of bytes of compressed data per second. |
| .IP "" |
| These BCJ filters have known problems related to |
| the compression ratio: |
| .RS |
| .IP \(bu 3 |
| Some types of files containing executable code |
| (e.g. object files, static libraries, and Linux kernel modules) |
| have the addresses in the instructions filled with filler values. |
| These BCJ filters will still do the address conversion, |
| which will make the compression worse with these files. |
| .IP \(bu 3 |
| Applying a BCJ filter on an archive containing multiple similar |
| executables can make the compression ratio worse than not using |
| a BCJ filter. |
| This is because the BCJ filter doesn't detect the boundaries |
| of the executable files, and doesn't reset |
| the address conversion counter for each executable. |
| .RE |
| .IP "" |
| Both of the above problems will be fixed |
| in the future in a new filter. |
| The old BCJ filters will still be useful in embedded systems, |
| because the decoder of the new filter will be bigger |
| and use more memory. |
| .IP "" |
| Different instruction sets have have different alignment: |
| .RS |
| .RS |
| .PP |
| .TS |
| tab(;); |
| l n l |
| l n l. |
| Filter;Alignment;Notes |
| x86;1;32-bit or 64-bit x86 |
| PowerPC;4;Big endian only |
| ARM;4;Little endian only |
| ARM-Thumb;2;Little endian only |
| IA-64;16;Big or little endian |
| SPARC;4;Big or little endian |
| .TE |
| .RE |
| .RE |
| .IP "" |
| Since the BCJ-filtered data is usually compressed with LZMA2, |
| the compression ratio may be improved slightly if |
| the LZMA2 options are set to match the |
| alignment of the selected BCJ filter. |
| For example, with the IA-64 filter, it's good to set |
| .B pb=4 |
| with LZMA2 (2^4=16). |
| The x86 filter is an exception; |
| it's usually good to stick to LZMA2's default |
| four-byte alignment when compressing x86 executables. |
| .IP "" |
| All BCJ filters support the same |
| .IR options : |
| .RS |
| .TP |
| .BI start= offset |
| Specify the start |
| .I offset |
| that is used when converting between relative |
| and absolute addresses. |
| The |
| .I offset |
| must be a multiple of the alignment of the filter |
| (see the table above). |
| The default is zero. |
| In practice, the default is good; specifying a custom |
| .I offset |
| is almost never useful. |
| .RE |
| .TP |
| \fB\-\-delta\fR[\fB=\fIoptions\fR] |
| Add the Delta filter to the filter chain. |
| The Delta filter can be only used as a non-last filter |
| in the filter chain. |
| .IP "" |
| Currently only simple byte-wise delta calculation is supported. |
| It can be useful when compressing e.g. uncompressed bitmap images |
| or uncompressed PCM audio. |
| However, special purpose algorithms may give significantly better |
| results than Delta + LZMA2. |
| This is true especially with audio, |
| which compresses faster and better e.g. with |
| .BR flac (1). |
| .IP "" |
| Supported |
| .IR options : |
| .RS |
| .TP |
| .BI dist= distance |
| Specify the |
| .I distance |
| of the delta calculation in bytes. |
| .I distance |
| must be 1\-256. |
| The default is 1. |
| .IP "" |
| For example, with |
| .B dist=2 |
| and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be |
| A1 B1 01 02 01 02 01 02. |
| .RE |
| . |
| .SS "Other options" |
| .TP |
| .BR \-q ", " \-\-quiet |
| Suppress warnings and notices. |
| Specify this twice to suppress errors too. |
| This option has no effect on the exit status. |
| That is, even if a warning was suppressed, |
| the exit status to indicate a warning is still used. |
| .TP |
| .BR \-v ", " \-\-verbose |
| Be verbose. |
| If standard error is connected to a terminal, |
| .B xz |
| will display a progress indicator. |
| Specifying |
| .B \-\-verbose |
| twice will give even more verbose output. |
| .IP "" |
| The progress indicator shows the following information: |
| .RS |
| .IP \(bu 3 |
| Completion percentage is shown |
| if the size of the input file is known. |
| That is, the percentage cannot be shown in pipes. |
| .IP \(bu 3 |
| Amount of compressed data produced (compressing) |
| or consumed (decompressing). |
| .IP \(bu 3 |
| Amount of uncompressed data consumed (compressing) |
| or produced (decompressing). |
| .IP \(bu 3 |
| Compression ratio, which is calculated by dividing |
| the amount of compressed data processed so far by |
| the amount of uncompressed data processed so far. |
| .IP \(bu 3 |
| Compression or decompression speed. |
| This is measured as the amount of uncompressed data consumed |
| (compression) or produced (decompression) per second. |
| It is shown after a few seconds have passed since |
| .B xz |
| started processing the file. |
| .IP \(bu 3 |
| Elapsed time in the format M:SS or H:MM:SS. |
| .IP \(bu 3 |
| Estimated remaining time is shown |
| only when the size of the input file is |
| known and a couple of seconds have already passed since |
| .B xz |
| started processing the file. |
| The time is shown in a less precise format which |
| never has any colons, e.g. 2 min 30 s. |
| .RE |
| .IP "" |
| When standard error is not a terminal, |
| .B \-\-verbose |
| will make |
| .B xz |
| print the filename, compressed size, uncompressed size, |
| compression ratio, and possibly also the speed and elapsed time |
| on a single line to standard error after compressing or |
| decompressing the file. |
| The speed and elapsed time are included only when |
| the operation took at least a few seconds. |
| If the operation didn't finish, e.g. due to user interruption, |
| also the completion percentage is printed |
| if the size of the input file is known. |
| .TP |
| .BR \-Q ", " \-\-no\-warn |
| Don't set the exit status to 2 |
| even if a condition worth a warning was detected. |
| This option doesn't affect the verbosity level, thus both |
| .B \-\-quiet |
| and |
| .B \-\-no\-warn |
| have to be used to not display warnings and |
| to not alter the exit status. |
| .TP |
| .B \-\-robot |
| Print messages in a machine-parsable format. |
| This is intended to ease writing frontends that want to use |
| .B xz |
| instead of liblzma, which may be the case with various scripts. |
| The output with this option enabled is meant to be stable across |
| .B xz |
| releases. |
| See the section |
| .B "ROBOT MODE" |
| for details. |
| .TP |
| .BR \-\-info\-memory |
| Display, in human-readable format, how much physical memory (RAM) |
| .B xz |
| thinks the system has and the memory usage limits for compression |
| and decompression, and exit successfully. |
| .TP |
| .BR \-h ", " \-\-help |
| Display a help message describing the most commonly used options, |
| and exit successfully. |
| .TP |
| .BR \-H ", " \-\-long\-help |
| Display a help message describing all features of |
| .BR xz , |
| and exit successfully |
| .TP |
| .BR \-V ", " \-\-version |
| Display the version number of |
| .B xz |
| and liblzma in human readable format. |
| To get machine-parsable output, specify |
| .B \-\-robot |
| before |
| .BR \-\-version . |
| . |
| .SH "ROBOT MODE" |
| The robot mode is activated with the |
| .B \-\-robot |
| option. |
| It makes the output of |
| .B xz |
| easier to parse by other programs. |
| Currently |
| .B \-\-robot |
| is supported only together with |
| .BR \-\-version , |
| .BR \-\-info\-memory , |
| and |
| .BR \-\-list . |
| It will be supported for compression and |
| decompression in the future. |
| . |
| .SS Version |
| .B "xz \-\-robot \-\-version" |
| will print the version number of |
| .B xz |
| and liblzma in the following format: |
| .PP |
| .BI XZ_VERSION= XYYYZZZS |
| .br |
| .BI LIBLZMA_VERSION= XYYYZZZS |
| .TP |
| .I X |
| Major version. |
| .TP |
| .I YYY |
| Minor version. |
| Even numbers are stable. |
| Odd numbers are alpha or beta versions. |
| .TP |
| .I ZZZ |
| Patch level for stable releases or |
| just a counter for development releases. |
| .TP |
| .I S |
| Stability. |
| 0 is alpha, 1 is beta, and 2 is stable. |
| .I S |
| should be always 2 when |
| .I YYY |
| is even. |
| .PP |
| .I XYYYZZZS |
| are the same on both lines if |
| .B xz |
| and liblzma are from the same XZ Utils release. |
| .PP |
| Examples: 4.999.9beta is |
| .B 49990091 |
| and |
| 5.0.0 is |
| .BR 50000002 . |
| . |
| .SS "Memory limit information" |
| .B "xz \-\-robot \-\-info\-memory" |
| prints a single line with three tab-separated columns: |
| .IP 1. 4 |
| Total amount of physical memory (RAM) in bytes |
| .IP 2. 4 |
| Memory usage limit for compression in bytes. |
| A special value of zero indicates the default setting, |
| which for single-threaded mode is the same as no limit. |
| .IP 3. 4 |
| Memory usage limit for decompression in bytes. |
| A special value of zero indicates the default setting, |
| which for single-threaded mode is the same as no limit. |
| .PP |
| In the future, the output of |
| .B "xz \-\-robot \-\-info\-memory" |
| may have more columns, but never more than a single line. |
| . |
| .SS "List mode" |
| .B "xz \-\-robot \-\-list" |
| uses tab-separated output. |
| The first column of every line has a string |
| that indicates the type of the information found on that line: |
| .TP |
| .B name |
| This is always the first line when starting to list a file. |
| The second column on the line is the filename. |
| .TP |
| .B file |
| This line contains overall information about the |
| .B .xz |
| file. |
| This line is always printed after the |
| .B name |
| line. |
| .TP |
| .B stream |
| This line type is used only when |
| .B \-\-verbose |
| was specified. |
| There are as many |
| .B stream |
| lines as there are streams in the |
| .B .xz |
| file. |
| .TP |
| .B block |
| This line type is used only when |
| .B \-\-verbose |
| was specified. |
| There are as many |
| .B block |
| lines as there are blocks in the |
| .B .xz |
| file. |
| The |
| .B block |
| lines are shown after all the |
| .B stream |
| lines; different line types are not interleaved. |
| .TP |
| .B summary |
| This line type is used only when |
| .B \-\-verbose |
| was specified twice. |
| This line is printed after all |
| .B block |
| lines. |
| Like the |
| .B file |
| line, the |
| .B summary |
| line contains overall information about the |
| .B .xz |
| file. |
| .TP |
| .B totals |
| This line is always the very last line of the list output. |
| It shows the total counts and sizes. |
| .PP |
| The columns of the |
| .B file |
| lines: |
| .PD 0 |
| .RS |
| .IP 2. 4 |
| Number of streams in the file |
| .IP 3. 4 |
| Total number of blocks in the stream(s) |
| .IP 4. 4 |
| Compressed size of the file |
| .IP 5. 4 |
| Uncompressed size of the file |
| .IP 6. 4 |
| Compression ratio, for example |
| .BR 0.123. |
| If ratio is over 9.999, three dashes |
| .RB ( \-\-\- ) |
| are displayed instead of the ratio. |
| .IP 7. 4 |
| Comma-separated list of integrity check names. |
| The following strings are used for the known check types: |
| .BR None , |
| .BR CRC32 , |
| .BR CRC64 , |
| and |
| .BR SHA\-256 . |
| For unknown check types, |
| .BI Unknown\- N |
| is used, where |
| .I N |
| is the Check ID as a decimal number (one or two digits). |
| .IP 8. 4 |
| Total size of stream padding in the file |
| .RE |
| .PD |
| .PP |
| The columns of the |
| .B stream |
| lines: |
| .PD 0 |
| .RS |
| .IP 2. 4 |
| Stream number (the first stream is 1) |
| .IP 3. 4 |
| Number of blocks in the stream |
| .IP 4. 4 |
| Compressed start offset |
| .IP 5. 4 |
| Uncompressed start offset |
| .IP 6. 4 |
| Compressed size (does not include stream padding) |
| .IP 7. 4 |
| Uncompressed size |
| .IP 8. 4 |
| Compression ratio |
| .IP 9. 4 |
| Name of the integrity check |
| .IP 10. 4 |
| Size of stream padding |
| .RE |
| .PD |
| .PP |
| The columns of the |
| .B block |
| lines: |
| .PD 0 |
| .RS |
| .IP 2. 4 |
| Number of the stream containing this block |
| .IP 3. 4 |
| Block number relative to the beginning of the stream |
| (the first block is 1) |
| .IP 4. 4 |
| Block number relative to the beginning of the file |
| .IP 5. 4 |
| Compressed start offset relative to the beginning of the file |
| .IP 6. 4 |
| Uncompressed start offset relative to the beginning of the file |
| .IP 7. 4 |
| Total compressed size of the block (includes headers) |
| .IP 8. 4 |
| Uncompressed size |
| .IP 9. 4 |
| Compression ratio |
| .IP 10. 4 |
| Name of the integrity check |
| .RE |
| .PD |
| .PP |
| If |
| .B \-\-verbose |
| was specified twice, additional columns are included on the |
| .B block |
| lines. |
| These are not displayed with a single |
| .BR \-\-verbose , |
| because getting this information requires many seeks |
| and can thus be slow: |
| .PD 0 |
| .RS |
| .IP 11. 4 |
| Value of the integrity check in hexadecimal |
| .IP 12. 4 |
| Block header size |
| .IP 13. 4 |
| Block flags: |
| .B c |
| indicates that compressed size is present, and |
| .B u |
| indicates that uncompressed size is present. |
| If the flag is not set, a dash |
| .RB ( \- ) |
| is shown instead to keep the string length fixed. |
| New flags may be added to the end of the string in the future. |
| .IP 14. 4 |
| Size of the actual compressed data in the block (this excludes |
| the block header, block padding, and check fields) |
| .IP 15. 4 |
| Amount of memory (in bytes) required to decompress |
| this block with this |
| .B xz |
| version |
| .IP 16. 4 |
| Filter chain. |
| Note that most of the options used at compression time |
| cannot be known, because only the options |
| that are needed for decompression are stored in the |
| .B .xz |
| headers. |
| .RE |
| .PD |
| .PP |
| The columns of the |
| .B summary |
| lines: |
| .PD 0 |
| .RS |
| .IP 2. 4 |
| Amount of memory (in bytes) required to decompress |
| this file with this |
| .B xz |
| version |
| .IP 3. 4 |
| .B yes |
| or |
| .B no |
| indicating if all block headers have both compressed size and |
| uncompressed size stored in them |
| .PP |
| .I Since |
| .B xz |
| .I 5.1.2alpha: |
| .IP 4. 4 |
| Minimum |
| .B xz |
| version required to decompress the file |
| .RE |
| .PD |
| .PP |
| The columns of the |
| .B totals |
| line: |
| .PD 0 |
| .RS |
| .IP 2. 4 |
| Number of streams |
| .IP 3. 4 |
| Number of blocks |
| .IP 4. 4 |
| Compressed size |
| .IP 5. 4 |
| Uncompressed size |
| .IP 6. 4 |
| Average compression ratio |
| .IP 7. 4 |
| Comma-separated list of integrity check names |
| that were present in the files |
| .IP 8. 4 |
| Stream padding size |
| .IP 9. 4 |
| Number of files. |
| This is here to |
| keep the order of the earlier columns the same as on |
| .B file |
| lines. |
| .PD |
| .RE |
| .PP |
| If |
| .B \-\-verbose |
| was specified twice, additional columns are included on the |
| .B totals |
| line: |
| .PD 0 |
| .RS |
| .IP 10. 4 |
| Maximum amount of memory (in bytes) required to decompress |
| the files with this |
| .B xz |
| version |
| .IP 11. 4 |
| .B yes |
| or |
| .B no |
| indicating if all block headers have both compressed size and |
| uncompressed size stored in them |
| .PP |
| .I Since |
| .B xz |
| .I 5.1.2alpha: |
| .IP 12. 4 |
| Minimum |
| .B xz |
| version required to decompress the file |
| .RE |
| .PD |
| .PP |
| Future versions may add new line types and |
| new columns can be added to the existing line types, |
| but the existing columns won't be changed. |
| . |
| .SH "EXIT STATUS" |
| .TP |
| .B 0 |
| All is good. |
| .TP |
| .B 1 |
| An error occurred. |
| .TP |
| .B 2 |
| Something worth a warning occurred, |
| but no actual errors occurred. |
| .PP |
| Notices (not warnings or errors) printed on standard error |
| don't affect the exit status. |
| . |
| .SH ENVIRONMENT |
| .B xz |
| parses space-separated lists of options |
| from the environment variables |
| .B XZ_DEFAULTS |
| and |
| .BR XZ_OPT , |
| in this order, before parsing the options from the command line. |
| Note that only options are parsed from the environment variables; |
| all non-options are silently ignored. |
| Parsing is done with |
| .BR getopt_long (3) |
| which is used also for the command line arguments. |
| .TP |
| .B XZ_DEFAULTS |
| User-specific or system-wide default options. |
| Typically this is set in a shell initialization script to enable |
| .BR xz 's |
| memory usage limiter by default. |
| Excluding shell initialization scripts |
| and similar special cases, scripts must never set or unset |
| .BR XZ_DEFAULTS . |
| .TP |
| .B XZ_OPT |
| This is for passing options to |
| .B xz |
| when it is not possible to set the options directly on the |
| .B xz |
| command line. |
| This is the case e.g. when |
| .B xz |
| is run by a script or tool, e.g. GNU |
| .BR tar (1): |
| .RS |
| .RS |
| .PP |
| .nf |
| .ft CW |
| XZ_OPT=\-2v tar caf foo.tar.xz foo |
| .ft R |
| .fi |
| .RE |
| .RE |
| .IP "" |
| Scripts may use |
| .B XZ_OPT |
| e.g. to set script-specific default compression options. |
| It is still recommended to allow users to override |
| .B XZ_OPT |
| if that is reasonable, e.g. in |
| .BR sh (1) |
| scripts one may use something like this: |
| .RS |
| .RS |
| .PP |
| .nf |
| .ft CW |
| XZ_OPT=${XZ_OPT\-"\-7e"} |
| export XZ_OPT |
| .ft R |
| .fi |
| .RE |
| .RE |
| . |
| .SH "LZMA UTILS COMPATIBILITY" |
| The command line syntax of |
| .B xz |
| is practically a superset of |
| .BR lzma , |
| .BR unlzma , |
| and |
| .BR lzcat |
| as found from LZMA Utils 4.32.x. |
| In most cases, it is possible to replace |
| LZMA Utils with XZ Utils without breaking existing scripts. |
| There are some incompatibilities though, |
| which may sometimes cause problems. |
| . |
| .SS "Compression preset levels" |
| The numbering of the compression level presets is not identical in |
| .B xz |
| and LZMA Utils. |
| The most important difference is how dictionary sizes |
| are mapped to different presets. |
| Dictionary size is roughly equal to the decompressor memory usage. |
| .RS |
| .PP |
| .TS |
| tab(;); |
| c c c |
| c n n. |
| Level;xz;LZMA Utils |
| \-0;256 KiB;N/A |
| \-1;1 MiB;64 KiB |
| \-2;2 MiB;1 MiB |
| \-3;4 MiB;512 KiB |
| \-4;4 MiB;1 MiB |
| \-5;8 MiB;2 MiB |
| \-6;8 MiB;4 MiB |
| \-7;16 MiB;8 MiB |
| \-8;32 MiB;16 MiB |
| \-9;64 MiB;32 MiB |
| .TE |
| .RE |
| .PP |
| The dictionary size differences affect |
| the compressor memory usage too, |
| but there are some other differences between |
| LZMA Utils and XZ Utils, which |
| make the difference even bigger: |
| .RS |
| .PP |
| .TS |
| tab(;); |
| c c c |
| c n n. |
| Level;xz;LZMA Utils 4.32.x |
| \-0;3 MiB;N/A |
| \-1;9 MiB;2 MiB |
| \-2;17 MiB;12 MiB |
| \-3;32 MiB;12 MiB |
| \-4;48 MiB;16 MiB |
| \-5;94 MiB;26 MiB |
| \-6;94 MiB;45 MiB |
| \-7;186 MiB;83 MiB |
| \-8;370 MiB;159 MiB |
| \-9;674 MiB;311 MiB |
| .TE |
| .RE |
| .PP |
| The default preset level in LZMA Utils is |
| .B \-7 |
| while in XZ Utils it is |
| .BR \-6 , |
| so both use an 8 MiB dictionary by default. |
| . |
| .SS "Streamed vs. non-streamed .lzma files" |
| The uncompressed size of the file can be stored in the |
| .B .lzma |
| header. |
| LZMA Utils does that when compressing regular files. |
| The alternative is to mark that uncompressed size is unknown |
| and use end-of-payload marker to indicate |
| where the decompressor should stop. |
| LZMA Utils uses this method when uncompressed size isn't known, |
| which is the case for example in pipes. |
| .PP |
| .B xz |
| supports decompressing |
| .B .lzma |
| files with or without end-of-payload marker, but all |
| .B .lzma |
| files created by |
| .B xz |
| will use end-of-payload marker and have uncompressed size |
| marked as unknown in the |
| .B .lzma |
| header. |
| This may be a problem in some uncommon situations. |
| For example, a |
| .B .lzma |
| decompressor in an embedded device might work |
| only with files that have known uncompressed size. |
| If you hit this problem, you need to use LZMA Utils |
| or LZMA SDK to create |
| .B .lzma |
| files with known uncompressed size. |
| . |
| .SS "Unsupported .lzma files" |
| The |
| .B .lzma |
| format allows |
| .I lc |
| values up to 8, and |
| .I lp |
| values up to 4. |
| LZMA Utils can decompress files with any |
| .I lc |
| and |
| .IR lp , |
| but always creates files with |
| .B lc=3 |
| and |
| .BR lp=0 . |
| Creating files with other |
| .I lc |
| and |
| .I lp |
| is possible with |
| .B xz |
| and with LZMA SDK. |
| .PP |
| The implementation of the LZMA1 filter in liblzma |
| requires that the sum of |
| .I lc |
| and |
| .I lp |
| must not exceed 4. |
| Thus, |
| .B .lzma |
| files, which exceed this limitation, cannot be decompressed with |
| .BR xz . |
| .PP |
| LZMA Utils creates only |
| .B .lzma |
| files which have a dictionary size of |
| .RI "2^" n |
| (a power of 2) but accepts files with any dictionary size. |
| liblzma accepts only |
| .B .lzma |
| files which have a dictionary size of |
| .RI "2^" n |
| or |
| .RI "2^" n " + 2^(" n "\-1)." |
| This is to decrease false positives when detecting |
| .B .lzma |
| files. |
| .PP |
| These limitations shouldn't be a problem in practice, |
| since practically all |
| .B .lzma |
| files have been compressed with settings that liblzma will accept. |
| . |
| .SS "Trailing garbage" |
| When decompressing, |
| LZMA Utils silently ignore everything after the first |
| .B .lzma |
| stream. |
| In most situations, this is a bug. |
| This also means that LZMA Utils |
| don't support decompressing concatenated |
| .B .lzma |
| files. |
| .PP |
| If there is data left after the first |
| .B .lzma |
| stream, |
| .B xz |
| considers the file to be corrupt unless |
| .B \-\-single\-stream |
| was used. |
| This may break obscure scripts which have |
| assumed that trailing garbage is ignored. |
| . |
| .SH NOTES |
| . |
| .SS "Compressed output may vary" |
| The exact compressed output produced from |
| the same uncompressed input file |
| may vary between XZ Utils versions even if |
| compression options are identical. |
| This is because the encoder can be improved |
| (faster or better compression) |
| without affecting the file format. |
| The output can vary even between different |
| builds of the same XZ Utils version, |
| if different build options are used. |
| .PP |
| The above means that once |
| .B \-\-rsyncable |
| has been implemented, |
| the resulting files won't necessarily be rsyncable |
| unless both old and new files have been compressed |
| with the same xz version. |
| This problem can be fixed if a part of the encoder |
| implementation is frozen to keep rsyncable output |
| stable across xz versions. |
| . |
| .SS "Embedded .xz decompressors" |
| Embedded |
| .B .xz |
| decompressor implementations like XZ Embedded don't necessarily |
| support files created with integrity |
| .I check |
| types other than |
| .B none |
| and |
| .BR crc32 . |
| Since the default is |
| .BR \-\-check=crc64 , |
| you must use |
| .B \-\-check=none |
| or |
| .B \-\-check=crc32 |
| when creating files for embedded systems. |
| .PP |
| Outside embedded systems, all |
| .B .xz |
| format decompressors support all the |
| .I check |
| types, or at least are able to decompress |
| the file without verifying the |
| integrity check if the particular |
| .I check |
| is not supported. |
| .PP |
| XZ Embedded supports BCJ filters, |
| but only with the default start offset. |
| . |
| .SH EXAMPLES |
| . |
| .SS Basics |
| Compress the file |
| .I foo |
| into |
| .I foo.xz |
| using the default compression level |
| .RB ( \-6 ), |
| and remove |
| .I foo |
| if compression is successful: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz foo |
| .ft R |
| .fi |
| .RE |
| .PP |
| Decompress |
| .I bar.xz |
| into |
| .I bar |
| and don't remove |
| .I bar.xz |
| even if decompression is successful: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-dk bar.xz |
| .ft R |
| .fi |
| .RE |
| .PP |
| Create |
| .I baz.tar.xz |
| with the preset |
| .B \-4e |
| .RB ( "\-4 \-\-extreme" ), |
| which is slower than e.g. the default |
| .BR \-6 , |
| but needs less memory for compression and decompression (48\ MiB |
| and 5\ MiB, respectively): |
| .RS |
| .PP |
| .nf |
| .ft CW |
| tar cf \- baz | xz \-4e > baz.tar.xz |
| .ft R |
| .fi |
| .RE |
| .PP |
| A mix of compressed and uncompressed files can be decompressed |
| to standard output with a single command: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt |
| .ft R |
| .fi |
| .RE |
| . |
| .SS "Parallel compression of many files" |
| On GNU and *BSD, |
| .BR find (1) |
| and |
| .BR xargs (1) |
| can be used to parallelize compression of many files: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| find . \-type f \e! \-name '*.xz' \-print0 \e |
| | xargs \-0r \-P4 \-n16 xz \-T1 |
| .ft R |
| .fi |
| .RE |
| .PP |
| The |
| .B \-P |
| option to |
| .BR xargs (1) |
| sets the number of parallel |
| .B xz |
| processes. |
| The best value for the |
| .B \-n |
| option depends on how many files there are to be compressed. |
| If there are only a couple of files, |
| the value should probably be 1; |
| with tens of thousands of files, |
| 100 or even more may be appropriate to reduce the number of |
| .B xz |
| processes that |
| .BR xargs (1) |
| will eventually create. |
| .PP |
| The option |
| .B \-T1 |
| for |
| .B xz |
| is there to force it to single-threaded mode, because |
| .BR xargs (1) |
| is used to control the amount of parallelization. |
| . |
| .SS "Robot mode" |
| Calculate how many bytes have been saved in total |
| after compressing multiple files: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}' |
| .ft R |
| .fi |
| .RE |
| .PP |
| A script may want to know that it is using new enough |
| .BR xz . |
| The following |
| .BR sh (1) |
| script checks that the version number of the |
| .B xz |
| tool is at least 5.0.0. |
| This method is compatible with old beta versions, |
| which didn't support the |
| .B \-\-robot |
| option: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" || |
| [ "$XZ_VERSION" \-lt 50000002 ]; then |
| echo "Your xz is too old." |
| fi |
| unset XZ_VERSION LIBLZMA_VERSION |
| .ft R |
| .fi |
| .RE |
| .PP |
| Set a memory usage limit for decompression using |
| .BR XZ_OPT , |
| but if a limit has already been set, don't increase it: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| NEWLIM=$((123 << 20)) # 123 MiB |
| OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3) |
| if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then |
| XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM" |
| export XZ_OPT |
| fi |
| .ft R |
| .fi |
| .RE |
| . |
| .SS "Custom compressor filter chains" |
| The simplest use for custom filter chains is |
| customizing a LZMA2 preset. |
| This can be useful, |
| because the presets cover only a subset of the |
| potentially useful combinations of compression settings. |
| .PP |
| The CompCPU columns of the tables |
| from the descriptions of the options |
| .BR "\-0" " ... " "\-9" |
| and |
| .B \-\-extreme |
| are useful when customizing LZMA2 presets. |
| Here are the relevant parts collected from those two tables: |
| .RS |
| .PP |
| .TS |
| tab(;); |
| c c |
| n n. |
| Preset;CompCPU |
| \-0;0 |
| \-1;1 |
| \-2;2 |
| \-3;3 |
| \-4;4 |
| \-5;5 |
| \-6;6 |
| \-5e;7 |
| \-6e;8 |
| .TE |
| .RE |
| .PP |
| If you know that a file requires |
| somewhat big dictionary (e.g. 32 MiB) to compress well, |
| but you want to compress it quicker than |
| .B "xz \-8" |
| would do, a preset with a low CompCPU value (e.g. 1) |
| can be modified to use a bigger dictionary: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-lzma2=preset=1,dict=32MiB foo.tar |
| .ft R |
| .fi |
| .RE |
| .PP |
| With certain files, the above command may be faster than |
| .B "xz \-6" |
| while compressing significantly better. |
| However, it must be emphasized that only some files benefit from |
| a big dictionary while keeping the CompCPU value low. |
| The most obvious situation, |
| where a big dictionary can help a lot, |
| is an archive containing very similar files |
| of at least a few megabytes each. |
| The dictionary size has to be significantly bigger |
| than any individual file to allow LZMA2 to take |
| full advantage of the similarities between consecutive files. |
| .PP |
| If very high compressor and decompressor memory usage is fine, |
| and the file being compressed is |
| at least several hundred megabytes, it may be useful |
| to use an even bigger dictionary than the 64 MiB that |
| .B "xz \-9" |
| would use: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-vv \-\-lzma2=dict=192MiB big_foo.tar |
| .ft R |
| .fi |
| .RE |
| .PP |
| Using |
| .B \-vv |
| .RB ( "\-\-verbose \-\-verbose" ) |
| like in the above example can be useful |
| to see the memory requirements |
| of the compressor and decompressor. |
| Remember that using a dictionary bigger than |
| the size of the uncompressed file is waste of memory, |
| so the above command isn't useful for small files. |
| .PP |
| Sometimes the compression time doesn't matter, |
| but the decompressor memory usage has to be kept low |
| e.g. to make it possible to decompress the file on |
| an embedded system. |
| The following command uses |
| .B \-6e |
| .RB ( "\-6 \-\-extreme" ) |
| as a base and sets the dictionary to only 64\ KiB. |
| The resulting file can be decompressed with XZ Embedded |
| (that's why there is |
| .BR \-\-check=crc32 ) |
| using about 100\ KiB of memory. |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo |
| .ft R |
| .fi |
| .RE |
| .PP |
| If you want to squeeze out as many bytes as possible, |
| adjusting the number of literal context bits |
| .RI ( lc ) |
| and number of position bits |
| .RI ( pb ) |
| can sometimes help. |
| Adjusting the number of literal position bits |
| .RI ( lp ) |
| might help too, but usually |
| .I lc |
| and |
| .I pb |
| are more important. |
| E.g. a source code archive contains mostly US-ASCII text, |
| so something like the following might give |
| slightly (like 0.1\ %) smaller file than |
| .B "xz \-6e" |
| (try also without |
| .BR lc=4 ): |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar |
| .ft R |
| .fi |
| .RE |
| .PP |
| Using another filter together with LZMA2 can improve |
| compression with certain file types. |
| E.g. to compress a x86-32 or x86-64 shared library |
| using the x86 BCJ filter: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-x86 \-\-lzma2 libfoo.so |
| .ft R |
| .fi |
| .RE |
| .PP |
| Note that the order of the filter options is significant. |
| If |
| .B \-\-x86 |
| is specified after |
| .BR \-\-lzma2 , |
| .B xz |
| will give an error, |
| because there cannot be any filter after LZMA2, |
| and also because the x86 BCJ filter cannot be used |
| as the last filter in the chain. |
| .PP |
| The Delta filter together with LZMA2 |
| can give good results with bitmap images. |
| It should usually beat PNG, |
| which has a few more advanced filters than simple |
| delta but uses Deflate for the actual compression. |
| .PP |
| The image has to be saved in uncompressed format, |
| e.g. as uncompressed TIFF. |
| The distance parameter of the Delta filter is set |
| to match the number of bytes per pixel in the image. |
| E.g. 24-bit RGB bitmap needs |
| .BR dist=3 , |
| and it is also good to pass |
| .B pb=0 |
| to LZMA2 to accommodate the three-byte alignment: |
| .RS |
| .PP |
| .nf |
| .ft CW |
| xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff |
| .ft R |
| .fi |
| .RE |
| .PP |
| If multiple images have been put into a single archive (e.g.\& |
| .BR .tar ), |
| the Delta filter will work on that too as long as all images |
| have the same number of bytes per pixel. |
| . |
| .SH "SEE ALSO" |
| .BR xzdec (1), |
| .BR xzdiff (1), |
| .BR xzgrep (1), |
| .BR xzless (1), |
| .BR xzmore (1), |
| .BR gzip (1), |
| .BR bzip2 (1), |
| .BR 7z (1) |
| .PP |
| XZ Utils: <http://tukaani.org/xz/> |
| .br |
| XZ Embedded: <http://tukaani.org/xz/embedded.html> |
| .br |
| LZMA SDK: <http://7-zip.org/sdk.html> |