| .\" Copyright (c) 2003-2009 Tim Kientzle |
| .\" All rights reserved. |
| .\" |
| .\" Redistribution and use in source and binary forms, with or without |
| .\" modification, are permitted provided that the following conditions |
| .\" are met: |
| .\" 1. Redistributions of source code must retain the above copyright |
| .\" notice, this list of conditions and the following disclaimer. |
| .\" 2. Redistributions in binary form must reproduce the above copyright |
| .\" notice, this list of conditions and the following disclaimer in the |
| .\" documentation and/or other materials provided with the distribution. |
| .\" |
| .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND |
| .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
| .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
| .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE |
| .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
| .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
| .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
| .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
| .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| .\" SUCH DAMAGE. |
| .\" |
| .\" $FreeBSD: head/lib/libarchive/tar.5 201077 2009-12-28 01:50:23Z kientzle $ |
| .\" |
| .Dd December 27, 2009 |
| .Dt tar 5 |
| .Os |
| .Sh NAME |
| .Nm tar |
| .Nd format of tape archive files |
| .Sh DESCRIPTION |
| The |
| .Nm |
| archive format collects any number of files, directories, and other |
| file system objects (symbolic links, device nodes, etc.) into a single |
| stream of bytes. |
| The format was originally designed to be used with |
| tape drives that operate with fixed-size blocks, but is widely used as |
| a general packaging mechanism. |
| .Ss General Format |
| A |
| .Nm |
| archive consists of a series of 512-byte records. |
| Each file system object requires a header record which stores basic metadata |
| (pathname, owner, permissions, etc.) and zero or more records containing any |
| file data. |
| The end of the archive is indicated by two records consisting |
| entirely of zero bytes. |
| .Pp |
| For compatibility with tape drives that use fixed block sizes, |
| programs that read or write tar files always read or write a fixed |
| number of records with each I/O operation. |
| These |
| .Dq blocks |
| are always a multiple of the record size. |
| The maximum block size supported by early |
| implementations was 10240 bytes or 20 records. |
| This is still the default for most implementations |
| although block sizes of 1MiB (2048 records) or larger are |
| commonly used with modern high-speed tape drives. |
| (Note: the terms |
| .Dq block |
| and |
| .Dq record |
| here are not entirely standard; this document follows the |
| convention established by John Gilmore in documenting |
| .Nm pdtar . ) |
| .Ss Old-Style Archive Format |
| The original tar archive format has been extended many times to |
| include additional information that various implementors found |
| necessary. |
| This section describes the variant implemented by the tar command |
| included in |
| .At v7 , |
| which seems to be the earliest widely-used version of the tar program. |
| .Pp |
| The header record for an old-style |
| .Nm |
| archive consists of the following: |
| .Bd -literal -offset indent |
| struct header_old_tar { |
| char name[100]; |
| char mode[8]; |
| char uid[8]; |
| char gid[8]; |
| char size[12]; |
| char mtime[12]; |
| char checksum[8]; |
| char linkflag[1]; |
| char linkname[100]; |
| char pad[255]; |
| }; |
| .Ed |
| All unused bytes in the header record are filled with nulls. |
| .Bl -tag -width indent |
| .It Va name |
| Pathname, stored as a null-terminated string. |
| Early tar implementations only stored regular files (including |
| hardlinks to those files). |
| One common early convention used a trailing "/" character to indicate |
| a directory name, allowing directory permissions and owner information |
| to be archived and restored. |
| .It Va mode |
| File mode, stored as an octal number in ASCII. |
| .It Va uid , Va gid |
| User id and group id of owner, as octal numbers in ASCII. |
| .It Va size |
| Size of file, as octal number in ASCII. |
| For regular files only, this indicates the amount of data |
| that follows the header. |
| In particular, this field was ignored by early tar implementations |
| when extracting hardlinks. |
| Modern writers should always store a zero length for hardlink entries. |
| .It Va mtime |
| Modification time of file, as an octal number in ASCII. |
| This indicates the number of seconds since the start of the epoch, |
| 00:00:00 UTC January 1, 1970. |
| Note that negative values should be avoided |
| here, as they are handled inconsistently. |
| .It Va checksum |
| Header checksum, stored as an octal number in ASCII. |
| To compute the checksum, set the checksum field to all spaces, |
| then sum all bytes in the header using unsigned arithmetic. |
| This field should be stored as six octal digits followed by a null and a space |
| character. |
| Note that many early implementations of tar used signed arithmetic |
| for the checksum field, which can cause interoperability problems |
| when transferring archives between systems. |
| Modern robust readers compute the checksum both ways and accept the |
| header if either computation matches. |
| .It Va linkflag , Va linkname |
| In order to preserve hardlinks and conserve tape, a file |
| with multiple links is only written to the archive the first |
| time it is encountered. |
| The next time it is encountered, the |
| .Va linkflag |
| is set to an ASCII |
| .Sq 1 |
| and the |
| .Va linkname |
| field holds the first name under which this file appears. |
| (Note that regular files have a null value in the |
| .Va linkflag |
| field.) |
| .El |
| .Pp |
| Early tar implementations varied in how they terminated these fields. |
| The tar command in |
| .At v7 |
| used the following conventions (this is also documented in early BSD manpages): |
| the pathname must be null-terminated; |
| the mode, uid, and gid fields must end in a space and a null byte; |
| the size and mtime fields must end in a space; |
| the checksum is terminated by a null and a space. |
| Early implementations filled the numeric fields with leading spaces. |
| This seems to have been common practice until the |
| .St -p1003.1-88 |
| standard was released. |
| For best portability, modern implementations should fill the numeric |
| fields with leading zeros. |
| .Ss Pre-POSIX Archives |
| An early draft of |
| .St -p1003.1-88 |
| served as the basis for John Gilmore's |
| .Nm pdtar |
| program and many system implementations from the late 1980s |
| and early 1990s. |
| These archives generally follow the POSIX ustar |
| format described below with the following variations: |
| .Bl -bullet -compact -width indent |
| .It |
| The magic value is |
| .Dq ustar\ \& |
| (note the following space). |
| The version field contains a space character followed by a null. |
| .It |
| The numeric fields are generally filled with leading spaces |
| (not leading zeros as recommended in the final standard). |
| .It |
| The prefix field is often not used, limiting pathnames to |
| the 100 characters of old-style archives. |
| .El |
| .Ss POSIX ustar Archives |
| .St -p1003.1-88 |
| defined a standard tar file format to be read and written |
| by compliant implementations of |
| .Xr tar 1 . |
| This format is often called the |
| .Dq ustar |
| format, after the magic value used |
| in the header. |
| (The name is an acronym for |
| .Dq Unix Standard TAR . ) |
| It extends the historic format with new fields: |
| .Bd -literal -offset indent |
| struct header_posix_ustar { |
| char name[100]; |
| char mode[8]; |
| char uid[8]; |
| char gid[8]; |
| char size[12]; |
| char mtime[12]; |
| char checksum[8]; |
| char typeflag[1]; |
| char linkname[100]; |
| char magic[6]; |
| char version[2]; |
| char uname[32]; |
| char gname[32]; |
| char devmajor[8]; |
| char devminor[8]; |
| char prefix[155]; |
| char pad[12]; |
| }; |
| .Ed |
| .Bl -tag -width indent |
| .It Va typeflag |
| Type of entry. |
| POSIX extended the earlier |
| .Va linkflag |
| field with several new type values: |
| .Bl -tag -width indent -compact |
| .It Dq 0 |
| Regular file. |
| NUL should be treated as a synonym, for compatibility purposes. |
| .It Dq 1 |
| Hard link. |
| .It Dq 2 |
| Symbolic link. |
| .It Dq 3 |
| Character device node. |
| .It Dq 4 |
| Block device node. |
| .It Dq 5 |
| Directory. |
| .It Dq 6 |
| FIFO node. |
| .It Dq 7 |
| Reserved. |
| .It Other |
| A POSIX-compliant implementation must treat any unrecognized typeflag value |
| as a regular file. |
| In particular, writers should ensure that all entries |
| have a valid filename so that they can be restored by readers that do not |
| support the corresponding extension. |
| Uppercase letters "A" through "Z" are reserved for custom extensions. |
| Note that sockets and whiteout entries are not archivable. |
| .El |
| It is worth noting that the |
| .Va size |
| field, in particular, has different meanings depending on the type. |
| For regular files, of course, it indicates the amount of data |
| following the header. |
| For directories, it may be used to indicate the total size of all |
| files in the directory, for use by operating systems that pre-allocate |
| directory space. |
| For all other types, it should be set to zero by writers and ignored |
| by readers. |
| .It Va magic |
| Contains the magic value |
| .Dq ustar |
| followed by a NUL byte to indicate that this is a POSIX standard archive. |
| Full compliance requires the uname and gname fields be properly set. |
| .It Va version |
| Version. |
| This should be |
| .Dq 00 |
| (two copies of the ASCII digit zero) for POSIX standard archives. |
| .It Va uname , Va gname |
| User and group names, as null-terminated ASCII strings. |
| These should be used in preference to the uid/gid values |
| when they are set and the corresponding names exist on |
| the system. |
| .It Va devmajor , Va devminor |
| Major and minor numbers for character device or block device entry. |
| .It Va name , Va prefix |
| If the pathname is too long to fit in the 100 bytes provided by the standard |
| format, it can be split at any |
| .Pa / |
| character with the first portion going into the prefix field. |
| If the prefix field is not empty, the reader will prepend |
| the prefix value and a |
| .Pa / |
| character to the regular name field to obtain the full pathname. |
| The standard does not require a trailing |
| .Pa / |
| character on directory names, though most implementations still |
| include this for compatibility reasons. |
| .El |
| .Pp |
| Note that all unused bytes must be set to |
| .Dv NUL . |
| .Pp |
| Field termination is specified slightly differently by POSIX |
| than by previous implementations. |
| The |
| .Va magic , |
| .Va uname , |
| and |
| .Va gname |
| fields must have a trailing |
| .Dv NUL . |
| The |
| .Va pathname , |
| .Va linkname , |
| and |
| .Va prefix |
| fields must have a trailing |
| .Dv NUL |
| unless they fill the entire field. |
| (In particular, it is possible to store a 256-character pathname if it |
| happens to have a |
| .Pa / |
| as the 156th character.) |
| POSIX requires numeric fields to be zero-padded in the front, and requires |
| them to be terminated with either space or |
| .Dv NUL |
| characters. |
| .Pp |
| Currently, most tar implementations comply with the ustar |
| format, occasionally extending it by adding new fields to the |
| blank area at the end of the header record. |
| .Ss Pax Interchange Format |
| There are many attributes that cannot be portably stored in a |
| POSIX ustar archive. |
| .St -p1003.1-2001 |
| defined a |
| .Dq pax interchange format |
| that uses two new types of entries to hold text-formatted |
| metadata that applies to following entries. |
| Note that a pax interchange format archive is a ustar archive in every |
| respect. |
| The new data is stored in ustar-compatible archive entries that use the |
| .Dq x |
| or |
| .Dq g |
| typeflag. |
| In particular, older implementations that do not fully support these |
| extensions will extract the metadata into regular files, where the |
| metadata can be examined as necessary. |
| .Pp |
| An entry in a pax interchange format archive consists of one or |
| two standard ustar entries, each with its own header and data. |
| The first optional entry stores the extended attributes |
| for the following entry. |
| This optional first entry has an "x" typeflag and a size field that |
| indicates the total size of the extended attributes. |
| The extended attributes themselves are stored as a series of text-format |
| lines encoded in the portable UTF-8 encoding. |
| Each line consists of a decimal number, a space, a key string, an equals |
| sign, a value string, and a new line. |
| The decimal number indicates the length of the entire line, including the |
| initial length field and the trailing newline. |
| An example of such a field is: |
| .Dl 25 ctime=1084839148.1212\en |
| Keys in all lowercase are standard keys. |
| Vendors can add their own keys by prefixing them with an all uppercase |
| vendor name and a period. |
| Note that, unlike the historic header, numeric values are stored using |
| decimal, not octal. |
| A description of some common keys follows: |
| .Bl -tag -width indent |
| .It Cm atime , Cm ctime , Cm mtime |
| File access, inode change, and modification times. |
| These fields can be negative or include a decimal point and a fractional value. |
| .It Cm uname , Cm uid , Cm gname , Cm gid |
| User name, group name, and numeric UID and GID values. |
| The user name and group name stored here are encoded in UTF8 |
| and can thus include non-ASCII characters. |
| The UID and GID fields can be of arbitrary length. |
| .It Cm linkpath |
| The full path of the linked-to file. |
| Note that this is encoded in UTF8 and can thus include non-ASCII characters. |
| .It Cm path |
| The full pathname of the entry. |
| Note that this is encoded in UTF8 and can thus include non-ASCII characters. |
| .It Cm realtime.* , Cm security.* |
| These keys are reserved and may be used for future standardization. |
| .It Cm size |
| The size of the file. |
| Note that there is no length limit on this field, allowing conforming |
| archives to store files much larger than the historic 8GB limit. |
| .It Cm SCHILY.* |
| Vendor-specific attributes used by Joerg Schilling's |
| .Nm star |
| implementation. |
| .It Cm SCHILY.acl.access , Cm SCHILY.acl.default |
| Stores the access and default ACLs as textual strings in a format |
| that is an extension of the format specified by POSIX.1e draft 17. |
| In particular, each user or group access specification can include a fourth |
| colon-separated field with the numeric UID or GID. |
| This allows ACLs to be restored on systems that may not have complete |
| user or group information available (such as when NIS/YP or LDAP services |
| are temporarily unavailable). |
| .It Cm SCHILY.devminor , Cm SCHILY.devmajor |
| The full minor and major numbers for device nodes. |
| .It Cm SCHILY.fflags |
| The file flags. |
| .It Cm SCHILY.realsize |
| The full size of the file on disk. |
| XXX explain? XXX |
| .It Cm SCHILY.dev, Cm SCHILY.ino , Cm SCHILY.nlinks |
| The device number, inode number, and link count for the entry. |
| In particular, note that a pax interchange format archive using Joerg |
| Schilling's |
| .Cm SCHILY.* |
| extensions can store all of the data from |
| .Va struct stat . |
| .It Cm LIBARCHIVE.xattr. Ns Ar namespace Ns . Ns Ar key |
| Libarchive stores POSIX.1e-style extended attributes using |
| keys of this form. |
| The |
| .Ar key |
| value is URL-encoded: |
| All non-ASCII characters and the two special characters |
| .Dq = |
| and |
| .Dq % |
| are encoded as |
| .Dq % |
| followed by two uppercase hexadecimal digits. |
| The value of this key is the extended attribute value |
| encoded in base 64. |
| XXX Detail the base-64 format here XXX |
| .It Cm VENDOR.* |
| XXX document other vendor-specific extensions XXX |
| .El |
| .Pp |
| Any values stored in an extended attribute override the corresponding |
| values in the regular tar header. |
| Note that compliant readers should ignore the regular fields when they |
| are overridden. |
| This is important, as existing archivers are known to store non-compliant |
| values in the standard header fields in this situation. |
| There are no limits on length for any of these fields. |
| In particular, numeric fields can be arbitrarily large. |
| All text fields are encoded in UTF8. |
| Compliant writers should store only portable 7-bit ASCII characters in |
| the standard ustar header and use extended |
| attributes whenever a text value contains non-ASCII characters. |
| .Pp |
| In addition to the |
| .Cm x |
| entry described above, the pax interchange format |
| also supports a |
| .Cm g |
| entry. |
| The |
| .Cm g |
| entry is identical in format, but specifies attributes that serve as |
| defaults for all subsequent archive entries. |
| The |
| .Cm g |
| entry is not widely used. |
| .Pp |
| Besides the new |
| .Cm x |
| and |
| .Cm g |
| entries, the pax interchange format has a few other minor variations |
| from the earlier ustar format. |
| The most troubling one is that hardlinks are permitted to have |
| data following them. |
| This allows readers to restore any hardlink to a file without |
| having to rewind the archive to find an earlier entry. |
| However, it creates complications for robust readers, as it is no longer |
| clear whether or not they should ignore the size field for hardlink entries. |
| .Ss GNU Tar Archives |
| The GNU tar program started with a pre-POSIX format similar to that |
| described earlier and has extended it using several different mechanisms: |
| It added new fields to the empty space in the header (some of which was later |
| used by POSIX for conflicting purposes); |
| it allowed the header to be continued over multiple records; |
| and it defined new entries that modify following entries |
| (similar in principle to the |
| .Cm x |
| entry described above, but each GNU special entry is single-purpose, |
| unlike the general-purpose |
| .Cm x |
| entry). |
| As a result, GNU tar archives are not POSIX compatible, although |
| more lenient POSIX-compliant readers can successfully extract most |
| GNU tar archives. |
| .Bd -literal -offset indent |
| struct header_gnu_tar { |
| char name[100]; |
| char mode[8]; |
| char uid[8]; |
| char gid[8]; |
| char size[12]; |
| char mtime[12]; |
| char checksum[8]; |
| char typeflag[1]; |
| char linkname[100]; |
| char magic[6]; |
| char version[2]; |
| char uname[32]; |
| char gname[32]; |
| char devmajor[8]; |
| char devminor[8]; |
| char atime[12]; |
| char ctime[12]; |
| char offset[12]; |
| char longnames[4]; |
| char unused[1]; |
| struct { |
| char offset[12]; |
| char numbytes[12]; |
| } sparse[4]; |
| char isextended[1]; |
| char realsize[12]; |
| char pad[17]; |
| }; |
| .Ed |
| .Bl -tag -width indent |
| .It Va typeflag |
| GNU tar uses the following special entry types, in addition to |
| those defined by POSIX: |
| .Bl -tag -width indent |
| .It "7" |
| GNU tar treats type "7" records identically to type "0" records, |
| except on one obscure RTOS where they are used to indicate the |
| pre-allocation of a contiguous file on disk. |
| .It "D" |
| This indicates a directory entry. |
| Unlike the POSIX-standard "5" |
| typeflag, the header is followed by data records listing the names |
| of files in this directory. |
| Each name is preceded by an ASCII "Y" |
| if the file is stored in this archive or "N" if the file is not |
| stored in this archive. |
| Each name is terminated with a null, and |
| an extra null marks the end of the name list. |
| The purpose of this |
| entry is to support incremental backups; a program restoring from |
| such an archive may wish to delete files on disk that did not exist |
| in the directory when the archive was made. |
| .Pp |
| Note that the "D" typeflag specifically violates POSIX, which requires |
| that unrecognized typeflags be restored as normal files. |
| In this case, restoring the "D" entry as a file could interfere |
| with subsequent creation of the like-named directory. |
| .It "K" |
| The data for this entry is a long linkname for the following regular entry. |
| .It "L" |
| The data for this entry is a long pathname for the following regular entry. |
| .It "M" |
| This is a continuation of the last file on the previous volume. |
| GNU multi-volume archives guarantee that each volume begins with a valid |
| entry header. |
| To ensure this, a file may be split, with part stored at the end of one volume, |
| and part stored at the beginning of the next volume. |
| The "M" typeflag indicates that this entry continues an existing file. |
| Such entries can only occur as the first or second entry |
| in an archive (the latter only if the first entry is a volume label). |
| The |
| .Va size |
| field specifies the size of this entry. |
| The |
| .Va offset |
| field at bytes 369-380 specifies the offset where this file fragment |
| begins. |
| The |
| .Va realsize |
| field specifies the total size of the file (which must equal |
| .Va size |
| plus |
| .Va offset ) . |
| When extracting, GNU tar checks that the header file name is the one it is |
| expecting, that the header offset is in the correct sequence, and that |
| the sum of offset and size is equal to realsize. |
| .It "N" |
| Type "N" records are no longer generated by GNU tar. |
| They contained a |
| list of files to be renamed or symlinked after extraction; this was |
| originally used to support long names. |
| The contents of this record |
| are a text description of the operations to be done, in the form |
| .Dq Rename %s to %s\en |
| or |
| .Dq Symlink %s to %s\en ; |
| in either case, both |
| filenames are escaped using K&R C syntax. |
| Due to security concerns, "N" records are now generally ignored |
| when reading archives. |
| .It "S" |
| This is a |
| .Dq sparse |
| regular file. |
| Sparse files are stored as a series of fragments. |
| The header contains a list of fragment offset/length pairs. |
| If more than four such entries are required, the header is |
| extended as necessary with |
| .Dq extra |
| header extensions (an older format that is no longer used), or |
| .Dq sparse |
| extensions. |
| .It "V" |
| The |
| .Va name |
| field should be interpreted as a tape/volume header name. |
| This entry should generally be ignored on extraction. |
| .El |
| .It Va magic |
| The magic field holds the five characters |
| .Dq ustar |
| followed by a space. |
| Note that POSIX ustar archives have a trailing null. |
| .It Va version |
| The version field holds a space character followed by a null. |
| Note that POSIX ustar archives use two copies of the ASCII digit |
| .Dq 0 . |
| .It Va atime , Va ctime |
| The time the file was last accessed and the time of |
| last change of file information, stored in octal as with |
| .Va mtime . |
| .It Va longnames |
| This field is apparently no longer used. |
| .It Sparse Va offset / Va numbytes |
| Each such structure specifies a single fragment of a sparse |
| file. |
| The two fields store values as octal numbers. |
| The fragments are each padded to a multiple of 512 bytes |
| in the archive. |
| On extraction, the list of fragments is collected from the |
| header (including any extension headers), and the data |
| is then read and written to the file at appropriate offsets. |
| .It Va isextended |
| If this is set to non-zero, the header will be followed by additional |
| .Dq sparse header |
| records. |
| Each such record contains information about as many as 21 additional |
| sparse blocks as shown here: |
| .Bd -literal -offset indent |
| struct gnu_sparse_header { |
| struct { |
| char offset[12]; |
| char numbytes[12]; |
| } sparse[21]; |
| char isextended[1]; |
| char padding[7]; |
| }; |
| .Ed |
| .It Va realsize |
| A binary representation of the file's complete size, with a much larger range |
| than the POSIX file size. |
| In particular, with |
| .Cm M |
| type files, the current entry is only a portion of the file. |
| In that case, the POSIX size field will indicate the size of this |
| entry; the |
| .Va realsize |
| field will indicate the total size of the file. |
| .El |
| .Ss GNU tar pax archives |
| GNU tar 1.14 (XXX check this XXX) and later will write |
| pax interchange format archives when you specify the |
| .Fl -posix |
| flag. |
| This format uses custom keywords to store sparse file information. |
| There have been three iterations of this support, referred to |
| as |
| .Dq 0.0 , |
| .Dq 0.1 , |
| and |
| .Dq 1.0 . |
| .Bl -tag -width indent |
| .It Cm GNU.sparse.numblocks , Cm GNU.sparse.offset , Cm GNU.sparse.numbytes , Cm GNU.sparse.size |
| The |
| .Dq 0.0 |
| format used an initial |
| .Cm GNU.sparse.numblocks |
| attribute to indicate the number of blocks in the file, a pair of |
| .Cm GNU.sparse.offset |
| and |
| .Cm GNU.sparse.numbytes |
| to indicate the offset and size of each block, |
| and a single |
| .Cm GNU.sparse.size |
| to indicate the full size of the file. |
| This is not the same as the size in the tar header because the |
| latter value does not include the size of any holes. |
| This format required that the order of attributes be preserved and |
| relied on readers accepting multiple appearances of the same attribute |
| names, which is not officially permitted by the standards. |
| .It Cm GNU.sparse.map |
| The |
| .Dq 0.1 |
| format used a single attribute that stored a comma-separated |
| list of decimal numbers. |
| Each pair of numbers indicated the offset and size, respectively, |
| of a block of data. |
| This does not work well if the archive is extracted by an archiver |
| that does not recognize this extension, since many pax implementations |
| simply discard unrecognized attributes. |
| .It Cm GNU.sparse.major , Cm GNU.sparse.minor , Cm GNU.sparse.name , Cm GNU.sparse.realsize |
| The |
| .Dq 1.0 |
| format stores the sparse block map in one or more 512-byte blocks |
| prepended to the file data in the entry body. |
| The pax attributes indicate the existence of this map |
| (via the |
| .Cm GNU.sparse.major |
| and |
| .Cm GNU.sparse.minor |
| fields) |
| and the full size of the file. |
| The |
| .Cm GNU.sparse.name |
| holds the true name of the file. |
| To avoid confusion, the name stored in the regular tar header |
| is a modified name so that extraction errors will be apparent |
| to users. |
| .El |
| .Ss Solaris Tar |
| XXX More Details Needed XXX |
| .Pp |
| Solaris tar (beginning with SunOS XXX 5.7 ?? XXX) supports an |
| .Dq extended |
| format that is fundamentally similar to pax interchange format, |
| with the following differences: |
| .Bl -bullet -compact -width indent |
| .It |
| Extended attributes are stored in an entry whose type is |
| .Cm X , |
| not |
| .Cm x , |
| as used by pax interchange format. |
| The detailed format of this entry appears to be the same |
| as detailed above for the |
| .Cm x |
| entry. |
| .It |
| An additional |
| .Cm A |
| entry is used to store an ACL for the following regular entry. |
| The body of this entry contains a seven-digit octal number |
| followed by a zero byte, followed by the |
| textual ACL description. |
| The octal value is the number of ACL entries |
| plus a constant that indicates the ACL type: 01000000 |
| for POSIX.1e ACLs and 03000000 for NFSv4 ACLs. |
| .El |
| .Ss AIX Tar |
| XXX More details needed XXX |
| .Ss Mac OS X Tar |
| The tar distributed with Apple's Mac OS X stores most regular files |
| as two separate entries in the tar archive. |
| The two entries have the same name except that the first |
| one has |
| .Dq ._ |
| added to the beginning of the name. |
| This first entry stores the |
| .Dq resource fork |
| with additional attributes for the file. |
| The Mac OS X |
| .Fn CopyFile |
| API is used to separate a file on disk into separate |
| resource and data streams and to reassemble those separate |
| streams when the file is restored to disk. |
| .Ss Other Extensions |
| One obvious extension to increase the size of files is to |
| eliminate the terminating characters from the various |
| numeric fields. |
| For example, the standard only allows the size field to contain |
| 11 octal digits, reserving the twelfth byte for a trailing |
| NUL character. |
| Allowing 12 octal digits allows file sizes up to 64 GB. |
| .Pp |
| Another extension, utilized by GNU tar, star, and other newer |
| .Nm |
| implementations, permits binary numbers in the standard numeric fields. |
| This is flagged by setting the high bit of the first byte. |
| This permits 95-bit values for the length and time fields |
| and 63-bit values for the uid, gid, and device numbers. |
| GNU tar supports this extension for the |
| length, mtime, ctime, and atime fields. |
| Joerg Schilling's star program supports this extension for |
| all numeric fields. |
| Note that this extension is largely obsoleted by the extended attribute |
| record provided by the pax interchange format. |
| .Pp |
| Another early GNU extension allowed base-64 values rather than octal. |
| This extension was short-lived and is no longer supported by any |
| implementation. |
| .Sh SEE ALSO |
| .Xr ar 1 , |
| .Xr pax 1 , |
| .Xr tar 1 |
| .Sh STANDARDS |
| The |
| .Nm tar |
| utility is no longer a part of POSIX or the Single Unix Standard. |
| It last appeared in |
| .St -susv2 . |
| It has been supplanted in subsequent standards by |
| .Xr pax 1 . |
| The ustar format is currently part of the specification for the |
| .Xr pax 1 |
| utility. |
| The pax interchange file format is new with |
| .St -p1003.1-2001 . |
| .Sh HISTORY |
| A |
| .Nm tar |
| command appeared in Seventh Edition Unix, which was released in January, 1979. |
| It replaced the |
| .Nm tp |
| program from Fourth Edition Unix which in turn replaced the |
| .Nm tap |
| program from First Edition Unix. |
| John Gilmore's |
| .Nm pdtar |
| public-domain implementation (circa 1987) was highly influential |
| and formed the basis of |
| .Nm GNU tar |
| (circa 1988). |
| Joerg Shilling's |
| .Nm star |
| archiver is another open-source (GPL) archiver (originally developed |
| circa 1985) which features complete support for pax interchange |
| format. |
| .Pp |
| This documentation was written as part of the |
| .Nm libarchive |
| and |
| .Nm bsdtar |
| project by |
| .An Tim Kientzle Aq kientzle@FreeBSD.org . |