| .\" Copyright (c) 2007 Tim Kientzle |
| .\" All rights reserved. |
| .\" |
| .\" Redistribution and use in source and binary forms, with or without |
| .\" modification, are permitted provided that the following conditions |
| .\" are met: |
| .\" 1. Redistributions of source code must retain the above copyright |
| .\" notice, this list of conditions and the following disclaimer. |
| .\" 2. Redistributions in binary form must reproduce the above copyright |
| .\" notice, this list of conditions and the following disclaimer in the |
| .\" documentation and/or other materials provided with the distribution. |
| .\" |
| .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND |
| .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
| .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
| .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE |
| .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
| .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
| .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
| .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
| .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| .\" SUCH DAMAGE. |
| .\" |
| .\" $FreeBSD: src/lib/libarchive/cpio.5,v 1.2 2008/05/26 17:00:23 kientzle Exp $ |
| .\" |
| .Dd October 5, 2007 |
| .Dt CPIO 5 |
| .Os |
| .Sh NAME |
| .Nm cpio |
| .Nd format of cpio archive files |
| .Sh DESCRIPTION |
| The |
| .Nm |
| archive format collects any number of files, directories, and other |
| file system objects (symbolic links, device nodes, etc.) into a single |
| stream of bytes. |
| .Ss General Format |
| Each file system object in a |
| .Nm |
| archive comprises a header record with basic numeric metadata |
| followed by the full pathname of the entry and the file data. |
| The header record stores a series of integer values that generally |
| follow the fields in |
| .Va struct stat . |
| (See |
| .Xr stat 2 |
| for details.) |
| The variants differ primarily in how they store those integers |
| (binary, octal, or hexadecimal). |
| The header is followed by the pathname of the |
| entry (the length of the pathname is stored in the header) |
| and any file data. |
| The end of the archive is indicated by a special record with |
| the pathname |
| .Dq TRAILER!!! . |
| .Ss PWB format |
| XXX Any documentation of the original PWB/UNIX 1.0 format? XXX |
| .Ss Old Binary Format |
| The old binary |
| .Nm |
| format stores numbers as 2-byte and 4-byte binary values. |
| Each entry begins with a header in the following format: |
| .Bd -literal -offset indent |
| struct header_old_cpio { |
| unsigned short c_magic; |
| unsigned short c_dev; |
| unsigned short c_ino; |
| unsigned short c_mode; |
| unsigned short c_uid; |
| unsigned short c_gid; |
| unsigned short c_nlink; |
| unsigned short c_rdev; |
| unsigned short c_mtime[2]; |
| unsigned short c_namesize; |
| unsigned short c_filesize[2]; |
| }; |
| .Ed |
| .Pp |
| The |
| .Va unsigned short |
| fields here are 16-bit integer values; the |
| .Va unsigned int |
| fields are 32-bit integer values. |
| The fields are as follows |
| .Bl -tag -width indent |
| .It Va magic |
| The integer value octal 070707. |
| This value can be used to determine whether this archive is |
| written with little-endian or big-endian integers. |
| .It Va dev , Va ino |
| The device and inode numbers from the disk. |
| These are used by programs that read |
| .Nm |
| archives to determine when two entries refer to the same file. |
| Programs that synthesize |
| .Nm |
| archives should be careful to set these to distinct values for each entry. |
| .It Va mode |
| The mode specifies both the regular permissions and the file type. |
| It consists of several bit fields as follows: |
| .Bl -tag -width "MMMMMMM" -compact |
| .It 0170000 |
| This masks the file type bits. |
| .It 0140000 |
| File type value for sockets. |
| .It 0120000 |
| File type value for symbolic links. |
| For symbolic links, the link body is stored as file data. |
| .It 0100000 |
| File type value for regular files. |
| .It 0060000 |
| File type value for block special devices. |
| .It 0040000 |
| File type value for directories. |
| .It 0020000 |
| File type value for character special devices. |
| .It 0010000 |
| File type value for named pipes or FIFOs. |
| .It 0004000 |
| SUID bit. |
| .It 0002000 |
| SGID bit. |
| .It 0001000 |
| Sticky bit. |
| On some systems, this modifies the behavior of executables and/or directories. |
| .It 0000777 |
| The lower 9 bits specify read/write/execute permissions |
| for world, group, and user following standard POSIX conventions. |
| .El |
| .It Va uid , Va gid |
| The numeric user id and group id of the owner. |
| .It Va nlink |
| The number of links to this file. |
| Directories always have a value of at least two here. |
| Note that hardlinked files include file data with every copy in the archive. |
| .It Va rdev |
| For block special and character special entries, |
| this field contains the associated device number. |
| For all other entry types, it should be set to zero by writers |
| and ignored by readers. |
| .It Va mtime |
| Modification time of the file, indicated as the number |
| of seconds since the start of the epoch, |
| 00:00:00 UTC January 1, 1970. |
| The four-byte integer is stored with the most-significant 16 bits first |
| followed by the least-significant 16 bits. |
| Each of the two 16 bit values are stored in machine-native byte order. |
| .It Va namesize |
| The number of bytes in the pathname that follows the header. |
| This count includes the trailing NUL byte. |
| .It Va filesize |
| The size of the file. |
| Note that this archive format is limited to |
| four gigabyte file sizes. |
| See |
| .Va mtime |
| above for a description of the storage of four-byte integers. |
| .El |
| .Pp |
| The pathname immediately follows the fixed header. |
| If the |
| .Cm namesize |
| is odd, an additional NUL byte is added after the pathname. |
| The file data is then appended, padded with NUL |
| bytes to an even length. |
| .Pp |
| Hardlinked files are not given special treatment; |
| the full file contents are included with each copy of the |
| file. |
| .Ss Portable ASCII Format |
| .St -susv2 |
| standardized an ASCII variant that is portable across all |
| platforms. |
| It is commonly known as the |
| .Dq old character |
| format or as the |
| .Dq odc |
| format. |
| It stores the same numeric fields as the old binary format, but |
| represents them as 6-character or 11-character octal values. |
| .Bd -literal -offset indent |
| struct cpio_odc_header { |
| char c_magic[6]; |
| char c_dev[6]; |
| char c_ino[6]; |
| char c_mode[6]; |
| char c_uid[6]; |
| char c_gid[6]; |
| char c_nlink[6]; |
| char c_rdev[6]; |
| char c_mtime[11]; |
| char c_namesize[6]; |
| char c_filesize[11]; |
| }; |
| .Ed |
| .Pp |
| The fields are identical to those in the old binary format. |
| The name and file body follow the fixed header. |
| Unlike the old binary format, there is no additional padding |
| after the pathname or file contents. |
| If the files being archived are themselves entirely ASCII, then |
| the resulting archive will be entirely ASCII, except for the |
| NUL byte that terminates the name field. |
| .Ss New ASCII Format |
| The "new" ASCII format uses 8-byte hexadecimal fields for |
| all numbers and separates device numbers into separate fields |
| for major and minor numbers. |
| .Bd -literal -offset indent |
| struct cpio_newc_header { |
| char c_magic[6]; |
| char c_ino[8]; |
| char c_mode[8]; |
| char c_uid[8]; |
| char c_gid[8]; |
| char c_nlink[8]; |
| char c_mtime[8]; |
| char c_filesize[8]; |
| char c_devmajor[8]; |
| char c_devminor[8]; |
| char c_rdevmajor[8]; |
| char c_rdevminor[8]; |
| char c_namesize[8]; |
| char c_check[8]; |
| }; |
| .Ed |
| .Pp |
| Except as specified below, the fields here match those specified |
| for the old binary format above. |
| .Bl -tag -width indent |
| .It Va magic |
| The string |
| .Dq 070701 . |
| .It Va check |
| This field is always set to zero by writers and ignored by readers. |
| See the next section for more details. |
| .El |
| .Pp |
| The pathname is followed by NUL bytes so that the total size |
| of the fixed header plus pathname is a multiple of four. |
| Likewise, the file data is padded to a multiple of four bytes. |
| Note that this format supports only 4 gigabyte files (unlike the |
| older ASCII format, which supports 8 gigabyte files). |
| .Pp |
| In this format, hardlinked files are handled by setting the |
| filesize to zero for each entry except the last one that |
| appears in the archive. |
| .Ss New CRC Format |
| The CRC format is identical to the new ASCII format described |
| in the previous section except that the magic field is set |
| to |
| .Dq 070702 |
| and the |
| .Va check |
| field is set to the sum of all bytes in the file data. |
| This sum is computed treating all bytes as unsigned values |
| and using unsigned arithmetic. |
| Only the least-significant 32 bits of the sum are stored. |
| .Ss HP variants |
| The |
| .Nm cpio |
| implementation distributed with HPUX used XXXX but stored |
| device numbers differently XXX. |
| .Ss Other Extensions and Variants |
| Sun Solaris uses additional file types to store extended file |
| data, including ACLs and extended attributes, as special |
| entries in cpio archives. |
| .Pp |
| XXX Others? XXX |
| .Sh BUGS |
| The |
| .Dq CRC |
| format is mis-named, as it uses a simple checksum and |
| not a cyclic redundancy check. |
| .Pp |
| The old binary format is limited to 16 bits for user id, |
| group id, device, and inode numbers. |
| It is limited to 4 gigabyte file sizes. |
| .Pp |
| The old ASCII format is limited to 18 bits for |
| the user id, group id, device, and inode numbers. |
| It is limited to 8 gigabyte file sizes. |
| .Pp |
| The new ASCII format is limited to 4 gigabyte file sizes. |
| .Pp |
| None of the cpio formats store user or group names, |
| which are essential when moving files between systems with |
| dissimilar user or group numbering. |
| .Pp |
| Especially when writing older cpio variants, it may be necessary |
| to map actual device/inode values to synthesized values that |
| fit the available fields. |
| With very large filesystems, this may be necessary even for |
| the newer formats. |
| .Sh SEE ALSO |
| .Xr cpio 1 , |
| .Xr tar 5 |
| .Sh STANDARDS |
| The |
| .Nm cpio |
| utility is no longer a part of POSIX or the Single Unix Standard. |
| It last appeared in |
| .St -susv2 . |
| It has been supplanted in subsequent standards by |
| .Xr pax 1 . |
| The portable ASCII format is currently part of the specification for the |
| .Xr pax 1 |
| utility. |
| .Sh HISTORY |
| The original cpio utility was written by Dick Haight |
| while working in AT&T's Unix Support Group. |
| It appeared in 1977 as part of PWB/UNIX 1.0, the |
| .Dq Programmer's Work Bench |
| derived from |
| .At v6 |
| that was used internally at AT&T. |
| Both the old binary and old character formats were in use |
| by 1980, according to the System III source released |
| by SCO under their |
| .Dq Ancient Unix |
| license. |
| The character format was adopted as part of |
| .St -p1003.1-88 . |
| XXX when did "newc" appear? Who invented it? When did HP come out with their variant? When did Sun introduce ACLs and extended attributes? XXX |