@c This file is distributed under GFDL 1.1 or any later version
@c published by the Free Software Foundation.
+@cindex sparse formats
+@cindex sparse versions
The notion of sparse file, and the ways of handling it from the point
of view of @GNUTAR{} user have been described in detail in
@ref{sparse}. This chapter describes the internal format @GNUTAR{}
@node Old GNU Format
@appendixsubsec Old GNU Format
+@cindex sparse formats, Old GNU
+@cindex Old GNU sparse format
The format introduced some time around 1990 (v. 1.09). It was
designed on top of standard @code{ustar} headers in such an
unfortunate way that some of its fields overwrote fields required by
@node PAX 0
@appendixsubsec PAX Format, Versions 0.0 and 0.1
-@UNREVISED{}
+@cindex sparse formats, v.0.0
There are two formats available in this branch. The version @code{0.0}
is the initial version of sparse format used by @command{tar}
versions 1.14--1.15.1. The sparse file map is kept in extended
(@code{x}) PAX header variables:
@table @code
+@vrindex GNU.sparse.size, extended header variable
@item GNU.sparse.size
Real size of the stored file
@item GNU.sparse.numblocks
+@vrindex GNU.sparse.numblocks, extended header variable
Number of blocks in the sparse map
@item GNU.sparse.offset
+@vrindex GNU.sparse.offset, extended header variable
Offset of the data block
@item GNU.sparse.numbytes
+@vrindex GNU.sparse.numbytes, extended header variable
Size of the data block
@end table
format, it will also extract a file containing extension header
attributes. This file can be used to expand the file to its original
state. However, posix-aware @command{tar}s will usually ignore the
-unknown variables, which makes restoring the file much more
-difficult@FIXME-xref{how to extract sparse file using third-party @command{tar}s}.
+unknown variables, which makes restoring the file more
+difficult. @xref{extracting sparse v.0.x, Extraction of sparse
+members in v.0.0 format}, for the detailed description of how to
+restore such members using non-GNU @command{tar}s.
@end enumerate
+@cindex sparse formats, v.0.1
@GNUTAR{} 1.15.2 introduced sparse format version @code{0.1}, which
attempted to solve these problems. As its predecessor, this format
stores sparse map in the extended POSIX header. It retains
@table @code
@item GNU.sparse.map
+@vrindex GNU.sparse.map, extended header variable
Map of non-null data chunks. It is a string consisting of
comma-separated values "@var{offset},@var{size}[,@var{offset-1},@var{size-1}...]"
@end table
%d/GNUSparseFile.%p/%f
@end smallexample
+@vrindex GNU.sparse.name, extended header variable
The real name of the sparse file is stored in the variable
@code{GNU.sparse.name}. Thus, those @command{tar} implementations
that are not aware of GNU extensions will at least extract the files
@node PAX 1
@appendixsubsec PAX Format, Version 1.0
-@UNREVISED{}
+@cindex sparse formats, v.1.0
The version @code{1.0} of sparse format was introduced with @GNUTAR{}
1.15.92. Its main objective was to make the resulting file
extractable with little effort even by non-posix aware @command{tar}
@table @code
@item GNU.sparse.major
+@vrindex GNU.sparse.major, extended header variable
Major version
@item GNU.sparse.minor
+@vrindex GNU.sparse.minor, extended header variable
Minor version
@end table
%d/GNUSparseFile.%p/%f
@end smallexample
+@vrindex GNU.sparse.name, extended header variable, in v.1.0
+@vrindex GNU.sparse.realsize, extended header variable
The real name of the sparse file is stored in the variable
@code{GNU.sparse.name}. The real size of the file is stored in the
variable @code{GNU.sparse.realsize}.
supporting @code{GNU.sparse.*} keywords will extract each sparse file
in its condensed form with the file map prepended and will place it
into a separate directory. Then, using a simple program it would be
-possible to expand the file to its original form even without GNU tar.
-@FIXME-xref{how to extract sparse file using third-party
-@command{tar}s}. @FIXME{Write the program and give its URL here}.
+possible to expand the file to its original form even without @GNUTAR{}.
+@xref{Sparse Recovery}, for the detailed information on how to extract
+sparse members without @GNUTAR{}.
@c Maintenance notes:
@c 1. Pay attention to @FIXME{}s and @UNREVISED{}s
@c 2. Before creating final variant:
-@c 1.1. Run `make check-options' to make sure all options are properly
+@c 2.1. Run `make check-options' to make sure all options are properly
@c documented;
-@c 2.1. Run `make master-menu' (see comment before the master menu).
+@c 2.2. Run `make master-menu' (see comment before the master menu).
@include rendition.texi
@include value.texi
You can use @command{tar} archives in many ways. We want to stress a few
of them: storage, backup, and transportation.
-@FIXME{the following table entries need a bit of work..}
+@FIXME{the following table entries need a bit of work.}
@table @asis
@item Storage
Often, @command{tar} archives are used to store related files for
@node Split Recovery
@subsubsection Extracting Members Split Between Volumes
+@cindex Mutli-volume archives, extracting using non-GNU tars
If a member is split between several volumes of an old GNU format archive
most third party @command{tar} implementation will fail to extract
it. To extract it, use @command{tarcat} program (@pxref{Tarcat}).
$ @kbd{tarcat vol-1.tar vol-2.tar vol-3.tar | tar xf -}
@end smallexample
+@cindex Mutli-volume archives in PAX format, extracting using non-GNU tars
You could use this approach for many (although not all) PAX
format archives as well. However, extracting split members from a PAX
archive is a much easier task, because PAX volumes are constructed in
@node Sparse Recovery
@subsubsection Extracting Sparse Members
+@cindex sparse files, extracting with non-GNU tars
Any @command{tar} implementation will be able to extract sparse members from a
PAX archive. However, the extracted files will be @dfn{condensed},
i.e. any zero blocks will be removed from them. When we restore such
@dfn{holes}) back to their original locations, we call this process
@dfn{expanding} a compressed sparse file.
+@pindex xsparse
To expand a file, you will need a simple auxiliary program called
@command{xsparse}. It is available in source form from
@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/xsparse.html, @GNUTAR{}
home page}.
+@cindex sparse files v.1.0, extracting with non-GNU tars
Let's begin with archive members in @dfn{sparse format
version 1.0}@footnote{@xref{PAX 1}.}, which are the easiest to expand.
The condensed file will contain both file map and file data, so no
@end group
@end smallexample
+@anchor{extracting sparse v.0.x}
+@cindex sparse files v.0.1, extracting with non-GNU tars
+@cindex sparse files v.0.0, extracting with non-GNU tars
An @dfn{extended header} is a special @command{tar} archive header
that precedes an archive member and contains a set of
@dfn{variables}, describing the member properties that cannot be
@option{--label=@var{archive-label}} again in conjunction with the
@option{--append}, @option{--update} or @option{--concatenate} operation.
-@FIXME{This is no longer true: Multivolume archives in @samp{POSIX}
-format can be extracted using any posix-compliant tar
-implementation. The split members can then be recreated from parts
-using a simple shell script. Provide more information about it:}
-Beware that there is @emph{no} real standard about the proper way, for
-a @command{tar} archive, to span volume boundaries. If you have a
-multi-volume created by some vendor's @command{tar}, there is almost
-no chance you could read all the volumes with @GNUTAR{}.
-The converse is also true: you may not expect
-multi-volume archives created by @GNUTAR{} to be
-fully recovered by vendor's @command{tar}. Since there is little
-chance that, in mixed system configurations, some vendor's
-@command{tar} will work on another vendor's machine, and there is a
-great chance that @GNUTAR{} will work on most of
-them, your best bet is to install @GNUTAR{} on all
-machines between which you know exchange of files is possible.
+Notice that multi-volume support is a GNU extension and the archives
+created in this mode should be read only using @GNUTAR{}. If you
+absolutely have to process such archives using a third-party @command{tar}
+implementation, read @ref{Split Recovery}.
@node Tape Files
@subsection Tape Files