X-Git-Url: https://git.brokenzipper.com/gitweb?a=blobdiff_plain;f=doc%2Ftar.texi;h=1bbb5b7a09a97498d70200a486fefda029257d34;hb=e2dbba2f07a403fbdda06efea93b79be910a1402;hp=74f585b3d225c6e456be44eef8098916dc8033da;hpb=e6d15fc7af298d284c3c41f731fc5af7dd7d4245;p=chaz%2Ftar diff --git a/doc/tar.texi b/doc/tar.texi index 74f585b..1bbb5b7 100644 --- a/doc/tar.texi +++ b/doc/tar.texi @@ -260,7 +260,9 @@ Choosing Files and Names for @command{tar} * Selecting Archive Members:: * files:: Reading Names from a File * exclude:: Excluding Some Files -* Wildcards:: Wildcards Patterns and Matching +* wildcards:: Wildcards Patterns and Matching +* quoting styles:: Ways of Quoting Special Characters in Names +* transform:: Modifying File and Member Names * after:: Operating Only on New Files * recurse:: Descending into Directories * one:: Crossing File System Boundaries @@ -1335,7 +1337,7 @@ $ @kbd{tar --list --file=bfiles.tar --wildcards '*b*'} @end smallexample @noindent -will list all members whose name contains @samp{b}. @xref{Wildcards}, +will list all members whose name contains @samp{b}. @xref{wildcards}, for a detailed discussion of globbing patterns and related @command{tar} command line options. @@ -1480,7 +1482,7 @@ Here, @option{--wildcards} instructs @command{tar} to treat command line arguments as globbing patterns and @option{--no-anchored} informs it that the patterns apply to member names after any @samp{/} delimiter. The use of globbing patterns is discussed in detail in -@xref{Wildcards}. +@xref{wildcards}. You can extract a file to standard output by combining the above options with the @option{--to-stdout} (@option{-O}) option (@pxref{Writing to Standard @@ -1710,7 +1712,7 @@ the files in the file system to @command{tar}. The distinction between file names and archive member names is especially important when shell globbing is used, and sometimes a source of confusion -for newcomers. @xref{Wildcards}, for more information about globbing. +for newcomers. @xref{wildcards}, for more information about globbing. The problem is that shells may only glob using existing files in the file system. Only @command{tar} itself may glob on archive members, so when needed, you must ensure that wildcard characters reach @command{tar} without @@ -2592,8 +2594,9 @@ code. @xref{Writing to an External Program}. @opindex no-quote-chars, summary @item --no-quote-chars=@var{string} -Do not quote characters from @var{string}, even if the selected -quoting style implies they should be quoted (@FIXME-pxref{Quoting Styles}). +Remove characters listed in @var{string} from the list of quoted +characters set by the previous @option{--quote-chars} option +(@pxref{quoting styles}). @opindex no-recursion, summary @item --no-recursion @@ -2712,15 +2715,34 @@ anonymous anyway, so that might as well be the owner of anonymous archives. This option does not affect extraction from archives. +@opindex transform, summary +@item --transform=@var{sed-expr} + +Transform file or member names using @command{sed} replacement expression +@var{sed-expr}. For example, + +@smallexample +$ @kbd{tar cf archive.tar --transform 's,^\./,usr/,' .} +@end smallexample + +@noindent +will add to @file{archive} files from the current working directory, +replacing initial @samp{./} prefix with @samp{usr/}. For the detailed +discussion, @xref{transform}. + +To see transformed member names in verbose listings, use +@option{--show-transformed-names} option +(@pxref{show-transformed-names}). + @opindex quote-chars, summary @item --quote-chars=@var{string} Always quote characters from @var{string}, even if the selected -quoting style would not quote them (@FIXME-pxref{Quoting Styles}). +quoting style would not quote them (@pxref{quoting styles}). @opindex quoting-style, summary @item --quoting-style=@var{style} Set quoting style to use when printing member and file names -(@FIXME-pxref{Quoting Styles}). Valid @var{style} values are: +(@pxref{quoting styles}). Valid @var{style} values are: @code{literal}, @code{shell}, @code{shell-always}, @code{c}, @code{escape}, @code{locale}, and @code{clocale}. Default quoting style is @code{escape}, unless overridden while configuring the @@ -2962,11 +2984,14 @@ $ tar --show-defaults Instructs @command{tar} to mention directories its skipping over when operating on a @command{tar} archive. @xref{show-omitted-dirs}. +@opindex show-transformed-names, summary @opindex show-stored-names, summary -@item --show-stored-names +@item --show-transformed-names +@itemx --show-stored-names -This option has effect only when used in conjunction with one of -archive creation operations. It instructs tar to list the member names +Display file or member names after applying any transformations +(@FIXME-pxref{}). In particular, when used in conjunction with one of +archive creation operations it instructs tar to list the member names stored in the archive, as opposed to the actual file names. @xref{listing member and file names}. @@ -2997,7 +3022,7 @@ tar --extract --file archive.tar --strip-components=2 @end smallexample @noindent -would extracted this file to file @file{name}. +would extract this file to file @file{name}. @opindex suffix, summary @item --suffix=@var{suffix} @@ -4171,9 +4196,10 @@ blues tar: funk not found in archive @end smallexample -The spirit behind the @option{--compare} (@option{--diff}, @option{-d}) option is to check whether the -archive represents the current state of files on disk, more than validating -the integrity of the archive media. For this later goal, @xref{verify}. +The spirit behind the @option{--compare} (@option{--diff}, +@option{-d}) option is to check whether the archive represents the +current state of files on disk, more than validating the integrity of +the archive media. For this later goal, @xref{verify}. @node create options @section Options Used by @option{--create} @@ -5858,7 +5884,9 @@ This chapter discusses these options in detail. * Selecting Archive Members:: * files:: Reading Names from a File * exclude:: Excluding Some Files -* Wildcards:: Wildcards Patterns and Matching +* wildcards:: Wildcards Patterns and Matching +* quoting styles:: Ways of Quoting Special Characters in Names +* transform:: Modifying File and Member Names * after:: Operating Only on New Files * recurse:: Descending into Directories * one:: Crossing File System Boundaries @@ -6331,7 +6359,7 @@ file. @end itemize -@node Wildcards +@node wildcards @section Wildcards Patterns and Matching @dfn{Globbing} is the operation by which @dfn{wildcard} characters, @@ -6527,6 +6555,486 @@ The following table summarizes pattern-matching default values: @item Exclusion @tab @option{--wildcards --no-anchored --wildcards-match-slash} @end multitable +@node quoting styles +@section Quoting Member Names + +When displaying member names, @command{tar} takes care to avoid +ambiguities caused by certain characters. This is called @dfn{name +quoting}. The characters in question are: + +@itemize @bullet +@item Non-printable control characters: + +@multitable @columnfractions 0.20 0.10 0.60 +@headitem Character @tab ASCII @tab Character name +@item \a @tab 7 @tab Audible bell +@item \b @tab 8 @tab Backspace +@item \f @tab 12 @tab Form feed +@item \n @tab 10 @tab New line +@item \r @tab 13 @tab Carriage return +@item \t @tab 9 @tab Horizontal tabulation +@item \v @tab 11 @tab Vertical tabulation +@end multitable + +@item Space (ASCII 32) + +@item Single and double quotes (@samp{'} and @samp{"}) + +@item Backslash (@samp{\}) +@end itemize + +The exact way @command{tar} uses to quote these characters depends on +the @dfn{quoting style}. The default quoting style, called +@dfn{escape} (see below), uses backslash notation to represent control +characters, space and backslash. Using this quoting style, control +characters are represented as listed in column @samp{Character} in the +above table, a space is printed as @samp{\ } and a backslash as @samp{\\}. + +@GNUTAR{} offers seven distinct quoting styles, which can be selected +using @option{--quoting-style} option: + +@table @option +@item --quoting-style=@var{style} +@opindex quoting-style + +Sets quoting style. Valid values for @var{style} argument are: +literal, shell, shell-always, c, escape, locale, clocale. +@end table + +These styles are described in detail below. To illustrate their +effect, we will use an imaginary tar archive @file{arch.tar} +containing the following members: + +@smallexample +@group +# 1. Contains horizontal tabulation character. +a tab +# 2. Contains newline character +a +newline +# 3. Contains a space +a space +# 4. Contains double quotes +a"double"quote +# 5. Contains single quotes +a'single'quote +# 6. Contains a backslash character: +a\backslash +@end group +@end smallexample + +Here is how usual @command{ls} command would have listed them, if they +had existed in the current working directory: + +@smallexample +@group +$ @kbd{ls} +a\ttab +a\nnewline +a\ space +a"double"quote +a'single'quote +a\\backslash +@end group +@end smallexample + +Quoting styles: + +@table @samp +@item literal +No quoting, display each character as is: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=literal} +./ +./a space +./a'single'quote +./a"double"quote +./a\backslash +./a tab +./a +newline +@end group +@end smallexample + +@item shell +Display characters the same way Bourne shell does: +control characters, except @samp{\t} and @samp{\n}, are printed using +backslash escapes, @samp{\t} and @samp{\n} are printed as is, and a +single quote is printed as @samp{\'}. If a name contains any quoted +characters, it is enclosed in single quotes. In particular, if a name +contains single quotes, it is printed as several single-quoted strings: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=shell} +./ +'./a space' +'./a'\''single'\''quote' +'./a"double"quote' +'./a\backslash' +'./a tab' +'./a +newline' +@end group +@end smallexample + +@item shell-always +Same as @samp{shell}, but the names are always enclosed in single +quotes: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=shell-always} +'./' +'./a space' +'./a'\''single'\''quote' +'./a"double"quote' +'./a\backslash' +'./a tab' +'./a +newline' +@end group +@end smallexample + +@item c +Use the notation of the C programming language. All names are +enclosed in double quotes. Control characters are quoted using +backslash notations, double quotes are represented as @samp{\"}, +backslash characters are represented as @samp{\\}. Single quotes and +spaces are not quoted: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=c} +"./" +"./a space" +"./a'single'quote" +"./a\"double\"quote" +"./a\\backslash" +"./a\ttab" +"./a\nnewline" +@end group +@end smallexample + +@item escape +Control characters are printed using backslash notation, a space is +printed as @samp{\ } and a backslash as @samp{\\}. This is the +default quoting style, unless it was changed when configured the +package. + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=escape} +./ +./a space +./a'single'quote +./a"double"quote +./a\\backslash +./a\ttab +./a\nnewline +@end group +@end smallexample + +@item locale +Control characters, single quote and backslash are printed using +backslash notation. All names are quoted using left and right +quotation marks, appropriate to the current locale. If it does not +define quotation marks, use @samp{`} as left and @samp{'} as right +quotation marks. Any occurrences of the right quotation mark in a +name are escaped with @samp{\}, for example: + +For example: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=locale} +`./' +`./a space' +`./a\'single\'quote' +`./a"double"quote' +`./a\\backslash' +`./a\ttab' +`./a\nnewline' +@end group +@end smallexample + +@item clocale +Same as @samp{locale}, but @samp{"} is used for both left and right +quotation marks, if not provided by the currently selected locale: + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=clocale} +"./" +"./a space" +"./a'single'quote" +"./a\"double\"quote" +"./a\\backslash" +"./a\ttab" +"./a\nnewline" +@end group +@end smallexample +@end table + +You can specify which characters should be quoted in addition to those +implied by the current quoting style: + +@table @option +@item --quote-chars=@var{string} +Always quote characters from @var{string}, even if the selected +quoting style would not quote them. +@end table + +For example, using @samp{escape} quoting (compare with the usual +escape listing above): + +@smallexample +@group +$ @kbd{tar tf arch.tar --quoting-style=escape --quote-chars=' "'} +./ +./a\ space +./a'single'quote +./a\"double\"quote +./a\\backslash +./a\ttab +./a\nnewline +@end group +@end smallexample + +To disable quoting of such additional characters, use the following +option: + +@table @option +@item --no-quote-chars=@var{string} +Remove characters listed in @var{string} from the list of quoted +characters set by the previous @option{--quote-chars} option. +@end table + +This option is particularly useful if you have added +@option{--quote-chars} to your @env{TAR_OPTIONS} (@pxref{TAR_OPTIONS}) +and wish to disable it for the current invocation. + +Note, that @option{--no-quote-chars} does @emph{not} disable those +characters that are quoted by default in the selected quoting style. + +@node transform +@section Modifying File and Member Names + +@command{Tar} archives contain detailed information about files stored +in them and full file names are part of that information. When +storing file to an archive, its file name is recorded in the archive +along with the actual file contents. When restoring from an archive, +a file is created on disk with exactly the same name as that stored +in the archive. In the majority of cases this is the desired behavior +of a file archiver. However, there are some cases when it is not. + +First of all, it is often unsafe to extract archive members with +absolute file names or those that begin with a @file{../}. @GNUTAR{} +takes special precautions when extracting such names and provides a +special option for handling them, which is described in +@xref{absolute}. + +Secondly, you may wish to extract file names without some leading +directory components, or with otherwise modified names. In other +cases it is desirable to store files under differing names in the +archive. + +@GNUTAR{} provides two options for these needs. + +@table @option +@opindex strip-components +@item --strip-components=@var{number} +Strip given @var{number} of leading components from file names before +extraction. +@end table + +For example, suppose you have archived whole @file{/usr} hierarchy to +a tar archive named @file{usr.tar}. Among other files, this archive +contains @file{usr/include/stdlib.h}, which you wish to extract to +the current working directory. To do so, you type: + +@smallexample +$ @kbd{tar -xf usr.tar --strip=2 usr/include/stdlib.h} +@end smallexample + +The option @option{--strip=2} instructs @command{tar} to strip the +two leading components (@file{usr/} and @file{include/}) off the file +name. + +If you add to the above invocation @option{--verbose} (@option{-v}) +option, you will note that the verbose listing still contains the +full file name, with the two removed components still in place. This +can be inconvenient, so @command{tar} provides a special option for +altering this behavior: + +@anchor{show-transformed-names} +@table @option +@opindex --show-transformed-names +@item --show-transformed-names +Display file or member names with all requested transformations +applied. +@end table + +For example: + +@smallexample +@group +$ @kbd{tar -xf usr.tar -v --strip=2 usr/include/stdlib.h} +usr/include/stdlib.h +$ @kbd{tar -xf usr.tar -v --strip=2 --show-transformed usr/include/stdlib.h} +stdlib.h +@end group +@end smallexample + +Notice that in both cases the file is @file{stdlib.h} extracted to the +current working directory, @option{--show-transformed-names} affects +only the way its name is displayed. + +This option is especially useful for verifying whether the invocation +will have the desired effect. Thus, before running + +@smallexample +$ @kbd{tar -x --strip=@var{n}} +@end smallexample + +@noindent +it is often advisable to run + +@smallexample +$ @kbd{tar -t -v --show-transformed --strip=@var{n}} +@end smallexample + +@noindent +to make sure the command will produce the intended results. + +In case you need to apply more complex modifications to the file name, +@GNUTAR{} provides a general-purpose transformation option: + +@table @option +@opindex --transform +@item --transform=@var{expression} +Modify file names using supplied @var{expression}. +@end table + +@noindent +The @var{expression} is a @command{sed}-like replace expression of the +form: + +@smallexample +s/@var{regexp}/@var{replace}/[@var{flags}] +@end smallexample + +@noindent +where @var{regexp} is a @dfn{regular expression}, @var{replace} is a +replacement for each file name part that matches @var{regexp}. Both +@var{regexp} and @var{replace} are described in detail in +@ref{The "s" Command, The "s" Command, The `s' Command, sed, GNU sed}. + +Supported @var{flags} are: + +@table @samp +@item g +Apply the replacement to @emph{all} matches to the @var{regexp}, not +just the first. + +@item i +Use case-insensitive matching + +@item x +@var{regexp} is an @dfn{extended regular expression} (@pxref{Extended +regexps, Extended regular expressions, Extended regular expressions, +sed, GNU sed}. + +@item @var{number} +Only replace the @var{number}th match of the @var{regexp}. + +Note: the @var{posix} standard does not specify what should happen +when you mix the @samp{g} and @var{number} modifiers. @GNUTAR{} +follows the GNU @command{sed} implementation in this regard, so +the the interaction is defined to be: ignore matches before the +@var{number}th, and then match and replace all matches from the +@var{number}th on. + +@end table + +Any delimiter can be used in lieue of @samp{/}, the only requirement being +that it be used consistently throughout the expression. For example, +the following two expressions are equivalent: + +@smallexample +@group +s/one/two/ +s,one,two, +@end group +@end smallexample + +Changing of delimiter is often useful when the @var{regex} contains +slashes. For example, it is more convenient to write: + +@smallexample +s,/,-, +@end smallexample + +@noindent +instead of + +@smallexample +s/\//-/ +@end smallexample + +Here are several examples of @option{--transform} usage: + +@enumerate +@item Extract @file{usr/} hierarchy into @file{usr/local/}: + +@smallexample +$ @kbd{tar --transform='s,usr/,usr/local/,' -x -f arch.tar} +@end smallexample + +@item Strip two leading directory components (equivalent to +@option{--strip-components=2}): + +@smallexample +$ @kbd{tar --transform='s,/*[^/]*/[^/]*/,,' -x -f arch.tar} +@end smallexample + +@item Prepend @file{/prefix/} to each file name: + +@smallexample +$ @kbd{tar --transform 's,^,/prefix/,' -x -f arch.tar} +@end smallexample + +@item Convert each file name to lower case: + +@smallexample +$ @kbd{tar --transform 's/.*/\L&/' -x -f arch.tar} +@end smallexample + +@end enumerate + +Unlike @option{--strip-components}, @option{--transform} can be used +in any @GNUTAR{} operation mode. For example, the following command +adds files to the archive while replacing the leading @file{usr/} +component with @file{var/}: + +@smallexample +$ @kbd{tar -cf arch.tar --transform='s,^usr/,var/,' /} +@end smallexample + +To test @option{--transform} effect we suggest to use +@option{--show-transformed-names}: + +@smallexample +$ @kbd{tar -cf arch.tar --transform='s,^usr/,var/,' \ + --verbose --show-transformed-names /} +@end smallexample + +If both @option{--strip-components} and @option{--transform} are used +together, then @option{--transform} is applied first, and the required +number of components is then stripped from its result. + @node after @section Operating Only on New Files @UNREVISED @@ -9631,7 +10139,7 @@ To treat member names as globbing patterns, use --wildcards option. If you want to tar to mimic the behavior of versions prior to 1.15.91, add this option to your @env{TAR_OPTIONS} variable. -@xref{Wildcards}, for the detailed discussion of the use of globbing +@xref{wildcards}, for the detailed discussion of the use of globbing patterns by @GNUTAR{}. @item Use of short option @option{-o}.