From: Paul Eggert Date: Fri, 24 Sep 2010 02:41:05 +0000 (-0700) Subject: tar: --dereference consistency X-Git-Url: https://git.brokenzipper.com/gitweb?a=commitdiff_plain;h=14efeb9f956e38d7beaf3fbedb04d3f3bb9ece3a;p=chaz%2Ftar tar: --dereference consistency This closes another race condition, that occurs when overwriting a symlink with a regular file. * NEWS (--dereference consistency): New section. * doc/tar.texi (Option Summary): Describe new --deference behavior. (dereference): Likewise. Remove discussion that I didn't follow, even before --dereference was changed. * src/common.h (deref_stat, set_file_atime): Adjust signatures. * src/compare.c (diff_file, diff_multivol): Respect open_read_flags instead of rolling our own flags. This implements the new behavior for --dereference. (diff_file, diff_dumpdir): Likewise, for fstatat_flags. * src/create.c: Adjust to set_file_atime signature change. * src/extract.c (mark_after_links, file_newer_p, extract_dir): Likewise. * src/incremen.c (try_purge_directory): Likewise. * src/misc.c (maybe_backup_file): Likewise. * src/extract.c (file_newer_p): New arg STP. All callers changed. (maybe_recoverable): New arg REGULAR. All callers changed. Handle the case of overwriting a symlink with a regular file, when --overwrite is specified but --dereference is not. (open_output_file): Add O_CLOEXEC, O_NOCTTY, O_NONBLOCK for consistency with file creation. Add O_NOFOLLOW if overwriting_old_files && ! dereference_option. * src/incremen.c (update_parent_directory): Use fstat, not fstatat; there's less to go wrong. * src/misc.c (deref_stat): Remove DEREF arg. All callers changed. Instead, use fstatat_flags. (set_file_atime): Remove ATFLAG arg. All callers changed. Instead, use fstatat_flags. * src/names.c, src/update.c: Adjust to deref_stat signature change. * src/tar.c (get_date_or_file): Use stat, not deref_stat, as this is not a file to be archived. * tests/Makefile.am (TESTSUITE_AT): Add extrac13.at. * tests/extrac13.at: New file. * tests/testsuite.at: Include it. --- diff --git a/NEWS b/NEWS index 58fd2ff..1484c55 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,4 @@ -GNU tar NEWS - User visible changes. 2010-09-17 +GNU tar NEWS - User visible changes. 2010-09-23 Please send GNU tar bug reports to @@ -33,6 +33,22 @@ supports this. For example, recent versions of the Linux kernel support setting times on symlinks, and some BSD kernels also support symlink permissions. +** --dereference consistency + +The --dereference (-h) option now applies to files that are copied +into or out of archives, independently of other options. For example, +if F is a symbolic link and archive.tar contains a regular-file member +also named F, "tar --overwrite -x -f archive.tar F" now overwrites F +itself, rather than the file that F points to. (To overwrite the file +that F points to, add the --dereference (-h) option.) Formerly, +--dereference was intended to apply only when using the -c option, but +the implementation was not consistent. + +Also, the --dereference option no longer affects accesses to other +files, such as archives and time stamp files. Symbolic links to these +files are always followed. Previously, the links were usually but not +always followed. + ** Spurious error diagnostics on broken pipe. When receiving SIGPIPE, tar would exit with error status and diff --git a/doc/tar.texi b/doc/tar.texi index 983e47a..23a5e76 100644 --- a/doc/tar.texi +++ b/doc/tar.texi @@ -2559,9 +2559,9 @@ directories until the end of extraction. @xref{Directory Modification Times and @item --dereference @itemx -h -When creating a @command{tar} archive, @command{tar} will archive the -file that a symbolic link points to, rather than archiving the -symlink. @xref{dereference}. +When reading or writing a file to be archived, @command{tar} accesses +the file that a symbolic link points to, rather than the symlink +itself. @xref{dereference}. @opsummary{directory} @item --directory=@var{dir} @@ -9319,30 +9319,25 @@ than System V's. Normally, when @command{tar} archives a symbolic link, it writes a block to the archive naming the target of the link. In that way, the @command{tar} archive is a faithful record of the file system contents. -@option{--dereference} (@option{-h}) is used with @option{--create} (@option{-c}), and causes -@command{tar} to archive the files symbolic links point to, instead of -the links themselves. When this option is used, when @command{tar} -encounters a symbolic link, it will archive the linked-to file, -instead of simply recording the presence of a symbolic link. - -The name under which the file is stored in the file system is not -recorded in the archive. To record both the symbolic link name and -the file name in the system, archive the file under both names. If -all links were recorded automatically by @command{tar}, an extracted file -might be linked to a file name that no longer exists in the file -system. - -If a linked-to file is encountered again by @command{tar} while creating -the same archive, an entire second copy of it will be stored. (This -@emph{might} be considered a bug.) +When @option{--dereference} (@option{-h}) is used with +@option{--create} (@option{-c}), @command{tar} archives the files +symbolic links point to, instead of +the links themselves. -So, for portable archives, do not archive symbolic links as such, -and use @option{--dereference} (@option{-h}): many systems do not support +When creating portable archives, use @option{--dereference} +(@option{-h}): some systems do not support symbolic links, and moreover, your distribution might be unusable if it contains unresolved symbolic links. -The @option{--dereference} option is not secure if an untrusted user -can modify files during creation or extraction. @xref{Security}. +When reading from an archive, the @option{--dereference} (@option{-h}) +option causes @command{tar} to follow an already-existing symbolic +link when @command{tar} writes or reads a file named in the archive. +Ordinarily, @command{tar} does not follow such a link, though it may +remove the link before writing a new file. @xref{Dealing with Old +Files}. + +The @option{--dereference} option is unsafe if an untrusted user can +modify directories while @command{tar} is running. @xref{Security}. @node hard links @subsection Hard Links diff --git a/src/common.h b/src/common.h index 12a4889..192cf9e 100644 --- a/src/common.h +++ b/src/common.h @@ -611,7 +611,7 @@ int remove_any_file (const char *file_name, enum remove_option option); bool maybe_backup_file (const char *file_name, bool this_is_the_archive); void undo_last_backup (void); -int deref_stat (bool deref, char const *name, struct stat *buf); +int deref_stat (char const *name, struct stat *buf); extern int chdir_current; extern int chdir_fd; @@ -640,7 +640,7 @@ void xpipe (int fd[2]); void *page_aligned_alloc (void **ptr, size_t size); int set_file_atime (int fd, int parentfd, char const *file, - struct timespec atime, int atflag); + struct timespec atime); /* Module names.c. */ diff --git a/src/compare.c b/src/compare.c index 1ee9bcb..6b7e6d8 100644 --- a/src/compare.c +++ b/src/compare.c @@ -151,7 +151,7 @@ read_and_process (struct tar_stat_info *st, int (*processor) (size_t, char *)) static int get_stat_data (char const *file_name, struct stat *stat_data) { - int status = deref_stat (dereference_option, file_name, stat_data); + int status = deref_stat (file_name, stat_data); if (status != 0) { @@ -217,14 +217,7 @@ diff_file (void) } else { - int atime_flag = - (atime_preserve_option == system_atime_preserve - ? O_NOATIME - : 0); - - diff_handle = openat (chdir_fd, file_name, - (O_RDONLY | O_BINARY | O_CLOEXEC | O_NOCTTY - | O_NONBLOCK | atime_flag)); + diff_handle = openat (chdir_fd, file_name, open_read_flags); if (diff_handle < 0) { @@ -244,8 +237,7 @@ diff_file (void) if (atime_preserve_option == replace_atime_preserve) { struct timespec atime = get_stat_atime (&stat_data); - if (set_file_atime (diff_handle, chdir_fd, file_name, - atime, 0) + if (set_file_atime (diff_handle, chdir_fd, file_name, atime) != 0) utime_error (file_name); } @@ -372,7 +364,7 @@ diff_dumpdir (void) dev_t dev = 0; struct stat stat_data; - if (deref_stat (true, current_stat_info.file_name, &stat_data)) + if (deref_stat (current_stat_info.file_name, &stat_data) != 0) { if (errno == ENOENT) stat_warn (current_stat_info.file_name); @@ -399,10 +391,6 @@ diff_multivol (void) struct stat stat_data; int fd, status; off_t offset; - int atime_flag = - (atime_preserve_option == system_atime_preserve - ? O_NOATIME - : 0); if (current_stat_info.had_trailing_slash) { @@ -429,9 +417,7 @@ diff_multivol (void) } - fd = openat (chdir_fd, current_stat_info.file_name, - (O_RDONLY | O_BINARY | O_CLOEXEC | O_NOCTTY | O_NONBLOCK - | atime_flag)); + fd = openat (chdir_fd, current_stat_info.file_name, open_read_flags); if (fd < 0) { diff --git a/src/create.c b/src/create.c index 26993c2..05af0d9 100644 --- a/src/create.c +++ b/src/create.c @@ -1795,9 +1795,7 @@ dump_file0 (struct tar_stat_info *st, char const *name, char const *p) set_exit_status (TAREXIT_DIFFERS); } else if (atime_preserve_option == replace_atime_preserve - && (set_file_atime (fd, parentfd, name, - st->atime, fstatat_flags) - != 0)) + && set_file_atime (fd, parentfd, name, st->atime) != 0) utime_error (p); } diff --git a/src/extract.c b/src/extract.c index 2730a03..0a1a032 100644 --- a/src/extract.c +++ b/src/extract.c @@ -375,7 +375,7 @@ mark_after_links (struct delayed_set_stat *head) struct stat st; h->after_links = 1; - if (fstatat (chdir_fd, h->file_name, &st, 0) != 0) + if (deref_stat (h->file_name, &st) != 0) stat_error (h->file_name); else { @@ -548,27 +548,32 @@ make_directories (char *file_name) return did_something; /* tell them to retry if we made one */ } +/* Return true if FILE_NAME (with status *STP, if STP) is not a + directory, and has a time stamp newer than (or equal to) that of + TAR_STAT. */ static bool -file_newer_p (const char *file_name, struct tar_stat_info *tar_stat) +file_newer_p (const char *file_name, struct stat const *stp, + struct tar_stat_info *tar_stat) { struct stat st; - if (fstatat (chdir_fd, file_name, &st, 0)) + if (!stp) { - if (errno != ENOENT) + if (deref_stat (file_name, &st) != 0) { - stat_warn (file_name); - /* Be on the safe side: if the file does exist assume it is newer */ - return true; + if (errno != ENOENT) + { + stat_warn (file_name); + /* Be safer: if the file exists, assume it is newer. */ + return true; + } + return false; } - return false; - } - if (!S_ISDIR (st.st_mode) - && tar_timespec_cmp (tar_stat->mtime, get_stat_mtime (&st)) <= 0) - { - return true; + stp = &st; } - return false; + + return (! S_ISDIR (stp->st_mode) + && tar_timespec_cmp (tar_stat->mtime, get_stat_mtime (stp)) <= 0); } #define RECOVER_NO 0 @@ -580,18 +585,39 @@ file_newer_p (const char *file_name, struct tar_stat_info *tar_stat) Return RECOVER_OK if we somewhat increased our chances at a successful extraction, RECOVER_NO if there are no chances, and RECOVER_SKIP if the caller should skip extraction of that member. The value of errno is - properly restored on returning RECOVER_NO. */ + properly restored on returning RECOVER_NO. + + If REGULAR, the caller was trying to extract onto a regular file. + + Set *INTERDIR_MADE if an intermediate directory is made as part of + the recovery process. */ static int -maybe_recoverable (char *file_name, bool *interdir_made) +maybe_recoverable (char *file_name, bool regular, bool *interdir_made) { int e = errno; + struct stat st; + struct stat const *stp = 0; if (*interdir_made) return RECOVER_NO; - switch (errno) + switch (e) { + case ELOOP: + if (! regular + || old_files_option != OVERWRITE_OLD_FILES || dereference_option) + break; + if (strchr (file_name, '/')) + { + if (deref_stat (file_name, &st) != 0) + break; + stp = &st; + } + + /* The caller tried to open a symbolic link with O_NOFOLLOW. + Fall through, treating it as an already-existing file. */ + case EEXIST: /* Remove an old file, if the options allow this. */ @@ -601,21 +627,16 @@ maybe_recoverable (char *file_name, bool *interdir_made) return RECOVER_SKIP; case KEEP_NEWER_FILES: - if (file_newer_p (file_name, ¤t_stat_info)) - { - errno = e; - return RECOVER_NO; - } + if (file_newer_p (file_name, stp, ¤t_stat_info)) + break; /* FALL THROUGH */ case DEFAULT_OLD_FILES: case NO_OVERWRITE_DIR_OLD_FILES: case OVERWRITE_OLD_FILES: - { - int r = remove_any_file (file_name, ORDINARY_REMOVE_OPTION); - errno = EEXIST; - return r > 0 ? RECOVER_OK : RECOVER_NO; - } + if (0 < remove_any_file (file_name, ORDINARY_REMOVE_OPTION)) + return RECOVER_OK; + break; case UNLINK_FIRST_OLD_FILES: break; @@ -623,19 +644,20 @@ maybe_recoverable (char *file_name, bool *interdir_made) case ENOENT: /* Attempt creating missing intermediate directories. */ - if (! make_directories (file_name)) + if (make_directories (file_name)) { - errno = ENOENT; - return RECOVER_NO; + *interdir_made = true; + return RECOVER_OK; } - *interdir_made = true; - return RECOVER_OK; + break; default: /* Just say we can't do anything about it... */ - - return RECOVER_NO; + break; } + + errno = e; + return RECOVER_NO; } /* Fix the statuses of all directories whose statuses need fixing, and @@ -769,7 +791,7 @@ extract_dir (char *file_name, int typeflag) || old_files_option == OVERWRITE_OLD_FILES)) { struct stat st; - if (fstatat (chdir_fd, file_name, &st, 0) == 0) + if (deref_stat (file_name, &st) == 0) { current_mode = st.st_mode; current_mode_mask = ALL_MODE_BITS; @@ -787,7 +809,7 @@ extract_dir (char *file_name, int typeflag) errno = EEXIST; } - switch (maybe_recoverable (file_name, &interdir_made)) + switch (maybe_recoverable (file_name, false, &interdir_made)) { case RECOVER_OK: continue; @@ -823,8 +845,11 @@ open_output_file (char const *file_name, int typeflag, mode_t mode, { int fd; bool overwriting_old_files = old_files_option == OVERWRITE_OLD_FILES; - int openflag = (O_WRONLY | O_BINARY | O_CREAT - | (overwriting_old_files ? O_TRUNC : O_EXCL)); + int openflag = (O_WRONLY | O_BINARY | O_CLOEXEC | O_NOCTTY | O_NONBLOCK + | O_CREAT + | (overwriting_old_files + ? O_TRUNC | (dereference_option ? 0 : O_NOFOLLOW) + : O_EXCL)); if (typeflag == CONTTYPE) { @@ -898,21 +923,19 @@ extract_file (char *file_name, int typeflag) } else { - int recover = RECOVER_NO; - do - fd = open_output_file (file_name, typeflag, mode, - ¤t_mode, ¤t_mode_mask); - while (fd < 0 - && (recover = maybe_recoverable (file_name, &interdir_made)) - == RECOVER_OK); - - if (fd < 0) + while ((fd = open_output_file (file_name, typeflag, mode, + ¤t_mode, ¤t_mode_mask)) + < 0) { - skip_member (); - if (recover == RECOVER_SKIP) - return 0; - open_error (file_name); - return 1; + int recover = maybe_recoverable (file_name, true, &interdir_made); + if (recover != RECOVER_OK) + { + skip_member (); + if (recover == RECOVER_SKIP) + return 0; + open_error (file_name); + return 1; + } } } @@ -994,7 +1017,7 @@ create_placeholder_file (char *file_name, bool is_symlink, bool *interdir_made) while ((fd = openat (chdir_fd, file_name, O_WRONLY | O_CREAT | O_EXCL, 0)) < 0) { - switch (maybe_recoverable (file_name, interdir_made)) + switch (maybe_recoverable (file_name, false, interdir_made)) { case RECOVER_OK: continue; @@ -1106,7 +1129,8 @@ extract_link (char *file_name, int typeflag) errno = e; } - while ((rc = maybe_recoverable (file_name, &interdir_made)) == RECOVER_OK); + while ((rc = maybe_recoverable (file_name, false, &interdir_made)) + == RECOVER_OK); if (rc == RECOVER_SKIP) return 0; @@ -1130,7 +1154,7 @@ extract_symlink (char *file_name, int typeflag) return create_placeholder_file (file_name, true, &interdir_made); while (symlinkat (current_stat_info.link_name, chdir_fd, file_name) != 0) - switch (maybe_recoverable (file_name, &interdir_made)) + switch (maybe_recoverable (file_name, false, &interdir_made)) { case RECOVER_OK: continue; @@ -1171,7 +1195,7 @@ extract_node (char *file_name, int typeflag) while (mknodat (chdir_fd, file_name, mode, current_stat_info.stat.st_rdev) != 0) - switch (maybe_recoverable (file_name, &interdir_made)) + switch (maybe_recoverable (file_name, false, &interdir_made)) { case RECOVER_OK: continue; @@ -1200,7 +1224,7 @@ extract_fifo (char *file_name, int typeflag) & ~ (0 < same_owner_option ? S_IRWXG | S_IRWXO : 0)); while (mkfifoat (chdir_fd, file_name, mode) != 0) - switch (maybe_recoverable (file_name, &interdir_made)) + switch (maybe_recoverable (file_name, false, &interdir_made)) { case RECOVER_OK: continue; @@ -1348,7 +1372,7 @@ prepare_to_extract (char const *file_name, int typeflag, tar_extractor_t *fun) break; case KEEP_NEWER_FILES: - if (file_newer_p (file_name, ¤t_stat_info)) + if (file_newer_p (file_name, 0, ¤t_stat_info)) { WARNOPT (WARN_IGNORE_NEWER, (0, 0, _("Current %s is newer or same age"), diff --git a/src/incremen.c b/src/incremen.c index 4d86ed7..628ff29 100644 --- a/src/incremen.c +++ b/src/incremen.c @@ -408,7 +408,7 @@ update_parent_directory (struct tar_stat_info *parent) if (directory) { struct stat st; - if (fstatat (parent->fd, ".", &st, fstatat_flags) != 0) + if (fstat (parent->fd, &st) != 0) stat_diag (directory->name); else directory->mtime = get_stat_mtime (&st); @@ -1668,7 +1668,7 @@ try_purge_directory (char const *directory_name) free (p); p = new_name (directory_name, cur); - if (deref_stat (false, p, &st)) + if (deref_stat (p, &st) != 0) { if (errno != ENOENT) /* FIXME: Maybe keep a list of renamed dirs and check it here? */ diff --git a/src/misc.c b/src/misc.c index d94791f..763c661 100644 --- a/src/misc.c +++ b/src/misc.c @@ -544,7 +544,7 @@ maybe_backup_file (const char *file_name, bool this_is_the_archive) if (this_is_the_archive && _remdev (file_name)) return true; - if (fstatat (chdir_fd, file_name, &file_stat, 0)) + if (deref_stat (file_name, &file_stat) != 0) { if (errno == ENOENT) return true; @@ -608,24 +608,24 @@ undo_last_backup (void) } } -/* Depending on DEREF, apply either stat or lstat to (NAME, BUF). */ +/* Apply either stat or lstat to (NAME, BUF), depending on the + presence of the --dereference option. NAME is relative to the + most-recent argument to chdir_do. */ int -deref_stat (bool deref, char const *name, struct stat *buf) +deref_stat (char const *name, struct stat *buf) { - return fstatat (chdir_fd, name, buf, deref ? 0 : AT_SYMLINK_NOFOLLOW); + return fstatat (chdir_fd, name, buf, fstatat_flags); } /* Set FD's (i.e., assuming the working directory is PARENTFD, FILE's) - access time to ATIME. ATFLAG controls symbolic-link following, in - the style of openat. */ + access time to ATIME. */ int -set_file_atime (int fd, int parentfd, char const *file, struct timespec atime, - int atflag) +set_file_atime (int fd, int parentfd, char const *file, struct timespec atime) { struct timespec ts[2]; ts[0] = atime; ts[1].tv_nsec = UTIME_OMIT; - return fdutimensat (fd, parentfd, file, ts, atflag); + return fdutimensat (fd, parentfd, file, ts, fstatat_flags); } /* A description of a working directory. */ diff --git a/src/names.c b/src/names.c index 88d8be2..6e214bf 100644 --- a/src/names.c +++ b/src/names.c @@ -981,7 +981,7 @@ collect_and_sort_names (void) tar_stat_init (&st); - if (deref_stat (dereference_option, name->name, &st.stat) != 0) + if (deref_stat (name->name, &st.stat) != 0) { stat_diag (name->name); continue; @@ -1159,7 +1159,7 @@ register_individual_file (char const *name) { struct stat st; - if (deref_stat (dereference_option, name, &st) != 0) + if (deref_stat (name, &st) != 0) return; /* Will be complained about later */ if (S_ISDIR (st.st_mode)) return; diff --git a/src/tar.c b/src/tar.c index 6fd117c..d80a10f 100644 --- a/src/tar.c +++ b/src/tar.c @@ -1014,7 +1014,7 @@ get_date_or_file (struct tar_args *args, const char *option, || *str == '.') { struct stat st; - if (deref_stat (dereference_option, str, &st) != 0) + if (stat (str, &st) != 0) { stat_error (str); USAGE_ERROR ((0, 0, _("Date sample file not found"))); diff --git a/src/update.c b/src/update.c index de2786d..69fa592 100644 --- a/src/update.c +++ b/src/update.c @@ -138,8 +138,7 @@ update_archive (void) struct stat s; chdir_do (name->change_dir); - if (deref_stat (dereference_option, - current_stat_info.file_name, &s) == 0) + if (deref_stat (current_stat_info.file_name, &s) == 0) { if (S_ISDIR (s.st_mode)) { diff --git a/tests/Makefile.am b/tests/Makefile.am index d378a37..34be617 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -79,6 +79,7 @@ TESTSUITE_AT = \ extrac10.at\ extrac11.at\ extrac12.at\ + extrac13.at\ filerem01.at\ filerem02.at\ gzip.at\ diff --git a/tests/extrac13.at b/tests/extrac13.at new file mode 100644 index 0000000..2f950cb --- /dev/null +++ b/tests/extrac13.at @@ -0,0 +1,53 @@ +# Process this file with autom4te to create testsuite. -*- Autotest -*- + +# Test suite for GNU tar. +# Copyright (C) 2010 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# written by Paul Eggert + +# Check that 'tar' normally does follow symbolic links when extracting, +# unless --dereference is specified. + +AT_SETUP([extract over symlinks]) +AT_KEYWORDS([extract extrac13]) + +AT_TAR_CHECK([ +mkdir src dst1 dst2 dst3 +echo file1 >src/file1 +ln -s target1 dst1/file1 +echo target1 >dst1/target1 +echo target1 >target1 + +tar -cf archive.tar -C src . && +tar -xf archive.tar -C dst1 && +diff -c src/file1 dst1/file1 && +diff -c target1 dst1/target1 + +ln -s target1 dst2/file1 +echo target1 >dst2/target1 +tar --overwrite -xf archive.tar -C dst2 && +diff -c src/file1 dst2/file1 && +diff -c target1 dst2/target1 + +ln -s target1 dst3/file1 +echo target1 >dst3/target1 +tar --overwrite -xhf archive.tar -C dst3 && +diff -c src/file1 dst3/file1 && +diff -c src/file1 dst3/target1 +], +[0],[],[],[],[],[gnu]) + +AT_CLEANUP diff --git a/tests/testsuite.at b/tests/testsuite.at index dc7a603..8d76887 100644 --- a/tests/testsuite.at +++ b/tests/testsuite.at @@ -151,6 +151,7 @@ m4_include([extrac09.at]) m4_include([extrac10.at]) m4_include([extrac11.at]) m4_include([extrac12.at]) +m4_include([extrac13.at]) m4_include([label01.at]) m4_include([label02.at])