kernel-hacking-2024-linux-s.../fs/btrfs
Filipe Manana f1cd1f0b7d Btrfs: fix race when listing an inode's xattrs
When listing a inode's xattrs we have a time window where we race against
a concurrent operation for adding a new hard link for our inode that makes
us not return any xattr to user space. In order for this to happen, the
first xattr of our inode needs to be at slot 0 of a leaf and the previous
leaf must still have room for an inode ref (or extref) item, and this can
happen because an inode's listxattrs callback does not lock the inode's
i_mutex (nor does the VFS does it for us), but adding a hard link to an
inode makes the VFS lock the inode's i_mutex before calling the inode's
link callback.

If we have the following leafs:

               Leaf X (has N items)                    Leaf Y

 [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 XATTR_ITEM 12345), ... ]
           slot N - 2         slot N - 1              slot 0

The race illustrated by the following sequence diagram is possible:

       CPU 1                                               CPU 2

  btrfs_listxattr()

    searches for key (257 XATTR_ITEM 0)

    gets path with path->nodes[0] == leaf X
    and path->slots[0] == N

    because path->slots[0] is >=
    btrfs_header_nritems(leaf X), it calls
    btrfs_next_leaf()

    btrfs_next_leaf()
      releases the path

                                                   adds key (257 INODE_REF 666)
                                                   to the end of leaf X (slot N),
                                                   and leaf X now has N + 1 items

      searches for the key (257 INODE_REF 256),
      with path->keep_locks == 1, because that
      is the last key it saw in leaf X before
      releasing the path

      ends up at leaf X again and it verifies
      that the key (257 INODE_REF 256) is no
      longer the last key in leaf X, so it
      returns with path->nodes[0] == leaf X
      and path->slots[0] == N, pointing to
      the new item with key (257 INODE_REF 666)

    btrfs_listxattr's loop iteration sees that
    the type of the key pointed by the path is
    different from the type BTRFS_XATTR_ITEM_KEY
    and so it breaks the loop and stops looking
    for more xattr items
      --> the application doesn't get any xattr
          listed for our inode

So fix this by breaking the loop only if the key's type is greater than
BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.

Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-11-09 18:34:40 +00:00
..
tests Btrfs: add fragment=* debug mount option 2015-10-21 18:51:43 -07:00
acl.c
async-thread.c btrfs: async_thread: Fix workqueue 'max_active' value when initializing 2015-08-31 11:46:40 -07:00
async-thread.h btrfs: async_thread: Fix workqueue 'max_active' value when initializing 2015-08-31 11:46:40 -07:00
backref.c btrfs: fix use after free iterating extrefs 2015-10-26 19:38:28 -07:00
backref.h
btrfs_inode.h Btrfs: Direct I/O: Fix space accounting 2015-09-21 13:47:55 -07:00
check-integrity.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
check-integrity.h
compression.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
compression.h
ctree.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
ctree.h btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
delayed-inode.c btrfs: comment the rest of implicit barriers before waitqueue_active 2015-10-10 18:42:00 +02:00
delayed-inode.h
delayed-ref.c btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
delayed-ref.h btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
dev-replace.c Merge branch 'fix/waitqueue-barriers' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-12 16:24:40 -07:00
dev-replace.h
dir-item.c
disk-io.c btrfs: qgroup: exit the rescan worker during umount 2015-11-05 10:32:20 +00:00
disk-io.h Btrfs: add btrfs_read_dev_one_super() to read one specific SB 2015-10-01 17:29:38 +02:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent-tree.c Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state 2015-11-03 07:44:20 -08:00
extent-tree.h
extent_io.c btrfs: extent_io: Introduce new function clear_record_extent_bits() 2015-10-21 18:37:44 -07:00
extent_io.h btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function 2015-10-21 18:37:45 -07:00
extent_map.c
extent_map.h
file-item.c
file.c Btrfs: fix race leading to incorrect item deletion when dropping extents 2015-11-08 21:51:28 +00:00
free-space-cache.c Btrfs: don't do extra bitmap search in one bit case 2015-10-21 18:55:41 -07:00
free-space-cache.h Btrfs: keep track of largest extent in bitmaps 2015-10-21 18:55:40 -07:00
hash.c
hash.h
inode-item.c Btrfs: consolidate btrfs_error() to btrfs_std_error() 2015-09-29 16:30:00 +02:00
inode-map.c btrfs: qgroup: Cleanup old inaccurate facilities 2015-10-21 18:41:06 -07:00
inode-map.h
inode.c Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow 2015-11-09 11:29:14 +00:00
ioctl.c btrfs: check unsupported filters in balance arguments 2015-10-26 19:38:26 -07:00
Kconfig
locking.c btrfs: comment the rest of implicit barriers before waitqueue_active 2015-10-10 18:42:00 +02:00
locking.h
lzo.c
Makefile
math.h
ordered-data.c Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
ordered-data.h Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
orphan.c
print-tree.c
print-tree.h
props.c btrfs: cleanup iterating over prop_handlers array 2015-10-21 18:28:48 +02:00
props.h
qgroup.c Btrfs: fix sleeping inside atomic context in qgroup rescan worker 2015-11-05 11:02:22 +00:00
qgroup.h btrfs: qgroup: Check if qgroup reserved space leaked 2015-10-21 18:41:10 -07:00
raid56.c btrfs: comment waitqueue_active implied by locks 2015-10-10 18:35:10 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c btrfs: reada: Fix returned errno code 2015-10-21 18:29:50 +02:00
relocation.c Btrfs: fix regression running delayed references when using qgroups 2015-10-25 19:53:26 +00:00
root-tree.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
scrub.c btrfs: switch message printers to ratelimited variants 2015-10-08 13:04:06 +02:00
send.c btrfs: fix resending received snapshot with parent 2015-10-13 20:04:10 +01:00
send.h
struct-funcs.c
super.c Btrfs: add fragment=* debug mount option 2015-10-21 18:51:43 -07:00
sysfs.c Btrfs: rename super_kobj to fsid_kobj 2015-09-29 16:29:59 +02:00
sysfs.h Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link 2015-09-29 16:29:59 +02:00
transaction.c Merge branch 'allocator-fixes' into for-linus-4.4 2015-10-21 19:00:38 -07:00
transaction.h Merge branch 'allocator-fixes' into for-linus-4.4 2015-10-21 19:00:38 -07:00
tree-defrag.c Btrfs: cleanup: remove unnecessary check before btrfs_free_path is called 2015-08-31 11:46:41 -07:00
tree-log.c Btrfs: fix regression running delayed references when using qgroups 2015-10-25 19:53:26 +00:00
tree-log.h
ulist.c
ulist.h
uuid-tree.c
volumes.c btrfs: extend balance filter usage to take minimum and maximum 2015-10-26 19:38:30 -07:00
volumes.h btrfs: add balance filters limits, stripes and usage to supported mask 2015-10-26 19:38:30 -07:00
xattr.c Btrfs: fix race when listing an inode's xattrs 2015-11-09 18:34:40 +00:00
xattr.h
zlib.c