From a5e6ea18e3d132be4716eb5fdd520c2c234e3003 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana@suse.com>
Date: Mon, 16 Dec 2019 18:26:55 +0000
Subject: [PATCH 1/2] fs: allow deduplication of eof block into the end of the
 destination file

We always round down, to a multiple of the filesystem's block size, the
length to deduplicate at generic_remap_check_len().  However this is only
needed if an attempt to deduplicate the last block into the middle of the
destination file is requested, since that leads into a corruption if the
length of the source file is not block size aligned.  When an attempt to
deduplicate the last block into the end of the destination file is
requested, we should allow it because it is safe to do it - there's no
stale data exposure and we are prepared to compare the data ranges for
a length not aligned to the block (or page) size - in fact we even do
the data compare before adjusting the deduplication length.

After btrfs was updated to use the generic helpers from VFS (by commit
34a28e3d77535e ("Btrfs: use generic_remap_file_range_prep() for cloning
and deduplication")) we started to have user reports of deduplication
not reflinking the last block anymore, and whence users getting lower
deduplication scores.  The main use case is deduplication of entire
files that have a size not aligned to the block size of the filesystem.

We already allow cloning the last block to the end (and beyond) of the
destination file, so allow for deduplication as well.

Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/read_write.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 5bbf587f5bc1..7458fccc59e1 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1777,10 +1777,9 @@ static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
  * else.  Assume that the offsets have already been checked for block
  * alignment.
  *
- * For deduplication we always scale down to the previous block because we
- * can't meaningfully compare post-EOF contents.
- *
- * For clone we only link a partial EOF block above the destination file's EOF.
+ * For clone we only link a partial EOF block above or at the destination file's
+ * EOF.  For deduplication we accept a partial EOF block only if it ends at the
+ * destination file's EOF (can not link it into the middle of a file).
  *
  * Shorten the request if possible.
  */
@@ -1796,8 +1795,7 @@ static int generic_remap_check_len(struct inode *inode_in,
 	if ((*len & blkmask) == 0)
 		return 0;
 
-	if ((remap_flags & REMAP_FILE_DEDUP) ||
-	    pos_out + *len < i_size_read(inode_out))
+	if (pos_out + *len < i_size_read(inode_out))
 		new_len &= ~blkmask;
 
 	if (new_len == *len)

From 831d2fa25ab8e27592b1b0268dae6f2dfaf7cc43 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana@suse.com>
Date: Mon, 16 Dec 2019 18:26:56 +0000
Subject: [PATCH 2/2] Btrfs: make deduplication with range including the last
 block work

Since btrfs was migrated to use the generic VFS helpers for clone and
deduplication, it stopped allowing for the last block of a file to be
deduplicated when the source file size is not sector size aligned (when
eof is somewhere in the middle of the last block). There are two reasons
for that:

1) The generic code always rounds down, to a multiple of the block size,
   the range's length for deduplications. This means we end up never
   deduplicating the last block when the eof is not block size aligned,
   even for the safe case where the destination range's end offset matches
   the destination file's size. That rounding down operation is done at
   generic_remap_check_len();

2) Because of that, the btrfs specific code does not expect anymore any
   non-aligned range length's for deduplication and therefore does not
   work if such nona-aligned length is given.

This patch addresses that second part, and it depends on a patch that
fixes generic_remap_check_len(), in the VFS, which was submitted ealier
and has the following subject:

  "fs: allow deduplication of eof block into the end of the destination file"

These two patches address reports from users that started seeing lower
deduplication rates due to the last block never being deduplicated when
the file size is not aligned to the filesystem's block size.

Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/ioctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 12ae31e1813e..173758d86feb 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3243,6 +3243,7 @@ static void btrfs_double_extent_lock(struct inode *inode1, u64 loff1,
 static int btrfs_extent_same_range(struct inode *src, u64 loff, u64 len,
 				   struct inode *dst, u64 dst_loff)
 {
+	const u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize;
 	int ret;
 
 	/*
@@ -3250,7 +3251,7 @@ static int btrfs_extent_same_range(struct inode *src, u64 loff, u64 len,
 	 * source range to serialize with relocation.
 	 */
 	btrfs_double_extent_lock(src, loff, dst, dst_loff, len);
-	ret = btrfs_clone(src, dst, loff, len, len, dst_loff, 1);
+	ret = btrfs_clone(src, dst, loff, len, ALIGN(len, bs), dst_loff, 1);
 	btrfs_double_extent_unlock(src, loff, dst, dst_loff, len);
 
 	return ret;