Skip to content

Conversation

@patrickwolf
Copy link

@patrickwolf patrickwolf commented Apr 15, 2025

IOCTL offers three key functions:

FICLONE: This function duplicates entire files by sharing storage, without assessing content similarity.
FICLONERANGE: This function allows for the cloning of specific byte ranges between files, also without verifying content similarity.
FIDEDUPERANGE: This function conducts deduplication by comparing specified byte ranges, sharing storage only when the content is identical; differing ranges remain distinct.

Among these, FIDEDUPERANGE is the only option that verifies the content of blocks prior to deduplication, making it the sole method that guarantees 100% safety for this process.

Dedupe via IOCTL calls currently uses FICLONE. Which works but the issue is that its not 100% safe as FICLONE does not check file contents as it is for cloning not deduplication. So it clones blocks even if they are different.

Hence the PR to fix this and close #293

Original request from @patrickwolf
#293 (comment)

Suggestion on implementation from @Zygo
#293 (comment)

PS: For full transparency my background is Python/C# and not Rust so I let Sonnet 3.7 help me with the PR

FILE_DEDUPE_RANGE_SAME: i32 = 0 wasn't used
as the tests were failing since some systems don't support FIDEDUPERANGE
This is ENOTTY (25), not EOPNOTSUPP (95) that we're checking for. Different kernels and filesystems can return different error codes for unsupported ioctls
- Different content deduplication (tests the replacement functionality)
- Identical content verification (tests verification without changes)
- Large file handling (tests chunking support)
@patrickwolf
Copy link
Author

wow how much effort the formatting was :)

@kakra
Copy link

kakra commented Apr 16, 2025

I think the tests don't check the case which this addition is actually trying to fix: Check if the new dedupe doesn't accidentally overwrite contents that the previous implementation didn't catch...

test_reflink_overwrite_with_different_content: Verify FICLONE overwrites target without content verification
test_reflink_overwrite_dedupe_rejects_different_content: Ensure FIDEDUPERANGE rejects files with differing content
test_reflink_overwrite_dedupe_with_identical_content: Confirm FIDEDUPERANGE succeeds with identical content
test_reflink_overwrite_dedupe_large_file_chunking: Validate 21MB files are processed correctly via chunking
test_reflink_overwrite_dedupe_large_file_one_byte_different: Prove FIDEDUPERANGE detects single-byte differences in large files
test_linux_reflink_fallback_behavior: Test fallback from FIDEDUPERANGE to FICLONE preserves data integrity
fix the mutex poisoning by removing the use of CrossTest in our tests, since they don't actually need to run sequentially (each test uses its own directory):
… test environment

Add tests for reflink content verification with FIDEDUPERANGE

- Add helper function to detect FIDEDUPERANGE support in test environment
- Test that reflink_overwrite (FICLONE) overwrites without content verification
- Test that reflink_overwrite_dedupe (FIDEDUPERANGE) rejects files with different content
- Test large file chunking and single-byte difference detection with FIDEDUPERANGE
- Skip FIDEDUPERANGE-specific tests on systems with older kernels (<4.5)
- Verify linux_reflink fallback behavior works correctly
…function

- Update match arm style to follow Rust formatting conventions
- Use block-style with curly braces for multi-line condition
- Add trailing comma to last match arm
@patrickwolf
Copy link
Author

patrickwolf commented Apr 17, 2025

Check if the new dedupe doesn't accidentally overwrite contents that the previous implementation didn't catch...

you are right the testing was incomplete..

I now added tests for reflink content verification with FIDEDUPERANGE

@patrickwolf
Copy link
Author

patrickwolf commented May 10, 2025

I'm still waiting to run fclones on my 28TB of data and would appreciate if you could have a look at my PR @pkolaczk @th1000s Thank you!

@patrickwolf
Copy link
Author

Have you had a look at my PR @pkolaczk @th1000s ? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FICLONE can cause deduplication errors - FIDEDUPERANGE would make dedupe CoW 100% safe

2 participants