Skip to content

Race in tests? loop device count before and after mismatch #67

@panlinux

Description

@panlinux

Hi,

we are getting a test failure on arm64 in ubuntu mantic with rust-loopdev 0.4.0-3[1]

864s ---- detach_a_backing_file_default stdout ----
864s thread 'detach_a_backing_file_default' panicked at 'assertion failed: `(left == right)`
864s left: `0`,
864s right: `1`: there should be no loopback devices mounted', tests/integration_test.rs:176:5

I added some print statements, ran it in an arm64 vm (which has normally 3 loop devices used already) and it looks like the loop device count is changing under our feet:

---- detach_a_backing_file_with_offset stdout ----
XXX num_devices_at_start=4
XXX list_device(None).len()=4
XXX start list_device(None): [LoopDeviceOutput { name: "/dev/loop1", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/lxd_25116.snap") }, LoopDeviceOutput { name: "/dev/loop2", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/snapd_19459.snap") }, LoopDeviceOutput { name: "/dev/loop0", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/core22_821.snap") }], len=3, var=4
XXX end list_device(None): [LoopDeviceOutput { name: "/dev/loop1", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/lxd_25116.snap") }, LoopDeviceOutput { name: "/dev/loop2", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/snapd_19459.snap") }, LoopDeviceOutput { name: "/dev/loop0", size_limit: Some(0), offset: Some(0), back_file: Some("/var/lib/snapd/snaps/core22_821.snap") }], num_devices_at_start=4
thread 'detach_a_backing_file_with_offset' panicked at 'assertion failed: `(left == right)`
  left: `3`,
 right: `4`: there should be no loopback devices mounted', tests/integration_test.rs:180:5

This is the changed function:

fn detach_a_backing_file(offset: u64, sizelimit: u64, file_size: i64) {
    let num_devices_at_start = list_device(None).len();
    println!("XXX num_devices_at_start={}",num_devices_at_start);
    println!("XXX list_device(None).len()={}", list_device(None).len());
    let _lock = setup();

    println!("XXX start list_device(None): {:?}, len={}, var={}", list_device(None), list_device(None).len(), num_devices_at_start);
    {
        let file = create_backing_file(file_size);
        attach_file(
            "/dev/loop3",
            file.to_path_buf().to_str().unwrap(),
            offset,
            sizelimit,
        );

        let ld0 = LoopDevice::open("/dev/loop3")
            .expect("should be able to open the created loopback device");

        ld0.detach()
            .expect("should not error detaching the backing file from the loopdev");

        file.close().expect("should delete the temp backing file");
    };

    std::thread::sleep(std::time::Duration::from_millis(500));

    println!("XXX end list_device(None): {:?}, num_devices_at_start={}", list_device(None), num_devices_at_start);
    assert_eq!(
        list_device(None).len(),
        num_devices_at_start,
        "there should be no loopback devices mounted"
    );
    detach_all();
}

Looks like list_device(None) right after the setup() call has one less loop device already. Which one? I don't know, because if I print list_device(None) right at the start of the test, then the failure doesn't happen anymore ;)

I know this is not the latest version of rust-loopdev, and I checked the git log to see if anything could be a fix for this, but my rust knowledge is close to zero.

Here is a full test run showing the failure[2].

In my test vm, I have to run the tests multiple times (but not many) until I see the failure. It doesn't matter if it has one or two cpus.

  1. https://code.launchpad.net/~git-ubuntu-import/ubuntu/+source/rust-loopdev/+git/rust-loopdev/+ref/applied/ubuntu/devel:
  2. https://autopkgtest.ubuntu.com/results/autopkgtest-mantic/mantic/arm64/r/rust-loopdev/20230814_100137_214a4@/log.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions