Skip to content

Conversation

@jrebelo
Copy link

@jrebelo jrebelo commented Jun 11, 2025

This PR is a simplification which removes the Rav1dSequenceHeader type, which is basically a duplicate representation of Dav1dSequenceHeader. While Rav1dSequenceHeader uses some more typical Rust types, the Dav1dSequenceHeader belongs to the public API which needs to be kept so it the one that remains. Since both types represent the same concept it is not needed to have both of them.

The reason why I propose this simplification and would like to follow it up by removing the types leading up to removing DRav1d is because I think this mix of types gets in the way of making optimizations to the way memory is allocated and used.

Remove Rav1dSequenceHeader which is basically a duplicate representation of the Dav1dSequenceHeader type. While the Rav1dSequenceHeader uses some more typical Rust types, theDav1dSequenceHeader belongs to the public API which needs to be kept so it the one that remains. Since both types represent the same concept it is not needed to have both of them.
Copy link
Collaborator

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rav1dSequenceHeader uses some more typical Rust types

This is exactly what lets us do optimizations on the Rav1d* types. For example, a bunch of these are made into enums and checked once per header parsing to avoid later bounds checks in inner loops, but this undoes that.

this mix of types gets in the way of making optimizations to the way memory is allocated and used

What optimizations are those? If it indeed helps with significant optimizations, then we can do that, but I'm not sure what exactly those might be, so I'd need to evaluate them first.

How does this PR alone affect performance?

Use the Rav1d Rust enums with a repr of u32 inside the Dav1dSequenceHeader such that the comparisons in the inner loops are more efficient.
@jrebelo
Copy link
Author

jrebelo commented Jun 12, 2025

Hi @kkysen and thank you for the quick reaction.

Rav1dSequenceHeader uses some more typical Rust types

This is exactly what lets us do optimizations on the Rav1d* types. For example, a bunch of these are made into enums and checked once per header parsing to avoid later bounds checks in inner loops, but this undoes that.

This is a very good point so I have made a new commit which basically makes the Rust enums be part of the Dav1dSequenceHeader. For the C representation of the struct to remain unchanged I have made these enums #[repr(u32)]. Like this there are no changes in the behavior of the internal loops, while the usage of the struct in C remains identical.

How does this PR alone affect performance?

This PR has no measurable impact on execution time performance on my machine as far as I could observe. What it does is reducing memory usage. For a small video file it saves about 60kb of heap allocations. Here are the measurement results

Original code:

$ valgrind --tool=dhat ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1==2732== DHAT, a dynamic heap analysis tool
==2732== Copyright (C) 2010-2018, and GNU GPL'd, by Mozilla Foundation
==2732== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==2732== Command: ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==2732== 
==2732== 
==2732== Total:     351,912,041 bytes in 227,488 blocks

Modified code:

$ valgrind --tool=dhat ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==9913== DHAT, a dynamic heap analysis tool
==9913== Copyright (C) 2010-2018, and GNU GPL'd, by Mozilla Foundation
==9913== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==9913== Command: ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==9913== 
==9913== 
==9913== Total:     351,851,673 bytes in 227,488 blocks

@kkysen kkysen changed the title Use Dav1dSequenceHeader instead of Rav1dSequenceHeader Use Dav1dSequenceHeader instead of Rav1dSequenceHeader Jun 13, 2025
Copy link
Collaborator

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good point so I have made a new commit which basically makes the Rust enums be part of the Dav1dSequenceHeader. For the C representation of the struct to remain unchanged I have made these enums #[repr(u32)]. Like this there are no changes in the behavior of the internal loops, while the usage of the struct in C remains identical.

This does help somewhat with bounds checks and such, but the smaller representation of a u8 over a u32 is a large part of that optimization.

This PR has no measurable impact on execution time performance on my machine as far as I could observe. What it does is reducing memory usage. For a small video file it saves about 60kb of heap allocations. Here are the measurement results

Original code:

$ valgrind --tool=dhat ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1==2732== DHAT, a dynamic heap analysis tool
==2732== Copyright (C) 2010-2018, and GNU GPL'd, by Mozilla Foundation
==2732== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==2732== Command: ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==2732== 
==2732== 
==2732== Total:     351,912,041 bytes in 227,488 blocks

Modified code:

$ valgrind --tool=dhat ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==9913== DHAT, a dynamic heap analysis tool
==9913== Copyright (C) 2010-2018, and GNU GPL'd, by Mozilla Foundation
==9913== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==9913== Command: ./target/release/dav1d -q -i ../Chimera-AV1-8bit-480x270-552kbps.ivf -o /dev/null --threads 1
==9913== 
==9913== 
==9913== Total:     351,851,673 bytes in 227,488 blocks

This is a 0.017% reduction in memory usage. That's incredibly tiny, and not really something I think is worth trying to optimize. Very few *SequenceHeaders are created, which is why I originally added these conversions, because they are indeed rare.

Moreover, the eventual goal of rav1d is to provide a safe Rust API (#1252), and it would use these Rav1d* types in doing so. And then the Dav1d* types would go behind a feature (DRav1d would become a no-op in that case), so you would only have this overhead if you still want to use the C API. That said, the overhead is small as we can see by the tiny difference in memory usage.

If we were to pursue this (and forego potential optimizations in Rav1d* type layout), then I'd like to see what other downstream optimizations this enables. If those help performance, then this could very well be a good idea to do, but by itself, I don't really think it is.

@kkysen kkysen added refactoring Cleaning up transpiled code memory usage labels Jun 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

memory usage refactoring Cleaning up transpiled code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants