-
Notifications
You must be signed in to change notification settings - Fork 5
Optimized .count() for the iterators. #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could you update the benchmarks so that they all use the same units please? My pea brain misreads comparisons when the units change! |
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality (softdevteam#87)
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality (softdevteam#87)
src/lib.rs
Outdated
| if range.start >= range.end { | ||
| return 0; | ||
| } | ||
| let start_word = block_offset::<T>(range.start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In vob we use off not word. So this would be start_off (etc).
src/lib.rs
Outdated
|
|
||
| if start_word == end_word { | ||
| // Range entirely within one word | ||
| let word = self.vec[start_word]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vob calls this b.
src/lib.rs
Outdated
| /// Convenience function that calls T::count_ones() and converts the result to usize | ||
| /// (The conversion is always safe even if T is u128 and usize is u16) | ||
| #[inline(always)] | ||
| fn count_ones_usize<T: PrimInt>(value: T) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to call the count_ones std lib method directly rather than use this wrapper. The try_from is mildly ugly in source code, but it'll be optimised away, so it's not worth worrying about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it'll be optimised away, so it's not worth worrying about.
I was sceptical about that, so i tested it out. It's actually true (on non-embedded platforms anyways)
|
I like the direction of this! My comments are mostly shallow and making sure this code feels in keeping with the rest of Vob's source, but the general idea clearly works as-is.
This would be a semver breakage. It might be worth doing after the next release, but it would be nice to get the optimisation out to people first before we think about breaking the API.
I don't have a better idea!
rustc will almost certainly optimise the unwrap away entirely on the platforms we know and care about (certainly, in all similar cases I've seen, and checked, it has done so, though obviously one can not 100% guarantee this), so the |
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality (softdevteam#87)
Yes, you are right about that. But it is unlikely that anybody stores an
Not only that, the |
I quite often store iterator types, so I know one user who'll be affected ;) |
src/lib.rs
Outdated
| /// Counts the number of set bits. | ||
| /// This method assumes the range is processed with process_range() | ||
| fn count_set_bits(&self, range: Range<usize>) -> usize { | ||
| use std::convert::TryFrom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go at the top of the file alongside the other imports.
|
One minor comment then I think we're ready to merge. |
And the other issues? i.e remove |
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality (softdevteam#87)
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality
|
I think we're ready. Please undraft? |
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality
|
Ok, I'll remove the benches i mentioned earlier and won't not add a modulus helper method. |
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality
Added a private count_set_bits() method that counts set bits in O(number of words) Fixed the deprecated rng methods in the benchmark Added tests and benchmarks for the new .count() functionality
|
Thanks! |
Added some modest optimizations for
.count()of all the iteratorsResults
With a benchmark of 100_000 random bits, on my M1 Mac:
Vob<usize>::iter().count()took 31_472 ps, now it takes 314 ps (100 times faster)Vob<usize>::iter_set_bits(..).count()took 178_860 ns, now it takes 198 ns (900 times faster)Vob::iter_storage().count()has been updated too (now it just does math instead of counting).Changes
Added a private
count_set_bits()method that counts set bits in O(number of words)Fixed the deprecated
rand::Rngmethods in the benchmark codeAdded tests and benchmarks for the new
iter*.count()functionalityIssues that needs resolve:
✓
### move(delay to next semver update)count_set_bits()We could move
count_set_bits()into the iterator ifIterSetBitsandIterUnsetBitswere merged into oneBitsIterator. This could be done without any runtime penalties, and we'd skip some duplicated boilerplate:pub struct BitsIterator<'a, T: 'a, const BIT_STATE: bool> {...}We could do this is stages, for now we could simply copy&paste like this (to reduce risk of new bugs):
Remove benchmark
Once the gain is confirmed, the new
count_bits()benchmark can be removed because it simply testsrange.count(), and that's like benchmarking std::Range.Possibly add a new helper method
for
n % bits_per_block::<T>(), just likeblock_offset()doesn / bits_per_block::<T>()✓
### Tidy up the.filter(|_|true)I use
iterator.filter(|_| true).count()to trigger the old.count()behavior in the tests, maybe there are better ways to do that.✓
###count_ones_usize()count_ones_usize()could safely useunwrap_unchecked()(or it could even safely useas)(link #86)