Skip to content

Commit d18d705

Browse files
committed
changed memory barrier to subgroup execution barrier in pseudocode
1 parent 516505e commit d18d705

File tree

1 file changed

+2
-3
lines changed
  • blog/2025/2025-06-19-subgroup-shuffle-reconvergence-on-nvidia

1 file changed

+2
-3
lines changed

blog/2025/2025-06-19-subgroup-shuffle-reconvergence-on-nvidia/index.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -181,16 +181,15 @@ Strangely enough, just a memory barrier also fixed it, which it shouldn't have a
181181
```cpp
182182
T inclusive_scan(T value)
183183
{
184-
memory_barrier()
185-
184+
subgroup_execution_barrier()
186185
rhs = shuffleUp(value, 1)
187186
value = value + (firstInvocation ? identity : rhs)
188187
189188
[unroll]
190189
for (i = 1; i < SubgroupSizeLog2; i++)
191190
{
192191
nextLevelStep = 1 << i
193-
memory_barrier()
192+
subgroup_execution_barrier()
194193
rhs = shuffleUp(value, nextLevelStep)
195194
value = value + (nextLevelStep out of bounds ? identity : rhs)
196195
}

0 commit comments

Comments
 (0)