Skip to content

Conversation

@DougLea
Copy link
Contributor

@DougLea DougLea commented Dec 12, 2025

Changes signal filtering to avoid possible starvation


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8373118: Test java/lang/Thread/virtual/Starvation.java timed out (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28797/head:pull/28797
$ git checkout pull/28797

Update a local copy of the PR:
$ git checkout pull/28797
$ git pull https://git.openjdk.org/jdk.git pull/28797/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28797

View PR using the GUI difftool:
$ git pr show -t 28797

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28797.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 12, 2025

👋 Welcome back dl! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 12, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Dec 12, 2025
@openjdk
Copy link

openjdk bot commented Dec 12, 2025

@DougLea The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 12, 2025
@mlbridge
Copy link

mlbridge bot commented Dec 12, 2025

task = (fifo != 0) ? localPoll() : localPop();
final int topLevelExec(ForkJoinTask<?> task, WorkQueue q,
int fifo, int qbase) {
int stolen = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DougLea It does seem a bit weird that stolen starts as 1 even if "task == null" (i.e. this method would return 1 for an invocation where task or q is null, which doesn't sounds right?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, t is never null, and q is never null, so I'll just suggest a small comment below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or less oddly, now prefaced with
if (task == null || q == null)
return 0; // currently impossible
I also added some javadoc and comments and cosmetics

if (q.base != qbase || qa == null || (qcap = qa.length) <= 0 ||
(t = (ForkJoinTask<?>)U.getReferenceAcquire(
qa, qk = slotOffset((qcap - 1) & qbase))) == null ||
q.base != qbase ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this (repeated) check intended? (i.e. are we worried about q.base drift between polling the array?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, needed (everywhere) to avoid bad CASes in rare cases (like array wraparound).

if (a != null && k < a.length && k >= 0 && a[k] == null)
break;
if (c == (c = ctl) && c == (c = compareAndExchangeCtl(c, nc))) {
if (c == (c = compareAndExchangeCtl(c, nc))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the risk here is that this introduces more cache-coordination contention, esp. on multisocket aarch64.

Comment on lines 1882 to 1892
if (v == null)
createWorker();
else {
v.phase = sp;
if (v.parking != 0)
U.unpark(v.owner);
}
break;
}
if (q != null && q.base - qbase > 0)
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crazy idea: Only create new workers on successful CAS if no queue-drift has occurred, but always prioritize to wake up sleeping workers regardless?

Suggested change
if (v != null) {
v.phase = sp;
if (v.parking != 0)
U.unpark(v.owner);
} else if (q == null || q.base - qbase == 0)
createWorker();
break;
}
if (q != null && q.base - qbase > 0)
break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Narrator: Turns out it was crazy, as it requires a back-out of nc. Disregard the comment above :)

Comment on lines 1860 to 1865
int pc = parallelism;
for (long c = ctl;;) {
WorkQueue[] qs = queues;
long ac = (c + RC_UNIT) & RC_MASK, nc;
int sp = (int)c, i = sp & SMASK;
if ((short)(c >>> RC_SHIFT) >= pc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it might make sense to re-read parallelism on each new loop, since if we have a parallelism resize during this loop we want to exit early?

Suggested change
for (long c = ctl;;) {
WorkQueue[] qs = queues;
long ac = (c + RC_UNIT) & RC_MASK, nc;
int sp = (int)c, i = sp & SMASK, pc;
if ((short)(c >>> RC_SHIFT) >= (pc = parallelism))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but it should also be rare to adjust parallelism, and setParallelism changed recently to signal.

Comment on lines 1259 to 1260
int size = s - base + 1, m;
if (((a != null && a.length > size) || (a = growArray(a, s)) != null) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instruction ordering reason to assign the size ahead of the if? 🤔

            int m;
            if (((a != null && a.length > (s - base + 1)) || (a = growArray(a, s)) != null) &&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Now reworked a little...) Well, mainly trying to simply the conditional while avoiding possibly-uninitialized errors

}
return newArray;
top = s; // back out
if ((phase & 1) == 0) // unlock if external
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume the read of phase is very much intentional here (otherwise passing in boolean internal as a param would make sense), so it might be worth adding the reason for reading phase on this line? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The main reason is that I had needed growArray in bulk-steal exploration (but not committed) and needed version without argument). But then noticed that there's no reason to have arg for a case that almost never happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for confirming 👍 :)

@AlanBateman
Copy link
Contributor

The change to relax orderings in push means that java/lang/Thread/virtual/Starvation.java now failing very frequently, on both x64 and aarch64. So this may tell us something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

3 participants