Skip to content

Conversation

@opennj
Copy link

@opennj opennj commented Nov 28, 2025

https://issues.apache.org/jira/browse/SOLR-18001

Description

LBHttp2SolrClientIntegrationTest > testSimple was FAILED because of java.util.concurrent.TimeoutException. Test was failing intermittently due to Jetty shutdown timeouts and timing issues with load balancer server detection.

Solution

Exception handling: Added try-catch blocks around Jetty stop operations
Timing fixes: Added 100ms delays for LB client to detect server state changes
Timeout control: Added 5-second timeout wrapper for Jetty shutdown
Robust cleanup: Enhanced tearDown methods to log exceptions without failing tests

Tests

Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

@dsmiley
Copy link
Contributor

dsmiley commented Dec 1, 2025

The changes look overall good in terms of robustness... though I suspect we'll still have a failure on our hand. I'd prefer to catch the timeout of stopping and get a thread dump to understand what activities are happening that may be blocking stopping.

FYI main branch, this test is now known as LB2SolrClientTest.

I looked at the test history failures and it started November 14th. I looked at the changes on the days leading up to it and it's not evident any of them would lead to this outcome, espeically its intermittent-ness.

@opennj
Copy link
Author

opennj commented Dec 4, 2025

The changes look overall good in terms of robustness... though I suspect we'll still have a failure on our hand. I'd prefer to catch the timeout of stopping and get a thread dump to understand what activities are happening that may be blocking stopping.

FYI main branch, this test is now known as LB2SolrClientTest.

I looked at the test history failures and it started November 14th. I looked at the changes on the days leading up to it and it's not evident any of them would lead to this outcome, espeically its intermittent-ness.

Let me create another PR to do a thread dump when the error happens before we introduce the timeout.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 19, 2025

I created a general test improvement for thread dumps on timeout: #3967

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants