Skip to content

Conversation

@tcheeric
Copy link
Owner

Summary

Improved WebSocket handling and integration test coverage. Enhanced configuration flexibility and reduced timeout crashes. Updated documentation for troubleshooting and test setup.

Related issue: #____

What changed?

  • Refactored WebSocket idle timeout configuration for better reliability.
    • Configured WebSocketContainer with a 1-hour default max idle timeout.
    • Added debug/error logging for verifying timeout settings.
  • Replaced nostr-rs-relay with dockurr/strfry in tests for reduced timeout issues and added better relay control.
  • Introduced new StandardWebSocketClient constructor for directly configuring timeouts without reflection.
  • Updated integration test setup for dynamic relay handling and simplified WebSocket client initialization.
  • Enhanced retry logic in WebSocket connections.
  • Added and updated documentation:
    • Troubleshooting section for test reliability.
    • Integration tests documentation highlights resolved issues and solutions.
    • Added changelog maintenance instructions.

BREAKING

None identified.

Review focus

  • WebSocket idle timeout configuration.
  • New StandardWebSocketClient constructor API.
  • Integration test reliability improvements.

Checklist

  • Scope ≤ 300 lines (or split/stack)
  • Title is verb + object (e.g., “Refactor auth middleware to async”)
  • Description links the issue and answers “why now?”
  • BREAKING flagged if needed
  • Tests/docs updated (if relevant)

tcheeric and others added 9 commits November 17, 2025 11:27
…rs to StandardWebSocketClient

Added a new public constructor to StandardWebSocketClient that accepts timeout
parameters directly, eliminating the need for reflection-based configuration.

New constructor:
  public StandardWebSocketClient(String relayUri, long awaitTimeoutMs, long pollIntervalMs)

This allows callers to:
- Configure timeouts programmatically without Spring @value annotations
- Create StandardWebSocketClient outside of Spring DI context
- Avoid reflection hacks for timeout configuration

Version bumped from 1.0.1 to 1.1.0 (minor version) per semver as this adds
new public API while maintaining backward compatibility.

Benefits:
- Cleaner API for programmatic configuration
- No reflection required
- Better testability
- Aligns with constructor injection pattern

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added debug and error logging to StandardWebSocketClient to help diagnose
timeout configuration issues:

- Log actual timeout values when StandardWebSocketClient is created
- Log timeout values being used when waiting for relay response
- Include configured timeout values in error message when timeout occurs

This helps verify that:
1. The constructor is being called with correct timeout values
2. The timeout values are being used in await() calls
3. The actual timeout duration matches the configured value

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add instructions for maintaining CHANGELOG.md following Keep a Changelog
format with semantic versioning best practices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The StandardWebSocketClient now configures the underlying WebSocketContainer
with a default 1-hour max session idle timeout. This prevents premature
connection closures when Nostr relays have periods of inactivity between
messages (e.g., after EOSE when waiting for new events).

Configuration:
- nostr.websocket.max-idle-timeout-ms (default: 3600000, 1 hour)
- Set to 0 for no timeout

The fix uses ContainerProvider.getWebSocketContainer() to obtain the
container, configures setDefaultMaxSessionIdleTimeout(), then passes
the configured container to Spring's StandardWebSocketClient constructor.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Release 1.1.1 with fix for WebSocket idle timeout configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…obustness

Replaced nostr-rs-relay with dockurr/strfry for integration tests, addressing timeout and crash issues. Enhanced WebSocket client retry logic, added configurable relay port, and included additional wait for indexing. Updated test setup for dynamic relay configuration and removed unused response comparison logic.
…strfry fixes

Marked integration tests as resolved with all 24 passing using strfry relay. Documented issues with nostr-rs-relay and provided solutions, including custom strfry configuration, indexing delay, and relay-agnostic assertions. Enhanced troubleshooting section with detailed resolutions for common test issues.
Refactor WebSocket handling and improve test robustness
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +52 to +54
.withClasspathResourceMapping(
"strfry.conf", "/etc/strfry.conf", BindMode.READ_ONLY)
.withTmpFs(Map.of("/app/strfry-db", "rw"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add missing strfry.conf resource for relay container

The test container now mounts strfry.conf from the classpath, but no such file exists anywhere in nostr-java-api/src/test/resources (checked via ls and rg "strfry.conf", which only finds references in docs). Testcontainers will throw an IllegalArgumentException during startup because the classpath resource cannot be found, so every integration test extending this base class will fail before even connecting to the relay.

Useful? React with 👍 / 👎.

@tcheeric tcheeric merged commit f14cf03 into main Dec 25, 2025
13 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors WebSocket handling to improve reliability and updates integration tests to use the strfry relay instead of nostr-rs-relay, which suffered from a known bug causing timeout crashes. The changes include a new public constructor for StandardWebSocketClient, WebSocket idle timeout configuration, and comprehensive test infrastructure improvements.

Key Changes

  • Added WebSocket idle timeout configuration (1-hour default) to prevent premature connection closures during relay inactivity
  • Introduced new StandardWebSocketClient(String, long, long) constructor for programmatic timeout configuration outside Spring DI
  • Replaced nostr-rs-relay with strfry relay in integration tests, resolving timeout issues caused by the quanta crate bug

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
pom.xml Bumped version from 1.0.1 to 1.1.1 and changed project name to hardcoded "nostr-java"
nostr-java-util/pom.xml Updated parent version to 1.1.1
nostr-java-id/pom.xml Updated parent version to 1.1.1
nostr-java-examples/pom.xml Updated parent version to 1.1.1
nostr-java-event/pom.xml Updated parent version to 1.1.1
nostr-java-encryption/pom.xml Updated parent version to 1.1.1
nostr-java-crypto/pom.xml Updated parent version to 1.1.1
nostr-java-client/pom.xml Updated parent version to 1.1.1
nostr-java-base/pom.xml Updated parent version to 1.1.1
nostr-java-api/pom.xml Updated parent version to 1.1.1
nostr-java-client/src/main/java/nostr/client/springwebsocket/StandardWebSocketClient.java Added new constructor with timeout parameters, WebSocket idle timeout configuration, enhanced logging, and refactored client creation into helper method
nostr-java-api/src/test/resources/relay-container.properties Changed relay image from nostr-rs-relay to strfry and added configurable port
nostr-java-api/src/test/java/nostr/api/integration/BaseRelayIntegrationTest.java Updated for strfry relay with ulimit configuration, tmpfs mount, custom config file mounting, and added helper methods for test relay access
nostr-java-api/src/test/java/nostr/api/integration/ApiNIP99EventIT.java Removed @beforeeach setup, added retry logic for client connections, updated assertions to be relay-agnostic, and added initialization delay
nostr-java-api/src/test/java/nostr/api/integration/ApiNIP52EventIT.java Removed @beforeeach setup, added retry logic for client connections, updated assertions to be relay-agnostic, and added initialization delay
nostr-java-api/src/test/java/nostr/api/integration/ApiEventTestUsingSpringWebSocketClientIT.java Removed Spring context dependencies, simplified to use dynamic relay URL, added retry logic, and updated assertions
nostr-java-api/src/test/java/nostr/api/integration/ApiEventIT.java Added @beforeeach to use dynamic relay URLs, added waitForRelayIndexing() helper for query timing
docs/operations/configuration.md Documented new nostr.websocket.max-idle-timeout-ms configuration property
docs/integration-test-bug-analysis.md New comprehensive document analyzing the nostr-rs-relay quanta bug and documenting the solution
docs/TROUBLESHOOTING.md Added extensive integration testing troubleshooting section covering common issues and solutions
CHANGELOG.md Added entries for versions 1.1.1, 1.1.0, and 1.0.1 documenting all changes
AGENTS.md Fixed reference from Cashu to Nostr specification and added versioning and changelog maintenance guidelines

Comment on lines +48 to +50
@Value("${nostr.websocket.max-idle-timeout-ms:3600000}")
private long maxIdleTimeoutMs;

Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field maxIdleTimeoutMs is declared with a configuration property annotation but is never used in the code. The actual max idle timeout is set using the constant DEFAULT_MAX_IDLE_TIMEOUT_MS in the createSpringClient() method at line 265. Either this field should be used to configure the timeout dynamically, or it should be removed if the timeout should only use the default value.

Suggested change
@Value("${nostr.websocket.max-idle-timeout-ms:3600000}")
private long maxIdleTimeoutMs;

Copilot uses AI. Check for mistakes.
Comment on lines +72 to 105
private SpringWebSocketClient createSpringWebSocketClient(String relayUri) {
ExecutionException lastException = null;

for (int attempt = 1; attempt <= MAX_CLIENT_CONNECTION_ATTEMPTS; attempt++) {
try {
return new SpringWebSocketClient(new StandardWebSocketClient(relayUri), relayUri);
} catch (ExecutionException e) {
lastException = e;
delayBeforeRetry(attempt);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new IllegalStateException("Interrupted while connecting to " + relayUri, e);
}
}

throw new IllegalStateException(
"Failed to initialize WebSocket client for "
+ relayUri
+ " after "
+ MAX_CLIENT_CONNECTION_ATTEMPTS
+ " attempts",
lastException);
}

private void delayBeforeRetry(int attempt) {
if (attempt >= MAX_CLIENT_CONNECTION_ATTEMPTS) {
return;
}
try {
Thread.sleep(CONNECTION_RETRY_DELAY_MS);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
}
}
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods createSpringWebSocketClient() and delayBeforeRetry() are duplicated across ApiNIP99EventIT, ApiNIP52EventIT, and ApiEventTestUsingSpringWebSocketClientIT with identical implementations. This violates the DRY principle and makes maintenance harder. Consider extracting these methods to the base class BaseRelayIntegrationTest so they can be reused by all integration tests.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to 61
private static final int MAX_CLIENT_CONNECTION_ATTEMPTS = 3;
private static final long CONNECTION_RETRY_DELAY_MS = 1_000L;

Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constants MAX_CLIENT_CONNECTION_ATTEMPTS and CONNECTION_RETRY_DELAY_MS are duplicated across multiple test files (ApiNIP99EventIT, ApiNIP52EventIT, and ApiEventTestUsingSpringWebSocketClientIT). Consider moving these constants to the base class BaseRelayIntegrationTest to ensure consistency and make them easier to maintain.

Suggested change
private static final int MAX_CLIENT_CONNECTION_ATTEMPTS = 3;
private static final long CONNECTION_RETRY_DELAY_MS = 1_000L;

Copilot uses AI. Check for mistakes.
Comment on lines +89 to 104
public StandardWebSocketClient(String relayUri, long awaitTimeoutMs, long pollIntervalMs)
throws java.util.concurrent.ExecutionException, InterruptedException {
if (awaitTimeoutMs <= 0) {
throw new IllegalArgumentException("awaitTimeoutMs must be positive");
}
if (pollIntervalMs <= 0) {
throw new IllegalArgumentException("pollIntervalMs must be positive");
}
this.awaitTimeoutMs = awaitTimeoutMs;
this.pollIntervalMs = pollIntervalMs;
log.info("StandardWebSocketClient created for {} with awaitTimeoutMs={}, pollIntervalMs={}",
relayUri, awaitTimeoutMs, pollIntervalMs);
this.clientSession = createSpringClient()
.execute(this, new WebSocketHttpHeaders(), URI.create(relayUri))
.get();
}
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new constructor does not initialize the maxIdleTimeoutMs field, leaving it at its default value of 0. While the createSpringClient() method uses DEFAULT_MAX_IDLE_TIMEOUT_MS for the WebSocket container configuration (which is correct), the instance field remains uninitialized. This could lead to confusion or issues if the field is used elsewhere in the future. Consider either initializing this field in the constructor or documenting that it's intentionally left at the default.

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +32
void doForEach() throws InterruptedException {
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Thread.sleep(500) to wait for relay initialization is a fragile approach that can lead to flaky tests. The sleep duration is arbitrary and may be insufficient on slower systems or excessive on faster ones. Consider using a more robust approach like polling with a timeout or waiting for a specific relay state/signal that indicates readiness, similar to how Testcontainers already waits for the log message in BaseRelayIntegrationTest.

Suggested change
void doForEach() throws InterruptedException {
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);
void doForEach() {

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +26
## [1.1.0] - 2025-12-23

### Added
- Public constructor `StandardWebSocketClient(String relayUri, long awaitTimeoutMs, long pollIntervalMs)` for programmatic timeout configuration outside Spring DI context.

### Changed
- Enhanced diagnostic logging for timeout configuration in StandardWebSocketClient.
- Simplified WebSocket client initialization and retry logic in tests.

### Fixed
- Updated `JsonDeserialize` builder reference in API module.
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CHANGELOG documents version 1.1.0 with a release date of 2025-12-23, but this version is not present in any of the POM files. The POM files jump directly from 1.0.1 to 1.1.1. Either version 1.1.0 should be added to the POM files if it was actually released, or the CHANGELOG should be updated to remove the 1.1.0 entry and consolidate its changes into 1.1.1.

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +65
void testNIP99ClassifiedListingEvent() throws IOException, InterruptedException {
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Thread.sleep(500) to wait for relay initialization is a fragile approach that can lead to flaky tests. The sleep duration is arbitrary and may be insufficient on slower systems or excessive on faster ones. Consider using a more robust approach like polling with a timeout or waiting for a specific relay state/signal that indicates readiness, similar to how Testcontainers already waits for the log message in BaseRelayIntegrationTest.

Suggested change
void testNIP99ClassifiedListingEvent() throws IOException, InterruptedException {
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);
void testNIP99ClassifiedListingEvent() throws IOException {

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +35
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Thread.sleep(500) to wait for relay initialization is a fragile approach that can lead to flaky tests. The sleep duration is arbitrary and may be insufficient on slower systems or excessive on faster ones. Consider using a more robust approach like polling with a timeout or waiting for a specific relay state/signal that indicates readiness, similar to how Testcontainers already waits for the log message in BaseRelayIntegrationTest.

Suggested change
// Give the relay a moment to fully initialize after container startup
Thread.sleep(500);

Copilot uses AI. Check for mistakes.
<packaging>pom</packaging>

<name>${project.artifactId}</name>
<name>nostr-java</name>
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project name was changed from a dynamic reference using ${project.artifactId} to a hardcoded string "nostr-java". While this works, it removes the flexibility of the dynamic reference. If this change is intentional to standardize the name across all child modules, it should be documented. Otherwise, consider keeping the dynamic reference to maintain consistency with Maven best practices.

Suggested change
<name>nostr-java</name>
<name>${project.artifactId}</name>

Copilot uses AI. Check for mistakes.
ResourceBundle bundle = ResourceBundle.getBundle(RESOURCE_BUNDLE);
String image = bundle.getString(IMAGE_KEY);
relayPort = bundle.containsKey(PORT_KEY)
? Integer.parseInt(bundle.getString(PORT_KEY))
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential uncaught 'java.lang.NumberFormatException'.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants