Skip to content

Conversation

@enigbe
Copy link
Owner

@enigbe enigbe commented Oct 21, 2025

What this PR does:

We introduce TierStore, a KVStore implementation that manages data across
three distinct storage layers.

The layers are:

  1. Primary: The main/remote data store.
  2. Ephemeral: A secondary store for non-critical, easily-rebuildable data
    (e.g., network graph). This tier aims to improve latency by leveraging a
    local KVStore designed for fast/local access.
  3. Backup: A tertiary store for disaster recovery. Backup operations are sent
    asynchronously/lazily to avoid blocking primary store operations.

We also permit the configuration of Node with these stores allowing
callers to set exponential back-off parameters, as well as backup and ephemeral
stores, and to build the Node with TierStore's primary store. These configuration
options also extend to our foreign interface, allowing bindings target to build the
Node with their own ffi::KVStore implementations.

A sample Python implementation is added and tested.

Additionally, we add comprehensive testing for TierStore by introducing

  1. Unit tests for TierStore core functionality.
  2. Integration tests for Node built with tiered storage.
  3. Python FFI tests for foreign ffi::KVStore implementations.

Concerns

It is worth considering the way retry logic is handled, especially because of nested
retries. TierStore comes with a basic one by default but there are KVStore implementations
that come with them baked-in (e.g. VssStore), and thus would have no need for
the wrapper-store's own logic.

tnull and others added 5 commits October 29, 2025 09:24
We bump our rand dependency to the latest stable version.
The previously-used `thread_rng` should be fine, but `os_rng` is
guaranteed to block until there is sufficient entropy available (e.g.,
after startup), which might slightly improve security here.
To enable more realistic testing with sqlite as a backend.
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 5 times, most recently from 4cdc7dc to bbfde89 Compare November 1, 2025 22:28
joostjager and others added 3 commits November 3, 2025 11:40
Introduces a criterion-based benchmark that sends 1000 concurrent
payments between two LDK nodes to measure total duration. Also adds a
CI job to automatically run the benchmark.
…UserChannelId

Implement Display for UserChannelId
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 4 times, most recently from 29f47f3 to 264aa7f Compare November 4, 2025 22:07
tnull and others added 13 commits November 5, 2025 11:22
We previously added a bunch of commented-out `rust-lightning`
dependencies in our `Cargo.toml` to be able to easily switch between
`rust-lightning` locations. However, this is exactly what the `[patch]`
command is for, which in particular also allows to patch a dependency
for the whole tree, not only this one project.

Therefore, we move the examples to a commented-out `patch` section.
.. which might be necessary for it to be able to run successfully.
Use `[patch]` instead of switching all the dependencies
…ll-request/patch

Automated nightly rustfmt (2025-11-05)
Upgrade `rand` dependency, use `os_rng` for seed generation
Recently, `rust-lightning` broke the (async) API of the `TestStore`,
making it ~impossible to use in regular tests.

Here, we un-DRY our `TestStore` implementation and simply copy over the
previous `TestStore` version, now named `InMemoryStore` to discern the
objects. We also switch all feasible instances over to use
`InMemoryStore` rather than LDK's `test_utils::TestStore`.
Previously, we'd still use `KVStoreSync` for persistence of our event
queue, which also meant calling the sync persistence through our
otherwise-async background processor/event handling flow.

Here we switch our `EventQueue` persistence to be async, which gets us
one step further towards async-everything.
Previously, LDK only allowed to set this for BOLT11 payments. Since we
now can, we allow to specify the `RouteParametersConfig` in BOLT12 and
`UnifiedQrPayment` APIs.
This change uses an alias (LdkChannelDetails) and an explicit Vec<LdkChannelDetails> type annotation for 'open_channels' in close_channel_internal and update_channel_config. This resolves type ambiguity caused by a name collision with the local ChannelDetails struct, which prevents rust-analyzer from correctly inferring the type as Vec, leading to an incorrect 'len() is private' error.
…-persistence

Make `EventQueue` persistence `async`
…g-with-rng

Replace deprecated thread_rng with rng
tnull and others added 20 commits November 19, 2025 12:01
Unfortunately, `doc_auto_cfg` was removed, breaking doc builds for
v0.7.0-rc.0. Here we replace it with the `doc_cfg` attribute.
…to-cfg-main

Replace docs.rs build `doc_auto_cfg` feature with `doc_cfg` (main)
…ll-request/patch

Automated nightly rustfmt (2025-11-23)
As we're about to expose more entropy-related things, we here introduce
a new module and start moving related types there.
Now that we don't use the `Runtime` in `VssStore` anymore, we can in
fact revert to reuse the public interface.
Previously, the `Builder` allowed setting different entropy sources via
its `set_entropy...` methods, defaulting to sourcing from an
auto-generated seed file in the storage path. While this allowed for
really easy setup, it spared the user to actually think about where to
store their node secret.

Here, we therefore introduce a mandatory `NodeEntropy` object that, as
before, allows the user to source entropy from BIP39 Mnemonic, seed
bytes, or a seed file. However, it doesn't implement any default and
hence intentionally requires manually setup by the user. Moreover, this
API refactor also allows to reuse the same object outside of the
`Node`'s `Builder` in a future commit.
We previouly ran into a quiet error that lead to `docs.rs` not rendering
our docs properly, which unfortunately didn't surface until after we
pushed out a releas (thankfully only an RC in this case).

Here, we add a CI job that tests our docs build with exactly the
settings `docs.rs` uses. Additionally, the change also has the benefit
that we now only build docs once rather than for every combiantion in
our workflow matrix, which was a bit overkill.
Previously, we could in tests potentially run into listening port
collisions resulting into `InvalidSocketAddress` errors. These errors
could surface if we rolled port numbers that either collided with other
concurrent tests *or* with other unrelated services running on
localhost.

Here, we simply let the OS assign us a free port number when setting up
the testing nodes, which avoids such collisions altoghether (mod the
potential TOCTOU race here, which we ignore for now).
In rust-lightning#4220 the `check_closed_event` macros was replaced with
a method and is now also only re-exported via `fuctional_test_utils`.
…ci-job

Add a `docs.rs` CI job checking documentation builds
We add a simple test calling `read_or_generate_seed_file` twice,
asserting it returns the same value in both cases.
…losed-event

Account for `check_closed_event` being moved on LDK `main`
…collisions

Avoid `TcpListener` port collisions
This simply adopts the changes of rust-lightning#4250.
…round

Previously, we implemented `lazy` deletes in `VssStore` by batching them
with the next write call as part of the next `PutObjectRequest` sent.
However, we unfortunately overlooked that in this instance any
non-existent `delete_items` would yield a `ConflictError`. Rather than
batched `VssStore` lazy deletes, we therefore here opt to simply spawn
them into the background and ignore any errors.
…nel-details-docs

Update `ChannelDetails` docs for splicing
…deletes

Revert batched VSS `lazy` deletes, rather `spawn` them into the background
…andatory-node-entropy

Introduce new mandatory `NodeEntropy` object
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 264aa7f to 493dd9a Compare December 2, 2025 19:50
tnull and others added 6 commits December 3, 2025 10:38
…-post-v0.7.0-release

Update `main` post v0.7.0 release
Introduces TierStore, a KVStore implementation that manages data
across three storage layers:

- Primary: Main/remote data store
- Ephemeral: Secondary store for non-critical, easily-rebuildable
  data (e.g., network graph) with fast local access
- Backup: Tertiary store for disaster recovery with async/lazy
  operations to avoid blocking primary store

Adds four configuration methods to NodeBuilder:

- set_tier_store_backup: Configure backup data store
- set_tier_store_ephemeral: Configure ephemeral data store
- set_tier_store_retry_config: Configure retry parameters with
  exponential backoff
- build_with_tier_store: Build node with primary data store

These methods are exposed to the foreign interface via additions
in ffi/types.rs:

- ffi::SyncAndAsyncKVStore: Composed of KVStore and KVStoreSync
  methods to handle the types::SyncAndAsyncKVStore supertrait
  across FFI
- ffi::ForeignKVStoreAdapter and ffi::DynStore: Adapt/translate
  between foreign language store and native Rust store
- Conditional compilation for DynStore: ffi::DynStore with uniffi,
  types::DynStore without, with selection aided by the wrap_store!()
  macro
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 493dd9a to a30cbfb Compare December 4, 2025 23:10
This commit adds unit, integration, and FFI tests
for the TierStore implementation:
- Unit tests for TierStore core functionality
- Integration tests for nodes built with tiered storage
- Python FFI tests for foreign key-value store
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from a30cbfb to 1e7bdbc Compare December 4, 2025 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants