Skip to content

Conversation

@spuckhafte
Copy link

@spuckhafte spuckhafte commented Dec 23, 2025

Fixes #XXXX.

Description


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • Bug Fixes
    • Deleting a log stream now also attempts to remove any leftover filters tied to that stream; failures during this extra cleanup are logged but do not interrupt the deletion, reducing stray matches and improving consistency.
  • New Features
    • Added a public stream-existence check to support safer, explicit stream validations.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 23, 2025

Walkthrough

Adds a metastore-backed cleanup step to remove orphaned ("zombie") filters during log-stream deletion, extends the Metastore trait with delete_zombie_filters, and exposes a Parseable::check_stream_exists method; handler logs failures from the new metastore call but does not change deletion control flow on error.

Changes

Cohort / File(s) Summary
Log stream deletion handler
src/handlers/http/logstream.rs
After deleting stats, call PARSEABLE.metastore.delete_zombie_filters(&stream_name).await; log a warning on error and continue with storage deletion, local cleanup, in-memory removal, and 200 OK response.
Metastore trait
src/metastore/metastore_traits.rs
Add async fn delete_zombie_filters(&self, stream_name: &str) -> Result<bool, MetastoreError>; to the Metastore trait.
Object-store metastore implementation
src/metastore/metastores/object_store_metastore.rs
Import FILTERS and implement delete_zombie_filters to abort if the stream still exists, list filters, delete matching filters (calling delete_filter and removing entries from FILTERS when filter_id present), and return a boolean result.
Parseable public API
src/parseable/mod.rs
Add pub fn check_stream_exists(&self, stream_name: &str) -> bool to check the in-memory Streams map.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Handler as LogStreamHandler
  participant Parseable
  participant Metastore
  participant Filters as FILTERS

  Client->>Handler: DELETE /streams/:name
  Handler->>Metastore: delete_stats(stream_name)
  Handler->>Parseable: check_stream_exists(stream_name)
  alt stream does not exist
    Handler->>Metastore: delete_zombie_filters(stream_name)
    alt zombie filters found
      loop per filter
        Metastore->>Metastore: delete_filter(filter)
        alt filter has filter_id
          Metastore->>Filters: remove filter_id entry
        end
      end
    end
  else stream still exists
    Handler->>Handler: skip zombie filter deletion (logs warning)
  end
  Handler->>Handler: delete storage objects & local cleanup
  Handler-->>Client: 200 OK
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • fix: delete date stats #1306 — Modifies stream deletion flow and stats removal; closely related to the handler changes that now invoke metastore filter cleanup.

Suggested labels

for next release

Suggested reviewers

  • nikhilsinhaparseable
  • parmesant

Poem

🐰 I hop the code, I sniff the streams,
I chase the filters from midnight dreams,
A tidy burrow, no ghosts remain,
One delete, one hop — cleaner terrain. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description contains only the template with no actual content filled in—all placeholder text remains unchecked, and the issue ID is not specified. Fill in the description with the goal, rationale for the solution chosen, and key changes made; specify the issue ID if applicable; and indicate checklist completion status.
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding deletion of associated filters when a stream is deleted, which aligns with the code changes across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e67fc6 and aca2671.

📒 Files selected for processing (1)
  • src/handlers/http/logstream.rs
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-08-25T01:31:41.786Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metadata.rs:63-68
Timestamp: 2025-08-25T01:31:41.786Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metadata.rs and src/storage/object_storage.rs are designed to track total events across all streams, not per-stream. They use labels [origin, parsed_date] to aggregate by format and date, while per-stream metrics use [stream_name, origin, parsed_date] labels.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-08-25T01:32:25.980Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:163-173
Timestamp: 2025-08-25T01:32:25.980Z
Learning: The TOTAL_EVENTS_INGESTED_DATE, TOTAL_EVENTS_INGESTED_SIZE_DATE, and TOTAL_EVENTS_STORAGE_SIZE_DATE metrics in src/metrics/mod.rs are intentionally designed to track global totals across all streams for a given date, using labels ["format", "date"] rather than per-stream labels. This is the correct design for global aggregation purposes.

Applied to files:

  • src/handlers/http/logstream.rs
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/handlers/http/logstream.rs
🔇 Additional comments (1)
src/handlers/http/logstream.rs (1)

32-32: LGTM!

The import of Filter is necessary for the new filter deletion logic and is correctly placed.

coderabbitai[bot]
coderabbitai bot previously approved these changes Dec 23, 2025
Copy link
Author

@spuckhafte spuckhafte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let PARSEABLE::metastore::get_filters be queryable using logstream name or id.

The current implementation of it fetches all the filters across existing logstreams at once.

Copy link
Contributor

@parmesant parmesant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move filter deletion so that it happens after stream deletion. Also, moving the filter deletion logic in a utility function would be better.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/parseable/mod.rs (1)

238-245: Simplify the boolean return expression.

The if-else construct is redundant when returning a boolean derived from a condition. This can be simplified to a single expression.

🔎 Proposed simplification
-    // check if a stream exists
-    pub fn check_stream_exists(&self, stream_name: &str) -> bool {
-        if self.streams.contains(stream_name) {
-            return true;
-        } else {
-            return false;
-        }
-    }
+    /// Checks if a stream exists in the in-memory streams map.
+    pub fn check_stream_exists(&self, stream_name: &str) -> bool {
+        self.streams.contains(stream_name)
+    }
src/metastore/metastores/object_store_metastore.rs (1)

549-579: Use self methods instead of global PARSEABLE.metastore calls.

The implementation calls PARSEABLE.metastore.get_filters() and PARSEABLE.metastore.delete_filter() instead of using self.get_filters() and self.delete_filter(). This is inconsistent with other methods in this trait implementation and creates unnecessary indirection through the global static. Using self is more idiomatic and avoids the implicit assumption that PARSEABLE.metastore is the same instance.

🔎 Proposed fix
     // clear filters associated to a deleted stream
     async fn delete_zombie_filters(&self, stream_name: &str) -> Result<bool, MetastoreError> {
         // stream should not exist in order to have zombie filters
         if PARSEABLE.check_stream_exists(stream_name) {
             warn!("no zombie filters cleared for [undeleted] stream {}", stream_name);
             return Ok(false);
         }

-        let all_filters = match PARSEABLE.metastore.get_filters().await {
-            Ok(all_f) => all_f,
-            Err(e) => {
-                return Err(e);
-            }
-        };
+        let all_filters = self.get_filters().await?;

         // collect filters associated with the logstream being deleted
         let filters_for_stream: Vec<Filter> = all_filters
             .into_iter()
             .filter(|filter| filter.stream_name == stream_name)
             .collect();

         for filter in filters_for_stream.iter() {
-            PARSEABLE.metastore.delete_filter(filter).await?;
-            
+            self.delete_filter(filter).await?;
+
             if let Some(filter_id) = filter.filter_id.as_ref() {
                 FILTERS.delete_filter(filter_id).await;
             }
         }

-        return Ok(true);
+        Ok(true)
     }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 19704d9 and d812994.

📒 Files selected for processing (4)
  • src/handlers/http/logstream.rs
  • src/metastore/metastore_traits.rs
  • src/metastore/metastores/object_store_metastore.rs
  • src/parseable/mod.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/handlers/http/logstream.rs
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/parseable/mod.rs
  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-07-28T17:10:39.448Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1392
File: src/migration/stream_metadata_migration.rs:303-322
Timestamp: 2025-07-28T17:10:39.448Z
Learning: In Parseable's migration system (src/migration/stream_metadata_migration.rs), each migration function updates the metadata to the current latest format using CURRENT_OBJECT_STORE_VERSION and CURRENT_SCHEMA_VERSION constants, rather than producing incremental versions. For example, v5_v6 function produces v7 format output when these constants are set to "v7", not v6 format.

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
🧬 Code graph analysis (2)
src/metastore/metastore_traits.rs (2)
src/metastore/metastores/object_store_metastore.rs (1)
  • delete_zombie_filters (550-579)
src/validator.rs (1)
  • stream_name (36-71)
src/metastore/metastores/object_store_metastore.rs (2)
src/users/filters.rs (1)
  • migrate_v1_v2 (181-193)
src/metastore/metastore_traits.rs (1)
  • delete_zombie_filters (111-111)
🔇 Additional comments (3)
src/metastore/metastore_traits.rs (1)

111-111: LGTM - trait method signature is appropriate.

The new delete_zombie_filters method follows the existing trait patterns and is logically grouped with other filter methods. Consider adding a doc comment to clarify the return semantics (true = filters deleted, false = stream still exists).

src/metastore/metastores/object_store_metastore.rs (2)

59-59: LGTM - import updated correctly.

The FILTERS import is necessary for in-memory cache cleanup during zombie filter deletion.


570-576: Consider partial failure handling.

If deletion fails mid-loop (e.g., network error on the 3rd of 5 filters), the method returns an error but some filters have already been deleted from storage. The in-memory FILTERS cache may become inconsistent with storage state. Consider whether to:

  1. Continue deleting remaining filters and collect errors
  2. Log which filters were successfully deleted before the error

This is a minor robustness concern since the operation can be retried.

coderabbitai[bot]
coderabbitai bot previously approved these changes Dec 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/metastore/metastores/object_store_metastore.rs (1)

549-579: LGTM! Zombie filter cleanup logic is sound.

The implementation correctly:

  • Guards against cleaning filters for active streams (line 552)
  • Fetches and filters to stream-specific filters
  • Removes from both storage and in-memory cache for consistency

The dual deletion (storage + in-memory) at lines 571-574 is essential to prevent stale filter references.

Optional: Simplify code style
-        let all_filters = match self.get_filters().await {
-            Ok(all_f) => all_f,
-            Err(e) => {
-                return Err(e);
-            }
-        };
+        let all_filters = self.get_filters().await?;
 
         // collect filters associated with the logstream being deleted
         let filters_for_stream: Vec<Filter> = all_filters
@@ -569,13 +564,11 @@
 
         for filter in filters_for_stream.iter() {
             self.delete_filter(filter).await?;
-            
             if let Some(filter_id) = filter.filter_id.as_ref() {
                 FILTERS.delete_filter(filter_id).await;
             }
         }
 
-        return Ok(true);
+        Ok(true)
     }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d812994 and 6e96d09.

📒 Files selected for processing (2)
  • src/metastore/metastores/object_store_metastore.rs
  • src/parseable/mod.rs
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.
📚 Learning: 2025-10-28T02:10:41.140Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1453
File: src/parseable/mod.rs:397-400
Timestamp: 2025-10-28T02:10:41.140Z
Learning: In Parseable enterprise deployments with multiple query nodes, hot tier configuration must be persisted in object storage so that newly started query nodes can fetch and synchronize the hot tier settings at startup (file: src/parseable/mod.rs, function: create_stream_and_schema_from_storage).

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-07-28T17:10:39.448Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1392
File: src/migration/stream_metadata_migration.rs:303-322
Timestamp: 2025-07-28T17:10:39.448Z
Learning: In Parseable's migration system (src/migration/stream_metadata_migration.rs), each migration function updates the metadata to the current latest format using CURRENT_OBJECT_STORE_VERSION and CURRENT_SCHEMA_VERSION constants, rather than producing incremental versions. For example, v5_v6 function produces v7 format output when these constants are set to "v7", not v6 format.

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-09-18T09:52:07.554Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/storage/object_storage.rs:173-177
Timestamp: 2025-09-18T09:52:07.554Z
Learning: In Parseable's upload system (src/storage/object_storage.rs), the update_storage_metrics function can safely use path.metadata().map_err() to fail on local file metadata read failures because parquet validation (validate_uploaded_parquet_file) ensures file integrity before this step, and the system guarantees local staging files remain accessible throughout the upload flow.

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-09-06T04:26:17.191Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1424
File: src/enterprise/utils.rs:65-72
Timestamp: 2025-09-06T04:26:17.191Z
Learning: In Parseable's metastore implementation, MetastoreError::to_detail() returns a MetastoreErrorDetail struct (not a string), which contains structured error information including operation, message, stream_name, and other contextual fields. This struct is designed to be boxed in ObjectStorageError::MetastoreError(Box<MetastoreErrorDetail>).

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
📚 Learning: 2025-02-14T09:49:25.818Z
Learnt from: de-sh
Repo: parseablehq/parseable PR: 1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Applied to files:

  • src/metastore/metastores/object_store_metastore.rs
  • src/parseable/mod.rs
📚 Learning: 2025-10-21T02:22:24.403Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1448
File: src/parseable/mod.rs:419-432
Timestamp: 2025-10-21T02:22:24.403Z
Learning: In Parseable's internal stream creation (`create_internal_stream_if_not_exists` in `src/parseable/mod.rs`), errors should not propagate to fail server initialization. The function creates both pmeta and pbilling internal streams, and failures are logged but the function always returns `Ok(())` to ensure server startup resilience. Individual stream creation failures should not prevent syncing of successfully created streams.

Applied to files:

  • src/parseable/mod.rs
📚 Learning: 2025-09-05T09:18:44.813Z
Learnt from: parmesant
Repo: parseablehq/parseable PR: 1425
File: src/query/mod.rs:484-495
Timestamp: 2025-09-05T09:18:44.813Z
Learning: In the Parseable system, stream names and column names cannot contain quotes, which eliminates SQL injection concerns when interpolating these identifiers directly into SQL queries in src/query/mod.rs.

Applied to files:

  • src/parseable/mod.rs
🧬 Code graph analysis (2)
src/metastore/metastores/object_store_metastore.rs (2)
src/users/filters.rs (1)
  • migrate_v1_v2 (181-193)
src/metastore/metastore_traits.rs (1)
  • delete_zombie_filters (111-111)
src/parseable/mod.rs (1)
src/validator.rs (1)
  • stream_name (36-71)
🔇 Additional comments (2)
src/parseable/mod.rs (1)

238-241: LGTM! Clean helper method.

The synchronous in-memory stream existence check is straightforward and fits the need for the metastore filter cleanup operation. The naming clearly conveys its purpose.

src/metastore/metastores/object_store_metastore.rs (1)

59-59: LGTM! Import supports in-memory cache cleanup.

The addition of FILTERS to imports is necessary for the in-memory filter deletion at line 574.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants