-
Notifications
You must be signed in to change notification settings - Fork 4
Compute nearest filters via indexed filter ranges #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Compute nearest filters via indexed filter ranges #25
Conversation
…rting options. Update README to reflect new options for maxDistance and filtering logic. Modify API definitions and implementations to support must/should filter conditions and sorting intervals. Add tests for new filtering functionality in closest point queries.
… and improve performance. Introduce a new buildStreamForCell method to manage filter conditions and optimize data retrieval. Update README to clarify maxDistance usage and filtering logic. Add tests for multiple filter scenarios in closest point queries.
…performance. Replace the buildStreamForCell method with direct database queries for point retrieval, enhancing clarity and efficiency. Introduce new helper methods for filter checks and sorting interval validation.
…etrieval and improving filter checks. Add methods for managing cell processing and fetching point entries in batches, optimizing performance and clarity in the query logic.
…precated filter methods. Introduce FILTER_SUBDIVIDE_THRESHOLD for enhanced performance in point retrieval, streamlining the query process.
…ove point retrieval logic. Introduce CellStreamState for better state handling during processing, and optimize filter checks and batch processing for improved performance.
…ia-indexed-filter-ranges
…necessary type assertions and directly using the context in query executions. This improves code clarity and maintains functionality.
commit: |
|
@ianmacartney just checking back in, how are we feeling about this? |
|
Sorry for the delay, Seth and I are heads down on improving Components
authoring, and haven't had time to reason through this and may not get to
it for a week or so.
…On Fri, Nov 21, 2025 at 8:09 AM MacKinley Smith ***@***.***> wrote:
*mackinleysmith* left a comment (get-convex/geospatial#25)
<#25 (comment)>
@ianmacartney <https://github.com/ianmacartney> just checking back in,
how are we feeling about this?
—
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACZQW4PPFDXS7C7CREFBMT3542LXAVCNFSM6AAAAACLZ3SXVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNRTG4YDANBXGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
@ianmacartney thanks for the update. I'm here for you if you need anything! |
sethconve
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm taking a look, will post more comments as I get through it!
src/client/index.ts
Outdated
| maxDistanceOrOptions?: | ||
| | number | ||
| | QueryNearestOptions<GeospatialDocument<Key, Filters>>, | ||
| maybeOptions?: QueryNearestOptions<GeospatialDocument<Key, Filters>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concerns about removing maybeOptions, and forcing an options object if you want to specify maxDistance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sethconve the only concern i had and the reason i did it this way was to try not to make a breaking change for other users of this package. i think what you're suggesting is cleaner, and if convex isn't concerned about making a breaking change here, then i'm happy to tidy this up for you and go with your suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a new method nearest(ctx, { point, limit, maxDistance, filter }) and mark queryNearest as deprecated that only takes distance and internally calls nearest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianmacartney i have updated the interface to be as you described. let me know if you think of anything else!
sethconve
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mackinleysmith looks great over all. One small question on simplifying the function signature. Do you have strong opinions there?
…and code to reflect changes, ensuring backward compatibility with `maxDistance` argument. Adjust related types and methods for improved clarity and consistency.
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (2)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughDeprecates the legacy queryNearest and introduces a unified nearest(ctx, { point, limit, maxDistance, filter }) API; threads filter and sort interval parameters through the query pipeline and implements multi-condition filtering and sort-interval enforcement in ClosestPointQuery. Changes
Sequence DiagramsequenceDiagram
participant Client
participant GeospatialIndex
participant FilterBuilder
participant QueryPipeline
participant ClosestPointQuery
Client->>GeospatialIndex: nearest(ctx, {point, limit, maxDistance, filter})
GeospatialIndex->>FilterBuilder: build filter conditions (must/should)
FilterBuilder-->>GeospatialIndex: FilterCondition[] (must/should)
GeospatialIndex->>QueryPipeline: execute with filtering & sorting.interval
QueryPipeline->>ClosestPointQuery: new ClosestPointQuery(..., filtering, interval)
rect rgb(230,245,255)
Note over ClosestPointQuery: For each cell: create/get stream, apply stream-level filters\napply matchesFilters, enforce sort interval, accumulate results, stop if thresholds met
end
ClosestPointQuery-->>QueryPipeline: filtered & ordered results
QueryPipeline-->>GeospatialIndex: result set
GeospatialIndex-->>Client: nearest result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/component/lib/pointQuery.ts (1)
150-152: Double-checking filters on final results is defensive but adds overhead.The indexed streams (FilterKeyRange + Intersection) should already filter by
must/shouldclauses and sort interval. Re-applyingmatchesFiltershere acts as a safety net but may be redundant. If the stream filtering is correct, this is unnecessary work.Consider whether this is intentional (defense in depth) or can be removed for performance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
package-lock.jsonis excluded by!**/package-lock.jsonsrc/component/_generated/api.d.tsis excluded by!**/_generated/**src/component/_generated/component.tsis excluded by!**/_generated/**
📒 Files selected for processing (9)
README.md(2 hunks)example/convex/example.ts(1 hunks)package.json(1 hunks)src/client/index.ts(2 hunks)src/component/lib/pointQuery.ts(7 hunks)src/component/query.ts(3 hunks)src/component/streams/constants.ts(1 hunks)src/component/streams/intersection.ts(1 hunks)src/component/tests/pointQuery.test.ts(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
src/component/streams/constants.ts (1)
src/component/query.ts (1)
PREFETCH_SIZE(18-18)
src/client/index.ts (4)
example/convex/_generated/dataModel.d.ts (1)
Doc(30-33)src/component/types.ts (2)
Point(18-18)point(14-17)src/client/query.ts (1)
GeospatialQuery(7-20)example/convex/_generated/server.d.ts (1)
QueryCtx(107-107)
src/component/lib/pointQuery.ts (13)
src/client/index.ts (2)
Primitive(12-12)point(13-13)src/component/lib/primitive.ts (1)
Primitive(10-10)src/component/types.ts (2)
Primitive(3-3)point(14-17)src/component/lib/interval.ts (2)
Interval(7-7)interval(3-6)src/component/lib/s2Bindings.ts (1)
cellIDToken(39-48)src/component/lib/tupleKey.ts (1)
decodeTupleKey(38-74)src/component/streams/zigzag.ts (2)
Stats(30-35)PointSet(3-28)src/component/streams/cellRange.ts (1)
CellRange(8-99)src/component/query.ts (1)
PREFETCH_SIZE(18-18)src/component/streams/constants.ts (1)
PREFETCH_SIZE(1-1)src/component/streams/filterKeyRange.ts (1)
FilterKeyRange(9-106)src/component/streams/union.ts (1)
Union(10-119)src/component/streams/intersection.ts (1)
Intersection(5-113)
src/component/tests/pointQuery.test.ts (1)
src/component/lib/pointQuery.ts (1)
ClosestPointQuery(23-353)
🔇 Additional comments (24)
package.json (1)
1-109: Package configuration is sound; only EOF newline changed.The package.json maintains consistent dependency health with convex pinned at 1.29.0 in dev and peer-constrained to ^1.24.8, allowing flexibility for consumers. The version bump from 0.1.x to 0.2.0 appropriately reflects the API additions in this PR (new
nearestmethod and queryNearest enhancements). No dependency conflicts, export changes, or script modifications are present.README.md (2)
139-141: Documentation correctly reflects the deprecation.The deprecation notice for
queryNearestis clear and directs users to the newnearestmethod.
272-277: API example demonstrates the newnearestmethod well.The code example clearly shows the options-based API with
point,limit,maxDistance, andfilterparameters.src/component/streams/constants.ts (1)
1-2: Good refactor to centralize the prefetch constant.Extracting
PREFETCH_SIZEto a dedicated constants module improves maintainability and ensures consistency acrossintersection.ts,query.ts, andpointQuery.ts.src/component/streams/intersection.ts (1)
2-2: Import path correctly updated.The import now sources
PREFETCH_SIZEfrom the dedicated constants module, aligning with the refactoring in this PR.example/convex/example.ts (1)
29-33: Example correctly migrated to newnearestAPI.The migration from
queryNearesttonearestwith the options object{ point, limit, maxDistance }is correct and demonstrates the intended usage pattern.src/component/tests/pointQuery.test.ts (5)
30-42: Good test data setup with distinct filter categories.The test points now have meaningful
filterKeys("coffee"and"tea") that enable thorough testing of the must/should filter logic.
102-123: Must filter test correctly validates filtering behavior.The test verifies that a
mustfilter oncategory: "coffee"returns only the two matching points (point1andpoint3), excludingpoint2which hascategory: "tea".
125-145: Should filter test correctly validates OR semantics.When a single
shouldfilter is provided, it acts as a required filter, correctly returning onlypoint2withcategory: "tea".
147-162: Sort interval test validates boundary behavior.Using
startInclusive: 3correctly filters to onlypoint3(which hassortKey: 3), excludingpoint1andpoint2with lower sort keys.
164-191: Multiple should filters test validates OR combination.The test correctly verifies that multiple
shouldclauses are combined with OR semantics, returning all three points when both"tea"and"coffee"categories are specified.src/component/query.ts (3)
16-18: Clean import and re-export of PREFETCH_SIZE.The constant is now sourced from the dedicated constants module and re-exported for backward compatibility with any consumers that import from
query.ts.
278-281: New filtering and sorting parameters correctly added.The
nearestPointsquery now acceptsfilteringandsortingparameters, enabling filtered nearest-neighbor queries through the same filter builder used by rectangle queries.
300-301: Parameters correctly propagated to ClosestPointQuery.The
filteringarray andsorting.intervalare passed to theClosestPointQueryconstructor, wiring the new filter functionality through to the query execution.src/component/lib/pointQuery.ts (6)
17-21: FilterCondition type cleanly defines filter metadata.The type structure with
filterKey,filterValue, andoccur("must" | "should") aligns well with the filter builder pattern used in the client API.
31-36: Private fields properly encapsulate filter state and caching.The separation of
mustFiltersandshouldFilters, along with thecellStreamscache, supports efficient filtered query execution.
85-92: Verify the subdivide threshold logic.The condition
approxRows >= ClosestPointQuery.FILTER_SUBDIVIDE_THRESHOLD(where threshold is 8) seems inverted. Typically, you'd subdivide when the cell has more rows than the threshold, but hereFILTER_SUBDIVIDE_THRESHOLD = 8andshouldSubdivideis true whenapproxRows >= 1OR when filters are active andapproxRows >= 8. This means cells with ≥8 rows always subdivide when filters are active.Is this intentional? It seems like the condition should be
approxRows <= FILTER_SUBDIVIDE_THRESHOLDto avoid subdividing small cells when filtering can efficiently narrow results.
108-137: Streaming cell processing is well-structured.The loop correctly:
- Checks early termination via
shouldStopProcessingCell- Validates sort interval bounds before fetching documents
- Applies
matchesFiltersbefore adding results- Properly advances the stream and marks completion
190-251: Stream composition correctly implements filter semantics.The implementation properly:
- Creates a
CellRangefor spatial filtering- Adds individual
FilterKeyRangestreams for eachmustfilter- Wraps
shouldfilters in aUnion(OR semantics)- Combines all with
Intersection(AND semantics)This ensures
mustclauses are all satisfied while anyshouldclause suffices.
289-298: pointMatchesCondition correctly handles array-valued filter keys.The implementation properly checks if the filter value exists in an array or matches a scalar, supporting multi-valued filter keys.
src/client/index.ts (4)
40-47: NearestQueryOptions type is well-designed.The type cleanly encapsulates all nearest query parameters with appropriate optionality. Using
NonNullable<GeospatialQuery<Doc>["filter"]>for the filter ensures type consistency with the rectangle query API.
49-54: Deprecation is correctly documented.The
@deprecatedJSDoc annotation with clear migration guidance helps users transition to the new API.
229-257: Newnearestmethod correctly implements the agreed-upon API.The implementation:
- Uses
FilterBuilderImplconsistently with thequerymethod- Passes
filterConditionsandintervalto the backend- Maps
limittomaxResultsfor the backend callThis follows the pattern discussed in past review comments where @ianmacartney suggested creating a new
nearestmethod with options object.
259-275: DeprecatedqueryNearestcorrectly delegates tonearest.The wrapper preserves backward compatibility by forwarding to the new method with appropriate parameter mapping (
maxResults→limit).
…y function, detailing the options object and its properties, including `point`, `limit`, `maxDistance`, and `filter`. This enhances understanding of the query capabilities and maintains consistency with recent changes.
We currently don't rebuild them in CI, so it is useful to commit the _generrated files
This is a follow-on for #23 -- apologies if there is a better way to set the base branch here, but I couldn't find it, so this contains the content of both PRs.
Summary
queryNearestclient helper to the same filter builder used by rectangle queries, carryingfilteringandsorting.intervalthrough the Convex query layer.FilterKeyRangestreams with cell scans somust/shouldclauses are enforced before document fetches.Details
GeospatialIndex.queryNearestnow accepts either a distance or options object and builds filter/sort intervals that are passed tocomponent.query.nearestPoints.ClosestPointQueryfactors in filter conditions when choosing cell subdivisions and mergesFilterKeyRangestreams viaIntersection, usingPREFETCH_SIZEfrom the shared constants module.maxDistanceinteract, emphasizing the performance win from narrowing the search space.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Summary by CodeRabbit
New Features
Documentation
Deprecations
Tests
✏️ Tip: You can customize this high-level summary in your review settings.