Skip to content

Conversation

@joanagmaia
Copy link
Contributor

@joanagmaia joanagmaia commented Nov 25, 2025

This PR follows this discussion https://linuxfoundation.slack.com/archives/C06ULRCFJF8/p1763982626411879.
This PR updates the LLM prompt to automatically merge member profiles and to make it a lot more strict when it comes to merging profiles with different identities for the same platform.

Even with the pro tip we saw a lot of merges that had different values for the same platform.

Tests made against two profiles that were previously merged before:

  • Bot merged with normal profile: ['b18bb1c0-859c-11f0-83d5-5fc87b30e90a', 'a10f9ad5-e705-4acc-8378-53416788eb80']
  • Two profiles with 2 github verified usernames: ['7fcf7950-b95b-11ee-8ae5-b5749935eef4', 'f249ae23-af7c-4b28-910c-3e736d65c671']

Note

Strengthens the LLM prompt to prevent merging when same-platform identities differ and to never merge bot with human profiles.

  • LLM Prompt Updates in services/apps/merge_suggestions_worker/src/workflows/mergeMembersWithLLM.ts:
    • Add critical rule to NEVER merge if both members share the same identities.platform but have different identities.value (checked first, regardless of other similarities).
    • Add bot checks to NEVER merge when attributes.isBot.default differs (bot vs human), evaluated before other similarities.

Written by Cursor Bugbot for commit 23d9fd1. This will update automatically on new commits. Configure here.

@joanagmaia joanagmaia requested a review from skwowet November 25, 2025 19:08
@joanagmaia joanagmaia self-assigned this Nov 25, 2025
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

2 similar comments
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

3 similar comments
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

// Format: [primaryMemberId, secondaryMemberId]
const HARDCODED_SUGGESTIONS: string[][] = [
['b18bb1c0-859c-11f0-83d5-5fc87b30e90a', 'a10f9ad5-e705-4acc-8378-53416788eb80'],
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test code with hardcoded data committed to production workflow

The workflow has been replaced with test/debugging code. HARDCODED_SUGGESTIONS contains specific test UUIDs with a comment saying "Replace with your test member ID pairs". The original getRawMemberMergeSuggestions call that fetches real suggestions is removed, saveLLMVerdict is removed so verdicts aren't saved, suggestion table updates are removed, and continueAsNew is replaced with "Test run completed!". This breaks all production functionality - the workflow will only process one hardcoded member pair and then stop.

Additional Locations (2)

Fix in Cursor Fix in Web

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

3 similar comments
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@joanagmaia joanagmaia force-pushed the fix/merge-members-llm-prompt branch from a55e2f4 to 23d9fd1 Compare November 28, 2025 12:44
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@joanagmaia
Copy link
Contributor Author

@skwowet deployed this already, but will wait for your review before merging.
This week we detected a lot of issues in data quality mainly related to multiple profiles being merged together, where they clearly shouldn't. We detected this in profiles that already had 1 verified github identity, and for bot profiles.
This PR adds way more strict rules to prevent this.

I also added 2 monitors to metaplane to see if this evolves, or if this fix is able to mitigate this:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants