Skip to content

Conversation

@jlpainter
Copy link

This patch includes code to add additional kernel metrics to the ctakes-ytex

These include:

  • Intrinsic Resnik
  • Resnik
  • Intrinsic Faith
  • Faith
  • Dice
  • Simpson
  • Braun-Blanquet
  • Ochiai

The algorithms for most can be found either in the original Perl UMLS::Similarity package or as described by Sanzhez and Batet in: https://www.sciencedirect.com/science/article/pii/S1532046411000645

Examples were computed and compared with output from the Perl UMLS::Similarity and verified to be the same. However, this requires that when testing against Perl's package, you must specify to use --instrinsic sanchez as the cTakes YTEX implementation of the IC is ONLY using the Sanchez implementation. If you do not specify the Resnik when calling the perl scripts, it will default to the corpus based IC which results in different numbers being produced. Once you force it to use the Sanchez IC, the distance metrics correspond exactly when running against the same UMLS database installed.

@Johnsd11 Johnsd11 mentioned this pull request Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant