-
Notifications
You must be signed in to change notification settings - Fork 75
Token merging algorithm #402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @sdiazlor , I noticed the checks haven’t started yet — could someone with access trigger the CI run? Thanks! |
|
Thank you so much for your contribution, @oii-nasif! I have triggered the CI, and then we will review |
|
Hi, thanks for the nice work ! I have been through the code and it is already is good shape (I have some questions but I will do a proper review later). The vit_small model you use during the test is not defined (yields |
|
This PR has been inactive for 10 days and is now marked as stale. |
|
@oii-nasif Kind reminder! :) |
|
This PR has been inactive for 10 days and is now marked as stale. |
|
Hi @oii-nasif, I saw this was automatically closed and abandoned. Are you interested in picking this up again? |
|
@llcnt are you able to follow up on this. |
llcnt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be happy to re-review once the following errors are fixed: #402 (comment) ;)
|
This PR has been inactive for 10 days and is now marked as stale. |
Fixes #399
Token Merging Algorithm Implementation
Hi @sdiazlor and Pruna team! 👋
I've implemented a Token Merging (ToMe) algorithm for Pruna as suggested in this issue. This is a cutting-edge optimization technique that accelerates Vision Transformers and similar models.
🎯 What's Implemented
Algorithm: Token Merging for Vision Transformers
Category: Pruner
Performance: 1.5-2x speedup, 20-30% memory reduction, <1% accuracy loss
📦 Files Added/Modified
Core Implementation
src/pruna/algorithms/pruning/token_merging.py- Main algorithm classsrc/pruna/algorithms/pruning/token_merging_utils.py- ToMe utilitiestests/algorithms/testers/pruning.py- Test casesDocumentation
TOKEN_MERGING.md- Comprehensive documentationexamples/token_merging_example.py- Usage examplesISSUE_364_IMPLEMENTATION.md- Implementation details🚀 Quick Example
✨ Key Features
📊 Performance
🔧 How It Works
Token Merging uses bipartite soft matching to progressively merge similar tokens:
This is particularly effective for Vision Transformers where many image patches contain redundant information.
💡 Why This Algorithm?
🎓 References
🧪 Testing
All syntax checks pass:
To run the full test suite:
🔮 Future Enhancements
Potential improvements for future versions:
📝 Next Steps
This implementation is ready for:
Looking forward to seeing this merged! 🎉
Note
Introduce a Token Merging (ToMe) pruner with PyTorch utilities and tests, enabling configurable token reduction in ViTs via forward-hook patching.
TokenMergingPruner(src/pruna/algorithms/pruning/token_merging.py):reduction_ratiohyperparameter and compatibility metadata._applyusingapply_tome_to_vit.src/pruna/algorithms/pruning/token_merging_utils.py):bipartite_soft_matching,do_nothing, andapply_tome_to_vitto patch transformer blocks and merge tokens during forward passes.TestTokenMergingintests/algorithms/testers/pruning.py(modelvit_small) asserting_tome_ris set and > 0; retains existing pruning tests.Written by Cursor Bugbot for commit 9d80630. This will update automatically on new commits. Configure here.