feat: Add BertForTokenClassification for Named Entity Recognition in Rust #3212
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Implements
BertForTokenClassificationfor pure-Rust Named Entity Recognition (NER) in Candle, following theexisting DeBERTa V2 pattern.
This enables Rust applications to perform token classification tasks (NER, POS tagging, etc.) without Python
dependencies or heavyweight C++ libraries like libtorch.
Changes
BertForTokenClassificationstruct with dropout and linearclassifier layers (~35 lines)
dslim/bert-base-NERfrom HuggingFace Hub (~300 lines)Implementation Details
VarBuilderfor loading model weights from safetensorsTesting
Tested with
dslim/bert-base-NERmodel - successfully extracts named entities (PER, LOC, ORG, MISC) withconfidence scores and character offsets.
Example output:
Use Case
Enables pure-Rust NER for applications requiring lightweight ML inference without Python/libtorch runtime
dependencies. Perfect for:
Comparison to Alternatives
This fills a significant gap in the Rust ML ecosystem by providing production-ready NER without the complexity of
PyTorch bindings.