The last transformer layer of the model is always kept by default?

Your work is much appreciated. I'm sorry that I'm not a native English speaker, so please understand if there are mistakes. I have a question, does your work only discuss whether the similarity between the first n-1 layers only and not the similarity between the n-1th layer and the nth layer? Is this because the last layer is connected to the classification header so we always keep the last transformer layer by default?
Under a certain threshold, if the similarity exceeds the threshold, we will always delete the latter layer, for example, if the similarity between layer i and layer i+1 exceeds the threshold and we need to delete or replace the layer, then we must target layer i+1
![image](https://github.com/user-attachments/assets/1e9ac6b7-935c-4fd4-8a0d-e902e773e4f6)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The last transformer layer of the model is always kept by default? #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The last transformer layer of the model is always kept by default? #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions