Commit fc673e4
[Bugfix] Fix cuda graph sizes when running with speculative decoding (vllm-project#30330)
Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com>
Signed-off-by: PatrykSaffer <patryk.saffer@mistral.ai>
Co-authored-by: Patryk Saffer <patryk.saffer99@gmail.com>
Signed-off-by: Nathan Price <nathan@abridge.com>
Signed-off-by: Nathan Price <nathan@abridge.com>1 parent b435421 commit fc673e4
1 file changed
+7
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1047 | 1047 | | |
1048 | 1048 | | |
1049 | 1049 | | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
1050 | 1056 | | |
1051 | | - | |
| 1057 | + | |
1052 | 1058 | | |
1053 | 1059 | | |
1054 | 1060 | | |
| |||
0 commit comments