|
| 1 | +# Cerebras Planning and Optimization (CePO) |
| 2 | + |
| 3 | +## Results |
| 4 | + |
| 5 | +### Comparison of CePO with default settings and base model |
| 6 | + |
| 7 | +| Method | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | |
| 8 | +| -------------------------- | ------- | --------------- | ---- | ---- | |
| 9 | +| Llama 3.3 70B | 51.0 | 78.6 | 49.1 | 72.6 | |
| 10 | +| Llama 3.1 405B | 49.8 | 79.2 | 50.7 | 73.0 | |
| 11 | +| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 55.5 | 80.1 | |
| 12 | + |
| 13 | +### Ablation studies |
| 14 | + |
| 15 | +| bestofn_n | planning_n | planning_m | bestofn_rating_type | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | Comments | |
| 16 | +| --------- | ---------- | ---------- | ------------------- | ------- | --------------- | ----- | ----- | -------------- | |
| 17 | +| 3 | 3 | 6 | absolute | 69.6 | 84.8 | 55.5 | 80.1 | Default config | |
| 18 | +| 3 | 3 | 6 | pairwise | 67.7 | 83.5 | 55.6 | 79.8 | | |
| 19 | +| 3 | 2 | 5 | absolute | 67.1 | 85.1 | 55.1 | 79.0 | | |
| 20 | +| 3 | 5 | 8 | absolute | 69.4 | 84.3 | 55.6 | 81.1 | | |
| 21 | +| 5 | 3 | 6 | absolute | 68.7 | 85.4 | 54.8 | 79.9 | | |
| 22 | +| 7 | 3 | 6 | absolute | 69.6 | 82.8 | 54.7 | 78.4 | | |
| 23 | +| 9 | 3 | 6 | absolute | 68.9 | 83.4 | 55.7 | 80.6 | | |
| 24 | + |
1 | 25 | # optillm |
2 | 26 |
|
3 | 27 | optillm is an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs. The current focus is on implementing techniques that improve reasoning over coding, logical and mathematical queries. It is possible to beat the frontier models using these techniques across diverse tasks by doing additional compute at inference time. |
@@ -362,28 +386,6 @@ Authorization: Bearer your_secret_api_key |
362 | 386 |
|
363 | 387 |  |
364 | 388 |
|
365 | | -## CePO Results |
366 | | - |
367 | | -### Comparison of CePO with default settings and base model |
368 | | - |
369 | | -| Method | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | |
370 | | -| -------------------------- | ------- | --------------- | ---- | ---- | |
371 | | -| Llama 3.3 70B | 51.0 | 78.6 | 49.1 | 72.6 | |
372 | | -| Llama 3.1 405B | 49.8 | 79.2 | 50.7 | 73.0 | |
373 | | -| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 55.5 | 80.1 | |
374 | | - |
375 | | -### Ablation studies |
376 | | - |
377 | | -| bestofn_n | planning_n | planning_m | bestofn_rating_type | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | Comments | |
378 | | -| --------- | ---------- | ---------- | ------------------- | ------- | --------------- | ----- | ----- | -------------- | |
379 | | -| 3 | 3 | 6 | absolute | 69.6 | 84.8 | 55.5 | 80.1 | Default config | |
380 | | -| 3 | 3 | 6 | pairwise | 67.7 | 83.5 | 55.6 | 79.8 | | |
381 | | -| 3 | 2 | 5 | absolute | 67.1 | 85.1 | 55.1 | 79.0 | | |
382 | | -| 3 | 5 | 8 | absolute | 69.4 | 84.3 | 55.6 | 81.1 | | |
383 | | -| 5 | 3 | 6 | absolute | 68.7 | 85.4 | 54.8 | 79.9 | | |
384 | | -| 7 | 3 | 6 | absolute | 69.6 | 82.8 | 54.7 | 78.4 | | |
385 | | -| 9 | 3 | 6 | absolute | 68.9 | 83.4 | 55.7 | 80.6 | | |
386 | | - |
387 | 389 | ### optillm with Patchwork (July 2024) |
388 | 390 |
|
389 | 391 | Since optillm is a drop-in replacement for OpenAI API you can easily integrate it with existing tools and frameworks using the OpenAI client. We used optillm with [patchwork](https://github.com/patched-codes/patchwork) which is an open-source framework that automates development gruntwork like PR reviews, bug fixing, security patching using workflows |
|
0 commit comments