Skip to content

Commit cbebb2c

Browse files
Updated documentation of CePO
1 parent bef08cf commit cbebb2c

File tree

2 files changed

+38
-37
lines changed

2 files changed

+38
-37
lines changed

README.md

Lines changed: 24 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,27 @@
1+
# Cerebras Planning and Optimization (CePO)
2+
3+
## Results
4+
5+
### Comparison of CePO with default settings and base model
6+
7+
| Method | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX |
8+
| -------------------------- | ------- | --------------- | ---- | ---- |
9+
| Llama 3.3 70B | 51.0 | 78.6 | 49.1 | 72.6 |
10+
| Llama 3.1 405B | 49.8 | 79.2 | 50.7 | 73.0 |
11+
| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 55.5 | 80.1 |
12+
13+
### Ablation studies
14+
15+
| bestofn_n | planning_n | planning_m | bestofn_rating_type | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | Comments |
16+
| --------- | ---------- | ---------- | ------------------- | ------- | --------------- | ----- | ----- | -------------- |
17+
| 3 | 3 | 6 | absolute | 69.6 | 84.8 | 55.5 | 80.1 | Default config |
18+
| 3 | 3 | 6 | pairwise | 67.7 | 83.5 | 55.6 | 79.8 | |
19+
| 3 | 2 | 5 | absolute | 67.1 | 85.1 | 55.1 | 79.0 | |
20+
| 3 | 5 | 8 | absolute | 69.4 | 84.3 | 55.6 | 81.1 | |
21+
| 5 | 3 | 6 | absolute | 68.7 | 85.4 | 54.8 | 79.9 | |
22+
| 7 | 3 | 6 | absolute | 69.6 | 82.8 | 54.7 | 78.4 | |
23+
| 9 | 3 | 6 | absolute | 68.9 | 83.4 | 55.7 | 80.6 | |
24+
125
# optillm
226

327
optillm is an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs. The current focus is on implementing techniques that improve reasoning over coding, logical and mathematical queries. It is possible to beat the frontier models using these techniques across diverse tasks by doing additional compute at inference time.
@@ -362,28 +386,6 @@ Authorization: Bearer your_secret_api_key
362386

363387
![Results showing Mixture of Agents approach using gpt-4o-mini on Arena Hard Auto Benchmark](https://raw.githubusercontent.com/codelion/optillm/main/moa-results.png)
364388

365-
## CePO Results
366-
367-
### Comparison of CePO with default settings and base model
368-
369-
| Method | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX |
370-
| -------------------------- | ------- | --------------- | ---- | ---- |
371-
| Llama 3.3 70B | 51.0 | 78.6 | 49.1 | 72.6 |
372-
| Llama 3.1 405B | 49.8 | 79.2 | 50.7 | 73.0 |
373-
| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 55.5 | 80.1 |
374-
375-
### Ablation studies
376-
377-
| bestofn_n | planning_n | planning_m | bestofn_rating_type | Math-L5 | MMLU-Pro (Math) | GPQA | CRUX | Comments |
378-
| --------- | ---------- | ---------- | ------------------- | ------- | --------------- | ----- | ----- | -------------- |
379-
| 3 | 3 | 6 | absolute | 69.6 | 84.8 | 55.5 | 80.1 | Default config |
380-
| 3 | 3 | 6 | pairwise | 67.7 | 83.5 | 55.6 | 79.8 | |
381-
| 3 | 2 | 5 | absolute | 67.1 | 85.1 | 55.1 | 79.0 | |
382-
| 3 | 5 | 8 | absolute | 69.4 | 84.3 | 55.6 | 81.1 | |
383-
| 5 | 3 | 6 | absolute | 68.7 | 85.4 | 54.8 | 79.9 | |
384-
| 7 | 3 | 6 | absolute | 69.6 | 82.8 | 54.7 | 78.4 | |
385-
| 9 | 3 | 6 | absolute | 68.9 | 83.4 | 55.7 | 80.6 | |
386-
387389
### optillm with Patchwork (July 2024)
388390

389391
Since optillm is a drop-in replacement for OpenAI API you can easily integrate it with existing tools and frameworks using the OpenAI client. We used optillm with [patchwork](https://github.com/patched-codes/patchwork) which is an open-source framework that automates development gruntwork like PR reviews, bug fixing, security patching using workflows

optillm/cepo.py

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,20 @@
99

1010
@dataclass
1111
class CepoConfig:
12-
bestofn_n: int
13-
bestofn_temperature: float
14-
bestofn_max_tokens: int
15-
bestofn_rating_type: Literal["absolute", "pairwise"]
16-
planning_n: int
17-
planning_m: int
18-
planning_temperature_step1: float
19-
planning_temperature_step2: float
20-
planning_temperature_step3: float
21-
planning_temperature_step4: float
22-
planning_max_tokens_step1: int
23-
planning_max_tokens_step2: int
24-
planning_max_tokens_step3: int
25-
planning_max_tokens_step4: int
26-
12+
bestofn_n: int # number of responses to be generated in best of n stage
13+
bestofn_temperature: float # temperature for verifier in best of n stage
14+
bestofn_max_tokens: int # maximum number of tokens for verifier in best of n stage
15+
bestofn_rating_type: Literal["absolute", "pairwise"] # type of rating in best of n stage
16+
planning_n: int # number of plans generated in planning stage
17+
planning_m: int # number of attempts to generate n plans in planning stage
18+
planning_temperature_step1: float # temperature for generator in step 1 of planning stage
19+
planning_temperature_step2: float # temperature for generator in step 2 of planning stage
20+
planning_temperature_step3: float # temperature for generator in step 3 of planning stage
21+
planning_temperature_step4: float # temperature for generator in step 4 of planning stage
22+
planning_max_tokens_step1: int # maximum number of tokens in step 1 of planning stage
23+
planning_max_tokens_step2: int # maximum number of tokens in step 2 of planning stage
24+
planning_max_tokens_step3: int # maximum number of tokens in step 3 of planning stage
25+
planning_max_tokens_step4: int # maximum number of tokens in step 4 of planning stage
2726

2827
# given command line arguments which includes a yaml file path, initialize a CePO configuration
2928
def init_cepo_config(cmd_line_args: dict) -> CepoConfig:

0 commit comments

Comments
 (0)