Skip to content

Commit 83dfb30

Browse files
Add cepo to readme
1 parent 3ef183b commit 83dfb30

File tree

1 file changed

+48
-32
lines changed

1 file changed

+48
-32
lines changed

README.md

Lines changed: 48 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -212,22 +212,23 @@ response = client.chat.completions.create(
212212
213213
## Implemented techniques
214214

215-
| Approach | Slug | Description |
216-
| ----------------------- | ------------------ | ---------------------------------------------------------------------------------------------- |
217-
| CoT with Reflection | `cot_reflection` | Implements chain-of-thought reasoning with \<thinking\>, \<reflection> and \<output\> sections |
218-
| PlanSearch | `plansearch` | Implements a search algorithm over candidate plans for solving a problem in natural language |
219-
| ReRead | `re2` | Implements rereading to improve reasoning by processing queries twice |
220-
| Self-Consistency | `self_consistency` | Implements an advanced self-consistency method |
221-
| Z3 Solver | `z3` | Utilizes the Z3 theorem prover for logical reasoning |
222-
| R* Algorithm | `rstar` | Implements the R* algorithm for problem-solving |
223-
| LEAP | `leap` | Learns task-specific principles from few shot examples |
224-
| Round Trip Optimization | `rto` | Optimizes responses through a round-trip process |
225-
| Best of N Sampling | `bon` | Generates multiple responses and selects the best one |
226-
| Mixture of Agents | `moa` | Combines responses from multiple critiques |
227-
| Monte Carlo Tree Search | `mcts` | Uses MCTS for decision-making in chat responses |
228-
| PV Game | `pvg` | Applies a prover-verifier game approach at inference time |
229-
| CoT Decoding | N/A for proxy | Implements chain-of-thought decoding to elicit reasoning without explicit prompting |
230-
| Entropy Decoding | N/A for proxy | Implements adaptive sampling based on the uncertainty of tokens during generation |
215+
| Approach | Slug | Description |
216+
| ------------------------------------ | ------------------ | ---------------------------------------------------------------------------------------------- |
217+
| Cerebras Planning and Optimimization | `cepo` | Combines Best of N, Chain-of-Thought, Self-Reflection, Self-Improvement, and various prompting techniques |
218+
| CoT with Reflection | `cot_reflection` | Implements chain-of-thought reasoning with \<thinking\>, \<reflection> and \<output\> sections |
219+
| PlanSearch | `plansearch` | Implements a search algorithm over candidate plans for solving a problem in natural language |
220+
| ReRead | `re2` | Implements rereading to improve reasoning by processing queries twice |
221+
| Self-Consistency | `self_consistency` | Implements an advanced self-consistency method |
222+
| Z3 Solver | `z3` | Utilizes the Z3 theorem prover for logical reasoning |
223+
| R* Algorithm | `rstar` | Implements the R* algorithm for problem-solving |
224+
| LEAP | `leap` | Learns task-specific principles from few shot examples |
225+
| Round Trip Optimization | `rto` | Optimizes responses through a round-trip process |
226+
| Best of N Sampling | `bon` | Generates multiple responses and selects the best one |
227+
| Mixture of Agents | `moa` | Combines responses from multiple critiques |
228+
| Monte Carlo Tree Search | `mcts` | Uses MCTS for decision-making in chat responses |
229+
| PV Game | `pvg` | Applies a prover-verifier game approach at inference time |
230+
| CoT Decoding | N/A for proxy | Implements chain-of-thought decoding to elicit reasoning without explicit prompting |
231+
| Entropy Decoding | N/A for proxy | Implements adaptive sampling based on the uncertainty of tokens during generation |
231232

232233
## Implemented plugins
233234

@@ -244,22 +245,37 @@ response = client.chat.completions.create(
244245

245246
optillm supports various command-line arguments and environment variables for configuration.
246247

247-
| Parameter | Description | Default Value |
248-
|--------------------------|-----------------------------------------------------------------|-----------------|
249-
| `--approach` | Inference approach to use | `"auto"` |
250-
| `--simulations` | Number of MCTS simulations | 2 |
251-
| `--exploration` | Exploration weight for MCTS | 0.2 |
252-
| `--depth` | Simulation depth for MCTS | 1 |
253-
| `--best-of-n` | Number of samples for best_of_n approach | 3 |
254-
| `--model` | OpenAI model to use | `"gpt-4o-mini"` |
255-
| `--base-url` | Base URL for OpenAI compatible endpoint | `""` |
256-
| `--rstar-max-depth` | Maximum depth for rStar algorithm | 3 |
257-
| `--rstar-num-rollouts` | Number of rollouts for rStar algorithm | 5 |
258-
| `--rstar-c` | Exploration constant for rStar algorithm | 1.4 |
259-
| `--n` | Number of final responses to be returned | 1 |
260-
| `--return-full-response` | Return the full response including the CoT with <thinking> tags | `False` |
261-
| `--port` | Specify the port to run the proxy | 8000 |
262-
| `--optillm-api-key` | Optional API key for client authentication to optillm | `""` |
248+
| Parameter | Description | Default Value |
249+
|-------------------------------------|-----------------------------------------------------------------|-----------------|
250+
| `--approach` | Inference approach to use | `"auto"` |
251+
| `--simulations` | Number of MCTS simulations | 2 |
252+
| `--exploration` | Exploration weight for MCTS | 0.2 |
253+
| `--depth` | Simulation depth for MCTS | 1 |
254+
| `--best-of-n` | Number of samples for best_of_n approach | 3 |
255+
| `--model` | OpenAI model to use | `"gpt-4o-mini"` |
256+
| `--base-url` | Base URL for OpenAI compatible endpoint | `""` |
257+
| `--rstar-max-depth` | Maximum depth for rStar algorithm | 3 |
258+
| `--rstar-num-rollouts` | Number of rollouts for rStar algorithm | 5 |
259+
| `--rstar-c` | Exploration constant for rStar algorithm | 1.4 |
260+
| `--n` | Number of final responses to be returned | 1 |
261+
| `--return-full-response` | Return the full response including the CoT with <thinking> tags | `False` |
262+
| `--port` | Specify the port to run the proxy | 8000 |
263+
| `--optillm-api-key` | Optional API key for client authentication to optillm | `""` |
264+
| `--cepo_bestofn_n` | Number of responses to be generated in best of n stage | 3 |
265+
| `--cepo_bestofn_temperature` | Temperature for verifier in best of n stage | 0.1 |
266+
| `--cepo_bestofn_max_tokens` | Maximum number of tokens for verifier in best of n stage | 4096 |
267+
| `--cepo_bestofn_rating_type` | Type of rating in best of n stage ("absolute" or "pairwise") | `"absolute"` |
268+
| `--cepo_planning_n` | Number of plans generated in planning stage | 3 |
269+
| `--cepo_planning_m` | Number of attempts to generate n plans in planning stage | 6 |
270+
| `--cepo_planning_temperature_step1` | Temperature for generator in step 1 of planning stage | 0.55 |
271+
| `--cepo_planning_temperature_step2` | Temperature for generator in step 2 of planning stage | 0.25 |
272+
| `--cepo_planning_temperature_step3` | Temperature for generator in step 3 of planning stage | 0.1 |
273+
| `--cepo_planning_temperature_step4` | Temperature for generator in step 4 of planning stage | 0 |
274+
| `--cepo_planning_max_tokens_step1` | Maximum number of tokens in step 1 of planning stage | 4096 |
275+
| `--cepo_planning_max_tokens_step2` | Maximum number of tokens in step 2 of planning stage | 4096 |
276+
| `--cepo_planning_max_tokens_step3` | Maximum number of tokens in step 3 of planning stage | 4096 |
277+
| `--cepo_planning_max_tokens_step4` | Maximum number of tokens in step 4 of planning stage | 4096 |
278+
| `--cepo_config_file` | Path to CePO configuration file | None |
263279

264280
When using Docker, these can be set as environment variables prefixed with `OPTILLM_`.
265281

0 commit comments

Comments
 (0)