@@ -212,22 +212,23 @@ response = client.chat.completions.create(
212212
213213## Implemented techniques
214214
215- | Approach | Slug | Description |
216- | ----------------------- | ------------------ | ---------------------------------------------------------------------------------------------- |
217- | CoT with Reflection | ` cot_reflection ` | Implements chain-of-thought reasoning with \< thinking\> , \< reflection> and \< output\> sections |
218- | PlanSearch | ` plansearch ` | Implements a search algorithm over candidate plans for solving a problem in natural language |
219- | ReRead | ` re2 ` | Implements rereading to improve reasoning by processing queries twice |
220- | Self-Consistency | ` self_consistency ` | Implements an advanced self-consistency method |
221- | Z3 Solver | ` z3 ` | Utilizes the Z3 theorem prover for logical reasoning |
222- | R* Algorithm | ` rstar ` | Implements the R* algorithm for problem-solving |
223- | LEAP | ` leap ` | Learns task-specific principles from few shot examples |
224- | Round Trip Optimization | ` rto ` | Optimizes responses through a round-trip process |
225- | Best of N Sampling | ` bon ` | Generates multiple responses and selects the best one |
226- | Mixture of Agents | ` moa ` | Combines responses from multiple critiques |
227- | Monte Carlo Tree Search | ` mcts ` | Uses MCTS for decision-making in chat responses |
228- | PV Game | ` pvg ` | Applies a prover-verifier game approach at inference time |
229- | CoT Decoding | N/A for proxy | Implements chain-of-thought decoding to elicit reasoning without explicit prompting |
230- | Entropy Decoding | N/A for proxy | Implements adaptive sampling based on the uncertainty of tokens during generation |
215+ | Approach | Slug | Description |
216+ | ------------------------------------ | ------------------ | ---------------------------------------------------------------------------------------------- |
217+ | Cerebras Planning and Optimimization | ` cepo ` | Combines Best of N, Chain-of-Thought, Self-Reflection, Self-Improvement, and various prompting techniques |
218+ | CoT with Reflection | ` cot_reflection ` | Implements chain-of-thought reasoning with \< thinking\> , \< reflection> and \< output\> sections |
219+ | PlanSearch | ` plansearch ` | Implements a search algorithm over candidate plans for solving a problem in natural language |
220+ | ReRead | ` re2 ` | Implements rereading to improve reasoning by processing queries twice |
221+ | Self-Consistency | ` self_consistency ` | Implements an advanced self-consistency method |
222+ | Z3 Solver | ` z3 ` | Utilizes the Z3 theorem prover for logical reasoning |
223+ | R* Algorithm | ` rstar ` | Implements the R* algorithm for problem-solving |
224+ | LEAP | ` leap ` | Learns task-specific principles from few shot examples |
225+ | Round Trip Optimization | ` rto ` | Optimizes responses through a round-trip process |
226+ | Best of N Sampling | ` bon ` | Generates multiple responses and selects the best one |
227+ | Mixture of Agents | ` moa ` | Combines responses from multiple critiques |
228+ | Monte Carlo Tree Search | ` mcts ` | Uses MCTS for decision-making in chat responses |
229+ | PV Game | ` pvg ` | Applies a prover-verifier game approach at inference time |
230+ | CoT Decoding | N/A for proxy | Implements chain-of-thought decoding to elicit reasoning without explicit prompting |
231+ | Entropy Decoding | N/A for proxy | Implements adaptive sampling based on the uncertainty of tokens during generation |
231232
232233## Implemented plugins
233234
@@ -244,22 +245,37 @@ response = client.chat.completions.create(
244245
245246optillm supports various command-line arguments and environment variables for configuration.
246247
247- | Parameter | Description | Default Value |
248- | --------------------------| -----------------------------------------------------------------| -----------------|
249- | ` --approach ` | Inference approach to use | ` "auto" ` |
250- | ` --simulations ` | Number of MCTS simulations | 2 |
251- | ` --exploration ` | Exploration weight for MCTS | 0.2 |
252- | ` --depth ` | Simulation depth for MCTS | 1 |
253- | ` --best-of-n ` | Number of samples for best_of_n approach | 3 |
254- | ` --model ` | OpenAI model to use | ` "gpt-4o-mini" ` |
255- | ` --base-url ` | Base URL for OpenAI compatible endpoint | ` "" ` |
256- | ` --rstar-max-depth ` | Maximum depth for rStar algorithm | 3 |
257- | ` --rstar-num-rollouts ` | Number of rollouts for rStar algorithm | 5 |
258- | ` --rstar-c ` | Exploration constant for rStar algorithm | 1.4 |
259- | ` --n ` | Number of final responses to be returned | 1 |
260- | ` --return-full-response ` | Return the full response including the CoT with <thinking > tags | ` False ` |
261- | ` --port ` | Specify the port to run the proxy | 8000 |
262- | ` --optillm-api-key ` | Optional API key for client authentication to optillm | ` "" ` |
248+ | Parameter | Description | Default Value |
249+ | -------------------------------------| -----------------------------------------------------------------| -----------------|
250+ | ` --approach ` | Inference approach to use | ` "auto" ` |
251+ | ` --simulations ` | Number of MCTS simulations | 2 |
252+ | ` --exploration ` | Exploration weight for MCTS | 0.2 |
253+ | ` --depth ` | Simulation depth for MCTS | 1 |
254+ | ` --best-of-n ` | Number of samples for best_of_n approach | 3 |
255+ | ` --model ` | OpenAI model to use | ` "gpt-4o-mini" ` |
256+ | ` --base-url ` | Base URL for OpenAI compatible endpoint | ` "" ` |
257+ | ` --rstar-max-depth ` | Maximum depth for rStar algorithm | 3 |
258+ | ` --rstar-num-rollouts ` | Number of rollouts for rStar algorithm | 5 |
259+ | ` --rstar-c ` | Exploration constant for rStar algorithm | 1.4 |
260+ | ` --n ` | Number of final responses to be returned | 1 |
261+ | ` --return-full-response ` | Return the full response including the CoT with <thinking > tags | ` False ` |
262+ | ` --port ` | Specify the port to run the proxy | 8000 |
263+ | ` --optillm-api-key ` | Optional API key for client authentication to optillm | ` "" ` |
264+ | ` --cepo_bestofn_n ` | Number of responses to be generated in best of n stage | 3 |
265+ | ` --cepo_bestofn_temperature ` | Temperature for verifier in best of n stage | 0.1 |
266+ | ` --cepo_bestofn_max_tokens ` | Maximum number of tokens for verifier in best of n stage | 4096 |
267+ | ` --cepo_bestofn_rating_type ` | Type of rating in best of n stage ("absolute" or "pairwise") | ` "absolute" ` |
268+ | ` --cepo_planning_n ` | Number of plans generated in planning stage | 3 |
269+ | ` --cepo_planning_m ` | Number of attempts to generate n plans in planning stage | 6 |
270+ | ` --cepo_planning_temperature_step1 ` | Temperature for generator in step 1 of planning stage | 0.55 |
271+ | ` --cepo_planning_temperature_step2 ` | Temperature for generator in step 2 of planning stage | 0.25 |
272+ | ` --cepo_planning_temperature_step3 ` | Temperature for generator in step 3 of planning stage | 0.1 |
273+ | ` --cepo_planning_temperature_step4 ` | Temperature for generator in step 4 of planning stage | 0 |
274+ | ` --cepo_planning_max_tokens_step1 ` | Maximum number of tokens in step 1 of planning stage | 4096 |
275+ | ` --cepo_planning_max_tokens_step2 ` | Maximum number of tokens in step 2 of planning stage | 4096 |
276+ | ` --cepo_planning_max_tokens_step3 ` | Maximum number of tokens in step 3 of planning stage | 4096 |
277+ | ` --cepo_planning_max_tokens_step4 ` | Maximum number of tokens in step 4 of planning stage | 4096 |
278+ | ` --cepo_config_file ` | Path to CePO configuration file | None |
263279
264280When using Docker, these can be set as environment variables prefixed with ` OPTILLM_ ` .
265281
0 commit comments