src.baseline.metabbo.opro¶
Module Contents¶
Classes¶
Introduction¶The paper proposes a method called “Optimization by PROmpting (OPRO)”, which achieves optimization using LLM by describing the optimization task in natural language and guiding LLM to iteratively generate new solutions based on the problem description and previously found solutions. |
API¶
- class src.baseline.metabbo.opro.LLMAgent(API_KEY)[source]¶
Initialization
- class src.baseline.metabbo.opro.OPRO(config)[source]¶
Bases:
src.rl.basic_agent.Basic_AgentIntroduction¶
The paper proposes a method called “Optimization by PROmpting (OPRO)”, which achieves optimization using LLM by describing the optimization task in natural language and guiding LLM to iteratively generate new solutions based on the problem description and previously found solutions.
Original Paper¶
“Large language models as optimizers.” arXiv preprint arXiv:2309.03409 (2023).
Official Implementation¶
Application Scenario¶
single-object optimization problems(SOOP)
Args:¶
`config`: Configuration object containing all necessary parameters for experiment.For details you can visit config.py.Attributes:¶
config (dict): Stores the configuration dictionary passed during initialization. llm_agent (LLMAgent): An instance of the LLMAgent class used to interact with the large language model. max_episodes (int): Maximum number of episodes for the optimization process. max_num_pairs (int): Maximum number of (theta, value) pairs to include in the meta-prompt.
Methods:¶
__str__(): Returns the string representation of the class. train_episode(envs, seeds): Raises a NotImplementedError as this method is not supported by OPRO. train_epoch(): Raises a NotImplementedError as this method is not supported by OPRO. gen_meta_prompt_multi(old_value_pairs_set, num_input_decimals=5, num_output_decimals=5, max_num_pairs=100): Generates a meta-prompt for general d-dimensional optimization. old_value_pairs_set (set): Set of (theta, value) pairs, where theta is a list. num_input_decimals (int): Number of decimal places for theta values. num_output_decimals (int): Number of decimal places for function values. max_num_pairs (int): Maximum number of examples to include in the meta-prompt. str: The generated meta-prompt string. rollout_episode(env, seed=None, required_info={}): Executes the optimization process by interacting with the environment and the LLM agent. env (object): The environment to interact with. seed (int, optional): Random seed for reproducibility. required_info (dict, optional): Additional information required from the environment. dict: Results of the optimization process, including cost, function evaluations, and metadata if enabled.
Returns:¶
str: For __str__(), returns the string "OPRO". str: For gen_meta_prompt_multi(), returns the generated meta-prompt string. dict: For rollout_episode(), returns a dictionary containing optimization results.
Raises:¶
NotImplementedError: Raised by train_episode() and train_epoch() methods as they are not supported.Initialization
Initialize the basic_agent with config.
- gen_meta_prompt_multi(old_value_pairs_set, num_input_decimals=5, num_output_decimals=5, max_num_pairs=100)[source]¶
Generate the meta-prompt for general d-dimensional optimization.
Args: old_value_pairs_set (set): set of (theta, value) pairs, where theta is a list. num_input_decimals (int): decimals for theta values. num_output_decimals (int): decimals for function value. max_num_pairs (int): number of examples to include.
Returns: meta_prompt (str)