`src.baseline.metabbo.rldas`¶

Module Contents¶

Classes¶

`Actor`
`Critic`
`RLDAS`	Introduction¶ This paper proposes a dynamic algorithm selection method based on deep reinforcement learning, aiming to improve the performance of solving real-parameter optimization problems. The paper points out that evolutionary algorithms (such as differential evolution) perform well in solving real-parameter optimization problems. However, the optimal algorithm parameters corresponding to different problem instances may be different, which poses a challenge to algorithm selection. To this end, the authors designed a deep reinforcement learning framework that can adaptively select the optimal algorithm parameter configuration based on the characteristics of the problem instance. Through experimental verification on a series of benchmark functions, the method shows better performance than traditional differential evolution algorithms.

API¶

class src.baseline.metabbo.rldas.Actor(dim, optimizer_num, feature_dim, device)[source]¶

Bases: src.baseline.metabbo.networks.nn.Module

Initialization

Introduction¶

Initializes the model with multiple embedders, a final embedder, and a main model for processing input features and producing optimizer selection probabilities.

Args:¶

dim (int): The input dimension for each embedder.
optimizer_num (int): The number of optimizers, determines how many embedders are created.
feature_dim (int): The dimension of the input features for the final embedder.
device (torch.device or str): The device on which to place the model components (e.g., ‘cpu’ or ‘cuda’).

Attributes:¶

device: Stores the computation device.
embedders (nn.ModuleList): Contains pairs of sequential neural network modules for each optimizer.
embedder_final (nn.Sequential): Processes concatenated features from all embedders and input features.
model (nn.Sequential): Produces a probability distribution over optimizers using a softmax layer.

forward(obs, fix_action=None, require_entropy=False)[source]¶

class src.baseline.metabbo.rldas.Critic(dim, optimizer_num, feature_dim, device)[source]¶

Bases: src.baseline.metabbo.networks.nn.Module

Initialization

forward(obs)[source]¶

class src.baseline.metabbo.rldas.RLDAS(config)[source]¶

Bases: src.rl.ppo.PPO_Agent

Introduction¶

This paper proposes a dynamic algorithm selection method based on deep reinforcement learning, aiming to improve the performance of solving real-parameter optimization problems. The paper points out that evolutionary algorithms (such as differential evolution) perform well in solving real-parameter optimization problems. However, the optimal algorithm parameters corresponding to different problem instances may be different, which poses a challenge to algorithm selection. To this end, the authors designed a deep reinforcement learning framework that can adaptively select the optimal algorithm parameter configuration based on the characteristics of the problem instance. Through experimental verification on a series of benchmark functions, the method shows better performance than traditional differential evolution algorithms.

Original Paper¶

“Deep Reinforcement Learning for Dynamic Algorithm Selection: A Proof-of-Principle Study on Differential Evolution.” IEEE Transactions on Systems, Man, and Cybernetics: Systems (2024)

Official Implementation¶

RL-DAS

Application Scenario¶

single-object optimization problems(SOOP)

Args:¶

`config`: Configuration object containing all necessary parameters for experiment.For details you can visit config.py.

Attributes:¶

config (object): Stores the configuration object.
actor (Actor): The actor network used for policy generation.
critic (Critic): The critic network used for value estimation.
optimizer (torch.optim.Optimizer): Optimizer for training the actor and critic networks.
learning_time (int): Tracks the number of learning steps performed.
cur_checkpoint (int): Tracks the current checkpoint for saving the model.

Methods:¶

__str__():
    Returns the string representation of the class.
train_episode(envs, seeds, para_mode='dummy', compute_resource={}, tb_logger=None, required_info={}):
    Trains the agent for one episode using the PPO algorithm.
    Args:
        envs (list): List of environments for training.
        seeds (Optional[Union[int, List[int], np.ndarray]]): Seeds for environment initialization.
        para_mode (Literal['dummy', 'subproc', 'ray', 'ray-subproc']): Parallelization mode for environments.
        compute_resource (dict): Resources for computation (e.g., CPUs, GPUs).
        tb_logger (Optional): TensorBoard logger for logging training metrics.
        required_info (dict): Additional information required from the environment.
    Returns:
        Tuple[bool, dict]: A tuple containing a boolean indicating if training is complete and a dictionary with training information.
rollout_episode(env, seed=None, required_info={}):
    Executes a single rollout episode in a given environment.
    Args:
        env (object): The environment for the rollout.
        seed (Optional): Seed for environment initialization.
        required_info (dict): Additional information required from the environment.
    Returns:
        dict: A dictionary containing rollout results, including costs, function evaluations, and returns.

Returns:¶

None

Raises:¶

None

Initialization

Initializes the PPO agent with the given configuration, networks, and learning rates.Store the initial agent in the checkpoint directory.

Args:¶

config: Configuration object containing all necessary parameters for the experiment.
networks (dict): A dictionary of neural networks used by the agent.
learning_rates (float): Learning rate for the optimizer.

__str__()[source]¶

train_episode(envs, seeds: src.rl.ppo.Optional[src.rl.ppo.Union[int, src.rl.ppo.List[int], src.rl.utils.np.ndarray]], para_mode: src.rl.ppo.Literal[dummy, subproc, ray, ray - subproc] = 'dummy', compute_resource={}, tb_logger=None, required_info={})[source]¶

rollout_episode(env, seed=None, required_info={})[source]¶

src.baseline.metabbo.rldas¶

Module Contents¶

Classes¶

Introduction¶

API¶

Introduction¶

Args:¶

Attributes:¶

Introduction¶

Original Paper¶

Official Implementation¶

Application Scenario¶

Args:¶

Attributes:¶

Methods:¶

Returns:¶

Raises:¶

Args:¶

`src.baseline.metabbo.rldas`¶