src.baseline.metabbo.rlpso

Module Contents

Classes

PolicyNetwork

RLPSO

Introduction

The paper “Employing reinforcement learning to enhance particle swarm optimization methods” presents a novel approach to improving the efficiency and adaptability of Particle Swarm Optimization (PSO), a popular optimization algorithm inspired by swarm intelligence. By incorporating reinforcement learning (RL), the authors develop a framework that adaptively adjusts key PSO parameters and strategies during the optimization process. This integration enables the algorithm to intelligently balance exploration and exploitation, addressing challenges such as premature convergence and stagnation.

API

class src.baseline.metabbo.rlpso.PolicyNetwork(config)[source]

Bases: torch.nn.Module

Initialization

forward(x_in, require_entropy=False, require_musigma=False)[source]
class src.baseline.metabbo.rlpso.RLPSO(config)[source]

Bases: src.rl.reinforce.REINFORCE_Agent

Introduction

The paper “Employing reinforcement learning to enhance particle swarm optimization methods” presents a novel approach to improving the efficiency and adaptability of Particle Swarm Optimization (PSO), a popular optimization algorithm inspired by swarm intelligence. By incorporating reinforcement learning (RL), the authors develop a framework that adaptively adjusts key PSO parameters and strategies during the optimization process. This integration enables the algorithm to intelligently balance exploration and exploitation, addressing challenges such as premature convergence and stagnation.

Original Paper

Employing reinforcement learning to enhance particle swarm optimization methods.” Engineering Optimization (2022)

Official Implementation

None

Application Scenario

single-object optimization problems(SOOP)

Args:

`config`: Configuration object containing all necessary parameters for experiment.For details you can visit config.py.

Attributes:

config (object): Configuration object with updated attributes specific to RLPSO.
model (PolicyNetwork): The policy network used by the RLPSO agent.
optimizer (torch.optim.Optimizer): Optimizer for training the policy network.
learning_time (int): Counter for the number of learning steps taken.
cur_checkpoint (int): Counter for the current checkpoint during training.

Methods:

__str__():
    Returns the string representation of the RLPSO class.
train_episode(envs, seeds, para_mode='dummy', asynchronous=None, num_cpus=1, num_gpus=0, tb_logger=None, required_info={}):
    Trains the RLPSO agent for one episode.
    Args:
        envs (list): List of environments for training.
        seeds (Optional[Union[int, List[int], np.ndarray]]): Seed(s) for environment randomization.
        para_mode (Literal['dummy', 'subproc', 'ray', 'ray-subproc']): Parallelization mode for environments.
        asynchronous (Literal[None, 'idle', 'restart', 'continue']): Asynchronous mode for environment execution.
        num_cpus (Optional[Union[int, None]]): Number of CPUs to use.
        num_gpus (int): Number of GPUs to use.
        tb_logger (object): TensorBoard logger for logging training metrics.
        required_info (dict): Additional information to retrieve from the environment.
    Returns:
        is_train_ended (bool): Whether the training has reached the maximum learning steps.
        return_info (dict): Dictionary containing training metrics and environment attributes.

Returns:

None

Raises:

None

Initialization

Initializes the REINFORCE agent with the given configuration, networks, and learning rates.Store the initial agent in the checkpoint directory.

Args:

  • config: Configuration object containing all necessary parameters for the experiment.

  • networks (dict): A dictionary of neural networks used by the agent.

  • learning_rates (float): Learning rate for the optimizer.

__str__()[source]
train_episode(envs, seeds: src.rl.reinforce.Optional[src.rl.reinforce.Union[int, src.rl.reinforce.List[int], src.rl.reinforce.np.ndarray]], para_mode: src.rl.reinforce.Literal[dummy, subproc, ray, ray-subproc] = 'dummy', asynchronous: src.rl.reinforce.Literal[None, idle, restart, continue] = None, num_cpus: src.rl.reinforce.Optional[src.rl.reinforce.Union[int, None]] = 1, num_gpus: int = 0, tb_logger=None, required_info={})[source]