src.environment.optimizer.rlemmo_optimizer

Module Contents

Classes

RLEMMO_Optimizer

RLEMMO_Optimizer

A reinforcement learning-based evolutionary multi-modal optimizer (RLEMMO) that extends Learnable_Optimizer. This optimizer is designed for multi-modal optimization problems and leverages a population-based approach with multiple mutation strategies, neighborhood structures, and reward mechanisms.

API

class src.environment.optimizer.rlemmo_optimizer.RLEMMO_Optimizer(config)[source]

Bases: src.environment.optimizer.learnable_optimizer.Learnable_Optimizer

RLEMMO_Optimizer

A reinforcement learning-based evolutionary multi-modal optimizer (RLEMMO) that extends Learnable_Optimizer. This optimizer is designed for multi-modal optimization problems and leverages a population-based approach with multiple mutation strategies, neighborhood structures, and reward mechanisms.

Introduction

RLEMMO_Optimizer maintains a population of candidate solutions and applies various evolutionary operators (actions) to explore and exploit the search space. It uses neighborhood information, clustering, and reinforcement learning-inspired mechanisms to adaptively guide the search process. The optimizer is suitable for problems with multiple global optima and supports meta-data collection for analysis.

Initialization

Introduction

Initializes the RLEMmoOptimizer with the provided configuration and sets default hyperparameters and attributes.

Args:

  • config (dict): Configuration dictionary containing optimizer settings.

    • The attributes needed for the RLEMMO are the following:

Attributes:

  • ps (int): Population size, default is 100.

  • k_neighbors (int): Number of neighbors for local search, default is 4.

  • n_action (int): Number of possible actions, default is 5.

  • FF (float): Differential evolution scaling factor, default is 0.5.

  • CR (float): Crossover probability, default is 0.9.

  • eps (float): Epsilon value for exploration, default is 0.2.

  • min_samples (int): Minimum samples for clustering or selection, default is 3.

  • reward_scale (int): Scaling factor for rewards, default is 1000.

  • fes (Any): Function evaluation counter, initialized as None.

  • cost (Any): Cost value, initialized as None.

  • pr (Any): Probability-related attribute, initialized as None.

  • sr (Any): Success rate or similar metric, initialized as None.

  • log_index (Any): Logging index, initialized as None.

  • log_interval (Any): Logging interval, initialized as None.

  • archive (Any): Archive for storing solutions, initialized as None.

  • archive_val (Any): Archive for storing solution values, initialized as None.

Raises:

  • None

__str__()[source]

Introduction

Returns a string representation of the RLEMMO_Optimizer object.

Returns:

  • str: The name of the optimizer, “RLEMMO_Optimizer”.

get_costs(position, problem)[source]

Introduction

Calculates the cost values for a given set of positions using the provided optimization problem.

Args:

  • position (np.ndarray): An array representing one or more candidate solutions (positions) to be evaluated.

  • problem (object): An optimization problem instance that must have eval and optimum attributes.

Built-in Attribute:

  • self.fes (int): Increments the function evaluation counter by the number of positions evaluated.

Returns:

  • np.ndarray: The cost values for each position, computed as the difference between the evaluated value and the problem’s optimum.

Raises:

  • AttributeError: If problem does not have the required eval method or optimum attribute.

find_nei(pop_dist)[source]

Introduction

Constructs a neighbor matrix for a population based on pairwise distances, marking the k-nearest neighbors for each individual.

Args:

  • pop_dist (np.ndarray): A 2D square numpy array of shape (ps, ps) representing pairwise distances between population members.

Returns:

  • np.ndarray: A binary matrix of shape (ps, ps) where entry (i, j) is 1 if individual j is among the k-nearest neighbors of individual i, otherwise 0.

Notes:

  • The diagonal of pop_dist is set to infinity to exclude self-neighbors.

  • The number of neighbors considered is determined by self.k_neighbors.

  • The population size is determined by self.ps.

act1(pop_choice)[source]
act2(pop_choice)[source]
act3(pop_choice)[source]
act4(pop_choice)[source]
act5(pop_choice)[source]
cal_pr_sr(problem)[source]
initialize_individuals(problem)[source]

Introduction

Initializes the population of individuals for the optimizer, setting up their positions, costs, neighborhood relationships, and best-known solutions.

Args:

  • problem (object): An object representing the optimization problem, which must have the attributes dim (int, dimensionality of the problem), ub (np.ndarray or float, upper bounds), and lb (np.ndarray or float, lower bounds).

Side Effects:

  • Initializes and stores the following attributes in self.individuals:

    • ‘current_position’: np.ndarray of shape (ps, dim), the positions of all individuals.

    • ‘c_cost’: np.ndarray of shape (ps,), the cost of each individual.

    • ‘pop_dist’: np.ndarray of shape (ps, ps), pairwise distances between individuals.

    • ‘neighbor_matrix’: np.ndarray of shape (ps, ps), neighborhood relationships.

    • ‘gbest_position’: np.ndarray of shape (dim,), the position of the global best individual.

    • ‘gbest_val’: float, the cost of the global best individual.

    • ‘no_improve’: int, counter for global no improvement.

    • ‘lbest_position’: list of np.ndarray, best positions in each individual’s neighborhood.

    • ‘lbest_val’: list of float, best costs in each individual’s neighborhood.

    • ‘local_no_improve’: np.ndarray of shape (ps,), counters for local no improvement.

    • ‘per_no_improve’: np.ndarray of shape (ps,), counters for personal no improvement.

  • Sets self.max_cost to the maximum cost in the initial population.

  • Sets self.gbest_val to the global best cost.

Returns:

  • None

init_population(problem)[source]

Introduction

Initializes the population and related state variables for the optimizer based on the provided problem instance.

Args:

  • problem (object): An object representing the optimization problem, expected to have attributes such as maxfes, dim, ub, and lb.

Returns:

  • np.ndarray: The initial state of the population, including population state, exploration state, and exploitation state.

Notes:

  • Sets up internal counters and logging intervals.

  • Initializes individuals and their costs.

  • Calculates and stores initial performance metrics (pr, sr).

  • Optionally collects meta-data if configured.

observe()[source]

Introduction

Computes and returns a set of state features for each individual in the population, capturing various statistics and relationships relevant to the optimizer’s environment. These features are used for monitoring or further decision-making in the optimization process.

Returns:

  • np.ndarray: A 2D array of shape (ps, 22), where each row corresponds to an individual and each column represents a specific feature describing the individual’s state in the population.

Built-in Attribute:

  • self.individuals (dict): Contains arrays for current positions, costs, distances, neighbor matrices, best positions, and improvement counters for all individuals.

  • self.ps (int): Population size.

  • self.max_fes (int): Maximum number of function evaluations.

  • self.fes (int): Current number of function evaluations.

  • self.max_dist (float): Maximum possible distance in the search space.

  • self.max_cost (float): Maximum possible cost value.

  • self.k_neighbors (int): Number of neighbors considered for each individual.

Raises:

  • AssertionError: If the neighbor count does not match self.k_neighbors or if any NaN values are present in the resulting state array.

mydbscan(problem)[source]

Introduction

Applies the DBSCAN clustering algorithm to the current population of individuals, normalized within the problem bounds.

Args:

  • problem: The optimization problem object, which has lb (lower bounds) and ub (upper bounds) attributes.

Returns:

  • numpy.ndarray: An array of cluster labels assigned to each individual in the population.

Raises:

  • AttributeError: If problem does not have lb or ub attributes.

  • ValueError: If the population shape is incompatible with the problem bounds.

cal_reward(problem)[source]
update(action, problem)[source]

Introduction

Updates the optimizer’s population based on the provided actions and problem instance, applying evolutionary operators, updating global and local bests, and calculating rewards and termination conditions.

Args:

  • action (np.ndarray): An array of actions to apply to each individual in the population.

  • problem (object): The optimization problem instance, providing bounds and cost evaluation.

Returns:

  • next_state (np.ndarray): The observed state of the population after the update.

  • reward (float): The calculated reward for the current update step, scaled by reward_scale.

  • is_end (bool): Flag indicating whether the optimization process has reached its end condition.

  • info (dict): Additional information (currently empty).

Raises:

  • ValueError: If an invalid action is encountered in the action array.