src.environment.optimizer.rlhpsde_optimizer

Module Contents

Classes

Population

RLHPSDE_Optimizer

RLHPSDE_Optimizer

A reinforcement learning-based hyper-parameter self-adaptive differential evolution optimizer.
This optimizer dynamically adapts its mutation and crossover strategies using reinforcement learning, and employs random walk-based landscape analysis to guide its search process.

Functions

clipping

binomial

generate_random_int

Introduction

Generates a matrix of random integers for use in mutation operations, ensuring that each row contains unique values and no value matches its row index.

cur_to_best_1

param x:

The 2-D population matrix of shape [NP, dim].

param best:

An array of the best individual of shape [dim].

param F:

The mutation factor, which could be a float or a 1-D array of shape[NP].

cur_to_rand_1

API

class src.environment.optimizer.rlhpsde_optimizer.Population(config, rng, dim)[source]

Initialization

initialize_group(lb, ub, size=-1)[source]
initialize_costs(problem)[source]
sort(size, reverse=False)[source]
choose_F_Cr(F_dist)[source]
mean_wL(df, s)[source]
update_M_F_Cr(SF, SCr, df)[source]
LPSR(fes, maxFEs)[source]
class src.environment.optimizer.rlhpsde_optimizer.RLHPSDE_Optimizer(config)[source]

Bases: src.environment.optimizer.learnable_optimizer.Learnable_Optimizer

RLHPSDE_Optimizer

A reinforcement learning-based hyper-parameter self-adaptive differential evolution optimizer.
This optimizer dynamically adapts its mutation and crossover strategies using reinforcement learning, and employs random walk-based landscape analysis to guide its search process.

Introduction

The RLHPSDE_Optimizer class extends Learnable_Optimizer and implements a self-adaptive differential evolution algorithm enhanced with reinforcement learning.
It utilizes random walk sampling and landscape analysis (fitness distance correlation and information entropy ruggedness) to determine the current state of the optimization landscape, which is then used to select appropriate mutation strategies.
The optimizer maintains a population of candidate solutions and iteratively updates them to minimize a given objective function.

Initialization

Introduction

Initializes the optimizer with the provided configuration, setting up algorithm-specific parameters and internal state variables.

Args:

  • config (object): Configuration object containing optimizer parameters such as mutation factor, crossover rate, minimum population size, memory factor, random walk steps, step size, maximum function evaluations, and logging interval.

    • The Attributes needed for the RLHPSDE_Optimizer:

Built-in Attribute:

  • self.__config: Stores the configuration object.

  • self.__population: Placeholder for the population, initialized as None.

  • self.__rw_steps: Number of random walk steps, taken from config.

  • self.__step_size: Step size for the optimizer, taken from config.

  • self.__maxFEs: Maximum number of function evaluations, taken from config.

  • self.fes: Counter for function evaluations, initialized as None.

  • self.cost: Placeholder for the cost value, initialized as None.

  • self.log_index: Index for logging, initialized as None.

  • self.log_interval: Interval for logging, taken from config.

Raises:

  • AttributeError: If required attributes are missing from the config object.

__str__()[source]

Returns a string representation of the RLHPSDE optimizer.

Returns:

str: The name of the optimizer, "RLHPSDE".
init_population(problem)[source]

Introduction

Initializes the population for the optimization process, sets up costs, sorts individuals, and prepares logging and metadata as required.

Args:

  • problem (object): An object representing the optimization problem, expected to have attributes lb (lower bounds), ub (upper bounds), and methods or properties for cost evaluation.

Returns:

  • object: The current state of the optimizer after population initialization, as returned by self.__get_state(problem).

Side Effects:

  • Initializes and modifies internal attributes such as self.__population, self.fes, self.log_index, self.meta_X, self.meta_Cost, and self.cost.

__simple_random_walk(lb, ub)[source]

Introduction

Generates a sequence of samples using a simple random walk within specified lower and upper bounds.

Args:

  • lb (np.ndarray): Lower bounds for each dimension of the random walk (shape: [dim,]).

  • ub (np.ndarray): Upper bounds for each dimension of the random walk (shape: [dim,]).

Built-in Attribute:

  • self.__rw_steps (int): Number of random walk steps to perform.

  • self.__dim (int): Dimensionality of the search space.

  • self.__step_size (float): Maximum step size for each random walk move.

  • self.rng (np.random.Generator): Random number generator used for sampling.

Returns:

  • np.ndarray: Array of shape (self.__rw_steps + 1, self.__dim) containing the random walk samples.

Raises:

  • None

__progressive_random_walk(lb, ub)[source]

Introduction

Generates a sequence of samples using a progressive random walk within specified lower and upper bounds. The walk starts from a randomly initialized point and iteratively updates the position, reflecting off the boundaries when exceeded.

Args:

  • lb (float or np.ndarray): The lower bound(s) for each dimension of the random walk.

  • ub (float or np.ndarray): The upper bound(s) for each dimension of the random walk.

Returns:

  • np.ndarray: An array of shape (self.__rw_steps + 1, self.__dim) containing the sequence of sampled points during the random walk.

Raises:

  • None

__DFDC(sample, cost)[source]

Introduction

Calculate the Dynamic Fitness Distance Correlation.

Args:

  • sample (np.ndarray): Array of candidate solutions, where the first element is excluded from analysis.

  • cost (np.ndarray): Array of cost values corresponding to each sample, with the first element excluded from analysis.

Returns:

  • bool: True if the sample is classified as “easy” (correlation coefficient between 0.15 and 1), False if “difficult” (between -1 and 0.15).

Raises:

  • ValueError: If the computed correlation coefficient is outside the expected range [-1, 1] or if standard deviations are zero.

__DRIE(cost)[source]

Introduction

Determines the difficulty of a cost sequence using the DRIE (Difficulty Rating Index Estimator) method based on entropy of symbol transitions.

Args:

  • cost (np.ndarray): 1D array of cost values representing a sequence of steps.

Built-in Attribute:

  • self.__rw_steps (int): Number of random walk steps used for windowing the cost sequence.

Returns:

  • bool:

    • True if the sequence is classified as “easy” (0.5 <= r <= 1).

    • False if the sequence is classified as “difficult” (0 <= r < 0.5).

Raises:

  • ValueError: If the computed DRIE value r falls outside the expected range [0, 1].

__get_state(problem)[source]

Introduction

Generates the current state representation for the optimizer by performing a random walk within the problem’s bounds, evaluating the sampled solutions, and combining feature extraction methods.

Args:

  • problem (object): An optimization problem instance that must have attributes lb (lower bounds), ub (upper bounds), and optimum (optional), as well as an eval method for evaluating solutions.

Returns:

  • np.ndarray or float: The computed state representation, which is a combination of features extracted from the sampled solutions and their costs.

Notes:

  • Increments the function evaluation counter (self.fes) by the number of samples evaluated.

  • Uses either a simple or progressive random walk to generate samples.

  • Applies feature extraction methods __DFDC and __DRIE to the sampled data.

update(action, problem)[source]

Introduction

Updates the optimizer’s population based on the selected action and the given problem instance. This method performs mutation, crossover, selection, and updates the best solution found so far. It also manages logging, reward calculation, and meta-data collection for reinforcement learning-based hyper-parameter search.

Args:

  • action (int): The action index specifying which mutation and crossover strategy to use.

  • problem (object): The problem instance providing evaluation, lower/upper bounds, and optimum value.

Returns:

  • state (object): The updated state representation for the RL agent.

  • reward (float): The reward signal based on the proportion of improved solutions.

  • done (bool): Whether the optimization process has reached its termination condition.

  • info (dict): Additional information (currently empty).

Raises:

  • ValueError: If the provided action is not a valid index for the available strategies.

src.environment.optimizer.rlhpsde_optimizer.clipping(x: Union[numpy.ndarray, Iterable], lb: Union[numpy.ndarray, Iterable, int, float, None], ub: Union[numpy.ndarray, Iterable, int, float, None]) numpy.ndarray[source]
src.environment.optimizer.rlhpsde_optimizer.binomial(x: numpy.ndarray, v: numpy.ndarray, Cr: Union[numpy.ndarray, float], rng) numpy.ndarray[source]
src.environment.optimizer.rlhpsde_optimizer.generate_random_int(NP: int, cols: int, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a matrix of random integers for use in mutation operations, ensuring that each row contains unique values and no value matches its row index.

Args:

  • NP (int): Population size, determines the number of rows and the range of random integers [0, NP-1].

  • cols (int): Number of random integers to generate for each individual (number of columns).

  • rng (np.random.RandomState, optional): Random number generator instance. If None, a default RNG should be used.

Returns:

  • np.ndarray: A (NP, cols) shaped matrix of random integers, where each row contains unique values and no value equals its row index.

Raises:

  • ValueError: If NP or cols is not a positive integer.

src.environment.optimizer.rlhpsde_optimizer.cur_to_best_1(x: numpy.ndarray, best: numpy.ndarray, F: Union[numpy.ndarray, float], rng: numpy.random.RandomState = None) numpy.ndarray[source]
Parameters:
  • x – The 2-D population matrix of shape [NP, dim].

  • best – An array of the best individual of shape [dim].

  • F – The mutation factor, which could be a float or a 1-D array of shape[NP].

src.environment.optimizer.rlhpsde_optimizer.cur_to_rand_1(x: numpy.ndarray, F: Union[numpy.ndarray, float], rng: numpy.random.RandomState = None) numpy.ndarray[source]