src.environment.optimizer.rlhpsde_optimizer¶
Module Contents¶
Classes¶
RLHPSDE_Optimizer¶A reinforcement learning-based hyper-parameter self-adaptive differential evolution optimizer. |
Functions¶
Introduction¶Generates a matrix of random integers for use in mutation operations, ensuring that each row contains unique values and no value matches its row index. |
|
|
|
API¶
- class src.environment.optimizer.rlhpsde_optimizer.Population(config, rng, dim)[source]¶
Initialization
- class src.environment.optimizer.rlhpsde_optimizer.RLHPSDE_Optimizer(config)[source]¶
Bases:
src.environment.optimizer.learnable_optimizer.Learnable_OptimizerRLHPSDE_Optimizer¶
A reinforcement learning-based hyper-parameter self-adaptive differential evolution optimizer.
This optimizer dynamically adapts its mutation and crossover strategies using reinforcement learning, and employs random walk-based landscape analysis to guide its search process.Introduction¶
The
RLHPSDE_Optimizerclass extendsLearnable_Optimizerand implements a self-adaptive differential evolution algorithm enhanced with reinforcement learning.
It utilizes random walk sampling and landscape analysis (fitness distance correlation and information entropy ruggedness) to determine the current state of the optimization landscape, which is then used to select appropriate mutation strategies.
The optimizer maintains a population of candidate solutions and iteratively updates them to minimize a given objective function.Initialization
Introduction¶
Initializes the optimizer with the provided configuration, setting up algorithm-specific parameters and internal state variables.
Args:¶
config (object): Configuration object containing optimizer parameters such as mutation factor, crossover rate, minimum population size, memory factor, random walk steps, step size, maximum function evaluations, and logging interval.
The Attributes needed for the RLHPSDE_Optimizer:
Built-in Attribute:¶
self.__config: Stores the configuration object.
self.__population: Placeholder for the population, initialized as None.
self.__rw_steps: Number of random walk steps, taken from config.
self.__step_size: Step size for the optimizer, taken from config.
self.__maxFEs: Maximum number of function evaluations, taken from config.
self.fes: Counter for function evaluations, initialized as None.
self.cost: Placeholder for the cost value, initialized as None.
self.log_index: Index for logging, initialized as None.
self.log_interval: Interval for logging, taken from config.
Raises:¶
AttributeError: If required attributes are missing from the config object.
- __str__()[source]¶
Returns a string representation of the RLHPSDE optimizer.
Returns:¶
str: The name of the optimizer, "RLHPSDE".
- init_population(problem)[source]¶
Introduction¶
Initializes the population for the optimization process, sets up costs, sorts individuals, and prepares logging and metadata as required.
Args:¶
problem (object): An object representing the optimization problem, expected to have attributes
lb(lower bounds),ub(upper bounds), and methods or properties for cost evaluation.
Returns:¶
object: The current state of the optimizer after population initialization, as returned by
self.__get_state(problem).
Side Effects:¶
Initializes and modifies internal attributes such as
self.__population,self.fes,self.log_index,self.meta_X,self.meta_Cost, andself.cost.
- __simple_random_walk(lb, ub)[source]¶
Introduction¶
Generates a sequence of samples using a simple random walk within specified lower and upper bounds.
Args:¶
lb (np.ndarray): Lower bounds for each dimension of the random walk (shape: [dim,]).
ub (np.ndarray): Upper bounds for each dimension of the random walk (shape: [dim,]).
Built-in Attribute:¶
self.__rw_steps (int): Number of random walk steps to perform.
self.__dim (int): Dimensionality of the search space.
self.__step_size (float): Maximum step size for each random walk move.
self.rng (np.random.Generator): Random number generator used for sampling.
Returns:¶
np.ndarray: Array of shape (self.__rw_steps + 1, self.__dim) containing the random walk samples.
Raises:¶
None
- __progressive_random_walk(lb, ub)[source]¶
Introduction¶
Generates a sequence of samples using a progressive random walk within specified lower and upper bounds. The walk starts from a randomly initialized point and iteratively updates the position, reflecting off the boundaries when exceeded.
Args:¶
lb (float or np.ndarray): The lower bound(s) for each dimension of the random walk.
ub (float or np.ndarray): The upper bound(s) for each dimension of the random walk.
Returns:¶
np.ndarray: An array of shape (self.__rw_steps + 1, self.__dim) containing the sequence of sampled points during the random walk.
Raises:¶
None
- __DFDC(sample, cost)[source]¶
Introduction¶
Calculate the Dynamic Fitness Distance Correlation.
Args:¶
sample (np.ndarray): Array of candidate solutions, where the first element is excluded from analysis.
cost (np.ndarray): Array of cost values corresponding to each sample, with the first element excluded from analysis.
Returns:¶
bool:
Trueif the sample is classified as “easy” (correlation coefficient between 0.15 and 1),Falseif “difficult” (between -1 and 0.15).
Raises:¶
ValueError: If the computed correlation coefficient is outside the expected range [-1, 1] or if standard deviations are zero.
- __DRIE(cost)[source]¶
Introduction¶
Determines the difficulty of a cost sequence using the DRIE (Difficulty Rating Index Estimator) method based on entropy of symbol transitions.
Args:¶
cost (np.ndarray): 1D array of cost values representing a sequence of steps.
Built-in Attribute:¶
self.__rw_steps (int): Number of random walk steps used for windowing the cost sequence.
Returns:¶
bool:
True if the sequence is classified as “easy” (0.5 <= r <= 1).
False if the sequence is classified as “difficult” (0 <= r < 0.5).
Raises:¶
ValueError: If the computed DRIE value
rfalls outside the expected range [0, 1].
- __get_state(problem)[source]¶
Introduction¶
Generates the current state representation for the optimizer by performing a random walk within the problem’s bounds, evaluating the sampled solutions, and combining feature extraction methods.
Args:¶
problem (object): An optimization problem instance that must have attributes
lb(lower bounds),ub(upper bounds), andoptimum(optional), as well as anevalmethod for evaluating solutions.
Returns:¶
np.ndarray or float: The computed state representation, which is a combination of features extracted from the sampled solutions and their costs.
Notes:¶
Increments the function evaluation counter (
self.fes) by the number of samples evaluated.Uses either a simple or progressive random walk to generate samples.
Applies feature extraction methods
__DFDCand__DRIEto the sampled data.
- update(action, problem)[source]¶
Introduction¶
Updates the optimizer’s population based on the selected action and the given problem instance. This method performs mutation, crossover, selection, and updates the best solution found so far. It also manages logging, reward calculation, and meta-data collection for reinforcement learning-based hyper-parameter search.
Args:¶
action (int): The action index specifying which mutation and crossover strategy to use.
problem (object): The problem instance providing evaluation, lower/upper bounds, and optimum value.
Returns:¶
state (object): The updated state representation for the RL agent.
reward (float): The reward signal based on the proportion of improved solutions.
done (bool): Whether the optimization process has reached its termination condition.
info (dict): Additional information (currently empty).
Raises:¶
ValueError: If the provided
actionis not a valid index for the available strategies.
- src.environment.optimizer.rlhpsde_optimizer.clipping(x: Union[numpy.ndarray, Iterable], lb: Union[numpy.ndarray, Iterable, int, float, None], ub: Union[numpy.ndarray, Iterable, int, float, None]) numpy.ndarray[source]¶
- src.environment.optimizer.rlhpsde_optimizer.binomial(x: numpy.ndarray, v: numpy.ndarray, Cr: Union[numpy.ndarray, float], rng) numpy.ndarray[source]¶
- src.environment.optimizer.rlhpsde_optimizer.generate_random_int(NP: int, cols: int, rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
Introduction¶
Generates a matrix of random integers for use in mutation operations, ensuring that each row contains unique values and no value matches its row index.
Args:¶
NP (int): Population size, determines the number of rows and the range of random integers [0, NP-1].
cols (int): Number of random integers to generate for each individual (number of columns).
rng (np.random.RandomState, optional): Random number generator instance. If None, a default RNG should be used.
Returns:¶
np.ndarray: A (NP, cols) shaped matrix of random integers, where each row contains unique values and no value equals its row index.
Raises:¶
ValueError: If NP or cols is not a positive integer.
- src.environment.optimizer.rlhpsde_optimizer.cur_to_best_1(x: numpy.ndarray, best: numpy.ndarray, F: Union[numpy.ndarray, float], rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
- Parameters:
x – The 2-D population matrix of shape [NP, dim].
best – An array of the best individual of shape [dim].
F – The mutation factor, which could be a float or a 1-D array of shape[NP].