src.environment.optimizer.dedqn_optimizer¶
Module Contents¶
Classes¶
Introduction¶DEDQN is a mixed mutation strategy Differential Evolution (DE) algorithm based on deep Q-network (DQN), in which a deep reinforcement learning approach realizes the adaptive selection of mutation strategy in the evolution process. |
Functions¶
Introduction¶Calculates the fitness distance correlation (FDC) for a given set of samples and their corresponding fitness values. FDC is a metric used to assess the relationship between the fitness of solutions and their distance to the best solution in the sample set. |
|
Introduction¶Calculates the ruggedness of information entropy (RIE) of a given fitness sequence, which quantifies the complexity or unpredictability of changes in the sequence at multiple scales. |
|
Introduction¶Calculates the autocorrelation coefficient (ACF) of a given fitness sequence. |
|
Introduction¶Calculates the number of optimal values (NOP) metric for a set of samples and their corresponding fitness values. The NOP metric quantifies how often the order of fitness values is preserved when samples are sorted by their distance to the best sample. |
|
Introduction¶Generates a sequence of random walk samples within the bounds of a given population in a multi-dimensional space. |
|
Introduction¶Calculates a custom reward based on the survival status of elements and a specified pointer index. |
|
Introduction¶Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result. |
|
Introduction¶Generates a new candidate vector using the DE/rand/1 mutation strategy for Differential Evolution (DE) optimization algorithms. |
|
Introduction¶Generates a new candidate solution vector using the “best/2” differential evolution strategy. This method perturbs the current best solution by adding the weighted difference of two pairs of randomly selected individuals from the population. |
|
Introduction¶Generates a new candidate vector using the “current-to-rand/1” mutation strategy, commonly used in Differential Evolution (DE) algorithms. |
API¶
- src.environment.optimizer.dedqn_optimizer.cal_fdc(sample, fitness)[source]¶
Introduction¶
Calculates the fitness distance correlation (FDC) for a given set of samples and their corresponding fitness values. FDC is a metric used to assess the relationship between the fitness of solutions and their distance to the best solution in the sample set.
Args:¶
sample (np.ndarray): An array of candidate solutions, where each row represents a solution.
fitness (np.ndarray): A 1D array of fitness values corresponding to each solution in
sample.
Returns:¶
float: The computed fitness distance correlation coefficient.
Notes:¶
The function normalizes the correlation by the product of the variances of distance and fitness, with a small epsilon added to avoid division by zero.
- src.environment.optimizer.dedqn_optimizer.cal_rie(fitness)[source]¶
Introduction¶
Calculates the ruggedness of information entropy (RIE) of a given fitness sequence, which quantifies the complexity or unpredictability of changes in the sequence at multiple scales.
Args:¶
fitness (list or np.ndarray): A sequence of numerical fitness values representing the progression of a process or optimization.
Returns:¶
float: The maximum entropy value computed across different epsilon scales, representing the RIE of the input sequence.
Notes:¶
The function uses a multi-scale approach by varying the epsilon threshold to detect significant changes in the fitness sequence.
Entropy is normalized by the logarithm of the number of possible state transitions (6 in this case).
Zero frequencies are replaced with the length of the fitness sequence to avoid log(0) issues.
- src.environment.optimizer.dedqn_optimizer.cal_acf(fitness)[source]¶
Introduction¶
Calculates the autocorrelation coefficient (ACF) of a given fitness sequence.
Args:¶
fitness (np.ndarray or list of float): Sequence of fitness values for which the autocorrelation coefficient is to be computed.
Returns:¶
float: The computed autocorrelation coefficient of the input fitness sequence.
Notes:¶
A small constant (1e-6) is added to the denominator to prevent division by zero.
- src.environment.optimizer.dedqn_optimizer.cal_nop(sample, fitness)[source]¶
Introduction¶
Calculates the number of optimal values (NOP) metric for a set of samples and their corresponding fitness values. The NOP metric quantifies how often the order of fitness values is preserved when samples are sorted by their distance to the best sample.
Args:¶
sample (np.ndarray): An array of sample points, where each row corresponds to a sample.
fitness (np.ndarray): A 1D array of fitness values corresponding to each sample.
Returns:¶
float: The normalized count of order-preserving pairs, representing the proportion of times a better fitness value follows a worse one when samples are sorted by distance to the best sample.
Raises:¶
None
- src.environment.optimizer.dedqn_optimizer.random_walk_sampling(population, dim, steps, rng)[source]¶
Introduction¶
Generates a sequence of random walk samples within the bounds of a given population in a multi-dimensional space.
Args:¶
population (np.ndarray): The current population of solutions, shape (n_individuals, dim).
dim (int): The dimensionality of the search space.
steps (int): The number of steps (samples) to generate in the random walk.
rng (np.random.Generator): A NumPy random number generator instance for reproducibility.
Returns:¶
np.ndarray: An array of shape (steps, dim) containing the random walk samples, scaled to the range defined by the population.
- src.environment.optimizer.dedqn_optimizer.cal_reward(survival, pointer)[source]¶
Introduction¶
Calculates a custom reward based on the survival status of elements and a specified pointer index.
Args:¶
survival (list): A sequence where each element represents the survival status (typically 1 or another integer) of an entity.
pointer (int): The index in the survival list that is treated specially in the reward calculation.
Returns:¶
float: The computed reward value, normalized by the length of the survival list.
Notes:¶
If the element at the pointer index has a survival value of 1, the reward is incremented by 1.
For all other indices, the reward is incremented by the reciprocal of their survival value.
The final reward is averaged over the total number of elements in the survival list.
- class src.environment.optimizer.dedqn_optimizer.DEDQN_Optimizer(config)[source]¶
Bases:
src.environment.optimizer.learnable_optimizer.Learnable_OptimizerIntroduction¶
DEDQN is a mixed mutation strategy Differential Evolution (DE) algorithm based on deep Q-network (DQN), in which a deep reinforcement learning approach realizes the adaptive selection of mutation strategy in the evolution process.
Original paper¶
“Differential evolution with mixed mutation strategy based on deep reinforcement learning.” Applied Soft Computing (2021).
Initialization
Introduction¶
Initializes the optimizer with the provided configuration, setting up key parameters and internal state variables for the optimization process.
Args:¶
config (object): Config object containing parameters for the optimizer。
Attributes needed for the DEDQN_Optimizer are the following:
log_interval (int): Interval at which logs are recorded.Default is config.maxFEs/config.n_logpoint.
n_logpoint (int): Number of log points to record.Default is 50.
full_meta_data (bool): Flag indicating whether to use full meta data.Default is False.
maxFEs (int): Maximum number of function evaluations.
Built-in Attribute:¶
self.__config: Stores the configuration object.
self.__NP: Population size for the optimizer. Default is 100.
self.__F: Mutation factor. Default is 0.5.
self.__Cr: Crossover rate. Default is 0.5.
self.__maxFEs: Maximum number of function evaluations.
self.__rwsteps: Number of random walk steps. Default is 100.
self.__solution_pointer: Index indicating which solution receives the action. Default is 0.
self.__population: Stores the current population of solutions. Default is None.
self.__cost: Stores the cost values for the population. Default is None.
self.__gbest: Stores the global best solution found. Default is None.
self.__gbest_cost: Stores the cost of the global best solution. Default is None.
self.__state: Stores the current state of the optimizer. Default is None.
self.__survival: Stores survival information for the population. Default is None.
self.fes: Tracks the number of function evaluations performed.Default is None.
self.cost: Tracks the cost values.Default is None.
self.log_index: Tracks the logging index.Default is None.
self.log_interval: Interval for logging progress.
Returns:¶
None
- __cal_feature(problem)[source]¶
Introduction¶
Calculates a set of feature descriptors for the current population in the optimization process, including FDC, RIE, ACF, and NOP, based on random walk sampling and cost evaluation.
Args:¶
problem (object): The optimization problem instance, which must provide an
evalmethod for evaluating solutions and anoptimumattribute for the known optimum value (if available).
Returns:¶
np.ndarray: A NumPy array containing the computed feature values [fdc, rie, acf, nop].
Raises:¶
AttributeError: If the
problemobject does not have the requiredevalmethod oroptimumattribute.
- init_population(problem)[source]¶
Introduction¶
Initializes the population for the optimizer based on the provided problem definition. Sets up the initial population, evaluates their costs, determines the global best solution, and prepares meta-data if required.
Args:¶
problem (object):problem object, which have attributes
dim,lb,ub,optimum, and a methodeval()for evaluating solutions.
Built-in Attribute:¶
self.__dim (int): Dimensionality of the problem.
self.__population (np.ndarray): The initialized population within the problem bounds.
self.__survival (np.ndarray): Survival status of each individual in the population.
self.__cost (np.ndarray): Cost values of the population.
self.__gbest (np.ndarray): The best solution found so far.
self.__gbest_cost (float): The cost of the best solution.
self.fes (int): Function evaluation count.
self.log_index (int): Logging index for tracking progress. Default is 1.
self.cost (list): History of global best costs.
self.__state (np.ndarray): Feature vector representing the current state.
self.meta_X (list, optional): History of populations (if full meta-data is enabled).
self.meta_Cost (list, optional): History of costs (if full meta-data is enabled).
Returns:¶
np.ndarray: The feature vector representing the current state of the optimizer.
Raises:¶
None explicitly, but may raise exceptions from problem.eval() or if problem attributes are missing.
- update(action, problem)[source]¶
Introduction¶
Updates the current solution in the population using a specified mutation and crossover strategy, evaluates the new solution, updates the best solution found, and manages logging and meta-data for the optimization process.
Args:¶
action (int): The index of the mutation strategy to use (0: rand/1, 1: current-to-rand/1, 2: best/2).
problem (object): The optimization problem instance, which must provide lower and upper bounds (
lb,ub), an evaluation function (eval), and optionally an optimum value (optimum).
Returns:¶
state (np.ndarray): The current feature state of the optimizer.
reward (float): The reward calculated for the current update step.
is_done (bool): Whether the optimization process has reached its termination condition.
info (dict): Additional information (currently empty).
Raises:¶
ValueError: If the provided
actionis not one of the supported mutation strategies (0, 1, or 2).
- src.environment.optimizer.dedqn_optimizer.clipping(x: Union[numpy.ndarray, Iterable], lb: Union[numpy.ndarray, Iterable, int, float, None], ub: Union[numpy.ndarray, Iterable, int, float, None]) numpy.ndarray[source]¶
- src.environment.optimizer.dedqn_optimizer.binomial(x: numpy.ndarray, v: numpy.ndarray, Cr: Union[numpy.ndarray, float], rng) numpy.ndarray[source]¶
- src.environment.optimizer.dedqn_optimizer.generate_random_int_single(NP: int, cols: int, pointer: int, rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
Introduction¶
Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result.
Args:¶
NP (int): The upper bound (exclusive) for the random integers.
cols (int): The number of random integers to generate.
pointer (int): The integer value that must not appear in the generated array.
rng (np.random.RandomState, optional): A random number generator instance. Defaults to None.
Returns:¶
np.ndarray: An array of randomly generated integers of length
cols, excluding thepointervalue.
- src.environment.optimizer.dedqn_optimizer.rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
Introduction¶
Generates a new candidate vector using the DE/rand/1 mutation strategy for Differential Evolution (DE) optimization algorithms.
Args:¶
x (np.ndarray): Population array of candidate solutions, where each row represents an individual.
F (float): Differential weight, a scaling factor for the mutation.
pointer (int): Index of the current target vector in the population.
r (np.ndarray, optional): Array of three unique indices for mutation. If None, random indices are generated.
rng (np.random.RandomState, optional): Random number generator for reproducibility.
Returns:¶
np.ndarray: The mutated vector generated by the DE/rand/1 strategy.
Raises:¶
ValueError: If the generated or provided indices in
rare not unique or include thepointerindex.
- src.environment.optimizer.dedqn_optimizer.best_2_single(x: numpy.ndarray, best: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
Introduction¶
Generates a new candidate solution vector using the “best/2” differential evolution strategy. This method perturbs the current best solution by adding the weighted difference of two pairs of randomly selected individuals from the population.
Args:¶
x (np.ndarray): The population array of candidate solutions.
best (np.ndarray): The current best solution vector.
F (float): The differential weight (scaling factor) applied to the difference vectors.
pointer (int): The index of the target individual in the population.
r (np.ndarray, optional): An array of 4 unique random indices for selecting individuals from the population. If None, random indices are generated.
rng (np.random.RandomState, optional): Random number generator for reproducibility. If None, the default NumPy RNG is used.
Returns:¶
np.ndarray: The newly generated candidate solution vector.
Raises:¶
ValueError: If the population size is less than 4 or if invalid indices are provided.
- src.environment.optimizer.dedqn_optimizer.cur_to_rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]¶
Introduction¶
Generates a new candidate vector using the “current-to-rand/1” mutation strategy, commonly used in Differential Evolution (DE) algorithms.
Args:¶
x (np.ndarray): Population array of candidate solutions, where each row represents an individual.
F (float): Differential weight, a scaling factor for the mutation.
pointer (int): Index of the current target vector in the population.
r (np.ndarray, optional): Array of three unique random indices for mutation. If None, random indices are generated.
rng (np.random.RandomState, optional): Random number generator for reproducibility. If None, the global numpy RNG is used.
Returns:¶
np.ndarray: The mutated candidate vector generated by the current-to-rand/1 strategy.
Raises:¶
ValueError: If the generated or provided indices in
rare not unique or includepointer.