src.environment.optimizer.dedqn_optimizer

Module Contents

Classes

DEDQN_Optimizer

Introduction

DEDQN is a mixed mutation strategy Differential Evolution (DE) algorithm based on deep Q-network (DQN), in which a deep reinforcement learning approach realizes the adaptive selection of mutation strategy in the evolution process.

Functions

cal_fdc

Introduction

Calculates the fitness distance correlation (FDC) for a given set of samples and their corresponding fitness values. FDC is a metric used to assess the relationship between the fitness of solutions and their distance to the best solution in the sample set.

cal_rie

Introduction

Calculates the ruggedness of information entropy (RIE) of a given fitness sequence, which quantifies the complexity or unpredictability of changes in the sequence at multiple scales.

cal_acf

Introduction

Calculates the autocorrelation coefficient (ACF) of a given fitness sequence.

cal_nop

Introduction

Calculates the number of optimal values (NOP) metric for a set of samples and their corresponding fitness values. The NOP metric quantifies how often the order of fitness values is preserved when samples are sorted by their distance to the best sample.

random_walk_sampling

Introduction

Generates a sequence of random walk samples within the bounds of a given population in a multi-dimensional space.

cal_reward

Introduction

Calculates a custom reward based on the survival status of elements and a specified pointer index.

clipping

binomial

generate_random_int_single

Introduction

Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result.

rand_1_single

Introduction

Generates a new candidate vector using the DE/rand/1 mutation strategy for Differential Evolution (DE) optimization algorithms.

best_2_single

Introduction

Generates a new candidate solution vector using the “best/2” differential evolution strategy. This method perturbs the current best solution by adding the weighted difference of two pairs of randomly selected individuals from the population.

cur_to_rand_1_single

Introduction

Generates a new candidate vector using the “current-to-rand/1” mutation strategy, commonly used in Differential Evolution (DE) algorithms.

API

src.environment.optimizer.dedqn_optimizer.cal_fdc(sample, fitness)[source]

Introduction

Calculates the fitness distance correlation (FDC) for a given set of samples and their corresponding fitness values. FDC is a metric used to assess the relationship between the fitness of solutions and their distance to the best solution in the sample set.

Args:

  • sample (np.ndarray): An array of candidate solutions, where each row represents a solution.

  • fitness (np.ndarray): A 1D array of fitness values corresponding to each solution in sample.

Returns:

  • float: The computed fitness distance correlation coefficient.

Notes:

  • The function normalizes the correlation by the product of the variances of distance and fitness, with a small epsilon added to avoid division by zero.

src.environment.optimizer.dedqn_optimizer.cal_rie(fitness)[source]

Introduction

Calculates the ruggedness of information entropy (RIE) of a given fitness sequence, which quantifies the complexity or unpredictability of changes in the sequence at multiple scales.

Args:

  • fitness (list or np.ndarray): A sequence of numerical fitness values representing the progression of a process or optimization.

Returns:

  • float: The maximum entropy value computed across different epsilon scales, representing the RIE of the input sequence.

Notes:

  • The function uses a multi-scale approach by varying the epsilon threshold to detect significant changes in the fitness sequence.

  • Entropy is normalized by the logarithm of the number of possible state transitions (6 in this case).

  • Zero frequencies are replaced with the length of the fitness sequence to avoid log(0) issues.

src.environment.optimizer.dedqn_optimizer.cal_acf(fitness)[source]

Introduction

Calculates the autocorrelation coefficient (ACF) of a given fitness sequence.

Args:

  • fitness (np.ndarray or list of float): Sequence of fitness values for which the autocorrelation coefficient is to be computed.

Returns:

  • float: The computed autocorrelation coefficient of the input fitness sequence.

Notes:

  • A small constant (1e-6) is added to the denominator to prevent division by zero.

src.environment.optimizer.dedqn_optimizer.cal_nop(sample, fitness)[source]

Introduction

Calculates the number of optimal values (NOP) metric for a set of samples and their corresponding fitness values. The NOP metric quantifies how often the order of fitness values is preserved when samples are sorted by their distance to the best sample.

Args:

  • sample (np.ndarray): An array of sample points, where each row corresponds to a sample.

  • fitness (np.ndarray): A 1D array of fitness values corresponding to each sample.

Returns:

  • float: The normalized count of order-preserving pairs, representing the proportion of times a better fitness value follows a worse one when samples are sorted by distance to the best sample.

Raises:

  • None

src.environment.optimizer.dedqn_optimizer.random_walk_sampling(population, dim, steps, rng)[source]

Introduction

Generates a sequence of random walk samples within the bounds of a given population in a multi-dimensional space.

Args:

  • population (np.ndarray): The current population of solutions, shape (n_individuals, dim).

  • dim (int): The dimensionality of the search space.

  • steps (int): The number of steps (samples) to generate in the random walk.

  • rng (np.random.Generator): A NumPy random number generator instance for reproducibility.

Returns:

  • np.ndarray: An array of shape (steps, dim) containing the random walk samples, scaled to the range defined by the population.

src.environment.optimizer.dedqn_optimizer.cal_reward(survival, pointer)[source]

Introduction

Calculates a custom reward based on the survival status of elements and a specified pointer index.

Args:

  • survival (list): A sequence where each element represents the survival status (typically 1 or another integer) of an entity.

  • pointer (int): The index in the survival list that is treated specially in the reward calculation.

Returns:

  • float: The computed reward value, normalized by the length of the survival list.

Notes:

  • If the element at the pointer index has a survival value of 1, the reward is incremented by 1.

  • For all other indices, the reward is incremented by the reciprocal of their survival value.

  • The final reward is averaged over the total number of elements in the survival list.

class src.environment.optimizer.dedqn_optimizer.DEDQN_Optimizer(config)[source]

Bases: src.environment.optimizer.learnable_optimizer.Learnable_Optimizer

Introduction

DEDQN is a mixed mutation strategy Differential Evolution (DE) algorithm based on deep Q-network (DQN), in which a deep reinforcement learning approach realizes the adaptive selection of mutation strategy in the evolution process.

Original paper

Differential evolution with mixed mutation strategy based on deep reinforcement learning.” Applied Soft Computing (2021).

Initialization

Introduction

Initializes the optimizer with the provided configuration, setting up key parameters and internal state variables for the optimization process.

Args:

  • config (object): Config object containing parameters for the optimizer。

    • Attributes needed for the DEDQN_Optimizer are the following:

      • log_interval (int): Interval at which logs are recorded.Default is config.maxFEs/config.n_logpoint.

      • n_logpoint (int): Number of log points to record.Default is 50.

      • full_meta_data (bool): Flag indicating whether to use full meta data.Default is False.

      • maxFEs (int): Maximum number of function evaluations.

Built-in Attribute:

  • self.__config: Stores the configuration object.

  • self.__NP: Population size for the optimizer. Default is 100.

  • self.__F: Mutation factor. Default is 0.5.

  • self.__Cr: Crossover rate. Default is 0.5.

  • self.__maxFEs: Maximum number of function evaluations.

  • self.__rwsteps: Number of random walk steps. Default is 100.

  • self.__solution_pointer: Index indicating which solution receives the action. Default is 0.

  • self.__population: Stores the current population of solutions. Default is None.

  • self.__cost: Stores the cost values for the population. Default is None.

  • self.__gbest: Stores the global best solution found. Default is None.

  • self.__gbest_cost: Stores the cost of the global best solution. Default is None.

  • self.__state: Stores the current state of the optimizer. Default is None.

  • self.__survival: Stores survival information for the population. Default is None.

  • self.fes: Tracks the number of function evaluations performed.Default is None.

  • self.cost: Tracks the cost values.Default is None.

  • self.log_index: Tracks the logging index.Default is None.

  • self.log_interval: Interval for logging progress.

Returns:

  • None

__cal_feature(problem)[source]

Introduction

Calculates a set of feature descriptors for the current population in the optimization process, including FDC, RIE, ACF, and NOP, based on random walk sampling and cost evaluation.

Args:

  • problem (object): The optimization problem instance, which must provide an eval method for evaluating solutions and an optimum attribute for the known optimum value (if available).

Returns:

  • np.ndarray: A NumPy array containing the computed feature values [fdc, rie, acf, nop].

Raises:

  • AttributeError: If the problem object does not have the required eval method or optimum attribute.

init_population(problem)[source]

Introduction

Initializes the population for the optimizer based on the provided problem definition. Sets up the initial population, evaluates their costs, determines the global best solution, and prepares meta-data if required.

Args:

  • problem (object):problem object, which have attributes dim, lb, ub, optimum, and a method eval() for evaluating solutions.

Built-in Attribute:

  • self.__dim (int): Dimensionality of the problem.

  • self.__population (np.ndarray): The initialized population within the problem bounds.

  • self.__survival (np.ndarray): Survival status of each individual in the population.

  • self.__cost (np.ndarray): Cost values of the population.

  • self.__gbest (np.ndarray): The best solution found so far.

  • self.__gbest_cost (float): The cost of the best solution.

  • self.fes (int): Function evaluation count.

  • self.log_index (int): Logging index for tracking progress. Default is 1.

  • self.cost (list): History of global best costs.

  • self.__state (np.ndarray): Feature vector representing the current state.

  • self.meta_X (list, optional): History of populations (if full meta-data is enabled).

  • self.meta_Cost (list, optional): History of costs (if full meta-data is enabled).

Returns:

  • np.ndarray: The feature vector representing the current state of the optimizer.

Raises:

  • None explicitly, but may raise exceptions from problem.eval() or if problem attributes are missing.

update(action, problem)[source]

Introduction

Updates the current solution in the population using a specified mutation and crossover strategy, evaluates the new solution, updates the best solution found, and manages logging and meta-data for the optimization process.

Args:

  • action (int): The index of the mutation strategy to use (0: rand/1, 1: current-to-rand/1, 2: best/2).

  • problem (object): The optimization problem instance, which must provide lower and upper bounds (lb, ub), an evaluation function (eval), and optionally an optimum value (optimum).

Returns:

  • state (np.ndarray): The current feature state of the optimizer.

  • reward (float): The reward calculated for the current update step.

  • is_done (bool): Whether the optimization process has reached its termination condition.

  • info (dict): Additional information (currently empty).

Raises:

  • ValueError: If the provided action is not one of the supported mutation strategies (0, 1, or 2).

src.environment.optimizer.dedqn_optimizer.clipping(x: Union[numpy.ndarray, Iterable], lb: Union[numpy.ndarray, Iterable, int, float, None], ub: Union[numpy.ndarray, Iterable, int, float, None]) numpy.ndarray[source]
src.environment.optimizer.dedqn_optimizer.binomial(x: numpy.ndarray, v: numpy.ndarray, Cr: Union[numpy.ndarray, float], rng) numpy.ndarray[source]
src.environment.optimizer.dedqn_optimizer.generate_random_int_single(NP: int, cols: int, pointer: int, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result.

Args:

  • NP (int): The upper bound (exclusive) for the random integers.

  • cols (int): The number of random integers to generate.

  • pointer (int): The integer value that must not appear in the generated array.

  • rng (np.random.RandomState, optional): A random number generator instance. Defaults to None.

Returns:

  • np.ndarray: An array of randomly generated integers of length cols, excluding the pointer value.

src.environment.optimizer.dedqn_optimizer.rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new candidate vector using the DE/rand/1 mutation strategy for Differential Evolution (DE) optimization algorithms.

Args:

  • x (np.ndarray): Population array of candidate solutions, where each row represents an individual.

  • F (float): Differential weight, a scaling factor for the mutation.

  • pointer (int): Index of the current target vector in the population.

  • r (np.ndarray, optional): Array of three unique indices for mutation. If None, random indices are generated.

  • rng (np.random.RandomState, optional): Random number generator for reproducibility.

Returns:

  • np.ndarray: The mutated vector generated by the DE/rand/1 strategy.

Raises:

  • ValueError: If the generated or provided indices in r are not unique or include the pointer index.

src.environment.optimizer.dedqn_optimizer.best_2_single(x: numpy.ndarray, best: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new candidate solution vector using the “best/2” differential evolution strategy. This method perturbs the current best solution by adding the weighted difference of two pairs of randomly selected individuals from the population.

Args:

  • x (np.ndarray): The population array of candidate solutions.

  • best (np.ndarray): The current best solution vector.

  • F (float): The differential weight (scaling factor) applied to the difference vectors.

  • pointer (int): The index of the target individual in the population.

  • r (np.ndarray, optional): An array of 4 unique random indices for selecting individuals from the population. If None, random indices are generated.

  • rng (np.random.RandomState, optional): Random number generator for reproducibility. If None, the default NumPy RNG is used.

Returns:

  • np.ndarray: The newly generated candidate solution vector.

Raises:

  • ValueError: If the population size is less than 4 or if invalid indices are provided.

src.environment.optimizer.dedqn_optimizer.cur_to_rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new candidate vector using the “current-to-rand/1” mutation strategy, commonly used in Differential Evolution (DE) algorithms.

Args:

  • x (np.ndarray): Population array of candidate solutions, where each row represents an individual.

  • F (float): Differential weight, a scaling factor for the mutation.

  • pointer (int): Index of the current target vector in the population.

  • r (np.ndarray, optional): Array of three unique random indices for mutation. If None, random indices are generated.

  • rng (np.random.RandomState, optional): Random number generator for reproducibility. If None, the global numpy RNG is used.

Returns:

  • np.ndarray: The mutated candidate vector generated by the current-to-rand/1 strategy.

Raises:

  • ValueError: If the generated or provided indices in r are not unique or include pointer.