src.environment.optimizer.deddqn_optimizer

Module Contents

Classes

DEDDQN_Optimizer

Introduction

DE-DDQN is an adaptive operator selection method based on Double Deep Q-Learning (DDQN), a Deep Reinforcement Learning method, to control the mutation strategies of Differential Evolution (DE).

Functions

clipping

Clips the input array or iterable x element-wise to the specified lower and upper bounds.

binomial

Introduction

Performs binomial (crossover) operation commonly used in evolutionary algorithms, combining two populations (x and v) based on a crossover rate (Cr). Ensures at least one element from v is selected for each individual.

generate_random_int_single

Introduction

Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result.

rand_1_single

Introduction

Implements the “rand/1” mutation strategy commonly used in Differential Evolution (DE) optimization algorithms. This function generates a new candidate solution by combining three randomly selected vectors from the population.

rand_2_single

Introduction

Generates a new vector based on the DE/rand/2 mutation strategy used in Differential Evolution (DE) algorithms. This method combines elements from multiple vectors in the population to create a trial vector.

rand_to_best_2_single

Introduction

Generates a new candidate solution vector using the “rand-to-best/2” mutation strategy, commonly used in Differential Evolution algorithms.

cur_to_rand_1_single

Introduction

Generates a new vector using the “current-to-rand/1” mutation strategy commonly used in Differential Evolution algorithms. The function perturbs the vector at the specified pointer index by adding a scaled difference of randomly selected vectors.

API

class src.environment.optimizer.deddqn_optimizer.DEDDQN_Optimizer(config)[source]

Bases: src.environment.optimizer.learnable_optimizer.Learnable_Optimizer

Introduction

DE-DDQN is an adaptive operator selection method based on Double Deep Q-Learning (DDQN), a Deep Reinforcement Learning method, to control the mutation strategies of Differential Evolution (DE).

Original paper

Deep reinforcement learning based parameter control in differential evolution.” Proceedings of the Genetic and Evolutionary Computation Conference (2019).

Official Implementation

DE-DDQN

Initialization

Introduction

Initializes the optimizer with the provided configuration, setting up algorithm-specific parameters and internal state variables for the optimization process.

Args:

  • config (object): Config object containing hyperparameters and settings for the optimizer.

    • The Attributes needed for the DEDDQN_Optimizer are the following:

      • log_interval (int): Interval at which logs are recorded.Default is config.maxFEs/config.n_logpoint.

      • n_logpoint (int): Number of log points to record.Default is 50.

      • full_meta_data (bool): Flag indicating whether to use full meta data.Default is False.

      • maxFEs (int): Maximum number of function evaluations.

      • dim (int): Dimensionality of the problem.Default is 10.

      • fes (int): Counter for the number of function evaluations.Default is 0.

      • cost (float): Current cost of the best solution found.Default is None.

      • log_index (int): Index for logging.Default is 1.

      • log_interval (int): Interval for logging progress.Default is config.maxFEs/config.n_logpoint.

      • full_meta_data (bool): Flag indicating whether to use full meta data.Default is False.

      • __config (object): Stores the config object from src/config.py.

      • __F (float): Mutation factor for the optimizer.Default is 0.5.

      • __Cr (float): Crossover rate for the optimizer.Default is 1.0.

      • __NP (int): Population size.Default is 100.

      • __maxFEs (int): Maximum number of function evaluations.

      • __gen_max (int): Maximum number of generations.Default is 10.

      • __W (int): Window size.Default is 50.

      • __dim_max (int): Maximum dimensionality of the problem.Default is 10.

Built-in Attribute:

  • __config (object): Configuration object containing algorithm parameters.

  • __gen (int): Current generation number.

  • __pointer (int): Pointer to the current individual being updated.

  • __stagcount (int): Stagnation counter for tracking progress.

  • __X (ndarray): Population of individuals.

  • __cost (ndarray): Cost values of the population.

  • __X_gbest (ndarray): Global best individual.

  • __c_gbest (float): Cost of the global best individual.

  • __c_gworst (float): Cost of the worst individual in the population.

  • __X_prebest (ndarray): Previous best individual.

  • __c_prebest (float): Cost of the previous best individual.

  • __OM (list): Operator metadata for tracking operator performance.

  • __N_succ (list): Success counts for each operator.

  • __N_tot (list): Total counts for each operator.

  • __OM_W (list): Operator weights for adaptive selection.

  • __r (ndarray): Random indexes used for generating states and mutation.

  • fes (int): Total number of function evaluations.

  • cost (list): List of best costs at each generation.

  • log_index (int): Index for logging progress.

Returns:

  • None

__str__()[source]

Returns a string representation of the DEDDQN optimizer instance.

Returns:

  • str: The name of the optimizer, “DEDDQN_Optimizer”.

init_population(problem)[source]

Introduction

Initializes the population for an optimization problem, setting up the initial solutions, their costs, and various tracking variables.

Args:

  • problem (object): An instance of the optimization problem, which must provide the following attributes and methods:

    • ub (array-like): Upper bounds of the problem’s search space.

    • lb (array-like): Lower bounds of the problem’s search space.

    • optimum (float or None): The known optimal value of the problem, if available.

    • eval(X) (callable): A method to evaluate the cost of a given population X.

Returns:

  • dict: The initial state of the optimizer, including population, costs, and other relevant metadata.

Built-in Attributes:

  • __dim (int): Dimensionality of the problem.

  • __X (ndarray): Initial population of solutions.Default is None.

  • __cost (ndarray): Costs of the initial population.Default is None.

  • __X_gbest (ndarray): Global best solution found so far.Default is None.

  • __c_gbest (float): Cost of the global best solution.Default is None.

  • __c_gworst (float): Cost of the worst solution in the population.Default is None.

  • __X_prebest (ndarray): Previous best solution.Default is None.

  • __c_prebest (float): Cost of the previous best solution.Default is None.

  • __OM (list): Operator metadata for tracking operator performance.Default is a list of empty lists.

  • __N_succ (list): Success counts for each operator.Default is a list of empty lists.

  • __N_tot (list): Total counts for each operator.Default is a list of empty lists.

  • __OM_W (list): Operator weights for adaptive selection.Default is a list of empty lists.

  • fes (int): Total number of function evaluations.Default is 0.

  • cost (list): List of best costs at each generation.Default is a list with the initial global best cost.

  • log_index (int): Index for logging progress.Default is 1.

  • log_interval (int): Interval for logging progress.Default is config.maxFEs/config.n_logpoint.

  • meta_X (list): List to store the population positions at each iteration.Default is an empty list.

  • meta_Cost (list): List to store the corresponding costs for each population.Default is an empty list.

  • meta_tmp_x (list): Temporary list to store the current population positions.Default is an empty list.

  • meta_tmp_cost (list): Temporary list to store the current costs.Default is an empty list.

Notes:

  • This method assumes that the problem object provides the necessary attributes and methods for population initialization and evaluation.

  • If problem.optimum is provided, the costs are adjusted relative to the optimum.

__get_state(problem)[source]

Introduction

Generates a feature vector representing the current state of the optimization problem. The features are derived from various properties of the optimization process, including population diversity, fitness values, and historical operator performance.

Args:

  • problem (object): The optimization problem instance containing bounds and other relevant data.

Returns:

  • numpy.ndarray: A 1D array of 99 features representing the current state of the optimization process.

Notes:

  • The feature vector includes normalized fitness values, distances between solutions, operator success rates, and other statistical measures.

  • The method uses internal attributes such as population positions, fitness values, and operator performance metrics to compute the features.

update(action, problem)[source]

Introduction

Updates the optimizer’s state based on the selected action and the problem instance. This function implements the core logic of the DE-based optimizer, including mutation, crossover, selection, and reward computation.

Args:

  • action (int): The action index representing the mutation strategy to use. Valid values are:

    • 0: ‘rand/1’

    • 1: ‘rand/2’

    • 2: ‘rand-to-best/2’

    • 3: ‘cur-to-rand/1’

  • problem (Problem): The optimization problem instance containing the objective function, bounds, and other problem-specific details.

Returns:

  • next_state (np.ndarray): The next state of the optimizer after applying the action.

  • reward (float): The reward obtained from the action, calculated as the improvement in cost.

  • is_done (bool): A flag indicating whether the optimization process has reached its termination condition.

  • info (dict): Additional information about the current state of the optimizer.

Raises:

  • ValueError: If the provided action is not a valid mutation strategy index.

src.environment.optimizer.deddqn_optimizer.clipping(x: Union[numpy.ndarray, Iterable], lb: Union[numpy.ndarray, Iterable, int, float, None], ub: Union[numpy.ndarray, Iterable, int, float, None]) numpy.ndarray[source]

Clips the input array or iterable x element-wise to the specified lower and upper bounds.

Args:

  • x (Union[np.ndarray, Iterable]): The input array or iterable to be clipped.

  • lb (Union[np.ndarray, Iterable, int, float, None]): The lower bound(s) for clipping. If None, no lower bound is applied.

  • ub (Union[np.ndarray, Iterable, int, float, None]): The upper bound(s) for clipping. If None, no upper bound is applied.

Returns:

  • np.ndarray: The clipped array, with values limited to the interval [lb, ub].

Raises:

  • ValueError: If the shapes of x, lb, and ub are not broadcastable to a common shape.

src.environment.optimizer.deddqn_optimizer.binomial(x: numpy.ndarray, v: numpy.ndarray, Cr: Union[numpy.ndarray, float], rng) numpy.ndarray[source]

Introduction

Performs binomial (crossover) operation commonly used in evolutionary algorithms, combining two populations (x and v) based on a crossover rate (Cr). Ensures at least one element from v is selected for each individual.

Args:

  • x (np.ndarray): The current population array of shape (NP, dim) or (dim,).

  • v (np.ndarray): The donor population array of the same shape as x.

  • Cr (Union[np.ndarray, float]): The crossover rate(s), either a float or a 1D array of length NP.

  • rng: A random number generator with rand and randint methods (e.g., numpy.random.Generator).

Returns:

  • np.ndarray: The trial population array after binomial crossover, with the same shape as x.

Notes:

  • If x and v are 1D arrays, they are reshaped to 2D for processing and squeezed back before returning.

  • Guarantees that at least one dimension per individual is inherited from v.

src.environment.optimizer.deddqn_optimizer.generate_random_int_single(NP: int, cols: int, pointer: int, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a random array of integers within a specified range, ensuring that a given pointer value is not included in the result.

Args:

  • NP (int): The upper bound (exclusive) for the random integers.

  • cols (int): The number of random integers to generate.

  • pointer (int): The integer value that must not appear in the generated array.

  • rng (np.random.RandomState, optional): A random number generator instance. Defaults to None.

Returns:

  • np.ndarray: An array of randomly generated integers of length cols, excluding the pointer value.

src.environment.optimizer.deddqn_optimizer.rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Implements the “rand/1” mutation strategy commonly used in Differential Evolution (DE) optimization algorithms. This function generates a new candidate solution by combining three randomly selected vectors from the population.

Args:

  • x (np.ndarray): The population of candidate solutions, where each row represents an individual solution.

  • F (float): The scaling factor used to control the amplification of the differential variation.

  • pointer (int): The index of the current candidate solution in the population.

  • r (np.ndarray, optional): An array of three unique random indices used to select individuals from the population. If None, the indices will be generated automatically.

  • rng (np.random.RandomState, optional): A random number generator for reproducibility. If None, the default RNG is used.

Returns:

  • np.ndarray: A new candidate solution generated by applying the “rand/1” mutation strategy.

src.environment.optimizer.deddqn_optimizer.rand_2_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new vector based on the DE/rand/2 mutation strategy used in Differential Evolution (DE) algorithms. This method combines elements from multiple vectors in the population to create a trial vector.

Args:

  • x (np.ndarray): The population array where each row represents an individual vector.

  • F (float): The scaling factor used to control the amplification of the differential variation.

  • pointer (int): The index of the current target vector in the population.

  • r (np.ndarray, optional): An array of indices used to select vectors from the population. If None, random indices are generated.

  • rng (np.random.RandomState, optional): A random number generator instance for reproducibility. If None, the default RNG is used.

Returns:

  • np.ndarray: A new vector generated by applying the DE/rand/2 mutation strategy.

src.environment.optimizer.deddqn_optimizer.rand_to_best_2_single(x: numpy.ndarray, best: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new candidate solution vector using the “rand-to-best/2” mutation strategy, commonly used in Differential Evolution algorithms.

Args:

  • x (np.ndarray): Population array of candidate solutions.

  • best (np.ndarray): The current best solution vector.

  • F (float): Differential weight, a scaling factor for the mutation.

  • pointer (int): Index of the target vector in the population.

  • r (np.ndarray, optional): Array of 5 unique random indices for mutation. If None, they are generated automatically.

  • rng (np.random.RandomState, optional): Random number generator for reproducibility.

Returns:

  • np.ndarray: The mutated candidate solution vector.

Raises:

  • ValueError: If the input arrays have incompatible shapes or if insufficient unique indices are available for mutation.

src.environment.optimizer.deddqn_optimizer.cur_to_rand_1_single(x: numpy.ndarray, F: float, pointer: int, r: numpy.ndarray = None, rng: numpy.random.RandomState = None) numpy.ndarray[source]

Introduction

Generates a new vector using the “current-to-rand/1” mutation strategy commonly used in Differential Evolution algorithms. The function perturbs the vector at the specified pointer index by adding a scaled difference of randomly selected vectors.

Args:

  • x (np.ndarray): Population array of candidate solutions, where each row is an individual.

  • F (float): Scaling factor for the mutation.

  • pointer (int): Index of the target vector in the population to be mutated.

  • r (np.ndarray, optional): Array of three unique random indices for mutation. If None, they are generated automatically.

  • rng (np.random.RandomState, optional): Random number generator for reproducibility. If None, the default NumPy RNG is used.

Returns:

  • np.ndarray: The mutated vector generated by the current-to-rand/1 strategy.

Raises:

  • ValueError: If the generated or provided indices in r are not unique or include the pointer index.