src.environment.optimizer.l2t_optimizer

Module Contents

Classes

L2T_Optimizer

Introduction

L2T_Optimizer is a learnable optimizer designed for multi-task optimization problems. It leverages evolutionary strategies and knowledge transfer mechanisms to optimize multiple tasks simultaneously. The optimizer maintains separate populations for each task and applies both standard and knowledge transfer-based differential evolution operations to generate new candidate solutions. It tracks various statistics such as stagnation, improvement flags, and rewards to guide the optimization process.

Functions

DE_mutation

Introduction

Performs the mutation operation for Differential Evolution (DE) on a population of candidate solutions.

DE_crossover

Introduction

Performs the crossover operation in Differential Evolution (DE) by combining mutant and population vectors to produce trial vectors.

DE_rand_1

Introduction

Applies the DE/rand/1 strategy from Differential Evolution to a population, generating new candidate solutions (offsprings) through mutation and crossover operations.

mixed_DE

Introduction

Performs a mixed Differential Evolution (DE) mutation and crossover operation on given populations, generating a new set of candidate solutions (mutants) based on the provided actions and source populations.

API

src.environment.optimizer.l2t_optimizer.DE_mutation(populations)[source]

Introduction

Performs the mutation operation for Differential Evolution (DE) on a population of candidate solutions.

Args:

  • populations (np.ndarray): A 2D numpy array of shape (population_cnt, dim), where each row represents an individual in the population.

Returns:

  • np.ndarray: A 2D numpy array of shape (population_cnt, dim) containing the mutated individuals (mutants).

Notes:

  • The mutation is performed by selecting three distinct individuals (other than the current one) and combining them according to the DE/rand/1 scheme.

  • The resulting mutant vectors are clipped to the range [0, 1].

src.environment.optimizer.l2t_optimizer.DE_crossover(mutants, populations)[source]

Introduction

Performs the crossover operation in Differential Evolution (DE) by combining mutant and population vectors to produce trial vectors.

Args:

  • mutants (np.ndarray): Array of mutant vectors with shape (population_cnt, dim).

  • populations (np.ndarray): Array of current population vectors with shape (population_cnt, dim).

Returns:

  • np.ndarray: Array of trial vectors after crossover, with the same shape as mutants and populations.

Raises:

  • ValueError: If the input arrays do not have the expected shape or are incompatible for crossover.

src.environment.optimizer.l2t_optimizer.DE_rand_1(populations)[source]

Introduction

Applies the DE/rand/1 strategy from Differential Evolution to a population, generating new candidate solutions (offsprings) through mutation and crossover operations.

Args:

  • populations (np.ndarray): The current population of candidate solutions, typically represented as a 2D NumPy array where each row is an individual.

Returns:

  • np.ndarray: The new population (offsprings) generated after mutation and crossover.

Raises:

  • None

src.environment.optimizer.l2t_optimizer.mixed_DE(populations, source_pupulations, KT_index, action_2, action_3)[source]

Introduction

Performs a mixed Differential Evolution (DE) mutation and crossover operation on given populations, generating a new set of candidate solutions (mutants) based on the provided actions and source populations.

Args:

  • populations (np.ndarray): Array of target populations, where each row represents an individual solution.

  • source_pupulations (np.ndarray): Array of source populations used for mutation, with the same shape as populations.

  • KT_index (int): Index specifying which population in populations is the target for mutation and crossover.

  • action_2 (float): Mixing coefficient controlling the contribution of target and source populations in mutation.

  • action_3 (float): Mixing coefficient controlling the contribution of different mutation strategies.

Returns:

  • np.ndarray: Array of new candidate solutions (mutants) generated after mutation and crossover.

Raises:

  • ValueError: If the number of available individuals in source_pupulations is less than 6, as unique selection is required.

class src.environment.optimizer.l2t_optimizer.L2T_Optimizer(config)[source]

Bases: src.environment.optimizer.learnable_optimizer.Learnable_Optimizer

Introduction

L2T_Optimizer is a learnable optimizer designed for multi-task optimization problems. It leverages evolutionary strategies and knowledge transfer mechanisms to optimize multiple tasks simultaneously. The optimizer maintains separate populations for each task and applies both standard and knowledge transfer-based differential evolution operations to generate new candidate solutions. It tracks various statistics such as stagnation, improvement flags, and rewards to guide the optimization process.

Initialization

Introduction

Initializes the optimizer with configuration parameters, sets up task-specific attributes, and allocates memory for tracking optimization progress and statistics.

Args:

  • config (object): Config object containing problem settings.

    • Attributes needed for the L2T_Optimizer are the following:

      • train_problem (str): The training problem to be used.

      • test_problem (str): The testing problem to be used.

      • dim (int): Dimensionality of the optimization problem.

      • log_interval (int): Interval for logging progress.

      • n_logpoint (int): Number of log points to record.

      • full_meta_data (bool): Flag indicating whether to use full meta data.

      • device (str): Device to use for computations (e.g., “cpu”, “cuda”).

Built-in Attributes:

  • __config (object): Configuration object containing algorithm parameters.

  • task_cnt (int): Number of tasks to be optimized.Decided based on the problem type.

  • dim (int): Dimensionality of the optimization problem.Default is 50.

  • generation (int): Current generation count.Default is 0.

  • pop_cnt (int): Population size for each task.Default is 50.

  • total_generation (int): Total number of generations for the optimization process.

  • flag_improved (np.ndarray): Array to track improvement flags for each task.

  • stagnation (np.ndarray): Array to track stagnation counts for each task.

  • old_action_1 (np.ndarray): Array to store the last action taken for each task.

  • old_action_2 (np.ndarray): Array to store the last action taken for each task.

  • old_action_3 (np.ndarray): Array to store the last action taken for each task.

  • N_kt (np.ndarray): Array to track the number of knowledge transfer operations for each task.Decided based on the problem type.

  • Q_kt (np.ndarray): Array to track the quality of knowledge transfer for each task.

  • gbest (np.ndarray): Array to store the best fitness values for each task.

  • task (Any): Placeholder for the current task being optimized.Default is None.

  • offsprings (np.ndarray): Array to store the generated offsprings for each task.

  • noKT_offsprings (np.ndarray): Array to store the generated offsprings without knowledge transfer for each task.

  • KT_offsprings (list): List to store the generated offsprings with knowledge transfer for each task.

  • KT_index (list): List to store the indices of individuals selected for knowledge transfer for each task.

  • parent_population (np.ndarray): Array to store the current population for each task.

  • reward (list): List to store the rewards for each task.

  • total_reward (float): Total accumulated reward across all tasks.

  • begin_best (list): List to store the best fitness values at the beginning of the optimization for each task.

  • last_gen_best (list): List to store the best fitness values from the last generation for each task.

  • this_gen_best (list): List to store the best fitness values from the current generation for each task.

  • optimal_value (list): List to store the optimal values for each task.

  • fes (int): Counter for the number of function evaluations.Default is None.

  • cost (list): List to store the best cost values during optimization.Default is None.

  • log_index (int): Index for logging progress.Default is None.

Returns:

  • None

Raises:

  • None

get_state()[source]

Introduction

Computes and returns the current state representation of the optimizer, aggregating various statistics and features for each task.

Args:

None

Returns:

  • np.ndarray: A 1D array of type float32 containing the concatenated state features for all tasks, including normalized generation count, stagnation, improvement flags, Q-values, population statistics, and previous actions.

Raises:

None

init_population(tasks)[source]

Introduction

Initializes the population for a multi-task optimization process, evaluates their fitness, and prepares meta-data if required.

Args:

  • tasks (list): A list of task objects, each providing an eval method to compute the fitness of a population.

Returns:

  • state (Any): The current state of the optimizer after population initialization, as returned by self.get_state().

Side Effects:

  • Initializes and updates several instance attributes including self.fes, self.task, self.parent_population, self.log_index, self.gbest, self.cost, self.meta_X, and self.meta_Cost.

  • Evaluates the fitness of the initial population for each task and stores the results.

  • Optionally stores meta-data if self.__config.full_meta_data is set to True.

self_update()[source]

Introduction

Updates the noKT_offsprings attribute for each task by generating new offsprings using the DE_rand_1 differential evolution strategy.

Args:

None

Returns:

None

Raises:

  • IndexError: If self.parent_population or self.noKT_offsprings do not have sufficient elements for the range of self.task_cnt.

transfer(actions)[source]

Introduction

Transfers knowledge between tasks by generating offsprings using actions and a randomly selected source population. This method applies a mixed differential evolution (DE) strategy to a subset of individuals in each task’s population, based on the provided actions.

Args:

  • actions (list or array-like): A sequence of three action values [action_1, action_2, action_3] used to control the transfer process and DE parameters.

Notes:

  • The method ensures that the source population for transfer is different from the target task.

  • At least one individual is always selected for transfer per task.

  • Uses mixed_DE for generating transferred offsprings and copy.deepcopy to preserve non-transferred offsprings.

seletion()[source]

Introduction

Performs the selection operation in an evolutionary optimization process, updating the parent population based on the fitness of offspring and parent individuals. It also updates rewards, quality metrics, and tracks improvements or stagnation for each task.

Args:

None

Returns:

  • np.ndarray: A 1D array of type float32 containing the concatenated state features for all tasks, including normalized generation count, stagnation, improvement flags, Q-values, population statistics, and previous actions.

Side Effects:

  • Updates self.parent_population, self.reward, self.Q_kt, self.gbest, self.flag_improved, self.stagnation, and meta-data lists (self.meta_X, self.meta_Cost) as part of the selection process.

Notes:

  • Assumes that self.task, self.parent_population, self.offsprings, self.KT_index, self.KT_count, self.gbest, self.flag_improved, self.stagnation, and self.__config.full_meta_data are properly initialized and maintained elsewhere in the class.

  • Uses deep copies to avoid unintended side effects when updating populations.

update(actions, tasks)[source]

Introduction

Updates the optimizer’s state based on the provided actions and tasks, manages reward accumulation, logging, and determines if the optimization process has ended.

Args:

  • actions (Any): Actions to be applied in the current update step.

  • tasks (Any): Tasks relevant to the current optimization step.

Returns:

  • next_state (Any): The next state after applying the actions.

  • total_reward (float): The accumulated reward after the update.

  • is_end (bool): Flag indicating whether the optimization process has ended.

  • info (dict): Additional information (currently empty).

Raises:

  • None