src.tester¶
Module Contents¶
Classes¶
Introduction: BBO_TestUnit is a test unit designed for running batch episodes of black-box optimization (BBO) algorithms in parallel using RAY. It encapsulates a problem instance and an optimizer, and ensures reproducibility by managing random seeds and PyTorch settings. |
|
Introduction¶MetaBBO_TestUnit is a test unit designed for parallel execution using RAY, encapsulating an agent, an environment, and a random seed for reproducibility. It facilitates the evaluation of agent performance on a given environment, with optional checkpointing.
|
|
Functions¶
Introduction¶Estimates the average time (in milliseconds) required to perform a set of basic NumPy vectorized operations. ‘’’T0 will be used to calculate the complexity of algorithms.’’’ |
|
Introduction¶Measures the average time (in milliseconds) required to evaluate a problem’s objective function over a batch of randomly generated solutions. T1 will be used to calculate the complexity of the algorithm. |
|
Introduction¶Processes a list of data items, updating results and meta_results dictionaries with information extracted from each item. Handles both standard result keys and metadata, organizing results by problem and agent. |
|
Introduction¶Stores and updates meta data results for different process names into pickle files within a specified log directory. Ensures that meta data is accumulated and persisted across multiple calls, and clears in-memory storage after saving. |
|
API¶
- src.tester.cal_t0(dim, fes)[source]¶
Introduction¶
Estimates the average time (in milliseconds) required to perform a set of basic NumPy vectorized operations. ‘’’T0 will be used to calculate the complexity of algorithms.’’’
Args:¶
dim (int): The dimensionality of the random NumPy arrays to generate.
fes (int): The number of function evaluations (iterations of operations) to perform in each timing loop.
Returns:¶
float: The average elapsed time in milliseconds over 10 runs for performing the specified operations.
Notes:¶
The function performs addition, division, multiplication, square root, logarithm, and exponential operations on randomly generated NumPy arrays.
The timing is measured using
time.perf_counter()for higher precision.
- src.tester.cal_t1(problem, dim, fes)[source]¶
Introduction¶
Measures the average time (in milliseconds) required to evaluate a problem’s objective function over a batch of randomly generated solutions. T1 will be used to calculate the complexity of the algorithm.
Args:¶
problem: a problem object
dim (int): The dimensionality of each solution vector.
fes (int): The number of function evaluations
Returns:¶
float: The average elapsed time (in milliseconds) to evaluate the batch, computed over 10 runs.
Notes:¶
The function generates random solutions using
np.random.rand.Timing is performed using
time.perf_counter.
- src.tester.record_data(data, test_set, agent_for_rollout, checkpoints, results, meta_results, config)[source]¶
Introduction¶
Processes a list of data items, updating results and meta_results dictionaries with information extracted from each item. Handles both standard result keys and metadata, organizing results by problem and agent.
Args:¶
todo:这里写完了,有个问题,这个metadata具体的结构写在哪比较好
data(dict): Metadata, a dict contain the rollout test result,similar to test result but has more details.
test_set (object): The problem dataset for the test process.
agent_for_rollout (str): The base name or identifier for the agent used during rollout.
checkpoints (list): List of checkpoint identifiers for agents.
results (dict): A dictionary to store or update results initialized only with the config information.
meta_results (dict): An empty dictionary to store or update metadata results.
config (object): Configuration object with attributes such as
full_meta_datato control metadata processing.
Returns:¶
tuple: A tuple containing the updated
resultsandmeta_resultsdictionaries.
- src.tester.store_meta_data(log_dir, meta_data_results)[source]¶
Introduction¶
Stores and updates meta data results for different process names into pickle files within a specified log directory. Ensures that meta data is accumulated and persisted across multiple calls, and clears in-memory storage after saving.
Args:¶
log_dir (str): The directory path where the metadata should be stored.
meta_data_results (dict): A dictionary where keys are process names and values are dictionaries mapping agent names to lists of meta data.
Returns:¶
dict: The updated
meta_data_resultsdictionary with in-memory lists cleared after saving.
Raises:¶
OSError: If the function fails to create the required directories or write to files.
pickle.PickleError: If there is an error during pickling or unpickling the data.
- class src.tester.BBO_TestUnit(optimizer: src.environment.optimizer.basic_optimizer.Basic_Optimizer, problem: src.environment.problem.basic_problem.Basic_Problem, seed: int)[source]¶
Introduction: BBO_TestUnit is a test unit designed for running batch episodes of black-box optimization (BBO) algorithms in parallel using RAY. It encapsulates a problem instance and an optimizer, and ensures reproducibility by managing random seeds and PyTorch settings.
optimizer (Basic_Optimizer): The optimizer instance to be tested.
problem (Basic_Problem): The problem instance on which the optimizer will be evaluated.
seed (int): The random seed for reproducibility.
Methods:¶
run_batch_episode(): Runs a single batch episode of the optimizer on the problem, returning a dictionary of results and timing information.
Attributes:¶
optimizer (Basic_Optimizer): The optimizer used in the test unit.
problem (Basic_Problem): The problem instance for evaluation.
seed (int): The random seed for reproducibility.
Initialization
- run_batch_episode()[source]¶
Introduction¶
Runs a single batch episode for the optimizer on the given problem, ensuring reproducibility by setting random seeds and configuring PyTorch settings.
Args:¶
None
Returns:¶
dict: A dictionary containing the results of the optimizer’s episode, including timing information, agent and problem names, and additional metrics.
Raises:¶
None
- class src.tester.MetaBBO_TestUnit(agent: src.rl.Basic_Agent, env: src.environment.basic_environment.PBO_Env, seed: int)[source]¶
Introduction¶
MetaBBO_TestUnit is a test unit designed for parallel execution using RAY, encapsulating an agent, an environment, and a random seed for reproducibility. It facilitates the evaluation of agent performance on a given environment, with optional checkpointing.
agent (Basic_Agent): The agent to be evaluated.
env (PBO_Env): The environment in which the agent operates.
seed (int): The random seed for reproducibility.
checkpoint (int, optional): An optional checkpoint identifier for the agent. Defaults to None.
Methods:¶
run_batch_episode(required_info: dict = {}): Runs a single batch episode with the specified agent and environment, ensuring reproducibility by setting random seeds and configuring PyTorch settings. Returns a dictionary containing episode results, timing, agent and problem names, and any additional rollout results.
Initialization
- run_batch_episode(required_info={})[source]¶
Introduction¶
Runs a single batch episode using the agent and environment, ensuring reproducibility by setting random seeds and configuring PyTorch settings.
Args:¶
todo:需不需要example
required_info (dict, optional): Additional information required for the episode rollout. Defaults to an empty dictionary.
Returns:¶
dict: A dictionary containing the results of the episode, including timing information, agent and problem names, and any additional rollout results.
- class src.tester.Tester(config, baselines, user_datasets=None)[source]¶
Bases:
objectInitialization
- initialize_record(key)[source]¶
Introduction¶
Initializes a record in the
test_resultsdictionary for a given key, setting up nested dictionaries for each problem, agent, and optimizer.Args:¶
key (str): The identifier for the test record to initialize.
Side Effects:¶
Modifies the
self.test_resultsattribute by adding a new entry forkeyif it does not already exist. For each problem inself.test_set.data, creates sub-entries for each agent inself.agent_name_listand each optimizer inself.t_optimizer_for_cp, initializing them as empty lists.
- record_test_data(data: list)[source]¶
Introduction¶
Records test data from a list of dictionaries, organizing results by problem and agent names. Handles both metadata and other test result keys, updating internal result structures accordingly.
Args:¶
data (list): Metadata, a dict contain the rollout test result,similar to test result but has more details.
Side Effects:¶
Updates
self.meta_data_resultswith metadata ifself.config.full_meta_datais True.Updates
self.test_resultswith other test result metrics, initializing records as needed.
Notes:¶
Assumes that
self.meta_data_results,self.test_results, andself.config.full_meta_dataare properly initialized.Ignores keys ‘agent_name’ and ‘problem_name’ when recording test results.
- test(log=True)[source]¶
Introduction¶
Runs tests on agents and optimizers using different parallelization strategies and records the results.
Args:¶
None
Side Effects:¶
Records test data and stores meta data results after each test run.
Saves the final test results to a pickle file in the log directory.
Raises:¶
NotImplementedError: If an unsupported parallelization mode is specified in the configuration.
- test_for_random_search()[source]¶
Introduction¶
Executes a comprehensive test suite for the Random Search optimizer across a set of benchmark problems, logging performance metrics and timing information for analysis.
Args:¶
None (uses self.config for configuration).
Returns:¶
todo: 这里就是test_result,同样的问题,它的具体结构是否写在这?感觉应该写在这,毕竟是在这里构建的,要放图在这里吗?
dict: A dictionary
test_resultscontaining the metrics list:‘cost’: Nested dict mapping problem names to optimizer names to lists of cost arrays (one per run).
‘fes’: Nested dict mapping problem names to optimizer names to lists of function evaluation counts (one per run).
‘T0’: Baseline timing value computed from problem dimension and max function evaluations.
‘T1’: Dict mapping optimizer names to average problem-specific timing metric.
‘T2’: Dict mapping optimizer names to average wall-clock time per run (in milliseconds).
Notes:¶
Runs 51 independent trials per problem.
Uses tqdm for progress visualization.
Seeds numpy’s RNG for reproducibility.
Pads cost arrays to length 51 if necessary.
- static name_translate(problem)[source]¶
Introduction¶
Translates a given problem identifier into a human-readable problem name.
Args:¶
problem (str): The identifier of the problem to be translated. Expected values include ‘bbob’, ‘bbob-torch’, ‘bbob-noisy’, ‘bbob-noisy-torch’, ‘protein’, or ‘protein-torch’.
Returns:¶
str: The human-readable name corresponding to the given problem identifier.
Raises:¶
ValueError: If the provided problem identifier is not recognized.
- static mgd_test(config, agent: str, from_problem: str, from_difficulty: str, from_test_path: str, to_test_path: str)[source]¶
- static mte_test(config, agent: str, pre_train_problem: str, pre_train_difficulty: str, pre_train_data_path: str, scratch_data_path: str, pdf_fig: bool = True)[source]¶
Introduction¶
Evaluates and visualizes the Model Transfer Efficiency (MTE) between a pre-trained agent and a scratch agent on a transfer learning task. The method loads experiment results, processes performance data, computes MTE, and generates a comparative plot of average returns over learning steps.
Args:¶
None. Uses configuration from
self.config.Returns:¶
None. Prints the computed MTE value and saves a plot comparing pre-trained and scratch agent performance.
Raises:¶
FileNotFoundError: If the required JSON or pickle files are not found.
KeyError: If expected keys are missing in the loaded data.
Exception: For errors during data processing or plotting.
- static rollout_batch(config, rollout_dir, rollout_opt, rollout_datasets, checkpoints=None, log=True)[source]¶
todo:重写注释
Introduction¶
Executes a batch rollout of agents on a test set of problems using various parallelization strategies. The function loads agent checkpoints, sets up environments, and evaluates agent performance across multiple seeds and problems, storing the results for further analysis.
Args:¶
config (object): Configuration object containing all necessary parameters for experiment.For details you can visit config.py.
Returns:¶
None: The function saves the rollout results and metadata to disk but does not return any value.
Raises:¶
KeyError: If the specified agent key is missing in the
model.jsonfile.NotImplementedError: If the specified parallelization mode in
config.test_parallel_modeis not supported.