src.environment.problem.SOO.COCO_BBOB.bbob_dataset

Problem Difficulty Classification

BBOB (F1-F24)

Difficulty Mode

Training Set

Testing Set

easy

4, 6-14, 18-20, 22-24

1, 2, 3, 5, 15, 16, 17, 21

difficult

1, 2, 3, 5, 15, 16, 17, 21

4, 6-14, 18-20, 22-24

BBOB-Noisy (F101-F130)

Difficulty Mode

Training Set

Testing Set

easy

102-104, 106-114, 118, 121-124, 126-130

101, 105, 115-117, 119, 120, 125

difficult

101, 105, 115-117, 119, 120, 125

102-104, 106-114, 118, 121-124, 126-130

Note: When difficulty is ‘all’, both training and testing sets contain all problems in the suite.

Module Contents

Classes

BBOB_Dataset

Introduction

BBOB-Surrogate investigates the integration of surrogate modeling techniques into MetaBBO , enabling data-driven approximation of expensive objective functions while maintaining optimization fidelity.

API

class src.environment.problem.SOO.COCO_BBOB.bbob_dataset.BBOB_Dataset(data, batch_size=1)[source]

Bases: torch.utils.data.Dataset

Introduction

BBOB-Surrogate investigates the integration of surrogate modeling techniques into MetaBBO , enabling data-driven approximation of expensive objective functions while maintaining optimization fidelity.

Original paper

Surrogate Learning in Meta-Black-Box Optimization: A Preliminary Study.” arXiv preprint arXiv:2503.18060 (2025).

Official Implementation

BBOB-Surrogate

License

None

Problem Suite Composition

BBOB-Surrogate contains a total of 72 optimization problems, corresponding to three dimensions (2, 5, 10), each dimension contains 24 problems. Each problem consists of a trained KAN or MLP network, which is used to fit 24 black box functions in the COCO-BBOB benchmark. The network here is a surrogate model of the original function.

Args:

  • data (list): A list of BBOB problem instances.

  • batch_size (int, optional): Number of instances per batch. Defaults to 1.

Attributes:

  • data (list): The list of BBOB problem instances.

  • batch_size (int): The batch size for sampling.

  • N (int): Total number of problem instances.

  • ptr (list): List of starting indices for each batch.

  • index (np.ndarray): Permuted indices for shuffling.

  • maxdim (int): Maximum dimensionality among all problem instances.

Methods:

  • get_datasets(…): Static method to generate train and test datasets based on suit, difficulty, and other configuration options.

  • getitem(item): Returns a batch of problem instances at the specified batch index.

  • len(): Returns the total number of problem instances.

  • add(other): Concatenates two BBOB_Dataset objects.

  • shuffle(): Randomly permutes the order of the dataset.

Raises:

  • ValueError: If required arguments are missing or invalid (e.g., unsupported suit, invalid difficulty, or upperbound too low). batch = train_set[0] train_set.shuffle()


Initialization

Introduction

Initializes the dataset object with provided data and batch size, and computes relevant attributes for batching and dimensionality.

Args:

  • data (list): The dataset to be managed, where each item is expected to have a dim attribute.

  • batch_size (int, optional): The number of samples per batch. Defaults to 1.

Built-in Attribute:

  • self.data (list): Stores the input dataset.

  • self.batch_size (int): Stores the batch size.

  • self.N (int): The total number of data items.

  • self.ptr (list): List of starting indices for each batch.

  • self.index (np.ndarray): Array of indices for the dataset.

  • self.maxdim (int): The maximum dimension found among all items in the dataset.

Returns:

  • None

Raises:

  • AttributeError: If any item in data does not have a dim attribute.

static get_datasets(suit, upperbound, shifted=True, rotated=True, biased=True, train_batch_size=1, test_batch_size=1, difficulty=None, version='numpy', instance_seed=3849, user_train_list=None, user_test_list=None, device=None)[source]

Introduction

Generates training and testing datasets for BBOB (Black-Box Optimization Benchmarking) or BBOB-noisy function suites, with configurable properties such as shifting, rotation, bias, and difficulty level.

Args:

  • suit (str): The function suite and dimension, e.g., ‘bbob10’ or ‘bbob-noisy20’.

  • upperbound (float): The upper bound for the function domain (must be at least 5).

  • shifted (bool, optional): Whether to apply a random shift to the function. Defaults to True.

  • rotated (bool, optional): Whether to apply a random rotation to the function. Defaults to True.

  • biased (bool, optional): Whether to add a random bias to the function. Defaults to True.

  • train_batch_size (int, optional): Batch size for the training dataset. Defaults to 1.

  • test_batch_size (int, optional): Batch size for the testing dataset. Defaults to 1.

  • difficulty (str, optional): Difficulty level of the functions to include (‘easy’, ‘difficult’, ‘all’, or None). Defaults to None.

  • version (str, optional): Version of the function implementation (‘numpy’ or other). Defaults to ‘numpy’.

  • instance_seed (int, optional): Random seed for reproducibility. Defaults to 3849.

  • user_train_list (list, optional): List of function IDs to include in the training set. Defaults to None.

  • user_test_list (list, optional): List of function IDs to include in the testing set. Defaults to None.

  • device (torch.device, optional): Device for torch tensors if using the torch version. Defaults to None.

Returns:

  • Tuple[BBOB_Dataset, BBOB_Dataset]: A tuple containing the training and testing datasets.

Raises:

  • ValueError: If neither difficulty nor both user_train_list and user_test_list are provided.

  • ValueError: If the function suite is invalid or not supported.

  • ValueError: If the difficulty level is invalid.

  • AssertionError: If upperbound is less than 5.

__getitem__(item)[source]

Introduction

Retrieves a batch of data items corresponding to the given index or indices.

Args:

  • item (int or slice): The index or indices specifying which batch to retrieve.

Returns:

  • list: A list containing the data items from the dataset for the specified batch.

Raises:

  • IndexError: If the provided index is out of range.

__len__()[source]

Introduction

Returns the number of elements in the dataset.

Returns:

  • int: The total number of elements contained in the dataset.

__add__(other: src.environment.problem.SOO.COCO_BBOB.bbob_dataset.BBOB_Dataset)[source]

Introduction

Implements the addition operator for the BBOB_Dataset class, allowing two datasets to be combined.

Args:

  • other (BBOB_Dataset): Another instance of BBOB_Dataset to be added to the current dataset.

Returns:

  • BBOB_Dataset: A new BBOB_Dataset instance containing the combined data from both datasets, with the same batch size as the original.

Raises:

  • AttributeError: If other does not have a data attribute.

  • TypeError: If other is not an instance of BBOB_Dataset.

shuffle()[source]

Introduction

Randomly shuffles the indices of the dataset, updating the internal index order.

Built-in Attribute:

  • self.N (int): The total number of elements in the dataset.

  • self.index (np.ndarray): The array storing the current order of indices.

Returns:

  • None

Notes:

This method modifies the self.index attribute in-place using a random permutation.