src.environment.problem.SOO.COCO_BBOB.kan.KANLayer¶
Module Contents¶
Classes¶
KANLayer class |
API¶
- class src.environment.problem.SOO.COCO_BBOB.kan.KANLayer.KANLayer(in_dim=3, out_dim=2, num=5, k=3, noise_scale=0.5, scale_base_mu=0.0, scale_base_sigma=1.0, scale_sp=1.0, base_fun=torch.nn.SiLU(), grid_eps=0.02, grid_range=[-1, 1], sp_trainable=True, sb_trainable=True, save_plot_data=True, device='cpu', sparse_init=False)[source]¶
Bases:
torch.nn.ModuleKANLayer class
Attributes:¶
in_dim: int input dimension out_dim: int output dimension num: int the number of grid intervals k: int the piecewise polynomial order of splines noise_scale: float spline scale at initialization coef: 2D torch.tensor coefficients of B-spline bases scale_base_mu: float magnitude of the residual function b(x) is drawn from N(mu, sigma^2), mu = sigma_base_mu scale_base_sigma: float magnitude of the residual function b(x) is drawn from N(mu, sigma^2), mu = sigma_base_sigma scale_sp: float mangitude of the spline function spline(x) base_fun: fun residual function b(x) mask: 1D torch.float mask of spline functions. setting some element of the mask to zero means setting the corresponding activation to zero function. grid_eps: float in [0,1] a hyperparameter used in update_grid_from_samples. When grid_eps = 1, the grid is uniform; when grid_eps = 0, the grid is partitioned using percentiles of samples. 0 < grid_eps < 1 interpolates between the two extremes. the id of activation functions that are locked device: str device
Initialization
‘ initialize a KANLayer
Args:¶
in_dim : int input dimension. Default: 2. out_dim : int output dimension. Default: 3. num : int the number of grid intervals = G. Default: 5. k : int the order of piecewise polynomial. Default: 3. noise_scale : float the scale of noise injected at initialization. Default: 0.1. scale_base_mu : float the scale of the residual function b(x) is intialized to be N(scale_base_mu, scale_base_sigma^2). scale_base_sigma : float the scale of the residual function b(x) is intialized to be N(scale_base_mu, scale_base_sigma^2). scale_sp : float the scale of the base function spline(x). base_fun : function residual function b(x). Default: torch.nn.SiLU() grid_eps : float When grid_eps = 1, the grid is uniform; when grid_eps = 0, the grid is partitioned using percentiles of samples. 0 < grid_eps < 1 interpolates between the two extremes. grid_range : list/np.array of shape (2,) setting the range of grids. Default: [-1,1]. sp_trainable : bool If true, scale_sp is trainable sb_trainable : bool If true, scale_base is trainable device : str device sparse_init : bool if sparse_init = True, sparse initialization is applied.
Returns:¶
selfExample¶
from kan.KANLayer import * model = KANLayer(in_dim=3, out_dim=5) (model.in_dim, model.out_dim)
- forward(x)[source]¶
KANLayer forward given input x
Args:¶
x : 2D torch.float inputs, shape (number of samples, input dimension)
Returns:¶
y : 2D torch.float outputs, shape (number of samples, output dimension) preacts : 3D torch.float fan out x into activations, shape (number of sampels, output dimension, input dimension) postacts : 3D torch.float the outputs of activation functions with preacts as inputs postspline : 3D torch.float the outputs of spline functions with preacts as inputs
Example¶
from kan.KANLayer import * model = KANLayer(in_dim=3, out_dim=5) x = torch.normal(0,1,size=(100,3)) y, preacts, postacts, postspline = model(x) y.shape, preacts.shape, postacts.shape, postspline.shape
- update_grid_from_samples(x, mode='sample')[source]¶
update grid from samples
Args:¶
x : 2D torch.float inputs, shape (number of samples, input dimension)
Returns:¶
NoneExample¶
model = KANLayer(in_dim=1, out_dim=1, num=5, k=3) print(model.grid.data) x = torch.linspace(-3,3,steps=100)[:,None] model.update_grid_from_samples(x) print(model.grid.data)
- initialize_grid_from_parent(parent, x, mode='sample')[source]¶
update grid from a parent KANLayer & samples
Args:¶
parent : KANLayer a parent KANLayer (whose grid is usually coarser than the current model) x : 2D torch.float inputs, shape (number of samples, input dimension)
Returns:¶
NoneExample¶
batch = 100 parent_model = KANLayer(in_dim=1, out_dim=1, num=5, k=3) print(parent_model.grid.data) model = KANLayer(in_dim=1, out_dim=1, num=10, k=3) x = torch.normal(0,1,size=(batch, 1)) model.initialize_grid_from_parent(parent_model, x) print(model.grid.data)
- get_subset(in_id, out_id)[source]¶
get a smaller KANLayer from a larger KANLayer (used for pruning)
Args:¶
in_id : list id of selected input neurons out_id : list id of selected output neurons
Returns:¶
spb : KANLayerExample¶
kanlayer_large = KANLayer(in_dim=10, out_dim=10, num=5, k=3) kanlayer_small = kanlayer_large.get_subset([0,9],[1,2,3]) kanlayer_small.in_dim, kanlayer_small.out_dim (2, 3)
- swap(i1, i2, mode='in')[source]¶
swap the i1 neuron with the i2 neuron in input (if mode == ‘in’) or output (if mode == ‘out’)
Args:¶
i1 : int i2 : int mode : str mode = 'in' or 'out'
Returns:¶
NoneExample¶
from kan.KANLayer import * model = KANLayer(in_dim=2, out_dim=2, num=5, k=3) print(model.coef) model.swap(0,1,mode=’in’) print(model.coef)