Autoconfig
This module provides functionality for automatically configuring and searching
optimal parameters for a tabular data engine. The main class, Autoconfig, is
used to perform grid search over various model architectures and batch sizes
to find the best configuration based on reconstruction loss.
- class clearbox_synthetic.utils.autoconfig.autoconfig.Autoconfig(train_ds: ndarray, numerical_features_sizes: int, categorical_features_sizes: List, y_train_ds: ndarray | None = None)[source]
Bases:
objectA class for automatically configuring and searching optimal parameters for a tabular engine.
- train_ds
The training dataset.
- Type:
np.ndarray
- y_train_ds
The target values for the training dataset.
- Type:
np.ndarray, optional
- numerical_features_sizes
The size of ordinal features.
- Type:
int
- categorical_features_sizes
The sizes of categorical features.
- Type:
List
- grid_search()[source]
Performs a grid search to find the optimal model configuration.
The grid search iterates over different model architectures and batch sizes, fitting the model using multiple threads, and evaluates each model to determine the configuration with the lowest mean reconstruction loss.
- Returns:
The optimal configuration (architecture and batch size) based on the evaluation loss.
- Return type:
list
- clearbox_synthetic.utils.autoconfig.autoconfig.learning_rule(training_rows_size: int)[source]
Determines the learning rate, number of epochs, and batch size based on the size of the training data.
- Parameters:
training_rows_size (int) – The number of rows in the training dataset.
- Returns:
A tuple containing (learning_rate, epochs, batch_size).
- Return type:
Tuple[int, int, int]