gklr package

Submodules

gklr.calcs module

GKLR calcs module.

class gklr.calcs.Calcs(K)[source]

Bases: ABC

Base Calcs class object.

Constructor.

Parameters:

K (KernelMatrix) – KernelMatrix object.

abstract calc_G(Y)[source]

Calculate the generating function G of a Generalized Extreme Value (GEV) model and its derivative.

calc_P(Y, G, G_j)[source]

Calculate the matrix of probabilities for each alternative for each row of the dataset.

Parameters:
  • Y (ndarray) – The auxiliary matrix Y that contains the exponentiated values of the matrix f. Shape: (n_samples, num_alternatives).

  • G (ndarray) – The auxiliary matrix G. Shape: (n_samples, 1).

  • G_j (ndarray) – The derivative of the auxiliary matrix G. Shape: (n_samples, num_alternatives).

Returns:

The matrix of probabilities for each alternative for each row of the

dataset. Each column corresponds to an alternative and each row to a row of the dataset. The sum of the probabilities for each row is 1. Shape: (n_samples, num_alternatives).

Return type:

ndarray

calc_Y(f)[source]

Calculate the auxiliary matrix Y that contains the exponentiated values of the matrix f.

Parameters:

f (ndarray) – The matrix of utility function values for each alternative for each row of the dataset. Shape: (n_samples, num_alternatives).

Returns:

The auxiliary matrix Y that contains the exponentiated values of the

matrix f. Shape: (n_samples, num_alternatives).

Return type:

ndarray

abstract calc_f(alpha)[source]

Calculate the value of utility function for each alternative for each row of the dataset.

abstract calc_probabilities(alpha)[source]

Calculate the probabilities for each alternative.

abstract gradient(alpha, P)[source]

Calculate the log-likelihood of the model and its gradient for the given parameters.

abstract log_likelihood(alpha, P, choice_indices)[source]

Calculate the log-likelihood of the model for the given parameters.

gklr.config module

GKLR Config module.

class gklr.config.Config[source]

Bases: object

Configuration class for the GKLR package.

This class stores the configuration and hyperparameters for the GKLR package.

Constructor.

check_values()[source]

Checks validity of hyperparameter values. Raises an error if any of the hyperparameters is not valid.

remove_hyperparameter(key)[source]

Helper method to remove a hyperparameter from GKLR.

Parameters:

key (str) – The hyperparameter to remove.

set_hyperparameter(key, value)[source]

Helper method to set the hyperparameters of GKLR.

Parameters:
  • key (str) – The hyperparameter to set.

  • value (Any) – The value to set the hyperparameter to.

gklr.estimation module

GKLR estimation module.

class gklr.estimation.Estimation(calcs, pmle=None, pmle_lambda=0.0, method='L-BFGS-B', verbose=1)[source]

Bases: ABC

Base Estimation class object.

Constructor.

Parameters:
  • calcs (Calcs) – Calcs object.

  • pmle (str | None) – Indicates the penalization method for the penalized maximum likelihood estimation. If ‘None’ a maximum likelihood estimation without penalization is performed. Default: None.

  • pmle_lambda (float) – The value of the regularization parameter for the PMLE method. Default: 0.0.

  • method (str) – The optimization method. Default: “L-BFGS-B”.

  • verbose (int) – Indicates the level of verbosity of the function. If 0, no output will be printed. If 1, basic information about the estimation procedure will be printed. If 2, the information about each iteration will be printed. Default: 1.

abstract gradient()[source]
minimize(params, loss_tol=1e-06, options=None)[source]

Minimize the objective function.

Parameters:
  • params (ndarray) – The initial values of the model parameters. Shape: (n_params,).

  • loss_tol (float) – The tolerance for the loss function. Default: 1e-06.

  • options (Dict[str, Any] | None) – A dict with advance options for the optimization method. Default: None.

Returns:

A dict with the results of the optimization.

Return type:

Dict[str, Any]

abstract objective_function()[source]
abstract objective_function_with_gradient()[source]

gklr.gklr module

GKLR main module.

class gklr.gklr.KernelModel(model_params=None)[source]

Bases: object

Main class for GKLR models.

Constructor.

Parameters:

model_params (Optional[Dict[str, Any]]) – A dict where the keys are the parameters of the kernel model and the value they contain. Default: None.

clear_kernel(dataset='train')[source]

Clear the kernel matrices previously computed.

Removes the train and test kernel matrices and frees the memory.

Parameters:

dataset (str) – The kernel matrix to be deleted. It can take the values: “train”, “test” or “both”. Default: “train”.

Return type:

None

fit(init_parms=None, pmle='Tikhonov', pmle_lambda=0, method='L-BFGS-B', options=None, verbose=1)[source]

Fit the kernel model.

Perform the estimation of the kernel model and store post-estimation results.

Parameters:
  • init_parms (ndarray | None) – Initial value of the parameters to be optimized. Shape: (num_cols_kernel_matrix, n_features). Default: None

  • pmle (str) – Penalization method. Default: None.

  • pmle_lambda (float) – Parameter for the penalization method. Default: 0

  • method (str) – Optimization method. Default: “L-BFGS-B”.

  • options (Dict[str, Any] | None) – Options for the optimization method. Default: None.

  • verbose (int) – Indicates the level of verbosity of the function. If 0, no output will be printed. If 1, basic information about the time spent and the Log-likelihood value will be displayed. Default: 1.

Return type:

None

get_kernel(dataset='train')[source]

Returns the train and/or test KernelMatrix object.

Parameters:

dataset (str) – The kernel matrix to be retrieved. It can take the values: “train”, “test” or “both”. Default: “train”.

Returns:

The KernelMatrix object.

Return type:

KernelMatrix | None

predict(train=False)[source]

Predict class for the train or test kernel.

Parameters:

train (bool) – A boolean that indicates if the prediction belong to the training set (True) or test set (False), only in the case that a test kernel matrix is defined. Default: False.

Returns:

Vector containing the class labels of the sample.

Return type:

ndarray

predict_log_proba(train=False)[source]

Predict the natural logarithm of the class probabilities for the train or test kernel.

Parameters:

train (bool) – A boolean that indicates if the probability estimates belong to the training set (True) or test set (False), only in the case that a test kernel matrix is defined. Default: False.

Returns:

Log-probability of the sample for each class in the model.

Return type:

ndarray

predict_proba(train=False)[source]

Predict class probabilities for the train or test kernel.

Parameters:

train (bool) – A boolean that indicates if the probability estimates belong to the training set (True) or test set (False), only in the case that a test kernel matrix is defined. Default: False.

Returns:

Probability of the sample for each class in the model.

Return type:

ndarray

score()[source]

Predict the mean accuracy on the test kernel.

Returns:

Mean accuracy of self.predict().

Return type:

float | np.float64

set_kernel_test(Z, choice_column=None, attributes=None, verbose=1)[source]

Computes the kernel matrix test dataset.

Processes the test dataset and creates the corresponding kernel matrix. The kernel matrix is encapsulated and stored using the KernelMatrix class.

Parameters:
  • Z (DataFrame) – Test dataset stored in a pandas DataFrame. Shape: (n_samples, n_features)

  • choice_column (str | None) – Name of the column of DataFrame Z that contains the ID of chosen alternative.

  • attributes (Dict[int, List[str]] | None) – A dict that contains the columns of DataFrame Z that are considered for each alternative. This dict is indexed by the ID of the available alternatives in the dataset and the values are list containing the names of all the columns considered for that alternative.

  • verbose (int) – Indicates the level of verbosity of the function. If 0, no output will be printed. If 1, basic information about the time spent and the size of the matrix will be displayed. Default: 1.

Return type:

None

set_kernel_train(X, choice_column, attributes, hyperparams, verbose=1)[source]

Computes the kernel matrix for the train dataset.

Processes the train dataset and creates the corresponding kernel matrix. The kernel matrix is encapsulated and stored using the KernelMatrix class.

Parameters:
  • X (DataFrame) – Train dataset stored in a pandas DataFrame. Shape: (n_samples, n_features)

  • choice_column (str) – Name of the column of DataFrame X that contains the ID of chosen alternative.

  • attributes (Dict[int, List[str]]) – A dict that contains the columns of DataFrame X that are considered for each alternative. This dict is indexed by the ID of the available alternatives in the dataset and the values are list containing the names of all the columns considered for that alternative.

  • hyperparams (Dict[str, Any]) – A dict where the keys are the hyperparameters passed to the kernel function and the value they contain.

  • verbose (int) – Indicates the level of verbosity of the function. If 0, no output will be printed. If 1, basic information about the time spent and the size of the matrix will be displayed. Default: 1.

Return type:

None

summary()[source]

Print a summary of the estimation results.

Return type:

None

gklr.kernel_calcs module

GKLR kernel_calcs module.

class gklr.kernel_calcs.KernelCalcs(K)[source]

Bases: Calcs

Main calculations for the Kernel Logistic Regression (KLR) model.

Constructor.

Parameters:

K (KernelMatrix) – KernelMatrix object.

calc_G(Y)[source]
Calculate the generating function G of a Generalized Extreme Value

(GEV) model and its derivative. For KLR model, the generating function is the sum of the utilities of the alternatives for each row of the dataset.

Parameters:

Y (ndarray) – The auxiliary matrix Y that contains the exponentiated values of the matrix f. Shape: (n_samples, num_alternatives).

Returns:

A tuple with the auxiliary matrix G and its derivative.

The auxiliary matrix G is a numpy array of shape: (n_samples, 1) and its derivative G_j is a numpy array of shape: (n_samples, num_alternatives).

Return type:

Tuple[ndarray, ndarray]

calc_f(alpha, indices=None)[source]

Calculate the value of utility function for each alternative for each row of the dataset.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • indices (ndarray | None) – The indices of the rows of the dataset for which the utility function is calculated. If None, all the rows are used. Default: None.

Returns:

A matrix where each row corresponds to the utility of each alternative

for each row of the dataset. Shape: (n_samples, num_alternatives).

Return type:

ndarray

calc_probabilities(alpha, indices=None)[source]

Calculate the probabilities for each alternative.

Obtain the probabilities for each alternative for each row of the dataset.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • indices (ndarray | None) – The indices of the rows of the dataset for which the probabilities are calculated. If None, the probabilities are calculated for all rows of the dataset. Default: None.

Returns:

A matrix of probabilities for each alternative for each row of the

dataset. Each column corresponds to an alternative and each row to a row of the dataset. The sum of the probabilities for each row is 1. Shape: (n_samples, num_alternatives).

Return type:

ndarray

gradient(alpha, P=None, pmle=None, pmle_lambda=0, indices=None)[source]

Calculate the gradient of the log-likelihood function of the KLR model for the given parameters.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • pmle (str | None) – It specifies the type of penalization for performing a penalized maximum likelihood estimation. Default: None.

  • pmle_lambda (float) – The lambda parameter for the penalized maximum likelihood. Default: 0.

  • P (ndarray | None) – The matrix of probabilities of each alternative for each row of the dataset. If None, the probabilities are calculated. Shape: (n_samples, num_alternatives). Default: None.

  • indices (ndarray | None) – The indices of the rows of the dataset for which the log-likelihood is calculated. If None, the log-likelihood is calculated for all rows of the dataset. Default: None.

Returns:

The gradient of the log-likelihood function of the KLR model for the given parameters. Shape: (num_rows_kernel_matrix * num_alternatives, ).

Return type:

ndarray

log_likelihood(alpha, P=None, choice_indices=None, pmle=None, pmle_lambda=0, indices=None)[source]

Calculate the log-likelihood of the KLR model for the given parameters.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • P (ndarray | None) – The matrix of probabilities of each alternative for each row of the dataset. If None, the probabilities are calculated. Shape: (n_samples, num_alternatives). Default: None.

  • choice_indices (ndarray | None) – The indices of the chosen alternatives for each row of the dataset. If None, the indices are obtained from the KernelMatrix object. Shape: (n_samples,). Default: None.

  • pmle (str | None) – It specifies the type of penalization for performing a penalized maximum likelihood estimation. Default: None.

  • pmle_lambda (float) – The lambda parameter for the penalized maximum likelihood. Default: 0.

  • indices (ndarray | None) – The indices of the rows of the dataset for which the log-likelihood is calculated. If None, the log-likelihood is calculated for all rows of the dataset. Default: None.

Returns:

The log-likelihood of the KLR model for the given parameters.

Return type:

float

tikhonov_penalty(alpha, pmle_lambda)[source]

Calculate the Tikhonov penalty for the given parameters.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • pmle_lambda (float) – The lambda parameter for the penalized maximum likelihood.

Returns:

The Tikhonov penalty for the given parameters.

Return type:

float

tikhonov_penalty_gradient(alpha, pmle_lambda, indices=None)[source]

Calculate the gradient of the Tikhonov penalty for the given parameters.

Parameters:
  • alpha (ndarray) – The vector of parameters. Shape: (num_cols_kernel_matrix, num_alternatives).

  • pmle_lambda (float) – The lambda parameter for the penalized maximum likelihood.

  • indices (ndarray | None) – The indices of the rows of the dataset for which the gradient

Returns:

The gradient of the Tikhonov penalty for the given parameters.

If indices is None, the shape is (num_cols_kernel_matrix, num_alternatives), otherwise, the shape is (len(indices), num_alternatives).

Return type:

ndarray

gklr.kernel_estimator module

GKLR kernel_estimator module.

class gklr.kernel_estimator.KernelEstimator(calcs, pmle=None, pmle_lambda=0.0, method='L-BFGS-B', verbose=1)[source]

Bases: Estimation

Estimation object for the Kernel Logistic Regression (KLR) model.

Constructor.

Parameters:
  • calcs (KernelCalcs) – Calcs object.

  • pmle (str | None) – Indicates the penalization method for the penalized maximum likelihood estimation. If ‘None’ a maximum likelihood estimation without penalization is performed. Default: None.

  • pmle_lambda (float) – The value of the regularization parameter for the PMLE method. Default: 0.0.

  • method (str) – The optimization method. Default: “L-BFGS-B”.

  • verbose (int) – Indicates the level of verbosity of the function. If 0, no output will be printed. If 1, basic information about the estimation procedure will be printed. If 2, the information about each iteration will be printed. Default: 1.

gradient(params, indices=None)[source]

Compute the gradient of the objective function for the Kernel Logistic Regression (KLR) model.

This function is used by the optimization methods that do not require the computation of the objective function. If the objective function is also required, it is more efficient to use the ‘objective_function’ method, setting the ‘return_gradient’ argument to ‘True’.

Parameters:
  • params (ndarray) – The model parameters. Shape: (n_params,).

  • indices (ndarray | None) – The indices of the samples to be used in the computation of the the gradient. If ‘None’ all the samples will be used. Default: None.

Returns:

The gradient of the objective function with respect to the model parameters with shape: (num_rows_kernel_matrix * num_alternatives,).

Return type:

ndarray

minimize(params, loss_tol=1e-06, options=None, **kargs)[source]

Minimize the objective function.

Parameters:
  • params (ndarray) – The initial values of the model parameters. Shape: (n_params,).

  • loss_tol (float) – The tolerance for the loss function. Default: 1e-06.

  • options (Dict[str, Any] | None) – A dict with advance options for the optimization method. Default: None.

  • **kargs (Dict[str, Any]) – Additional arguments for the minimization function.

Returns:

A dict with the results of the optimization.

Return type:

Dict[str, Any]

objective_function(params, indices=None)[source]

Compute the objective function for the Kernel Logistic Regression (KLR) model and its gradient.

Parameters:
  • params (ndarray) – The model parameters. Shape: (n_params,).

  • indices (ndarray | None) – The indices of the samples to be used in the computation of the objective function. If ‘None’ all the samples will be used. Default: None.

Returns:

A tuple with the value of the objective function and its gradient. The first element of the tuple is the value of the objective function and the second element is the gradient of the objective function with respect to the model parameters with shape: (num_rows_kernel_matrix * num_alternatives,)

Return type:

float

objective_function_with_gradient(params, indices=None)[source]

Compute the objective function for the Kernel Logistic Regression (KLR) model and its gradient.

Parameters:
  • params (ndarray) – The model parameters. Shape: (n_params,).

  • indices (ndarray | None) – The indices of the samples to be used in the computation of the objective function. If ‘None’ all the samples will be used. Default: None.

Returns:

A tuple with the value of the objective function and its gradient. The first element of the tuple is the value of the objective function and the second element is the gradient of the objective function with respect to the model parameters with shape: (num_rows_kernel_matrix * num_alternatives,)

Return type:

Tuple[float, ndarray]

gklr.kernel_matrix module

GKLR kernel_matrix module.

class gklr.kernel_matrix.KernelMatrix(X, choice_column, attributes, config, Z=None)[source]

Bases: object

Class to store the kernel matrix and its associated data.

Constructor.

Parameters:
  • X (DataFrame) – Train dataset stored in a pandas DataFrame. Shape: (n_samples, n_features).

  • choice_column (str) – Name of the column of DataFrame X that contains the ID of chosen alternative.

  • attributes (Dict[int, List[str]]) – A dict that contains the columns of DataFrame X that are considered for each alternative. This dict is indexed by the ID of the available alternatives in the dataset and the values are list containing the names of all the columns considered for that alternative.

  • config (Config) – A Config object that contains the hyperparameters of the GKLR model.

  • Z (DataFrame | None) – Test dataset stored in a pandas DataFrame. Shape: (n_samples, n_features). Default: None

dot(A, K_index=0, row_indices=None, col_indices=None)[source]

Implements the dot product of the kernel matrix and numpy array A.

Implements the matrix multiplication K ∙ A, where K is the kernel matrix and A is a numpy array given as argument.

Parameters:
  • A (ndarray) – Numpy array to be multiplied by the kernel matrix. Shape: (num_cols_kernel_matrix, •)

  • K_index (int) – Index of the kernel matrix to be used.

  • row_indices (ndarray | None) – Indices of the rows of the kernel matrix to be used in the dot product. If None, all the rows are used. Default: None.

  • col_indices (ndarray | None) – Indices of the columns of the kernel matrix to be used in the dot product. If None, all the columns are used. Default: None.

Returns:

The dot product of the kernel matrix and A.

Shape: (num_rows_kernel_matrix, •)

Return type:

ndarray

get_K(alt=None, index=None)[source]

Returns the kernel matrix for all the alternatives, for alternative alt, or the matrix at index index.

Parameters:
  • alt (int | None) – Alternative for which the kernel matrix to be returned.

  • index (int | None) – Index of the kernel matrix to be returned.

Returns:

The kernel matrix for all the alternatives, for alternative alt, or the matrix at index index.

Return type:

ndarray | Dict[int, ndarray]

get_alternatives()[source]

Return the available alternatives.

Returns:

(n_alternatives,).

Return type:

A numpy array with the available alternatives. Shape

get_choices()[source]

Return the choices per observation.

Returns:

(n_samples,).

Return type:

A numpy array with the choices per observation. Shape

get_choices_indices()[source]

Return the choices per observation as alternative indices.

Returns:

A numpy array with the choices per observation as alternative indices. Shape: (n_samples,).

Return type:

ndarray

get_choices_matrix()[source]

Return the choices per observation as a matrix.

Obtain a sparse matrix with one row per observation and one column per alternative. A cell Z_ij of the matrix takes value 1 if individual i choses alternative j; The cell contains 0 otherwise.

Returns:

A numpy array with the choices per observation as a matrix.

Shape: (n_samples, n_alternatives).

Return type:

ndarray

get_num_alternatives()[source]

Return the number of available alternatives.

Returns:

Number of available alternatives.

Return type:

int

get_num_cols()[source]

Return the number of columns of the kernel matrix.

Returns:

Number of columns of the kernel matrix, which corresponds to the

number of reference observations.

Return type:

int

get_num_rows()[source]

Return the number of rows of the kernel matrix.

Returns:

Number of rows of the kernel matrix, which corresponds to the

number of observations.

Return type:

int

get_num_samples()[source]

Return the number of observations in the dataset.

Returns:

Number of observations in the dataset.

Return type:

int

gklr.kernel_utils module

gklr.kernel_utils.convert_size_bytes_to_human_readable(size_in_bytes)[source]

Convert the size from bytes to other units like KB, MB or GB.

Parameters:

size_in_bytes – Size in bytes.

Returns:

A string with the size in bytes, KB, MB or GB.

gklr.kernel_utils.elapsed_time_to_str(elapsed_time_sec)[source]

Convert the elapsed time in seconds to a string with the appropriate units.

Parameters:

elapsed_time_sec (float) – Elapsed time in seconds

Returns:

A string with the elapsed time in seconds or minutes.

Return type:

str

gklr.logger module

GKLR logger module.

gklr.logger.logger_critical(msg, *args, **kwargs)

Log ‘msg % args’ with severity ‘CRITICAL’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.critical(“Houston, we have a %s”, “major disaster”, exc_info=1)

gklr.logger.logger_debug(msg, *args, **kwargs)

Log ‘msg % args’ with severity ‘DEBUG’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.debug(“Houston, we have a %s”, “thorny problem”, exc_info=1)

gklr.logger.logger_error(msg, *args, **kwargs)

Log ‘msg % args’ with severity ‘ERROR’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.error(“Houston, we have a %s”, “major problem”, exc_info=1)

gklr.logger.logger_get_level()[source]

Gets the level of the logger.

Returns:

The current level of the logger.

Return type:

int

gklr.logger.logger_info(msg, *args, **kwargs)

Log ‘msg % args’ with severity ‘INFO’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.info(“Houston, we have a %s”, “interesting problem”, exc_info=1)

gklr.logger.logger_log(level, msg, *args, **kwargs)

Log ‘msg % args’ with the integer severity ‘level’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.log(level, “We have a %s”, “mysterious problem”, exc_info=1)

gklr.logger.logger_set_level(level)[source]

Set the level of the logger.

Parameters:

level (int) – The level desired for the logger.

Return type:

None

gklr.logger.logger_warning(msg, *args, **kwargs)

Log ‘msg % args’ with severity ‘WARNING’.

To pass exception information, use the keyword argument exc_info with a true value, e.g.

logger.warning(“Houston, we have a %s”, “bit of a problem”, exc_info=1)

gklr.optimizer module

GKLR optimizer module.

class gklr.optimizer.AcceleratedLinearSearch(gamma=1.1, theta=0.5, max_alpha=1.5, n_epochs=10)[source]

Bases: object

Class for the accelerated linear search algorithm.

Initialize the accelerated linear search algorithm.

Parameters:
  • gamma (float) – The gamma parameter. Default: 1.1.

  • theta (float) – The theta parameter. Default: 0.5.

  • max_alpha (float) – The maximum alpha value. Default: 1.5.

  • n_epochs (int) – Number of epochs in the main algorithm to perform one step of the accelerated linear search. Default: 10.

initialize(y_t)[source]

Initialize the accelerated linear search algorithm.

Parameters:

y_t (ndarray) – The value of the parameters at the current iteration.

Return type:

None

update_params(fun, y_t, *args)[source]

Execute the accelerated linear search algorithm and update the parameters.

Parameters:
  • fun (Callable) – The objective function to be minimized. fun(x, *args) -> float, where x is the input vector and args are the additional arguments of the objective function.

  • y_t (ndarray) – The value of the parameters at the current iteration.

  • *args – Additional arguments of the objective function.

Returns:

The new value of the weights.

Return type:

ndarray

class gklr.optimizer.LearningRateScheduler(lr_scheduler=None, lr_decay_rate=1, lr_decay_step=100)[source]

Bases: object

Implements different learning rate scheduling methods.

Initialize the learning rate scheduler.

Parameters:
  • lr_scheduler (str | None) – The method for the learning rate decay. Default: None.

  • lr_decay_rate (float) – The learning rate decay rate. Default: 1.

  • lr_decay_step (int) – The learning rate decay step for the step decay method.

  • Default

class gklr.optimizer.MemoizeJac(fun)[source]

Bases: object

Decorator that caches the return values of a function returning (fun, grad) each time it is called.

derivative(x, *args)[source]
class gklr.optimizer.Optimizer[source]

Bases: object

Optimizer class object.

Constructor.

minimize(fun, x0, args=(), method='SGD', jac=None, hess=None, tol=1e-06, options=None)[source]

Minimize the objective function using the specified optimization method.

Parameters:
  • fun (Callable) – The objective function to be minimized. fun(x, *args) -> float, where x is the input vector and args are the additional arguments of the objective function.

  • x0 (ndarray) – The initial guess of the parameters.

  • args (tuple) – Additional arguments passed to the objective function.

  • method (str) – The optimization method. Default: “SGD”.

  • jac (Callable | bool | None) – The gradient of the objective function. Default: None.

  • hess (Callable | None) – The Hessian of the objective function. Default: None.

  • tol (float) – The tolerance for the termination. Default: 1e-06.

  • options (Dict[str, Any] | None) – A dictionary of solver options. Default: None.

Returns:

fun: The value of the objective function at the solution. x: A 1-D ndarray containing the solution. success: A boolean indicating whether the optimization converged. message: A string describing the cause of the termination.

Return type:

A dictionary containing the result of the optimization procedure

minimize_mini_batch_sgd(fun, x0, jac=None, optimizer='SGD', args=(), learning_rate=0.001, mini_batch_size=None, n_samples=0, beta=0.9, beta1=0.9, beta2=0.999, epsilon=1e-08, learning_rate_scheduler=None, accelerated_linear_search=None, maxiter=1000, print_every=0, seed=0, **kwards)[source]

Minimize the objective function using the mini-batch stochastic gradient descent method.

Parameters:
  • fun (Callable) – The objective function to be minimized. fun(x, *args) -> float, where x is the input vector and args are the additional arguments of the objective function.

  • x0 (ndarray) – The initial guess of the parameters.

  • jac (Callable | None) – The gradient of the objective function. Default: None.

  • optimizer (str) – The variant of the mini-batch stochastic gradient descent method to be used. Default: “SGD”.

  • args (tuple) – Additional arguments passed to the objective function.

  • learning_rate (float) – The learning rate. Default: 1e-03.

  • mini_batch_size (int | None) – The mini-batch size. Default: None.

  • n_samples (int) – The number of samples in the dataset. Default: 0.

  • beta (float) – The momentum parameter. Default: 0.9.

  • beta1 (float) – The exponential decay rate for the first moment estimates (gradients) in the Adam method. Default: 0.9.

  • beta2 (float) – The exponential decay rate for the second moment estimates (squared gradients) in the Adam method. Default: 0.999.

  • epsilon (float) – A small constant for numerical stability in the Adam method. Default: 1e-08.

  • learning_rate_scheduler (Callable | None) – A function that computes the learning rate at each iteration. Default: None.

  • accelerated_linear_search (AcceleratedLinearSearch | None) – An instance of the AcceleratedLinearSearch class. If None, the accelerated linear search is not used. Default: None.

  • maxiter (int) – The maximum number of iterations or epochs. Default: 1000.

  • print_every (int) – The number of iterations to print the loss. If 0, the loss is not computed. If -1, the loss is computed at each iteration but not printed. If -2, the loss and time per iteration are computed but not printed. Default: 0.

  • seed (int) – The seed for the random number generator. Default: 0.

  • **kwards – Additional arguments passed to the objective function.

Returns:

fun: The value of the objective function at the solution. x: A 1-D ndarray containing the solution. jac: The gradient of the objective function at the solution. nit: The number of iterations. nfev: The number of function evaluations. success: A boolean indicating whether the optimization converged. message: A string describing the cause of the termination. history: A dictionary containing the loss history.

Return type:

A dictionary containing the result of the optimization procedure

Module contents

GKLR package

gklr.display_info()[source]

Display GKLR module information.