pflacco.classical_ela_features#
- pflacco.classical_ela_features.calculate_cm_angle(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], blocks: Optional[Union[List[int], ndarray, int]] = None, minimize: bool = True) Dict[str, Union[int, float]]#
Calculation of Cell Mapping Angle features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
blocks (Optional[Union[List[int], np.ndarray, int]], optional) – Number of blocks per dimension, by default None.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_cm_conv(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], blocks: Optional[Union[List[int], ndarray, int]] = None, minimize: bool = True, cm_conv_diag: bool = False, cm_conv_fast_k: float = 0.05) Dict[str, Union[int, float]]#
Calculation of Cell Mapping Convexity features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
blocks (Optional[Union[List[int], np.ndarray, int]], optional) – Number of blocks per dimension, by default None.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
cm_conv_diag (bool, optional) – Indicator which, when true, consideres cells on the diagonal also as neighbours, by default False.
cm_conv_fast_k (float, optional) – Percentage of elements that should be considered within the nearest neighbour computation, by default 0.05.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_cm_grad(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], blocks: Optional[Union[List[int], ndarray, int]] = None, minimize: bool = True) Dict[str, Union[int, float]]#
Calculation of Cell Mapping Gradient Homogeneity features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
blocks (Optional[Union[List[int], np.ndarray, int]], optional) – Number of blocks per dimension, by default None.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_dispersion(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], disp_quantiles: List[float] = [0.02, 0.05, 0.1, 0.25], dist_method: str = 'euclidean', dist_p: int = 2, minimize: bool = True) Dict[str, Union[int, float]]#
Calculation of Dispersion features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
disp_quantiles (List[float], optional) – Quantiles which are used to determine the best elements of the entire sample, by default [0.02, 0.05, 0.1, 0.25].
dist_method (str, optional) – Determines which distance method is used. The given value is passed over to scipy.spatial.distance.pdist, by default ‘euclidean’.
dist_p (int, optional) – The p-norm to apply for Minkowski. This is only considered when dist_method = ‘minkowski’, by default 2.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_conv(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], f: Callable[[List[float]], float], ela_conv_nsample: int = 1000, ela_conv_threshold: float = 1e-10) Dict[str, Union[int, float]]#
Calculation of ELA Convexity features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
f (Callable[[List[float]], float]) – Objective function to be optimized.
ela_conv_nsample (int, optional) – Number of samples that are drawn for calculating the convexity features, by default 1000.
ela_conv_threshold (float, optional) – Threshold of the linearity, i.e., the tolerance to/deviation from perfect linearity, in order to still be considered linear, by default 1e-10.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_curvate(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], f: Callable[[List[float]], float], dim: int, lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], sample_size_factor: int = 100, delta: float = 0.0001, eps: float = 0.0001, zero_tol: float = 6.378748342528005e-156, r: int = 4, v: int = 2, seed: Optional[int] = None) Dict[str, Union[int, float]]#
Calculation of ELA Curvature features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
f (Callable[[List[float]], float]) – Objective function to be optimized.
dim (int) – Dimensionality of the decision space.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
sample_size_factor (int, optional) – Factor which determines the sample size by sample_size_factor * dim, by default 100.
delta (float, optional) – Parameter used to approximate the gradient and hessian. See grad and hessian of the R-package numDeriv for more details, by default 10**-4.
eps (float, optional) – Parameter used to approximate the gradient and hessian. See grad and hessian of the R-package numDeriv for more details, by default 10**-4.
zero_tol (float, optional) – Parameter used to approximate the gradient and hessian. See grad and hessian of the R-package numDeriv for more details, by default np.sqrt(np.nextafter(0, 1)/70**-7).
r (int, optional) – Parameter used to approximate the gradient and hessian. See grad and hessian of the R-package numDeriv for more details, by default 4.
v (int, optional) – Parameter used to approximate the gradient and hessian. See grad and hessian of the R-package numDeriv for more details, by default 2.
seed (Optional[int], optional) – Seed for reproducability, by default None.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_distribution(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], ela_distr_skewness_type: int = 3, ela_distr_kurtosis_type: int = 3) Dict[str, Union[int, float]]#
Calculation of ELA Distribution features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
ela_distr_skewness_type (int, optional) – Integer indicating which algorithm to use, by default 3.
ela_distr_kurtosis_type (int, optional) – Integer indicating which algorithm to use, by default 3.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_level(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], ela_level_quantiles: List[float] = [0.1, 0.25, 0.5], interface_mda_from_R: bool = False, ela_level_resample_iterations: int = 10) Dict[str, Union[int, float]]#
Calculation of ELA Levelset features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
ela_level_quantiles (List[float], optional) – Cutpoints (quantiles of the objective values) for splitting the objective space, by default [0.1, 0.25, 0.5].
interface_mda_from_R (bool, optional) – Indicator whether to interface missing functionality from R, by default False.
ela_level_resample_iterations (int, optional) – Number of iterations of the resampling method, by default 10.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_local(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], f: Callable[[List[float]], float], dim: int, lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], minimize: bool = True, ela_local_local_searches_factor: int = 50, ela_local_optim_method: str = 'L-BFGS-B', ela_local_clust_method: str = 'single', seed: Optional[int] = None, **minimizer_kwargs) Dict[str, Union[int, float]]#
Calculation of ELA Local Search features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
f (Callable[[List[float]], float]) – Objective function to be optimized.
dim (int) – Dimensionality of the decision space.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
ela_local_local_searches_factor (int, optional) – Factor which determines the number of local searches by ela_local_local_searches_factor * dim, by default 50.
ela_local_optim_method (str, optional) – Type of solver. Any of scipy.optimize.minimize can be used, by default ‘L-BFGS-B’.
ela_local_clust_method (str, optional) – Hierarchical clustering method to use, by default ‘single’.
seed (Optional[int], optional) – Seed for reproducability, by default None.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_ela_meta(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]]) Dict[str, Union[int, float]]#
Calculation of ela_meta features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_information_content(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], ic_sorting: str = 'nn', ic_nn_neighborhood: int = 20, ic_nn_start: Optional[int] = None, ic_epsilon: List[float] = array([0.00000000e+00, 1.00000000e-05, 1.04717682e-05, ..., 9.11926760e+14, 9.54948564e+14, 1.00000000e+15]), ic_settling_sensitivity: float = 0.05, ic_info_sensitivity: float = 0.5, seed: Optional[int] = None) Dict[str, Union[int, float]]#
Calculation of Information Content features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
ic_sorting (str, optional) – Sorting strategy, which is used to define the tour through the landscape. Possible values are ‘nn’ and ‘random, by default ‘nn’.
ic_nn_neighborhood (int, optional) – Number of neighbours to be considered in the computation, by default 20.
ic_nn_start (Optional[int], optional) – Indices of the observation which should be used as starting points. When none are supplied, these are chosen randomly, by default None.
ic_epsilon (List[float], optional) – Epsilon values as described in section V.A of [1], by default np.insert(10 ** np.linspace(start = -5, stop = 15, num = 1000), 0, 0).
ic_settling_sensitivity (float, optional) – Threshold, which should be used for computing the settling sensitivity of [1], by default 0.05.
ic_info_sensitivity (float, optional) – Portion of partial information sensitivity of [1], by default 0.5
seed (Optional[int], optional) – Seed for reproducability, by default None
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
References
- [1] Muñoz, M.A., Kirley, M. and Halgamuge, S.K., 2014.
Exploratory landscape analysis of continuous space optimization problems using information content. IEEE transactions on evolutionary computation, 19(1), pp.74-87.
- pflacco.classical_ela_features.calculate_limo(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], lower_bound: Union[List[float], float], upper_bound: Union[List[float], float], blocks: Optional[Union[List[int], ndarray, int]] = None) Dict[str, Optional[Union[int, float]]]#
Calculation of Linear Model features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
lower_bound (Union[List[float], float]) – Lower bound of variables of the decision space.
upper_bound (Union[List[float], float]) – Upper bound of variables of the decision space.
blocks (Optional[Union[List[int], np.ndarray, int]], optional) – Number of blocks per dimension, by default None.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Optional[Union[int, float]]]
- pflacco.classical_ela_features.calculate_nbc(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], fast_k: float = 0.05, dist_tie_breaker: str = 'sample', minimize: bool = True) Dict[str, Union[int, float]]#
Calculation of Nearest Better Clustering features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
fast_k (float, optional) – Controls the percentage of observations that should be considered when looking for the nearest better neighbour, by default 0.05.
dist_tie_breaker (str, optional) – Strategy to break ties between observations. Currently allows sample, by default ‘sample’.
minimize (bool, optional) – Indicator whether the objective function should be minimized or maximized, by default True.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]
- pflacco.classical_ela_features.calculate_pca(X: Union[DataFrame, ndarray, List[List[float]]], y: Union[Series, ndarray, List[float]], prop_cov_x: float = 0.9, prop_cor_x: float = 0.9, prop_cov_init: float = 0.9, prop_cor_init: float = 0.9) Dict[str, Union[int, float]]#
Calculation of Principal Component features, similar to the R-package flacco.
- Parameters
X (Union[pd.DataFrame, np.ndarray, List[List[float]]]) – A collection-like object which contains a sample of the decision space. Can be created with sampling.create_initial_sample.
y (Union[pd.Series, np.ndarray, List[float]]) – A list-like object which contains the respective objective values of X.
prop_cov_x (float, optional) – Proportion of the explained variance by the first PC based on the covariance matrix, by default 0.9.
prop_cor_x (float, optional) – Proportion of the explained variance by the first PC based on the correlation matrix, by default 0.9.
prop_cov_init (float, optional) – Proportion of the explained variance by the first PC based on the covariance matrix, by default 0.9.
prop_cor_init (float, optional) – Proportion of the explained variance by the first PC based on the correlation matrix, by default 0.9.
- Returns
Dictionary consisting of the calculated features.
- Return type
Dict[str, Union[int, float]]