nyaggle.ensemble¶
- nyaggle.ensemble.averaging(test_predictions, oof_predictions=None, y=None, weights=None, eval_func=None, rank_averaging=False)[source]¶
Perform averaging on model predictions.
- Parameters
test_predictions (
List[ndarray]) – List of predicted values on test data.oof_predictions (
Optional[List[ndarray]]) – List of predicted values on out-of-fold training data.y (
Optional[Series]) – Target valueweights (
Optional[List[float]]) – Weights for each predictionseval_func (
Optional[Callable]) – Evaluation metric used for calculating result score. Used only ifoof_predictionsandyare given.rank_averaging (
bool) – IfTrue, predictions will be converted to rank before averaging.
- Return type
EnsembleResult- Returns
Namedtuple with following members
- test_prediction:
numpy array, Average prediction on test data.
- oof_prediction:
numpy array, Average prediction on Out-of-Fold validation data.
Noneifoof_predictions=None.
- score:
float, Calculated score on Out-of-Fold data.
Noneifeval_funcisNone.
- nyaggle.ensemble.averaging_opt(test_predictions, oof_predictions, y, eval_func, higher_is_better, weight_bounds=(0.0, 1.0), rank_averaging=False, method=None)[source]¶
Perform averaging with optimal weights using scipy.optimize.
- Parameters
test_predictions (
List[ndarray]) – List of predicted values on test data.oof_predictions (
Optional[List[ndarray]]) – List of predicted values on out-of-fold training data.y (
Optional[Series]) – Target valueeval_func (
Optional[Callable[[ndarray,ndarray],float]]) – Evaluation metric f(y_true, y_pred) used for calculating result score. Used only ifoof_predictionsandyare given.higher_is_better (
bool) – Determine the direction of optimizeeval_func.weight_bounds (
Tuple[float,float]) – Specify lower/upper bounds of each weight.rank_averaging (
bool) – IfTrue, predictions will be converted to rank before averaging.method (
Optional[str]) – Type of solver. IfNone, SLSQP will be used.
- Return type
EnsembleResult- Returns
Namedtuple with following members
- test_prediction:
numpy array, Average prediction on test data.
- oof_prediction:
numpy array, Average prediction on Out-of-Fold validation data.
Noneifoof_predictions=None.
- score:
float, Calculated score on Out-of-Fold data.
Noneifeval_funcisNone.
- nyaggle.ensemble.stacking(test_predictions, oof_predictions, y, estimator=None, cv=None, groups=None, type_of_target='auto', eval_func=None)[source]¶
Perform stacking on predictions.
- Parameters
test_predictions (
List[ndarray]) – List of predicted values on test data.oof_predictions (
List[ndarray]) – List of predicted values on out-of-fold training data.y (
Series) – Target valueestimator (
Optional[BaseEstimator]) – Estimator used for the 2nd-level model. IfNone, the default estimator (auto-tuned linear model) will be used.cv (
Union[int,Iterable,BaseCrossValidator,None]) –int, cross-validation generator or an iterable which determines the cross-validation splitting strategy.
None, to use the default
KFold(5, random_state=0, shuffle=True),integer, to specify the number of folds in a
(Stratified)KFold,CV splitter (the instance of
BaseCrossValidator),An iterable yielding (train, test) splits as arrays of indices.
groups (
Optional[Series]) – Group labels for the samples. Only used in conjunction with a “Group” cv instance (e.g.,GroupKFold).type_of_target (
str) – The type of target variable. Ifauto, type is inferred bysklearn.utils.multiclass.type_of_target. Otherwise,binary,continuous, ormulticlassare supported.eval_func (
Optional[Callable]) – Evaluation metric used for calculating result score. Used only ifoof_predictionsandyare given.
- Return type
EnsembleResult- Returns
Namedtuple with following members
- test_prediction:
numpy array, Average prediction on test data.
- oof_prediction:
numpy array, Average prediction on Out-of-Fold validation data.
Noneifoof_predictions=None.
- score:
float, Calculated score on Out-of-Fold data.
Noneifeval_funcisNone.