skbonus.meta package¶

Module contents¶

Meta estimators.

class skbonus.meta.ExplainableBoostingMetaRegressor(base_regressor: Optional[Any] = None, max_rounds: int = 5000, learning_rate: float = 0.01, grid_points: int = 1000)¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

A meta regressor that outputs a transparent, explainable model given blackbox models.

It works exactly like the ExplainableBoostingRegressor by the interpretml team, but here you can choose any base regressor instead of being restricted to trees. For example, you can use scikit-learn’s IsotonicRegression to create a model that is monotonically increasing or decreasing in some of the features, while still being explainable and well-performing.

See the notes below to find a nice explanation of how the algorithm works at a high level.

Parameters

base_regressor (Any, default=DecisionTreeRegressor(max_depth=4)) – A single scikit-learn compatible regressor or a list of those regressors of length n_features.
max_rounds (int, default=5000) – Conduct the boosting for these many rounds.
learning_rate (float, default=0.01) – The learning rate. Should be quite small.
grid_points (int, default=1000) –
The more grid points, the
- more detailed the explanations get and
- the better the model performs, but
- the slower the algorithm gets.

Examples

>>> import numpy as np
>>> from sklearn.isotonic import IsotonicRegression
>>> np.random.seed(0)
>>> X = np.random.randn(100, 2)
>>> y = 2 * X[:, 0] - 3 * X[:, 1] + np.random.randn(100)
>>> e = ExplainableBoostingMetaRegressor(
...         base_regressor=[IsotonicRegression(), IsotonicRegression(increasing=False)],
...         grid_points=20
... ).fit(X, y)
>>> e.score(X, y)
0.9377382292348461
>>> e.outputs_[0] # increasing in the first feature, as it should be
array([-4.47984456, -4.47984456, -4.47984456, -4.47984456, -3.00182713,
       -2.96627696, -1.60843287, -1.06601264, -0.92013822, -0.7217753 ,
       -0.66440783,  0.28132994,  1.33664486,  1.47592253,  1.96677286,
        2.88969439,  2.96292906,  4.33642573,  4.38506967,  6.42967225])
>>> e.outputs_[1] # decreasing in the second feature, as it should be
array([ 6.35605214,  6.06407947,  6.05458114,  4.8488004 ,  4.41880876,
        3.45056373,  2.64560385,  1.6138303 ,  0.89860987,  0.458301  ,
        0.33455608, -0.43609495, -1.55600464, -2.05142528, -2.42791679,
       -3.58961475, -4.80134218, -4.94421252, -5.94858712, -6.36828774])

Notes

Check out the original author’s Github at https://github.com/interpretml/interpret and https://www.youtube.com/watch?v=MREiHgHgl0k for a great introduction into the operations of the algorithm.

fit(X: numpy.ndarray, y: numpy.ndarray, sample_weight: Optional[numpy.ndarray] = None) → skbonus.meta._explainable_regressor.ExplainableBoostingMetaRegressor¶

Fit the model.

Parameters

X (np.ndarray of shape (n_samples, n_features)) – The training data.
y (np.ndarray, 1-dimensional) – The target values.

Returns

Fitted regressor.

Return type

ExplainableBoostingMetaRegressor

predict(X: numpy.ndarray) → numpy.ndarray¶

Get predictions.

Parameters: X (np.ndarray, shape (n_samples, n_features)) – Samples to get predictions of.
Returns: y – The predicted values.
Return type: np.ndarray, shape (n_samples,)

class skbonus.meta.ZeroInflatedRegressor(classifier: Any, regressor: Any)¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

A meta regressor for zero-inflated datasets, i.e. the targets contain a lot of zeroes.

ZeroInflatedRegressor consists of a classifier and a regressor.

The classifier’s task is to find of if the target is zero or not.

The regressor’s task is to output a (usually positive) prediction whenever the classifier indicates that the there should be a non-zero prediction.

The regressor is only trained on examples where the target is non-zero, which makes it easier for it to focus.

At prediction time, the classifier is first asked if the output should be zero. If yes, output zero. Otherwise, ask the regressor for its prediction and output it.

Parameters

classifier (Any, scikit-learn classifier) – A classifier that answers the question “Should the output be zero?”.
regressor (Any, scikit-learn regressor) – A regressor for predicting the target. Its prediction is only used if classifier says that the output is non-zero.

Examples

>>> import numpy as np
>>> from sklearn.ensemble import ExtraTreesClassifier, ExtraTreesRegressor
>>> np.random.seed(0)
>>> X = np.random.randn(10000, 4)
>>> y = ((X[:, 0]>0) & (X[:, 1]>0)) * np.abs(X[:, 2] * X[:, 3]**2)
>>> z = ZeroInflatedRegressor(
... classifier=ExtraTreesClassifier(random_state=0),
... regressor=ExtraTreesRegressor(random_state=0)
... )
>>> z.fit(X, y)
ZeroInflatedRegressor(classifier=ExtraTreesClassifier(random_state=0),
                      regressor=ExtraTreesRegressor(random_state=0))
>>> z.predict(X)[:5]
array([4.91483294, 0.        , 0.        , 0.04941909, 0.        ])

fit(X: numpy.ndarray, y: numpy.ndarray) → skbonus.meta._zero_inflated_regressor.ZeroInflatedRegressor¶

Fit the model.

Parameters

X (np.ndarray of shape (n_samples, n_features)) – The training data.
y (np.ndarray, 1-dimensional) – The target values.

Returns

Fitted regressor.

Return type

ZeroInflatedRegressor

predict(X: numpy.ndarray) → numpy.ndarray¶

Get predictions.

Parameters: X (np.ndarray, shape (n_samples, n_features)) – Samples to get predictions of.
Returns: y – The predicted values.
Return type: np.ndarray, shape (n_samples,)