- Python Machine Learning Cookbook(Second Edition)
- Giuseppe Ciaburro Prateek Joshi
- 168字
- 2021-06-24 15:41:05
How to do it...
Let's see how to implement a stacking method:
- We start by importing the libraries:
from heamy.dataset import Dataset
from heamy.estimator import Regressor
from heamy.pipeline import ModelsPipeline
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
- Load the boston dataset, already used in Chapter 1, The Realm of Supervised Learning, for the Estimating housing prices recipe:
data = load_boston()
- Split the data:
X, y = data['data'], data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=2)
- Let's create the dataset:
Data = Dataset(X_train,y_train,X_test)
- Now we can build the two models that we will use in the stacking procedure:
RfModel = Regressor(dataset=Data, estimator=RandomForestRegressor, parameters={'n_estimators': 50},name='rf')
LRModel = Regressor(dataset=Data, estimator=LinearRegression, parameters={'normalize': True},name='lr')
- It's time to stack these models:
Pipeline = ModelsPipeline(RfModel,LRModel)
StackModel = Pipeline.stack(k=10,seed=2)
- Now we will train a LinearRegression model on stacked data:
Stacker = Regressor(dataset=StackModel, estimator=LinearRegression)
- Finally, we will calculate the results to validate the model:
Results = Stacker.predict()
Results = Stacker.validate(k=10,scorer=mean_absolute_error)