API parsing
AdaBoost algorithm (reduction coefficient needs to be added to solve the problem of infinite amplification caused by abnormal data, resulting in error)
parameter
AdaBoostClassifier
AdaBoostRegressor
base_estimator
Weak classifier object, the default is CART classification tree DecisionTreeClassifier;
Weak regressor object, the default is CART regression tree DecisionTreeRegressor;
algorithm
Samme and samme R: Samme indicates that the classification effect of the sample set is used as the weight of the weak classifier in the construction process; SAMME.R uses the prediction probability of classifying the sample set as the weight of the weak classifier. Due to samme R uses continuous probability measures, so the general iteration is faster than samme, and the default parameter is samme R; Emphasis: use samme R must require base_ The weak classifier model specified by the estimator must support probability prediction, that is, it has predict_proba method.
Not supported
loss
Not supported
Specify the calculation method of error. Optional parameters are "Iinear", "square", "exponential", and the default is "linear". Generally, it does not need to be changed
n_estimators
For the number of weak classifiers, too small value may lead to under fitting, and too large value may lead to over fitting. Generally, 50 ~ 100 is more suitable, and the default is 50
learning_rate
Specify the weight reduction coefficient v of each weak classifier, which is 1 by default: generally, the parameter is adjusted from a relatively small value; The smaller the value, the more weak classifiers are needed (if not added, the data processing in front of the exception data will be more accurate, and the data behind will be worse and worse)
GBDT algorithm
parameter
GradientBoostingClassifier
GradientBoostingRegressor
alpha
Not supported
When using the huber or quantile loss function, you need to give the quantile value, which is 0.9 by default; If there are many noise data, the parameter value can be appropriately reduced (the influence of abnormal data can be reduced by adjusting parameters)
loss
Given the loss function, log likelihood function deviation and exponential loss function are optional; The default is deviation: modification is not recommended
Given the loss function, the mean square deviation ls, absolute loss lad, huber loss huber, quantile loss quantile can be selected; Default Is; Generally, default Is adopted; If there are many noise data, huber Is recommended; In case of piecewise prediction, quantile Is recommended
n_ estimators
The maximum number of iterations. Too small value may lead to under fitting, and too large value may lead to over fitting. Generally, 50 ~ 100 is more suitable, and the default is 50
learning rate
Specify the weight reduction coefficient v of each weak classifier, which is 1 by default; -- Generally, the parameter is adjusted from a relatively small value: the smaller the value, the more weak classifiers are required
subsample
When the training model is given, the proportional value of subsampling is in the range of (0,1), and the default value is 1, which means that subsampling is not used; if the given value is less than 1, it means that some data are used for model training, which can reduce the over fitting of the model: it is recommended [0.5,0.8]; the sampling method is not to put back sampling
init
Given the initialized model, it can not be given
GDBT bottom layer regression code (bottom layer of lifting tree)
The core of GBDT is residual
There is a dependency between each base learner, and the current learner uses the last residual to calculate
Finally, the residual is getting smaller and smaller, which can make the sum of all base learners close to the real value
import pandas as pd import numpy as np from sklearn.tree import DecisionTreeRegressor from sklearn.metrics import mean_squared_error
#Import dataset df = pd.DataFrame([[1,5.56],[2,5.7],[3,5.91],[4,6.4],[5,6.8] ,[6,7.05],[7,8.9],[8,8.7],[9,9],[10,9.05]],columns=["x","y"])
#Set the container for storing base learners and set the number of base learners M = [] #An array that stores the decision tree model n_trees = 4 #Set the number of trees X = df.iloc[:,[0]] #Construct X Y = df.iloc[:,[-1]] #Construct Y y_ = Y(Here, you can obtain the information of the data by re assignment)
#Storage model #Calculate the residual as the input value of the next base learner for i in range(n_trees): model = DecisionTreeRegressor(max_depth=2).fit(X,Y) #New decision tree model M.append(model) #Add decision tree model to array Y_het = model.predict(X) #Output model predictions Y_het = pd.DataFrame(Y_het,columns=["y"]) #Convert model predictions to DataFrame print(Y_het) Y = Y - Y_het #Change the original Y and let the next learner continue learning print(i, Y)
#Set up a container to store all samples #Record the value generated each time for addition and calculation res = np.zeros(df.shape[0]) #Initialize all zero vector for i in M: #Traversal model array res += i.predict(X) #The predicted values of each model are superimposed on the res variable print(res) #Output the final predicted value for each sample label
GBDT library adjustment regression
#It is much simpler to adjust the library import pandas as pd import numpy as np from sklearn.metrics import mean_squared_error from sklearn.ensemble import GradientBoostingRegressor import warnings warnings.filterwarnings('ignore')
model = GradientBoostingRegressor(n_estimators=50) model.fit(X, Y) y_ = model.predict(X) print(y_) print(mean_squared_error(Y, y_))