Emotion classification model of Chinese comments based on fine tuning BERT (supporting CPU and GPU)

Posted by aunquarra on Wed, 15 Dec 2021 09:15:06 +0100

Migrate Chinese bert Model( chinese-bert-wwm),Complete the emotion classification of more than 20000 comment data, which is divided into 0-Good, 1-General, 2-Poor.
Learn bert After the model became powerful, I couldn't help but sigh that the model used in the previous competition was too rubbish to see and think bert Training models to simulate the game,
So it's time bert The model is migrated and trained on its own data set

Text classification using ktrain Library

0. Allocate GPU (CPU version omitted)

%reload_ext autoreload
%autoreload 2
%matplotlib inline
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID";
os.environ["CUDA_VISIBLE_DEVICES"]="0";   #Specify GPU

Configure GPU stepping
1.Python 3.0 in win10 environment 7+cuda10. 1+cudnn7. 6+tensorflow-gpu1. 13 installation of

2.Note the installation of CUDA and cuDNN (environment variables)

3.Tensorflow GPU version, cuda version and python version can be downloaded accordingly
Download and install in combination with these three articles

1. Load data

#0 - very good, 1 - average, 2 - poor
import pandas as pd
import numpy as np
train = pd.read_excel('D:/python relevant/data/train_sentiment.xls')
test = pd.read_excel('D:/python relevant/data/test_sentiment.xls')

2. Disrupt data

X_data=train.content
y_data=np.asarray(train.cls, dtype=np.float32)
x_data_test=test.content
y_data_test=np.asarray(test.cls, dtype=np.float32)
nums = []
nums_ = []
x_train = []
y_train = []
x_test = []
y_test = []
nums = np.arange(9999)
np.random.shuffle(nums)
for i in nums:
    x_train.append(X_data[i])
    y_train.append(y_data[i])
nums_ = np.arange(3602)
np.random.shuffle(nums_)
for i in nums_:
    x_test.append(x_data_test[i])
    y_test.append(y_data_test[i])
y_train=np.asarray(y_train, dtype=np.int)
y_test=np.asarray(y_test, dtype=np.int)

3. Preprocess the data and construct a Chinese trasnformer model

import ktrain
from ktrain import text
MODEL_NAME = 'hfl/chinese-bert-wwm'
t = text.Transformer(MODEL_NAME, maxlen=300, class_names=[0,1,2])#300 is a little small. You can set 400 / 500
trn = t.preprocess_train(x_train, y_train)
val = t.preprocess_test(x_test, y_test)
model = t.get_classifier()
learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=1)
#The GPU capacity is limited, and the parameters are adjusted as small as possible, and the results will be affected accordingly

*[preprocessing train...
language: zh-cn
Is Multi-Label? False
preprocessing test...
language: zh-cn]

Estimate a better learning rate

For the transformer model, combined with experience and great God test summary, the learning rate of 2e-5 - 5e-5 will achieve better performance. Here, we can use lr_find method to try to find a better learning rate on specific data

Here, we only run one epoch (because it takes too long, GPU students can try other parameters)

learner.lr_find(show_plot=True, max_epochs=1)


Use the loss function to determine the best learning rate. The smaller the loss, the better

Training model

Parameter setting: learning_rate = 2e-5, epochs = 4
CPU time is very long. GPU is more than ten times that of CPU

learner.fit_onecycle(2e-5, 4)

Evaluation model

learner.validate(class_names=t.get_classes())


Category: the accuracy of 0 is significantly lower than that of the other two categories, which may be due to the amount of data

Test on new data

Find the sample with the largest loss

learner.view_top_losses(n=1, preproc=t)

#id:397 | loss:7.4 | true:0 | pred:2)

*# check the error. The reason for this error is very clear. It is a counter example (Category 2: bad comments)*
It shows that the model is quite good.

print(x_test[397])

#1. Due to the design of the keyboard, some keys are not very convenient to press. 2. The lens of the camera is not protected and is exposed. If the lens surface is accidentally scratched, it will affect the effect of taking pictures. 3. The mobile phone is a little big and heavy. It's not very convenient to carry. 4. The sound box attached to the fuselage is not very good, and there will be some noise when the volume is high. 5. The battery life is a little

Test on new data

predictor = ktrain.get_predictor(learner.model, preproc=t)
predictor.predict('How come it's like this... Even copying the Hollywood structure is not enough! "Seven days off" and "the first day of the Spring Festival" are obviously speculation. Be sensational. Two elders died')

2 # results (poor evaluation)

predictor.explain('How come it's like this... Even copying the Hollywood structure is not enough! "Seven days off" and "the first day of the Spring Festival" are obviously speculation. Be sensational. Two elders died')

Save the model so that it can be called later.

predictor.save('/my_commentgroup_predictor')
reloaded_predictor = ktrain.load_predictor('/my_commentgroup_predictor')

Result analysis

The bert model is used to realize the emotion classification of Chinese comments. The data comes from the case data of a competition, with a total of 20000 pieces. There are hardware problems and high training difficulties, so it is reduced to 9999 pieces of training data and 3602 pieces of test data. It may be that the data itself is not accurate enough and the parameters are adjusted. The accuracy result is 88%. The results are better in the case of English classification using English model. Before pre training, I checked many blogs and learned the process, but it is still difficult to get started. The computer GPU has a small capacity and does not support larger model training, which also directly leads to the adjustability of model parameters.

Experimental summary

The results can also be optimized for competition. The difficulty is mainly in the debugging of model pre training. The computer CPU training is slow, and there are many pits in the process. Therefore, the GPU environment is configured, but it has been reporting errors, and it still encounters many pits, which makes it tired to step on. The environment is configured, but the computational power and capacity are limited. Let's stop here for the time being. It further reminds me of my ability. There is a long way to go. Come on!

Topics: Python Deep Learning NLP BERT