Using qe model to analyze the influence of sentence error types on cognitive difficulty

Posted by duvys on Thu, 20 Jan 2022 16:55:58 +0100

Processing corpus

Extract src, mt and time from the primitive and put them in three files respectively.

Sentences in src and mt need to be unmarked

Time needs to be normalized. Normalization method: divide each time by the maximum value in time (the normalized value with softmax is too small)

The formed src and mt are in order. Next, we need to disrupt the order of the corpus

last

The files required for the model are:

Training set: train src    train.mt      train.mt.hter (must be normalized!)   

Test set: test src    test.mt    test.mt.hter (must be normalized!)

Verification set: dev.src , dev.mt , dev.mt.hter(dev can be the same as test)

Modify file parameters

qe_train.sh:

Server block

export CUDA_VISIBLE_DEVICES=3

Data path: this path contains the training set, test set and verification set

datadir=./data/qe/error-corpus

Thesaurus path: this path does not need to be changed. The thesaurus of qe model here still uses the previous normal qe thesaurus, but changes the training corpus and standard answers. The corpus is replaced by the corpus with wrong annotation, and the original hter value is replaced by the normalized time, which is used as the standard answer for prediction

vocabdir=./data/vocab/ccmt2021/55

Model path: finally, the storage address of the model trained with the new corpus

modeldir=./saved_error_score_model

This paragraph is to qe_model.py pass parameters. Here, make sure that the names of train, test and dev cannot be wrong. lab=mt.hter in the fifth line is the suffix of the hter value file in the original experiment. Changing to another suffix will report an error, so the suffix mt.hter will continue to be used here, and the suffix of the normalized time file will be changed to mt.hter

num_train_step = 40000 is the number of training steps

python qe_model.py \
      --src=src \
      --mt=mt \
      --fea=sent.tfrecord \
      --lab=mt.hter \
      --train_prefix=${datadir}/train \
      --dev_prefix=${datadir}/dev \
      --test_prefix=${datadir}/dev \
      --vocab_prefix=${vocabdir}/vocab.low \
      --max_vocab_size=120000 \
      --out_dir=${modeldir} \
      --optimizer=lazyadam \
      --warmup_steps=8000 \
      --learning_rate=2.0 \
      --num_train_steps=40000 \
      --steps_per_stats=100 \
      --steps_per_external_eval=1000 \
      --rnn_units=128 \
      --rnn_layers=1 \
      --embedding_size=512 \
      --num_units=512 \
      --num_layers=4 \
      --ffn_inner_dim=512 \
      --qe_batch_size=64 \
      --infer_batch_size=64 \
      --metrics=pearson \
      --use_hf=False \
      --num_buckets=4 \
      --dim_hf=17 \
      --train_level=sent \
      --avg_ckpts=True \
      --fixed_exp=True \
      --label_smoothing=0.1 \
      --exp_model_dir=${exp_modeldir}

Start training

Then you can train

bash qe_train.sh

The training time is six hours, and the trained model is saved in saved_ error_ score_ In model

Perform infer

After the training, start testing the test set

bash qe_infer.sh

The generated forecast file is saved in the infer error folder

export CUDA_VISIBLE_DEVICES=0

metrics=pearson

datadir=./data/qe/error-corpus
modeldir=./saved_error_score_model
inferdir=${datadir}/infer-error

mkdir -p ${inferdir}

data=test

python qe_model.py \
    --out_dir=${modeldir} \
    --ckpt=${modeldir}/avg_best_${metrics} \
    --inference_src_file=${datadir}/${data}.src \
    --inference_mt_file=${datadir}/${data}.mt \
    --inference_fea_file=${datadir}/${data}.sent.tfrecord \
    --inference_output_file=${inferdir}/${data}.infer2

Calculate Pearson coefficient

Test for formed infer2. Send and test Mt.hter calculation Pearson

Topics: NLP