NLP text classification practical introduction super detailed tutorial
catalogue
preface
1, Data loading
1. Load package
2. Read data
II. Text processing
1. Remove useless characters
2. Text segmentation
3. Remove stop words
4. Remove low-frequency words
5. Divide training set and test set
3, Convert text into vector form
1. Convert text into TF IDF vector
2. Convert text into word2vec vector
3. Conv ...
Posted by adamwhiles on Fri, 28 Jan 2022 02:58:21 +0100
Detailed explanation of Label Smoothing and implementation of pytorch tenorflow
definitionLabel smoothing, like L1, L2 and dropout, is a regularization method in the field of machine learning. It is usually used for classification problems. The purpose is to prevent the model from predicting labels too confidently during training and improve the problem of poor generalization ability.backgroundFor the classification proble ...
Posted by runfastrick on Thu, 27 Jan 2022 14:28:53 +0100
[PyTorch] 13 Image Caption: let neural network read pictures and tell stories
1. Data set acquisition
Data from: AI challenger 2017 image description dataset Baidu online disk: https://pan.baidu.com/s/1g1XaPKzNvOurH9M44p1qrw Extraction code: bag3
Since the original training set is too large, only the verification set AI is used here_ challenger_ caption_ validation_ 20170910.zip, unzip it
2. Text data processing
...
Posted by Syranide on Tue, 25 Jan 2022 10:54:37 +0100
Document of word segmentation
The git of word segmentation can't be opened. Turn the content over for easy viewing
How to use word segmentation:
1. Quick experience
Run the script under the project root directory demo-word.bat You can quickly experience the word segmentation effect
usage: command [text] [input] [output]
command command The optional values are: demo,text ...
Posted by eco on Tue, 25 Jan 2022 04:17:34 +0100
Detailed explanation of NLP Transformer
Transformer details
Attention is all you need It is a paper that gives full play to the idea of Attention, which comes from Google. In this paper, a new model called Transformer is proposed, which abandons CNN and RNN used in previous deep learning tasks (in fact, it is not completely, but also uses one-dimensional convolution). This model is ...
Posted by SieRobin on Tue, 25 Jan 2022 00:30:01 +0100
Create a "theft note" with paddleocr
While listening to AI Studio courses and other online courses
Taking notes is too slow Incomplete memory Can't keep up with the teacher's lecture speed Missed the teacher's lecture because of taking notes
Worry about problems? Come and create a "theft note" with paddleocr! Your browser does not support video tags.
Introduction ...
Posted by presence on Sun, 23 Jan 2022 05:53:05 +0100
One move will take you to master all the videos of station B. Python script will download your favorite fairy videos and download whatever you want
Mobile phone buddies, especially those with little sisters, are now being written out to download B's dance videos. Now, interested friends can try to practice their hands, and then download other areas such as animation, music, fashion, ghost and tiktok, etc. after they master the method, B can help them to learn how to do the work. Download ...
Posted by knowj on Sat, 22 Jan 2022 16:27:51 +0100
Teach you how to build Bert text classification model. Come and see it quickly!
1 title
Quality analysis model of enterprise hidden danger investigation based on Text Mining
2 competition background
It is of great significance for enterprises to fill in the hidden dangers of safety production independently to eliminate the risks in the embryonic stage of accidents. When enterprises fill in hidden dangers, they often d ...
Posted by mverrier on Fri, 21 Jan 2022 13:10:00 +0100
Using qe model to analyze the influence of sentence error types on cognitive difficulty
Processing corpus
Extract src, mt and time from the primitive and put them in three files respectively.
Sentences in src and mt need to be unmarked
Time needs to be normalized. Normalization method: divide each time by the maximum value in time (the normalized value with softmax is too small)
The formed src and mt are in order. Next, ...
Posted by duvys on Thu, 20 Jan 2022 16:55:58 +0100
Analysis of point raising confrontation training in NLP competition
preface
In NLP competition, confrontation training is a common means to improve points. This paper will introduce the scene, function, type, specific implementation and future prospect of confrontation training in detail.
Confrontation training application scenario
Szegedy proposed the concept of countermeasure sample in the 14 year ICLR. ...
Posted by dormouse1976 on Thu, 20 Jan 2022 16:04:38 +0100