I found the reason with Python and became even more hopeless

Posted by jogisarge on Thu, 27 Jan 2022 07:02:58 +0100

Kevin, who has disappeared for a week, is back! We won't talk about sensitive things in this issue. Because * I've had enough.

Tell me about a friend's bitter history of Yaohao (license plate). And how I used python Technology (crawler, data analysis, visualization) to help him find the reason and make him more sad.

The data in this article is only for communication and learning, and cannot be used as other basis. The first person me is used in the text, and my friend is the one who waves the number.

Bitter history

The number of Yaohao is silently executed once again + = 1,

At first, I knew that I couldn't shake it for six years. It's better to buy another piece at a low price as soon as possible..

The individual winning rate in February was 0.54%!!

What is this concept??

On average, it takes nearly 200 shakes to win one!! 200 times, 16.6 years!!

I guess at that time, people were flying..


Lao Wang, who lives in the same dormitory of the University,

After working, I changed my bike to a battery car,

After shaking the number three times, he replaced the battery car with a car,

Every day carrying different sister everywhere!!!

This is the memory of youth!!!!


After I joined the work, I changed from bicycle to battery car, and directly changed from battery car to subway!!!

And watching a million people shake more than 5000 license plates,

I feel like I'm making up the denominator in 5000 / 1000000.


no way! I want to fight against fate,

I want to find the law of shaking the license plate!!


Just do it!

get data

The data can be obtained by crawler or copied manually. Here, it is assumed that the replication method is used. Actually, I use reptiles. Relevant code acquisition methods will be explained at the end of the article.

The website target is the passenger car increment index website of a city. You can analyze websites in your own city in the same way.


So I skillfully entered the url that made me happy and worried in the browser,


After a series of operations,

Copy and paste, copy and paste, copy and paste

All copied.


Next is the data analysis stage.

Start data analysis

Please give me some time ~ I want to know which surname always wins the lot

After three days and nights of hard work, I finally finished ordering


Continue to analyze the data, highlight python and make a word cloud picture to see a more straight point of view:


Share code:

import matplotlib.pyplot as plt
from wordcloud import WordCloud


# 1. Read txt text data
# As long as the text data is read in, the word cloud library will automatically count each character and generate a character map of relative size according to the number.
text = open(r'yaohao_data_analysis.txt', "r",encoding='utf-8').read()

# 2. Generate a word cloud map. It should be noted that WordCloud does not support Chinese by default, so the downloaded Chinese font is required here
# No custom background image: you need to specify the pixel size of the generated word cloud image. The default background color is black, and the unified text color is mode='RGBA 'and colormap='pink'
wc = WordCloud(
        # If the font is not specified, garbled code will appear
        font_path=r'./font.otf',
        # Set background color
        background_color='white',
        # Set background width
        width=800,
        # Set background height
        height=600,
        # Maximum font
        max_font_size=200,
        # Minimum font
        min_font_size=50,
        mode='RGBA'
        #colormap='pink'
        )
# Generate word cloud
wc.generate(text)
# Save picture
wc.to_file(r"wordcloud.png") # Save the drawn word cloud image according to the set pixel width and height, which is clearer than the following program
# 3. Display pictures
# Specifies the name of the drawing
plt.figure("Rocking number analysis chart")
# Show word clouds as pictures
plt.imshow(wc)
# Turn off the image coordinate system
plt.axis("off")
plt.show()

Another bar chart of the top 20 surnames with the highest winning rate:


Share code:

import re
from collections import Counter
import matplotlib.pyplot as plt

# Instantiate a counter object
count = Counter()

# Match all last names regularly
with open('./yaohao_data_analysis.txt','r',encoding="utf-8")as f:
    text = f.read()
f_name_list = re.findall(r'[\u4E00-\u9FA5]',text)

for n in f_name_list:
 count[n] += 1

# Convert count to dictionary
d = dict(count)

# Sort unordered Dictionaries
def dict_sort(dic):
    l=list(dic.items())
    l.sort(key=lambda x:x[1],reverse=True)
    return l

key=[]
value=[]
for k,v in dict_sort(d):
    key.append(k)
    value.append(v)

# Generate bar chart
# These two lines of code solve the problem of plt Chinese display
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

plt.bar(key[:20], value[:20])
plt.title('Rocking number analysis bar chart')
plt.show()

The first one is really Lao Wang next door!!


Postscript

It's good to learn Python well, whether in employment or sideline, but to learn python, you still need to have a learning plan. Finally, let's share a full set of Python learning materials to help those who want to learn Python!

1, Python learning routes in all directions

All directions of Python is to sort out the commonly used technical points of Python and form a summary of knowledge points in various fields. Its purpose is that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.

2, Learning software

If a worker wants to do well, he must sharpen his tools first. The commonly used development software for learning Python is here, which saves you a lot of time.

3, Getting started video

When we watch videos, we can't just move our eyes and brain without hands. The more scientific learning method is to use them after understanding. At this time, the hand training project is very suitable.

4, Actual combat cases

Optical theory is useless. We should learn to knock together and practice, so as to apply what we have learned to practice. At this time, we can make some practical cases to learn.

5, Interview materials

We must learn Python in order to find a high paying job. The following interview questions are the latest interview materials from front-line Internet manufacturers such as Alibaba, Tencent and byte, and Alibaba boss has given authoritative answers. After brushing this set of interview materials, I believe everyone can find a satisfactory job.


This complete set of Python learning materials has been uploaded to CSDN. Friends can scan the official authentication QR code of CSDN below on wechat and get it for free [guaranteed to be 100% free]

Topics: Python Back-end crawler