Hello, I'm Chen Cheng!
Not long ago, Xiaobian brushed such a short video, "170 million post-90s have only about 10 million couples married, and the marriage rate is less than 10%". Of course, we can't verify the source and authenticity of the data, but Xiaobian can always hear that his friends are complaining about the difficulty of getting rid of orders and finding suitable objects.
Today, Xiaobian wrote a simple script in Python to capture the public blind date copy and see what kind of people are on a blind date? What are their criteria for mate selection? What kind of people are more likely to take off the bill?
Code writing process
We introduce the required library. Here, we use the requests Library in Python to send and receive requests, and parse the data through the regular expression re library
import requests from tenacity import * import re import time
Many times, we encounter request timeout, so when an error occurs, we will try several more times. Therefore, we use the retry decorator to try many times
@retry(stop=stop_after_attempt(5)) def do_requests(url): response = requests.get(url, headers=headers, proxies=proxies, timeout=10) return response.text
The data we captured include the year of birth, height / weight, education, income, occupation, self introduction, mate selection criteria, garage situation, etc., which are all realized through the regular expression re library,
date_of_birth = re.compile("<br/>①date of birth/constellation(.*?)<br/>", re.M | re.S) sex = re.compile("<br/>[[basic information](.*?)<br/>") height = re.compile("<br/>②height/weight(.*?)<br/>") education = re.compile("<br/>⑤education(.*?)<br/>") jobs_1 = re.compile("<br/>⑥occupation(.*?)<br/>") income = re.compile("<br/>⑦Average monthly income(.*?)<br/>") married = re.compile("<br/>⑨Have you ever been married(.*?)<br/>") house_cars = re.compile("<br/>⑧Garage conditions(.*?)<br/>") self_intro = re.compile("<br/>⑪ self-introduction(.*?)<br/>") requirements = re.compile("<br/>[Spouse selection criteria]<br/>(.*?)</a>") family_member = re.compile("<br/>⑩member of family(.*?)<br/>")
Visual display of results
Let's take a look at the sex ratio first. From the distribution, the proportion of girls coming to blind date is higher, mainly because the data source is blind date introduction from big cities such as Beijing, Shanghai and Hangzhou. It seems that it is more difficult for girls to take off their orders in big cities,
Let's take a look at the characteristics of single women. First, their ages are mainly around 94, 93 and 95, which happen to be at the marriageable age
And their education, undergraduate accounted for the vast majority, basically have undergraduate education, while the proportion of junior college ranked second, and master's and doctor's degrees were in the minority
In addition, Xiaobian also made a statistics on the constellations of single women, and found that the single rate of women in Virgo, Libra, Sagittarius and Aries was slightly higher
Finally, let's take a look at their mate selection criteria. Xiaobian extracted their mate selection criteria separately, and then drew a word cloud map
review_list = [] reviews = get_cut_words("".join(df_girls["requirements"].astype(str).tolist())) reviews_counter = Counter(reviews).most_common(200) print(reviews_counter) for review in reviews_counter: review_list.append((" " + review[0] + " ") * review[1]) stylecloud.gen_stylecloud(text=" ".join(review_list), max_words=500, collocations=False, font_path="KAITI.ttf", icon_name="fab fa-apple", size=653, output_name="4.png")
The final appearance is shown in the figure below
It can be seen that girls in the blind date market first want their man to have a house and a car. Secondly, if the man has a marriage history before, girls will mind. Then, if they have a stable job, ability and sense of responsibility, they will usually leave a better impression on girls. As for external conditions, most girls' answer is that they are about 175-180 tall, The age is between 90-97 years.
Write at the end
In recent years, with the change of people's ideas, blind dates have gradually been accepted and recognized by young people, especially for those who have a narrow circle and can't contact the opposite sex. Xiaobian hopes that everyone can finally harvest love and have a better life.