After the epidemic, Python made a visual travel strategy

Posted by flyersman on Sat, 27 Jun 2020 08:49:04 +0200

 

preface

The text and pictures of this article are from the Internet, only for learning and communication, not for any commercial purpose. The copyright belongs to the original author. If you have any questions, please contact us in time for handling.

What do you want to do most after the outbreak? Take off the mask and go out for the sun. Do you want Korean food? Or rolling? Or hot pot? Stay with your family for a few days to make up for the missing company during the Spring Festival; go to the once familiar cinema and make up all the missed New Year's blockbusters.

 

When the epidemic is over, do you want to go to the hero city of Wuhan, see cherry blossom, eat a bowl of hot and dry noodles in Hubu lane, and see the Yangtze River Bridge where cars come and go. Whether there will be more travel plans, no matter where the beautiful scenery, want to see.

 

Travel is to relax and experience local characteristics. For a perfect trip, it is necessary to make a detailed travel strategy. For this reason, in the last article, the editor has crawled through the strategy database of "where to go" and obtained nearly 3.8 strategy data. Data fields include: area, destination, title, link, strategist, departure date, days, number of photos, number of people, play method, cost, number of reading, number of likes, number of comments, itinerary, etc.

 

 

 

 

 

Data preprocessing

For the acquired data, we need to carry out further processing to meet the needs of analysis. The main data processing steps are as follows:

  • Delete duplicate values
  • Correction field
  • Delete unnecessary fields

Specific code implementation

#data fetch
import pandas as pd
import re
#data fetch
base_data =  pd.read_excel('trip_data_merge.xlsx')
#Delete duplicate values
base_data.drop_duplicates(inplace=True)
#Fields not needed in analysis
base_data = base_data.drop(['link'], axis=1)
#Field correction, convenient for statistics
base_data['Days']= base_data['Days'].apply(lambda x :re.sub("\D", "", x) )
base_data['Number of photos']= base_data['Number of photos'].apply(lambda x :re.sub("\D", "", x) )
base_data['cost']= base_data['cost'].apply(lambda x :re.sub("\D", "", str(x) ))
base_data['cost'] = base_data['cost'].apply(lambda x : eval(x) if len(x)>0 else 0 )
base_data['date']= base_data['Departure date'].apply(lambda x :x.split( )[0] )
base_data['date_year']= base_data['Departure date'].apply(lambda x :x.split( )[0][:4])
base_data['Number of readings']= base_data['Number of readings'].apply(lambda x : int(re.sub("\D", "", str(x)))*10000 if str(x).find('ten thousand') else x)

Data analysis and visualization

 

Cost issues

The first thing to consider when traveling is the cost. Due to the epidemic situation, in the cost data analysis and statistics, we have excluded the data in 2020 and considered the data in 2017, 2018 and 2019.

 

 

 

The figure above shows the per capita consumption of major hot destinations in the past three years, including domestic and foreign hot areas. According to the statistics, the average per capita consumption in foreign countries is 9461 yuan, and that in China is 3313 yuan. The consumption of tourists in foreign countries is 2.85 times of that in China. The top four domestic per capita consumption: Lijiang, Sanya, Hong Kong and Shanghai. Top four consumers abroad: Maldives, France, the United States, Japan. Why is the per capita consumption of tourists in Maldives 6 times higher than that in Shanghai?

 

Tourists

 

Distribution of tourists in Maldives

 

 

 

Distribution of tourists in Shanghai

 

 

 

Maldives, a place with a name that makes people imagine, is called the necklace god throws on the world. The last paradise of the world attracts many people to go on holiday and leisure. The proportion of lovers is as high as 54.8%. In addition, the consumption of machine wine is also an important reason for the high consumption of Maldives. The tourists in Shanghai are relatively scattered, and the proportion of lovers is about 15%, One person travel, 35 friends account for a relatively high proportion.

Length of stay

Length of stay of Shanghai tourists

 

 

 

Length of stay of Maldives tourists

 

 

 

To judge the attraction of a city to tourists, the stay time of tourists is the core index. From the above figure, we can see that the total stay time of Maldives is more than 80% in 4-7 days and 8-10 days. The proportion of staying time in Shanghai in 1-3 days is 52.45%, and that in 4-7 days and 8-10 days is about 41%, which is an important factor of high per capita consumption in Maldives.

Play strategy

We can see that food, shopping + food, short weekend, seashore Island, self driving and other playing methods are your favorite. Exploration, tour, cycling and so on are also popular with many people. What kind of playing methods do you like?

Punch in attractions

Go to a place to travel, some scenic spots must go, for a strange city, how can you quickly determine the punch point? Xiao Bian chooses Shanghai, Chengdu and Wuhan to see if there are any missed punch points.

 

 

 

 

Best route

There are all kinds of punch in spots we want to go to. We need to have the most perfect route. Xiaobian combs the route that netizens like the most. Are you satisfied? Let's take a picture.

 

 

conclusion

 

So far, the editor has taken you to know about the average consumption of the tourist destination, the strategies of playing, the strategies of punching cards, the best routes, etc. if you have any questions, please leave a message in the comment area, and attach some core codes because of the space display

Core code display

#Proportion of stay days of Shanghai tourists
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker

c = (
    Pie()
    .add(
        "",
        [list(z) for z in zip(list(base_data_city_day_sh['Days']), list(base_data_city_day_sh['num']))],
        radius=["40%", "55%"],
        label_opts=opts.LabelOpts(
            position="outside",
            formatter="{a|{a}}{abg|}\n{hr|}\n {b|{b}: }{c}  {per|{d}%}  ",
            background_color="#eee",
            border_color="#aaa",
            border_width=1,
            border_radius=4,
            rich={
                "a": {"color": "#999", "lineHeight": 22, "align": "center"},
                "abg": {
                    "backgroundColor": "#e3e3e3",
                    "width": "100%",
                    "align": "right",
                    "height": 22,
                    "borderRadius": [4, 4, 0, 0],
                },
                "hr": {
                    "borderColor": "#aaa",
                    "width": "100%",
                    "borderWidth": 0.5,
                    "height": 0,
                },
                "b": {"fontSize": 16, "lineHeight": 33},
                "per": {
                    "color": "#eee",
                    "backgroundColor": "#334455",
                    "padding": [2, 4],
                    "borderRadius": 2,
                },
            },
        ),
    )
    .set_global_opts(title_opts=opts.TitleOpts(title="Proportion of stay time of Shanghai tourists"))
    .render("Proportion of stay time of Shanghai tourists.html")
)

 

Cloud of words

import stylecloud
from IPython.display import Image # Used to display local pictures in jupyter lab
result_gap = ' '.join(result)
# Draw word cloud
stylecloud.gen_stylecloud(text=result_gap, 
                          max_words=1000,
                          collocations=False,
                          font_path=r'msyh.ttf',
                          icon_name='fas fa-plane-departure',
                          size=624,
                          output_name='Cloud chart of punch words.png')

Image(filename='Cloud chart of punch words.png') 

Topics: Lambda Spring Database IPython