Share some ancestral Python code for direct use!

Posted by Drace on Wed, 26 Jan 2022 04:27:04 +0100

Today, I share a few pieces of code commonly used in work and life, which are the most basic functions and operations, and most of them appear frequently. Many of them can be used directly or simply modified and can be put into their own projects

Date generation

Many times, we need to generate dates in batch. There are many methods. Here are two pieces of code

Get the date in the past N days

import datetime

def get_nday_list(n):
    before_n_days = []
    for i in range(1, n + 1)[::-1]:
        before_n_days.append(str(datetime.date.today() - datetime.timedelta(days=i)))
    return before_n_days

a = get_nday_list(30)
print(a)

Output:

['2021-12-23', '2021-12-24', '2021-12-25', '2021-12-26', '2021-12-27', '2021-12-28', '2021-12-29', '2021-12-30', '2021-12-31', '2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08', '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12', '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16', '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20', '2022-01-21']

Generate a date within a period of time

import datetime

def create_assist_date(datestart = None,dateend = None):
    #Create date auxiliary table

    if datestart is None:
        datestart = '2016-01-01'
    if dateend is None:
        dateend = datetime.datetime.now().strftime('%Y-%m-%d')

    #Convert to date format
    datestart=datetime.datetime.strptime(datestart,'%Y-%m-%d')
    dateend=datetime.datetime.strptime(dateend,'%Y-%m-%d')
    date_list = []
    date_list.append(datestart.strftime('%Y-%m-%d'))
    while datestart<dateend:
        #Date superimposed by one day
        datestart+=datetime.timedelta(days=+1)
        #Date conversion string stored in the list
        date_list.append(datestart.strftime('%Y-%m-%d'))
    return date_list

d_list = create_assist_date(datestart='2021-12-27', dateend='2021-12-30')
d_list

Output:

['2021-12-27', '2021-12-28', '2021-12-29', '2021-12-30']

Save data to CSV

Saving data to CSV is a very common operation. I would like to share a piece of writing that I personally prefer

def save_data(data, date):
    if not os.path.exists(r'2021_data_%s.csv' % date):
        with open("2021_data_%s.csv" % date, "a+", encoding='utf-8') as f:
            f.write("title,degree of heat,time,url\n")
            for i in data:
                title = i["title"]
                extra = i["extra"]
                time = i['time']
                url = i["url"]
                row = '{},{},{},{}'.format(title,extra,time,url)
                f.write(row)
                f.write('\n')
    else:
        with open("2021_data_%s.csv" % date, "a+", encoding='utf-8') as f:
            for i in data:
                title = i["title"]
                extra = i["extra"]
                time = i['time']
                url = i["url"]
                row = '{},{},{},{}'.format(title,extra,time,url)
                f.write(row)
                f.write('\n')

Pyecharts with background color

Pyecharts, as an excellent Python implementation of Echarts, is favored by many developers. When drawing with pyecharts, using a comfortable background will also add a lot of color to our charts

Take the pie chart as an example, change the background color by adding JavaScript code

def pie_rosetype(data) -> Pie:
    background_color_js = (
    "new echarts.graphic.LinearGradient(0, 0, 0, 1, "
    "[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)"
)
    c = (
        Pie(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js)))
        .add(
            "",
            data,
            radius=["30%", "75%"],
            center=["45%", "50%"],
            rosetype="radius",
            label_opts=opts.LabelOpts(formatter="{b}: {c}"),
        )
        .set_global_opts(title_opts=opts.TitleOpts(title=""),
                        )
    )
    return c

requests library call

According to statistics, the requests library is the most cited third-party library in the Python family, which shows its high status in the Jianghu!

Send GET request

import requests


headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
  'cookie': 'some_cookie'
}
response = requests.request("GET", url, headers=headers)

Send POST request

import requests


payload={}
files=[]
headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
  'cookie': 'some_cookie'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)

Loop requests based on certain conditions, such as the date generated

def get_data(mydate):
    date_list = create_assist_date(mydate)
    url = "https://test.test"
    files=[]
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
        'cookie': ''
        }
    for d in date_list:
        payload={'p': '10',
        'day': d,
        'nodeid': '1',
        't': 'itemsbydate',
        'c': 'node'}
        for i in range(1, 100):
            payload['p'] = str(i)
            print("get data of %s in page %s" % (d, str(i)))
            response = requests.request("POST", url, headers=headers, data=payload, files=files)
            items = response.json()['data']['items']
            if items:
                save_data(items, d)
            else:
                break

Python operates various databases

Operating Redis

Connect to Redis

import redis


def redis_conn_pool():
    pool = redis.ConnectionPool(host='localhost', port=6379, decode_responses=True)
    rd = redis.Redis(connection_pool=pool)
    return rd

Write to Redis

from redis_conn import redis_conn_pool


rd = redis_conn_pool()
rd.set('test_data', 'mytest')

Operation MongoDB

Connect MongoDB

from pymongo import MongoClient


conn = MongoClient("mongodb://%s:%s@ipaddress:49974/mydb" % ('username', 'password'))
db = conn.mydb
mongo_collection = db.mydata

Batch insert data

res = requests.get(url, params=query).json()
commentList = res['data']['commentList']
mongo_collection.insert_many(commentList)

Operating MySQL

Connect to MySQL

import MySQLdb

#Open database connection
db = MySQLdb.connect("localhost", "testuser", "test123", "TESTDB", charset='utf8' )

#Use the cursor() method to get the operation cursor
cursor = db.cursor()

Execute SQL statement

#Execute SQL statements using the execute method
cursor.execute("SELECT VERSION()")

#Get a piece of data using the {fetchone() method
data = cursor.fetchone()

print "Database version : %s " % data

#Close database connection
db.close()

Output:

Database version : 5.0.45

Local file collation

Sorting files involves many requirements. What is shared here is to integrate multiple local CSV files into one file

import pandas as pd
import os


df_list = []
for i in os.listdir():
    if "csv" in i:
        day = i.split('.')[0].split('_')[-1]
        df = pd.read_csv(i)
        df['day'] = day
        df_list.append(df)
df = pd.concat(df_list, axis=0)
df.to_csv("total.txt", index=0)

Multithreaded code

There are also many ways to implement multithreading. We can choose the way we are most familiar with

import threading
import time

exitFlag = 0

class myThread (threading.Thread):
    def __init__(self, threadID, name, delay):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.delay = delay
    def run(self):
        print ("Start thread:" + self.name)
        print_time(self.name, self.delay, 5)
        print ("Exit thread:" + self.name)

def print_time(threadName, delay, counter):
    while counter:
        if exitFlag:
            threadName.exit()
        time.sleep(delay)
        print ("%s: %s" % (threadName, time.ctime(time.time())))
        counter -= 1

#Create a new thread
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)

#Start a new thread
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print ("Exit main thread")

Asynchronous programming code

Asynchronous crawling website

import asyncio
import aiohttp
import aiofiles

async def get_html(session, url):
    try:
        async with session.get(url=url, timeout=8) as resp:
            if not resp.status // 100 == 2:
                print(resp.status)
                print("Crawling", url, "An error occurred")
            else:
                resp.encoding = 'utf-8'
                text = await resp.text()
                return text
    except Exception as e:
        print("An error occurred", e)
        await get_html(session, url)

After using asynchronous request, the corresponding file saving also needs to be asynchronous, that is, asynchronous at one place and asynchronous everywhere

async def download(title_list, content_list):
    async with aiofiles.open('{}.txt'.format(title_list[0]), 'a',
                             encoding='utf-8') as f:
        await f.write('{}'.format(str(content_list)))

Topics: Python