Today, I share a few pieces of code commonly used in work and life, which are the most basic functions and operations, and most of them appear frequently. Many of them can be used directly or simply modified and can be put into their own projects
Date generation
Many times, we need to generate dates in batch. There are many methods. Here are two pieces of code
Get the date in the past N days
import datetime def get_nday_list(n): before_n_days = [] for i in range(1, n + 1)[::-1]: before_n_days.append(str(datetime.date.today() - datetime.timedelta(days=i))) return before_n_days a = get_nday_list(30) print(a)
Output:
['2021-12-23', '2021-12-24', '2021-12-25', '2021-12-26', '2021-12-27', '2021-12-28', '2021-12-29', '2021-12-30', '2021-12-31', '2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08', '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12', '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16', '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20', '2022-01-21']
Generate a date within a period of time
import datetime def create_assist_date(datestart = None,dateend = None): #Create date auxiliary table if datestart is None: datestart = '2016-01-01' if dateend is None: dateend = datetime.datetime.now().strftime('%Y-%m-%d') #Convert to date format datestart=datetime.datetime.strptime(datestart,'%Y-%m-%d') dateend=datetime.datetime.strptime(dateend,'%Y-%m-%d') date_list = [] date_list.append(datestart.strftime('%Y-%m-%d')) while datestart<dateend: #Date superimposed by one day datestart+=datetime.timedelta(days=+1) #Date conversion string stored in the list date_list.append(datestart.strftime('%Y-%m-%d')) return date_list d_list = create_assist_date(datestart='2021-12-27', dateend='2021-12-30') d_list
Output:
['2021-12-27', '2021-12-28', '2021-12-29', '2021-12-30']
Save data to CSV
Saving data to CSV is a very common operation. I would like to share a piece of writing that I personally prefer
def save_data(data, date): if not os.path.exists(r'2021_data_%s.csv' % date): with open("2021_data_%s.csv" % date, "a+", encoding='utf-8') as f: f.write("title,degree of heat,time,url\n") for i in data: title = i["title"] extra = i["extra"] time = i['time'] url = i["url"] row = '{},{},{},{}'.format(title,extra,time,url) f.write(row) f.write('\n') else: with open("2021_data_%s.csv" % date, "a+", encoding='utf-8') as f: for i in data: title = i["title"] extra = i["extra"] time = i['time'] url = i["url"] row = '{},{},{},{}'.format(title,extra,time,url) f.write(row) f.write('\n')
Pyecharts with background color
Pyecharts, as an excellent Python implementation of Echarts, is favored by many developers. When drawing with pyecharts, using a comfortable background will also add a lot of color to our charts
Take the pie chart as an example, change the background color by adding JavaScript code
def pie_rosetype(data) -> Pie: background_color_js = ( "new echarts.graphic.LinearGradient(0, 0, 0, 1, " "[{offset: 0, color: '#c86589'}, {offset: 1, color: '#06a7ff'}], false)" ) c = ( Pie(init_opts=opts.InitOpts(bg_color=JsCode(background_color_js))) .add( "", data, radius=["30%", "75%"], center=["45%", "50%"], rosetype="radius", label_opts=opts.LabelOpts(formatter="{b}: {c}"), ) .set_global_opts(title_opts=opts.TitleOpts(title=""), ) ) return c
requests library call
According to statistics, the requests library is the most cited third-party library in the Python family, which shows its high status in the Jianghu!
Send GET request
import requests headers = { 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36', 'cookie': 'some_cookie' } response = requests.request("GET", url, headers=headers)
Send POST request
import requests payload={} files=[] headers = { 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36', 'cookie': 'some_cookie' } response = requests.request("POST", url, headers=headers, data=payload, files=files)
Loop requests based on certain conditions, such as the date generated
def get_data(mydate): date_list = create_assist_date(mydate) url = "https://test.test" files=[] headers = { 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36', 'cookie': '' } for d in date_list: payload={'p': '10', 'day': d, 'nodeid': '1', 't': 'itemsbydate', 'c': 'node'} for i in range(1, 100): payload['p'] = str(i) print("get data of %s in page %s" % (d, str(i))) response = requests.request("POST", url, headers=headers, data=payload, files=files) items = response.json()['data']['items'] if items: save_data(items, d) else: break
Python operates various databases
Operating Redis
Connect to Redis
import redis def redis_conn_pool(): pool = redis.ConnectionPool(host='localhost', port=6379, decode_responses=True) rd = redis.Redis(connection_pool=pool) return rd
Write to Redis
from redis_conn import redis_conn_pool rd = redis_conn_pool() rd.set('test_data', 'mytest')
Operation MongoDB
Connect MongoDB
from pymongo import MongoClient conn = MongoClient("mongodb://%s:%s@ipaddress:49974/mydb" % ('username', 'password')) db = conn.mydb mongo_collection = db.mydata
Batch insert data
res = requests.get(url, params=query).json() commentList = res['data']['commentList'] mongo_collection.insert_many(commentList)
Operating MySQL
Connect to MySQL
import MySQLdb #Open database connection db = MySQLdb.connect("localhost", "testuser", "test123", "TESTDB", charset='utf8' ) #Use the cursor() method to get the operation cursor cursor = db.cursor()
Execute SQL statement
#Execute SQL statements using the execute method cursor.execute("SELECT VERSION()") #Get a piece of data using the {fetchone() method data = cursor.fetchone() print "Database version : %s " % data #Close database connection db.close()
Output:
Database version : 5.0.45
Local file collation
Sorting files involves many requirements. What is shared here is to integrate multiple local CSV files into one file
import pandas as pd import os df_list = [] for i in os.listdir(): if "csv" in i: day = i.split('.')[0].split('_')[-1] df = pd.read_csv(i) df['day'] = day df_list.append(df) df = pd.concat(df_list, axis=0) df.to_csv("total.txt", index=0)
Multithreaded code
There are also many ways to implement multithreading. We can choose the way we are most familiar with
import threading import time exitFlag = 0 class myThread (threading.Thread): def __init__(self, threadID, name, delay): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.delay = delay def run(self): print ("Start thread:" + self.name) print_time(self.name, self.delay, 5) print ("Exit thread:" + self.name) def print_time(threadName, delay, counter): while counter: if exitFlag: threadName.exit() time.sleep(delay) print ("%s: %s" % (threadName, time.ctime(time.time()))) counter -= 1 #Create a new thread thread1 = myThread(1, "Thread-1", 1) thread2 = myThread(2, "Thread-2", 2) #Start a new thread thread1.start() thread2.start() thread1.join() thread2.join() print ("Exit main thread")
Asynchronous programming code
Asynchronous crawling website
import asyncio import aiohttp import aiofiles async def get_html(session, url): try: async with session.get(url=url, timeout=8) as resp: if not resp.status // 100 == 2: print(resp.status) print("Crawling", url, "An error occurred") else: resp.encoding = 'utf-8' text = await resp.text() return text except Exception as e: print("An error occurred", e) await get_html(session, url)
After using asynchronous request, the corresponding file saving also needs to be asynchronous, that is, asynchronous at one place and asynchronous everywhere
async def download(title_list, content_list): async with aiofiles.open('{}.txt'.format(title_list[0]), 'a', encoding='utf-8') as f: await f.write('{}'.format(str(content_list)))