Write a script in python to dynamically delete the QQ space for many years

Posted by jiayanhuang on Sun, 27 Feb 2022 04:28:23 +0100

Seriously, when I wrote this script, I didn't understand the basic syntax of python. I can only say that language is not important, just easy to use...

As for why I want to delete QQ dynamics, I just don't want to always see those childish words that were sent that year and today. There are many dynamics (although QQ has not been used in recent years). It's too hard to write a script for automatic processing.

To delete all dynamics, I need to do two things:

  1. Get all dynamic information

  2. Get the deleted url, traverse all the obtained dynamic, and then delete them all.

In addition, there is a big premise: you need to get the session information first. This is simple. After all, it's your own QQ. Log in directly, and then there's nothing on the browser.

Of course, it's easy to say, but it's easy to do....... It's also very simple.

I originally wanted to solve it with a shell. Later, I grabbed the dynamic information and found that it was a json. After thinking about it, I think python should be easier to use (hurry to study python syntax for two hours first).

Here we begin:

  1. Log in to QQ space and click to talk. First, find the url that can get the dynamic information list. Here I'll talk about my analysis results directly. There is a page number under the dynamic, and the url when clicking is. Here's an example:
https://h5.qzone.qq.com/proxy/domain/taotao.qq.com/cgi-bin/emotion_cgi_msglist_v6?uin=763795151&inCharset=utf-8&outCharset=utf-8&hostUin=763795151&notice=0&sort=0&pos=840&num=20&cgi_host=http%3A%2F%2Ftaotao.qq.com%2Fcgi-bin%2Femotion_cgi_msglist_v6&code_version=1&format=jsonp&need_private_comment=1&g_tk=635656033&qzonetoken=8f1208fd31ba56f14fb6f71f498501c179fde08e0a6866589e12672dc8fe8c2af00b924c63f35c

It's probably like the above. There are two parameters in it. pos: start from the item and num: get the item. Adjust the values of these two parameters when necessary.

Then, send a request for this url on the browser. I use Google. Then take out the request header, cookie and other information. Dynamic information can be obtained according to the values of pos and num. The returned result is a string, and the middle part is a Json string.

I have deleted it and won't give an example. I will directly describe this string. There is a field msglist in this json string, which is a list and may be empty. When it is not empty, each value has a field tid. The value of tid is a query parameter we need to use when deleting this dynamic. Save each dynamic tid, and then traverse and delete it.

  1. Get the url of the deletion. This operation is very simple. Delete one, and then take out the deleted request information (check the browser). This is a post method to get the url, request header, cookie and request parameters used in deletion. Then configure it.

  2. Traverse all TIDs obtained in 1, then set each tid as the parameter in the request to be deleted, and delete it in turn.

It's not complicated. It doesn't have much technical content. I'll post my implementation below. After all, I don't know if the posture is right when I use python for the first time.

code:

#! /usr/bin/python

import requests
import re
import json

#Total number of dynamics to delete
total=840
limit=10
pages=total/limit
start_pos=2
tids=[]
#In the article, json is the first thing to finish, so a regular processing is used here
pattern=re.compile(r'^_Callback((.*));$')
#Get all dynamic TIDs
cookie="Change to your own browser cookie information"
#Request header
request_headers={'authority':'h5.qzone.qq.com','method':'GET','path':'/proxy/domain/taotao.qq.com/cgi-bin/emotion_cgi_msglist_v6?uin=763795151&inCharset=utf-8&outCharset=utf-8&hostUin=763795151&notice=0&sort=0&pos=820&num=40&cgi_host=http%3A%2F%2Ftaotao.qq.com%2Fcgi-bin%2Femotion_cgi_msglist_v6&code_version=1&format=jsonp&need_private_comment=1&g_tk=1874324779&qzonetoken=98cd2ff9067f620f90a72cbe07da56012b4593659ebbf1deba0b350e82d53be364cd48d038e7ca','scheme':'https','accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3','accept-language':'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7','cache-control':'max-age=0','cookie':cookie, 'upgrade-insecure-requests':'1'}
def getTids(say_list_url):
    res=requests.get(say_list_url, headers=request_headers)
    text=pattern.findall(res.text)[0]
    resolveTid(text)
    print "tid len: %d" % (len(tids))
    
def resolveTid(text):
    jsonObj=json.loads(text)
    if  not jsonObj.has_key('msglist') :
        print text
        return False
    msglist=jsonObj['msglist']
    if  msglist is None :
        return False
    for msg in msglist:
        print "tid: %s" % (msg['tid'])
        tids.append(msg['tid'])

for page in range(pages):
    num=limit*page
    #Dynamically set pos and num values to get all the information
    say_list_url="https://h5.qzone.qq.com/proxy/domain/taotao.qq.com/cgi-bin/emotion_cgi_msglist_v6?uin=763795151&inCharset=utf-8&outCharset=utf-8&hostUin=763795151&notice=0&sort=0&pos=%d&num=%d" % (start_pos+num, limit) + "&cgi_host=http%3A%2F%2Ftaotao.qq.com%2Fcgi-bin%2Femotion_cgi_msglist_v6&code_version=1&format=jsonp&need_private_comment=1&g_tk=1168850316&qzonetoken=87520ec2a71a9f349d23d7e462ffdef0bc3075786bc42114239de62af56e92ca85ccb2eb098144" 
    getTids(say_list_url)

print "total: %d" % (len(tids))
print "start delete..."

#Delete the url and replace it with your own
delete_url="https://user.qzone.qq.com/proxy/domain/taotao.qzone.qq.com/cgi-bin/emotion_cgi_delete_v6?qzonetoken=fc557d5f5542fbb951479b8e8b2f1a0f163db952eb44cef275261b76b758edeb51c34834c16f80&g_tk=597814207"
delete_cookie="Change to your own browser's cookie“
#Change to your own
delete_headers={'authority':'user.qzone.qq.com','method':'POST','path':'/proxy/domain/taotao.qzone.qq.com/cgi-bin/emotion_cgi_delete_v6?qzonetoken=9a40f79cdd9908ccbd41466b3b2276314f155525357ae4fb8a7d90093f6379e5144d31eefeabc6&g_tk=597814207','scheme':'https','accept':'*/*','accept-language':'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7','content-length':'178','content-type':'application/x-www-form-urlencoded;charset=UTF-8','origin':'https://user.qzone.qq.com','referer':'https://user.qzone.qq.com/763795151/311', 'cookie': delete_cookie, 'user-agent':'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Mobile Safari/537.36'}
body={'hostuin':'763795151','t1_source':'1','code_version':'1','format':'fs','qzreferrer':'https://user.qzone.qq.com/763795151/311','tid':''}
for tid in tids:
    body['tid']=tid
    print body
    dr=requests.post(delete_url, data=body, headers=delete_headers)
    print dr

If the session information fails for too long, log in again and update it. The web knows this. If 403 appears, the configuration is incorrect. Make sure that the request header, cookie and url are not copied wrong.

This is the first program I wrote in python. Unexpectedly, it's not hello,world ~ ~ ~.

I executed the above script at that time, which really cleared all the dynamics of my space for many years at one time. Later, the little girl who shared the house next door listened to me and wanted me to clear it for her, but I tried. I don't know why I can't clear her at one time. Later, he gave up, and the specific reasons were not investigated.

Topics: Python Programmer crawler http