Crawl the whole website [advanced notes of crawler]

From crawling one page of data to crawling all data Let's talk about the general process of static web crawler first   Data loading method By clicking on the second page, you can find that there are more at the back of the website? start=25 field This part is called "query string". The query string is transmitted to the server as a ...

Posted by themistral on Sun, 06 Mar 2022 08:20:50 +0100

No audio material for video creation? It only needs 16 lines of Python code, so you can't use it all. The steps are very detailed

preface As a new generation of contemporary youth, should they be more or less able to make short videos? Haha, what about the creators of contemporary we media~ When making videos, how much do you need some funny sounds? Or strange sounds? Music, etc~ How slow it is to download one by one, let's use python to realize batch download tod ...

Posted by hhstables on Wed, 23 Feb 2022 15:03:01 +0100

Visualization | analyze nearly 5000 tourist attractions in Python and tell you where to go during the holiday

Hello, I'm Ou K. The May Day holiday is coming. There is plenty of time for welfare (five-day holiday) this year. How do you want to play such a long holiday? Play like this? Still playing like this? In this issue, we will briefly analyze the distribution of popular scenic spots and national travel in China through the sales of ti ...

Posted by malam on Sat, 19 Feb 2022 04:14:51 +0100

Crawling dynamic data - simulation browser (Selenium introduction to actual combat)

catalogue First, simulate the browser's environmental preparation. 1. Introduction to selenium 2.Selenium installation 3. Install WebDriver (1) Installing chromedriver (2) Add chromedriver to environment variable 2, Example: use Selenium operation browser to obtain QQ music list data 1. Start the browser 2. Use webdriver to open qq mus ...

Posted by GremlinP1R on Thu, 17 Feb 2022 21:50:25 +0100

python_ Crawler 04 requests Library

catalogue 1, Installation and documentation address 2, Send GET request Add headers and query parameters response.text and response Difference of content 3, Send POST request 4, Use agent 5, Cookies 6, session 7, Handling untrusted SSL certificates   Although the urllib module in Python's standard library already contains most of th ...

Posted by leocon on Tue, 01 Feb 2022 16:40:48 +0100

python_ Download middleware of crawler 21 Scrapy framework

catalogue Downloader Middleware 1, process_request(self, request, spider) 2, process_response(self, request, response, spider) 3, Random request header Middleware setting.py  middlewares.py httpbin.py 4, ip proxy pool Middleware 1. Purchasing agent 2. Using ip proxy pool 3. Exclusive agent pool   Downloader Middleware Downloader ...

Posted by Tjorriemorrie on Tue, 01 Feb 2022 09:16:35 +0100

python web crawler - data storage

Data storage Two data storage methods are mainly introduced: Stored in files, including text files and csv files Stored in databases, including MySQL relational database and mongoDB database Store to txt title = "First text" # W create write W + create read + Write # r read r + read + Write # A write a + read write attach with open(r'C:\Users ...

Posted by WickedStylis on Sat, 29 Jan 2022 06:36:12 +0100

selenium simulates Ctrip Travel automatic login

Ctrip's automatic login is still a little troublesome. Let's look at the official website first:   Needless to say, you must locate the label first, locate it in the red box, jump through click(), and come to the following page:   Here, first locate the tag to the place where you enter the user name and password, and then use send_keys() can ...

Posted by kr9091 on Wed, 26 Jan 2022 12:10:16 +0100

Crawling Baidu translation (can be translated into Chinese and English)

Due to an introductory course in Python next semester So I've been groping for myself during the winter vacation. After all, I can't drop out at that time. It's also a water credit On a whim recently, I plan to try climbing Baidu translation After a day's work, the liver finally came out Don't talk too much and start it directly (the environmen ...

Posted by Dark_AngeL on Tue, 25 Jan 2022 01:41:41 +0100

[Python from zero to one] ten Selenium crawls online encyclopedia knowledge in ten thousand words (necessary skills for NLP corpus construction)

Welcome to "Python from zero to one", where I will share about 200 Python series articles, take you to learn and play, and see the interesting world of Python. All articles will be explained in combination with cases, codes and the author's experience. I really want to share my nearly ten years of programming experience with you. I ho ...

Posted by qaokpl on Mon, 24 Jan 2022 09:25:56 +0100