Preface
Text and pictures of the text come from the network for learning and communication purposes only. They do not have any commercial use. Copyright is owned by the original author. If you have any questions, please contact us in time for processing.
Author: Luo Luopan
PS: If you need Python learning materials for your child, click on the link below to get http://t.cn/A6Zvjdun
Fall in love at first sight is not love, but face
It's not a face, it's a feeling
Project introduction
This project uses Python Crawler and Baidu Face Recognition API to crawl and score user photos (invasion and deletion) for the short book dating column.This project includes the following:
- Picture crawler
- Face Recognition API Use
- Face score and file categorization
Picture crawler
Now some users will pop up on major dating websites. This article crawls all posts in the short book dating column and goes to the details page to get all pictures and download them locally.
Code
import requests from lxml import etree import time headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36' } def get_url(url): res = requests.get(url,headers=headers) html = etree.HTML(res.text) infos = html.xpath('//ul[@class="note-list"]/li') for info in infos: root = 'https://www.jianshu.com' def get_img(url): res = requests.get(url, headers=headers) html = etree.HTML(res.text) title = html.xpath('//div[@class="article"]/h1/text()')[0].strip('|').split(',')[0] name = html.xpath('//div[@class="author"]/div/span/a/text()')[0].strip('|') infos = html.xpath('//div[@class = "image-package"]') i = 1 for info in infos: try: img_url = info.xpath('div[1]/div[2]/img/@data-original-src')[0] print(img_url) data = requests.get('http:' + img_url,headers=headers) try: fp = open('row_img/' + title + '+' + name + '+' + str(i) + '.jpg','wb') fp.write(data.content) fp.close() except OSError: fp = open('row_img/' + name + '+' + str(i) + '.jpg', 'wb') fp.write(data.content) fp.close() except IndexError: pass i = i + 1 if __name__ == '__main__': urls = ['https://www.jianshu.com/c/bd38bd199ec6?order_by=added_at&page={}'.format(str(i)) for i in range(1,201)] for url in urls: get_url(url)
Face Recognition API Use
Since all the pictures below the post were crawled, there are various pictures (not including faces) in it, and it is also to find Miss Gao Yan Value. If manual screening takes a lot of effort, call Baidu's face recognition API here to filter pictures and score faces.
Application for Face Recognition
First, enter the Baidu Face Recognition website, Click to use it immediately, and log on to Baidu account (one is registered without it).
Create an app, and when you're done, click Manage Apps to see AppID s and so on, which you need to use when calling API s.
API Calls
Use Yang Chao's picture here to try launching first.Through the results, you can see 75 points, which is relatively high (I used some net red and stars to test, the average score is around 80, the highest is not more than 90).
from aip import AipFace import base64 APP_ID = '' API_KEY = '' SECRET_KEY = '' aipFace = AipFace(APP_ID, API_KEY, SECRET_KEY) filePath = r'C:\Users\LP\Desktop\6.jpg' def get_file_content(filePath): with open(filePath, 'rb') as fp: content = base64.b64encode(fp.read()) return content.decode('utf-8') imageType = "BASE64" options = {} options["face_field"] = "age,gender,beauty" result = aipFace.detect(get_file_content(filePath),imageType,options) print(result)
Face score and file categorization
Finally, combine the picture data and face score, design the code, filter out non-character and male pictures, get the score of Miss Sister picture (processed here as 1-10 points), and exist in different folders.
from aip import AipFace import base64 import os import time APP_ID = '' API_KEY = '' SECRET_KEY = '' aipFace = AipFace(APP_ID, API_KEY, SECRET_KEY) def get_file_content(filePath): with open(filePath, 'rb') as fp: content = base64.b64encode(fp.read()) return content.decode('utf-8') imageType = "BASE64" options = {} options["face_field"] = "age,gender,beauty" file_path = 'row_img' file_lists = os.listdir(file_path) for file_list in file_lists: result = aipFace.detect(get_file_content(os.path.join(file_path,file_list)),imageType,options) error_code = result['error_code'] if error_code == 222202: continue try: sex_type = result['result']['face_list'][-1]['gender']['type'] if sex_type == 'male': continue # print(result) beauty = result['result']['face_list'][-1]['beauty'] new_beauty = round(beauty/10,1) print(file_list,new_beauty) if new_beauty >= 8: os.rename(os.path.join(file_path,file_list),os.path.join('8 branch',str(new_beauty) + '+' + file_list)) elif new_beauty >= 7: os.rename(os.path.join(file_path,file_list),os.path.join('7 branch',str(new_beauty) + '+' + file_list)) elif new_beauty >= 6: os.rename(os.path.join(file_path,file_list),os.path.join('6 branch',str(new_beauty) + '+' + file_list)) elif new_beauty >= 5: os.rename(os.path.join(file_path,file_list),os.path.join('5 branch',str(new_beauty) + '+' + file_list)) else: os.rename(os.path.join(file_path,file_list),os.path.join('Other points',str(new_beauty) + '+' + file_list)) time.sleep(1) except KeyError: pass except TypeError: pass
The final result is very few misses with more than 8 points, as shown in the figure (invasion and deletion).
discuss
- With a small number of short-book dating misses and sisters, readers can try Weibo net red or know beautiful women.
- Although this is an age of looking at faces, liking a person starts with facial value, falls into talent, is loyal to people (positive energy in the end, avoid being blocked).