5000+Picture Find Your Favorite TA, Python Crawler+Face Score

Posted by miob on Sat, 28 Mar 2020 10:02:46 +0100

Preface

Text and pictures of the text come from the network for learning and communication purposes only. They do not have any commercial use. Copyright is owned by the original author. If you have any questions, please contact us in time for processing.

Author: Luo Luopan

PS: If you need Python learning materials for your child, click on the link below to get http://t.cn/A6Zvjdun

Fall in love at first sight is not love, but face

It's not a face, it's a feeling

Project introduction

This project uses Python Crawler and Baidu Face Recognition API to crawl and score user photos (invasion and deletion) for the short book dating column.This project includes the following:

Picture crawler
Face Recognition API Use
Face score and file categorization

Picture crawler

Now some users will pop up on major dating websites. This article crawls all posts in the short book dating column and goes to the details page to get all pictures and download them locally.

Code

import requests
from lxml import etree
import time

headers = {
 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36'
}

def get_url(url):
 res = requests.get(url,headers=headers)
 html = etree.HTML(res.text)
 infos = html.xpath('//ul[@class="note-list"]/li')
 for info in infos:
 root = 'https://www.jianshu.com'
 
def get_img(url):
 res = requests.get(url, headers=headers)
 html = etree.HTML(res.text)
 title = html.xpath('//div[@class="article"]/h1/text()')[0].strip('|').split('，')[0]
 name = html.xpath('//div[@class="author"]/div/span/a/text()')[0].strip('|')
 infos = html.xpath('//div[@class = "image-package"]')
 i = 1
 for info in infos:
 try:
 img_url = info.xpath('div[1]/div[2]/img/@data-original-src')[0]
 print(img_url)
 data = requests.get('http:' + img_url,headers=headers)
 try:
 fp = open('row_img/' + title + '+' + name + '+' + str(i) + '.jpg','wb')
 fp.write(data.content)
 fp.close()
 except OSError:
 fp = open('row_img/' + name + '+' + str(i) + '.jpg', 'wb')
 fp.write(data.content)
 fp.close()
 except IndexError:
 pass
 i = i + 1

if __name__ == '__main__':
 urls = ['https://www.jianshu.com/c/bd38bd199ec6?order_by=added_at&page={}'.format(str(i)) for i in range(1,201)]
 for url in urls:
 get_url(url)

Face Recognition API Use

Since all the pictures below the post were crawled, there are various pictures (not including faces) in it, and it is also to find Miss Gao Yan Value. If manual screening takes a lot of effort, call Baidu's face recognition API here to filter pictures and score faces.

Application for Face Recognition

First, enter the Baidu Face Recognition website, Click to use it immediately, and log on to Baidu account (one is registered without it).

Create an app, and when you're done, click Manage Apps to see AppID s and so on, which you need to use when calling API s.

API Calls

Use Yang Chao's picture here to try launching first.Through the results, you can see 75 points, which is relatively high (I used some net red and stars to test, the average score is around 80, the highest is not more than 90).

from aip import AipFace
import base64
 
APP_ID = ''
API_KEY = ''
SECRET_KEY = ''
 
aipFace = AipFace(APP_ID, API_KEY, SECRET_KEY)
 
filePath = r'C:\Users\LP\Desktop\6.jpg'
def get_file_content(filePath):
 with open(filePath, 'rb') as fp:
 content = base64.b64encode(fp.read())
 return content.decode('utf-8')
 
imageType = "BASE64"
 
options = {}
options["face_field"] = "age,gender,beauty"
result = aipFace.detect(get_file_content(filePath),imageType,options)
print(result)

Face score and file categorization

Finally, combine the picture data and face score, design the code, filter out non-character and male pictures, get the score of Miss Sister picture (processed here as 1-10 points), and exist in different folders.

from aip import AipFace
import base64
import os
import time
APP_ID = ''
API_KEY = ''
SECRET_KEY = ''
 
aipFace = AipFace(APP_ID, API_KEY, SECRET_KEY)
def get_file_content(filePath):
 with open(filePath, 'rb') as fp:
 content = base64.b64encode(fp.read())
 return content.decode('utf-8')
 
imageType = "BASE64"
 
options = {}
options["face_field"] = "age,gender,beauty"
file_path = 'row_img'
file_lists = os.listdir(file_path)
for file_list in file_lists:
 result = aipFace.detect(get_file_content(os.path.join(file_path,file_list)),imageType,options)
 error_code = result['error_code']
 if error_code == 222202:
 continue
 
 try:
 sex_type = result['result']['face_list'][-1]['gender']['type']
 if sex_type == 'male':
 continue
 # print(result)
 beauty = result['result']['face_list'][-1]['beauty']
 new_beauty = round(beauty/10,1)
 print(file_list,new_beauty)
 if new_beauty >= 8:
 os.rename(os.path.join(file_path,file_list),os.path.join('8 branch',str(new_beauty) + '+' + file_list))
 elif new_beauty >= 7:
 os.rename(os.path.join(file_path,file_list),os.path.join('7 branch',str(new_beauty) + '+' + file_list))
 elif new_beauty >= 6:
 os.rename(os.path.join(file_path,file_list),os.path.join('6 branch',str(new_beauty) + '+' + file_list))
 elif new_beauty >= 5:
 os.rename(os.path.join(file_path,file_list),os.path.join('5 branch',str(new_beauty) + '+' + file_list))
 else:
 os.rename(os.path.join(file_path,file_list),os.path.join('Other points',str(new_beauty) + '+' + file_list))
 time.sleep(1)
 except KeyError:
 pass
 except TypeError:
 pass

The final result is very few misses with more than 8 points, as shown in the figure (invasion and deletion).

discuss

With a small number of short-book dating misses and sisters, readers can try Weibo net red or know beautiful women.
Although this is an age of looking at faces, liking a person starts with facial value, falls into talent, is loyal to people (positive energy in the end, avoid being blocked).

Topics: Python network Windows

Programmer Think

5000+Picture Find Your Favorite TA, Python Crawler+Face Score

Hot Topics