Zero basics Python - geek time - learning notes

Posted by tress on Wed, 19 Jan 2022 08:14:05 +0100

Chapter 2 basic Python syntax

Writing rules for Python programs

Basic data type

Type judgment

type()

Cast type

Target type (data to convert)

Definition and common operations of variables




exercises

Title:

Exercise 1 Definition and use of variables

  1. Two variables are defined as US dollar and exchange rate
  2. Find the exchange rate of US dollar to RMB through search engine
  3. Use Python to calculate the amount of RMB converted into US $100 and output it with print()

code:

Chapter III sequence

Concept of sequence

case

concept

code

Definition and use of strings

Common operations of string

Basic operation of sequence

Membership operator

Join operator

Repeat operator

Definition and common operations of tuples

Comparison of number sizes in tuples

Single number:

Two numbers:
It can be regarded as the superposition of two numbers

120 < 220, so the result is False.

The difference between lists and tuples

  • The list is parenthesis [], and the tuple is parenthesis ()
  • The contents in the list can be changed, and the contents in the tuple cannot be changed

Functions of fliter

Format:
filter(lambda x: x < b, a)
Take out the elements smaller than b in a

Show extracted elements
list(filter(lambda x: x < b, a))

example:

Count the number of extracted elements:
Format:
len (list (filter(lambda x: x < b, a)))
Take out the number of elements less than b in a

example:

Realize the zodiac search function

List definition and common operations

Basic operation:

  • Add an element
  • Remove an element



exercises

Practice a string

Title:

  1. Define a string Hello Python and output it using print()
  2. Define the second string Let's go and output it using print()
  3. Define the third string "The Zen of Python" – by Tim Peters and output it using print()

code:

Exercise 2 basic operation of string

Title:

  1. Define two strings: xyz and abc
  2. Concatenate two strings
  3. Takes the second and third elements of the xyz string
  4. Output 10 times to abc
  5. Judge whether the a character (string) exists in xyz and abc strings and output it

code:

Exercise 3 basic operation of list

  1. Define a list of 5 numbers
  2. Add an element 100 to the list
  3. Observe the change of the list after deleting an element with remove()
  4. Use the slice operation to extract the first three elements of the list and the last element of the list

Practice the basic operation of quads

Title:

  1. Define an arbitrary tuple, and use append() to view the error information for the tuple
  2. Access the penultimate element in the tuple
  3. Define a new tuple, and 1 The tuple of is concatenated into a new tuple
  4. Calculate the number of tuple elements

code:

Chapter IV conditions and cycles

Conditional statement

grammar

code


for loop


Purpose:
The for loop is often used to traverse sequences
code:

while Loop

usage

It is usually used with if conditional judgment statements

break statement

Function:
Terminates the current cycle

continue Statement

Function:
Skip this cycle

summary

if nesting in for statement

Judgment constellation realized by for loop:

if nesting in while loop statements

exercises

Exercise 1 use of conditional statements

Title:

  1. Use the if statement to judge whether the length of the string is equal to 10, and output different results according to the judgment results
  2. Prompt the user to input a number between 1-40. Use the if statement to judge according to the size of the input number. If the input number is 1-10, 11-20, 21-30 and 31-40, output different numbers respectively

code:

Exercise 2 use of circular statements

Title:

  1. Use the for statement to output all even numbers between 1 and 100
  2. Use the while statement to output a number between 1 and 100 that can be divided by 3

code:

Chapter 5: mapping and dictionary

Dictionary definition and common operations

Define and add elements

Case improvement of zodiac and constellation


code:

chinese_zodiac = "Monkey chicken dog pig rat ox tiger rabbit dragon snake horse sheep"  # Define string type and store 12 zodiac characters

zodiac_name = (u"Capricorn", u"aquarius", u"Pisces", u"Aries", u"Taurus", u"Gemini",
               u"Cancer", u"leo", u"Virgo", u"libra", u"scorpio", u"sagittarius")
zodiac_days = ((1, 20), (2, 19), (3, 21), (4, 21), (5, 21), (6, 22),
               (7, 23), (8, 23), (9, 23), (10, 23), (11, 23), (12, 23),)

# Definition dictionary
cz_num = {}
z_num = {}
# Initialization key
for i in chinese_zodiac:
    cz_num[i] = 0  # Chinese_ The zodiac keyword is assigned to 0 in turn

for i in zodiac_name:
    z_num[i] = 0  # Zodiac_ The name keyword is assigned to 0 in turn

while True:
    # The user enters the year, month, and date of birth
    year = int(input('Please enter the year:'))
    month = int(input('Please enter month:'))
    day = int(input('Please enter date:'))

    n = 0
    while zodiac_days[n] < (month, day):
        if month == 12 and day > 23:
            break
        n += 1
    # Output zodiac and constellation
    print('Your constellation is:%s' % (zodiac_name[n]))

    print('%s What is the zodiac sign in %s' % (year, chinese_zodiac[year % 12]))

    # Assign a value to the initialized dictionary
    cz_num[chinese_zodiac[year % 12]] += 1  # The current zodiac of the user will be increased by one when it appears, and the value corresponding to the name of the Zodiac will be increased by one
    z_num[zodiac_name[n]] += 1  # The user's current constellation will be incremented once it appears, and the value corresponding to the constellation name will be incremented by one

    # Output statistics of zodiac and constellation
    for each_key in cz_num.keys():  # . keys() retrieves all keys in the dictionary
        print('the Chinese zodiac %s have %d individual' % (each_key, cz_num[each_key]))

    for each_key in z_num.keys():
        print('constellation %s have %d individual' % (each_key, z_num[each_key]))

result:

Please enter year: 2018
 Please enter month: 1
 Please enter date: 3
 Your constellation is Capricorn
2018 The zodiac sign in is dog
 There are 0 Zodiac monkeys
 There are 0 Zodiac chickens
 Zodiac dog has 1
 There are 0 Zodiac pigs
 There are 0 Zodiac rats
 There are 0 Zodiac cattle
 There are 0 Zodiac tigers
 There are 0 Zodiac rabbits
 There are 0 Zodiac dragons
 Zodiac snakes have 0
 There are 0 Zodiac horses
 There are 0 Zodiac sheep
 Capricorn has one constellation
 There are 0 constellations in Aquarius
 There are 0 constellations in Pisces
 Aries has 0 constellations
 There are 0 constellations in Taurus
 There are 0 constellations in Gemini
 There are 0 constellations in cancer
 There are 0 constellations in Leo
 Virgo has 0 constellations
 There are 0 constellations in Libra
 There are 0 constellations in Scorpio
 There are 0 constellations in Sagittarius
 Please enter year: 2021
 Please enter month: 3
 Please enter date: 25
 Your constellation is Aries
2021 The zodiac sign in is ox
 There are 0 Zodiac monkeys
 There are 0 Zodiac chickens
 Zodiac dog has 1
 There are 0 Zodiac pigs
 There are 0 Zodiac rats
 The zodiac cow has one
 There are 0 Zodiac tigers
 There are 0 Zodiac rabbits
 There are 0 Zodiac dragons
 Zodiac snakes have 0
 There are 0 Zodiac horses
 There are 0 Zodiac sheep
 Capricorn has one constellation
 There are 0 constellations in Aquarius
 There are 0 constellations in Pisces
 There is one constellation in Aries
 There are 0 constellations in Taurus
 There are 0 constellations in Gemini
 There are 0 constellations in cancer
 There are 0 constellations in Leo
 Virgo has 0 constellations
 There are 0 constellations in Libra
 There are 0 constellations in Scorpio
 There are 0 constellations in Sagittarius
 Please enter the year:

List derivation and dictionary derivation

exercises

Practice using a dictionary

Title:

  1. Define a dictionary. Use a, b, c and d as the keywords of the dictionary, and the value is any content
  2. After adding an element 'c': 'cake' to the dictionary, output the dictionary to the screen
  3. Take out the value with keyword d in the dictionary

code:

Exercise 2 use of sets

Title:

  1. Assign each character in the string hello to a set and output the set to the screen '

code:

Chapter VI input and output of documents

Built in functions for files


Common operations of files

code:

# # Record the main characters of the novel in a file
#
# # Basic process of writing files: open() - > write() - > close()
# file1 = open('name.txt', 'w')  # The open file name is "name.txt", the mode is "write mode", and it is assigned to a variable
#
# file1.write(u'Zhuge Liang')  # Write character
# file1.close()  # Close and save
#
# # Basic process of reading files: open() - > read() - > close()
# file2 = open('name.txt')  # Mode defaults to "mode=r" - read-only mode
# print(file2.read())
# file2.close()
#
# # Write a new character
# file3 = open('name.txt', 'a')
# file3.write('liu Bei ')
# file3.close()

# # Read one of multiple rows
# file4 = open('name.txt')
# print(file4.readline())
#
# # Read each line and operate at the same time (read in line mode and operate line by line)
# file5 = open('name.txt')
# for line in file5.readlines():  # readlines read line by line
#     print(line)
#     print('=====')

# Perform an operation, return to the beginning of the file after the operation is completed, and then operate on the file again
file6 = open('name.txt')
file6.tell()  # Tell the user where the "file pointer" is
# Pointer function: when there is no manual operation, the program will record the current operation position, and then continue to operate backward
print('The location of the current file pointer %s' % file6.tell())

print('A character is currently read. The content of the character is %s' % file6.read(1))  # Only 1 character of the file is read
print('The location of the current file pointer %s' % file6.tell())

# Requirement: after the operation is completed, go back to the beginning of the file and operate again
# Manipulation pointer
# The first parameter: offset position (offset). The second parameter: 0 -- offset from the beginning of the file, 1 -- offset from the current position, and 2 -- offset from the end of the file
file6.seek(0)  # file6.seek(5, 0) is offset 5 positions backward from the beginning of the file
print('We did seek operation')
print('The location of the current file pointer %s' % file6.tell())
# Read one character again
print('A character is currently read. The content of the character is %s' % file6.read(1))  # Only 1 character of the file is read
print('The location of the current file pointer %s' % file6.tell())
file6.close()

result:

Current file pointer position 0
 A character is currently read. The content of the character is a
 Current file pointer position 1
 We did seek operation
 Current file pointer position 0
 A character is currently read. The content of the character is a
 Current file pointer position 1

Exercise 1 creating and using files

Title:

  1. Create a file and write the current date
  2. Open the file again, read the first 4 characters of the file and exit

code:

# 1. Create a file and write the current date
import datetime

now = datetime.datetime.now()  # The now variable stores the current time

new_file = open('date.txt', 'w')
new_file.write(str(now))
new_file.close()

# 2. Open the file again, read the first 4 characters of the file and exit
again_file = open('date.txt')

print(again_file.read(4))  # Print read 4 characters

print(again_file.tell())  # File pointer, indicating that 4 characters have been read

again_file.close()

result:

2021
4

Chapter VII errors and exceptions

Anomaly detection and handling

Common errors

1. NameError


2. SyntaxError


3. IndexError


4. KeyError


5. ValueError


6. AttributeError


7. ZeroDivisionError


8. TypeError


Catch all errors

except Exception


exception handling


except can catch multiple exceptions

Format:

be careful:
Multiple errors should be enclosed in parentheses as a parameter.

Additional error messages are displayed after successful capture



Note: it is generally used when debugging programs

Throw exception manually


Python raise usage

finally

No matter whether there is an error or not, it should be executed.

summary

exercises

Exercise 1 exception

Title:

  1. In Python programs, use undefined variables, indexes that do not exist in the access list, and keywords that do not exist in the access dictionary to observe the error messages prompted by the system
  2. Generate IndexError through Python program and catch exception handling with try

code:

Chapter 8 function

Function definition and common operations


No function version:

Function version:

Full version:

import re


def find_main_characters(character_name):
    with open('sanguo.txt', encoding='UTF-8') as f:
        data = f.read().replace("\n", "")
        name_num = re.findall(character_name,data)

    return character_name, len(name_num)

name_dict = {}
with open('name.txt') as f:
    for line in f:
        names = line.split('|')
        for n in names:
            char_name, char_number = find_main_characters(n)
            name_dict[char_name] = char_number

weapon_dict = {}
with open('weapon.txt', encoding="UTF-8") as f: # Read by row by default
    i = 1
    for line in f:
        if i % 2 == 1:
            weapon_name, weapon_number = find_main_characters(line.strip('\n')) # Read delete this line of '\ n'
            weapon_dict[weapon_name] = weapon_number
        i = i + 1

name_sorted = sorted(name_dict.items(), key=lambda item: item[1], reverse=True)
print(name_sorted[0:10])

weapon_sorted = sorted(weapon_dict.items(), key=lambda item: item[1], reverse=True)
print(weapon_sorted[0:10])

result:

Variable length arguments to functions

Keyword parameters

effect:
Called when parameters are not written in order.

advantage:

  1. It is not necessary to write parameters in order
  2. More specifically, what is the meaning of the input parameters

code:

Variable length parameter

Variable scope of function

global variable

Iterators and generators for functions

iterator

function

Take each element in the list and process each element in turn. This method is called iteration. The function that can implement this method is called iterator

Two functions (Methods)

iter() and next()

generator

definition

An iterator made by yourself is called a generator.
Iterator with yield

Custom iterator

Code 1:

 for i in range(10, 20, 0.5):
     print(i)

An error will be reported. The range() function does not allow the float number as its step size.

Code 2:

# Implement a range that supports decimal step growth
def frange(start, stop, step):
    x = start
    while x < stop:
        yield x  # When yield runs to yield, it will pause and record the current position. When next() is called again, it will return a value through the current position
        x += step


for i in frange(10, 20, 0.5):
    print(i)

result:

10
10.5
11.0
11.5
12.0
12.5
13.0
13.5
14.0
14.5
15.0
15.5
16.0
16.5
17.0
17.5
18.0
18.5
19.0
19.5

Lambda expression

effect

Simplify functions.
code:

def true():
	return True
# be equal to
def true():return True
# be equal to
lambda: True
def add(x, y):
	return x + y
# be equal to
def add(x, y):return x + y
# be equal to
lambda x, y:x + y   # The parameter is x and Y returns x+y

Lambda returns the expression of lambda:
<function at 0x0000015D077E20D0>

purpose

Code 1:

lambda x: x <= (month, day)  # The parameter is: X, and the returned value is: x < = (month, day)
# Convert to function
def find(x):
    return x <= (month, day)

Code 2:

lambda item: item[1]  # The parameter is: item, and the returned value is: item[1]
# Convert to function
def find2(item):  # Pass in a dictionary element and take the value of the dictionary
    return item[1]


adict = {'a': '123', 'b': '456'}
for i in adict.items():
    print(find2(i)) # Take dictionary value

Usage of the Python dictionary items() function.

Python built-in functions

filter

Function:
filter(function, sequence)
Filter the number of funciton s in the sequence

code:

a = [1, 2, 3, 4, 5, 6, 7]
b = list(filter(lambda x: x > 2, a))  # The number in filter a that satisfies more than 2
print(b)

result:

[3, 4, 5, 6, 7]

It must be converted to list, otherwise lambda will not be executed

map

Function:
map(function, sequence)
Press function to process the values in sequence

Code 1:

c = [1, 2, 3]
map(lambda x: x, c)  # Return the values in c to x in turn
d = list(map(lambda x: x, c))
print(d)

map(lambda x: x + 1, c)  # Return the values in c + 1 in turn
e = list(map(lambda x: x + 1, c))
print(e)

result:

[1, 2, 3]
[2, 3, 4]

Code 2:

a = [1, 2, 3]
b = [4, 5, 6]
map(lambda x, y: x + y, a, b)
c = list(map(lambda x, y: x + y, a, b))
print(c)

result:

[5, 7, 9]

The corresponding items are added and output in turn.

reduce

Function:
reduce(function, sequence[, initial])
Operate the elements of the sequence and the initial value in a functional way.

code:

from functools import reduce

total = reduce(lambda x, y: x + y, [2, 3, 4], 1)  # 1 and the first element in the list operate according to func
print(total)
# ((1+2)+3)+4

result:

10

zip

Function 1: vertical integration

code:

exchange = zip((1, 2, 3), (4, 5, 6))

for i in exchange:
    print(i)
# The vertical integration is compared with the "matrix transformation" in linear algebra

result:

(1, 4)
(2, 5)
(3, 6)

(1, 2, 3)
(4, 5, 6)
(1, 4)
(2, 5)
(3, 6)

Function 2: key and value exchange in the dictionary

code:

dicta = {'a': '123', 'b': '456'}
dictb = zip(dicta.values(), dicta.keys())
# ('a', 'b'), ('1', '2') --> ('a', '1'), ('b', '2')
print(dictb)
print(dict(dictb))  # Its type needs to be cast to dict

result:

<zip object at 0x0000016A531F7E40>
{'123': 'a', '456': 'b'}

Definition of closure

closure

definition:
The variables in the external function are referenced by the internal function, which is called "closure"

code:

def func():
    a = 1
    b = 2
    return a + b


def sum(a):
    def add(b):
        return a + b  # int returned
	# Returns the function name of the inner function
    return add  # The returned function is function, add is the function name of the internal function, and refers to the function of add


num1 = func()
num2 = sum(2)  # num2 is equivalent to the internal function add in the sum external function
print(num2(4))  # Call num2 as a function (equivalent to add()) and pass in the second parameter

# print(type(num1)) --- int
# print(type(num2)) --- function

result:

6

add is the function name or a reference to the function
add() is a call to a function

Counter

Function:
Implement a counter, which counts + 1 every time it is called

Code 1:

# Implement a counter, which counts + 1 every time it is called
def counter():
    cnt = [0]  # Define a list, only one element is 0

    def add_one():
        cnt[0] += 1  # The counter is + 1 each time the function is called
        return cnt[0]  # Returns the added value

    return add_one  # Return inner function


num1 = counter()  # Assign the result returned by the counter() function to the variable num1
print(num1)  # Output num1. This function is equivalent to add_one
print(num1())  # Output add_ Call the one function
print(num1())
print(num1())
print(num1())
print(num1())

result:

1
2
3
4
5

Code 2:

# Implement a counter. According to the specified initial value, the count will be + 1 every time it is called
def counter(FIRST=0):  # If no parameter is passed in, it starts with 0 by default
    cnt = [FIRST]  # Define a list, and only one element is FIRST

    def add_one():
        cnt[0] += 1  # The counter is + 1 each time the function is called
        return cnt[0]  # Returns the added value

    return add_one  # Return inner function


num5 = counter(5)
num10 = counter(10)

print(num5())
print(num5())
print(num5())
print(num10())
print(num10())

result:

6
7
8
11
12

Use of closures

Case: mathematical operation

Closure:

# a * x + b = y
# To ensure that a and B remain unchanged, closures can be used only when x changes
def a_line(a, b):
    def arg_y(x):
        return a * x + b

    return arg_y


# a = 3, b = 5
# x = 10, y = ?
# x = 20, y = ?

line1 = a_line(3, 5) # line1 is equivalent to arg_y
print(line1(10)) # line1(10) is equivalent to arg_y(10)
print(line1(20)) # line1(20) is equivalent to arg_y(20)

# Find another line
# a = 5, b = 10

line2 = a_line(5, 10)
print(line2(10))
print(line2(20))

result:

35
65
60
110

The external function is a constant quantity, and the internal function is a variable quantity.

Common function writing method:

def func1(a,b,x):
	return a * x + b

The three values of a, B and X need to be passed in every time, which is not concise and elegant enough

lambda:

def a_line2(a, b):
    return lambda x: a * x + b


# a = 3, b = 5
# x = 10, y = ?
# x = 20, y = ?

new_line1 = a_line2(3, 5)
print(new_line1(10))
print(new_line1(20))

More concise and elegant!

The difference between closure and function

  1. Function transfer variable, closure transfer function;
  2. Closures have fewer arguments than function calls;
  3. Closure code is more elegant.

Definition of decorator

Description:
If you want to add some functions to the function, but do not want to add corresponding code inside the function, you can use the "decorator".

time library

sleep() method

Function: stop for a few seconds during operation

import time

time.sleep(3) # Stop for 3s

time() method

Function:
Count the time from January 1, 1970 to now.
code:

import time

print(time.time())  # How many seconds have you walked since January 1, 1970

result:

1611718601.273848

case

Statistical function running time:

# How long did the statistical function run
def i_can_sleep():
    time.sleep(3)


start_time = time.time()

i_can_sleep()

stop_time = time.time()

print('The function is running %s second' % (stop_time - start_time))

Decorator

Function:
Do repetitive things only once.

The difference between closures and decorators:

  • Closures pass in variables, and internal functions reference variables
  • The decorator passes in a function, and the internal function references a function

code:

def timer(func):
    def wrapper():
        start_time = time.time()
        func()
        stop_time = time.time()
        print("Run time is %s second" % (stop_time - start_time))

    return wrapper


# How long did the statistical function run
@timer  # Syntax: sugar, timer -- decorating function i_can_sleep -- decorated function, and additional code is encapsulated in the decorated function
def i_can_sleep():
    time.sleep(3)


i_can_sleep()  # Equivalent to num = timer (i_can_sleep()) num ()

result:

The running time is 3.0005228519439697 second

Use of decorators

Adding decorators to functions with arguments

code:

# Adding decorators to functions with arguments
def tips(func):
    def nei(a, b):
        print('start')
        func(a, b)
        print('stop')

    return nei


# Implement addition
@tips
def add(a, b):
    print(a + b)


# Implement subtraction
@tips
def sub(a, b):
    print(a - b)


print(add(4, 5))
print(sub(4, 5))

result:

start
9
stop
None
start
-1
stop
None

Decorator belt parameters

Code 1:

# Decorator belt parameters
# The decorator varies for different functions -- bring parameters to the decorator

def new_tips(argv):
    def tips(func):
        def nei(a, b):
            print('start %s' % argv)
            func(a, b)
            print('stop')

        return nei

    return tips


@new_tips('add')
def add(a, b):
    print(a + b)


@new_tips('sub')
def sub(a, b):
    print(a - b)


print(add(4, 5))
print(sub(4, 5))

result:

start add
9
stop
None
start sub
-1
stop
None

Code 2:

# Decorators take not only parameters, but also function names
def new_tips(argv):
    def tips(func):
        def nei(a, b):
            print('start %s %s' % (argv, func.__name__))  # Name of the output function
            func(a, b)
            print('stop')

        return nei

    return tips


@new_tips('add_module')
def add(a, b):
    print(a + b)


@new_tips('sub_module')
def sub(a, b):
    print(a - b)


print(add(4, 5))
print(sub(4, 5))

result:

start add_module add
9
stop
start sub_module sub
-1
stop

Benefits of decorators

  1. When calling a function, you don't need to write the corresponding modification code repeatedly, and you can put it in the decorator;
  2. Decorator code is easy to reuse - @ decorator name.

exercises

Exercise 1 defines a decorator for printing the time of function execution

  1. Count the time when the function starts and ends execution
  2. Extension exercise: pass in the timeout for the decorator, and exit after the function execution exceeds the specified time

code:

# 1. Count the time when the function starts and ends execution
import time


def timer(func):
    def nei():
        print('start time %s' % time.time())
        func()
        print('end time %s' % time.time())

    return nei


@timer
def i_can_sleep():
    time.sleep(3)


i_can_sleep()

result:

start time 1612241463.787381
end time 1612241466.787973

Exercise 2 define the decorator to realize the function of displaying the execution results in different colors

  1. Pass parameters to the decorator, and obtain the output color through the passed parameters
  2. The print() output of the decorated function is output according to the color obtained by the decorator

code:

# 1. Pass parameters to the decorator and obtain the output color through the passed parameters
# 2. The print() output of the decorated function is output according to the color obtained by the decorator

import sys


def make_color(code):
    def decorator(func):
        if (code == 0):
            s = 'white'
            return func(s)
        elif (code == 1):
            s = 'black'
            return func(s)
        else:
            print('wrong')

    return decorator


@make_color(0)
def color_func(s):
    print('The color is:%s' % s)

result:

The color is: white

Custom context manager

code:

fd = open('name.txt')
try:
    for line in fd:
        print(line)
finally:
    fd.close()

# Context manager
# If you use with, you don't need to write finally, because when an exception occurs, with will automatically call finally to close the file (described in detail later)
with open('name.txt') as f:
    for line in f:
        print(line)

result:

Zhuge Liang|Guan Yu|Liu Bei|Cao Cao|Sun Quan|Guan Yu|Fei Zhang|Lv Bu|Zhou Yu|Zhao Yun|Pang Tong|Sima Yi|Huang Zhong|Ma Chao
 Zhuge Liang|Guan Yu|Liu Bei|Cao Cao|Sun Quan|Guan Yu|Fei Zhang|Lv Bu|Zhou Yu|Zhao Yun|Pang Tong|Sima Yi|Huang Zhong|Ma Chao

summary

exercises

Practice a function

  1. Create a function to receive the number entered by the user and calculate the sum of the number entered by the user
  2. Create a function, pass in n integers, and return the maximum number and the minimum number
  3. Create a function, pass in a parameter n and return the factorial of n

code:

# 1. Create a function to receive the number entered by the user and calculate the sum of the number entered by the user


def func1():
    two_num = input('Please enter two numbers separated by spaces:')
    # Check whether the user input is legal
    func2(two_num)
    # print(type(two_num))
    num1, *_, num2 = two_num
    print('%s and %s The sum is:' % (num1, num2))
    print(int(num1) + int(num2))


def func2(check_number):
    pass


func1()


# 2. Create a function, pass in n integers, and return the maximum number and the minimum number
def func3(*nums):
    print('The maximum number is: %s' % max(nums))
    print('The smallest number is: %s' % min(nums))


func3(1, 5, 8, 32, 654, 765, 4, 6, 7)


# 3. Create a function, pass in a parameter n and return the factorial of n
def func4(n):
    if n == 0 or n == 1:
        return 1
    else:
        return n * func4(n - 1)


num = input('Please enter the number to factorize:')
print('%s The factorial result of is:%s' % (num, func4(num)))

Chapter IX module

Definition of module

Rename module

Rename the long module name to simplify the name.

import time as t  # Rename module

t.time()  # Reference functions in the time file

Method of not writing module name:

from time import sleep  # This is not recommended for fear of renaming

sleep()

Custom module and its call

Own module:

def print_me():
    print('me')

# print_me() is rarely called directly. It is usually the definition of a function

Calling module:

import mymod  # Import is not added py suffix

mymod.print_me()  # Reference the functions in this file

exercises

Practice a module the first one the next.

  1. Import the os module and use help(os) to view the help documentation for the os module
# 1. Import the os module and use help(os) to view the help document of the os module
import os

print(help(os))

Chapter 10 grammatical norms

PEP8 coding specification

Relevant instructions in class:

https://www.python.org/dev/peps/pep-0008/



pycharm install PEP8
cmd Window input: pip install autopep8
Tools→Extends Tools→Click the plus sign

Name: Autopep8((you can take it at will)
- Tools settings:
    - Programs: `autopep8` (If you have installed it)
    - Parameters:`--in-place --aggressive --aggressive $FilePath$`
    - Working directory:`$ProjectFileDir$`
- click Output Filters→Add, in the dialog box: Regular expression to match output Enter in:`$FILE_PATH$\:$LINE$\:$COLUMN$\:.*`

If unsuccessful, please refer to the following:
pycharm setting autopep8

Chapter 11 object oriented programming

Classes and instances

Process oriented programming

characteristic:

  • Write corresponding functions from top to bottom according to the sequence of program execution

code:

# Process oriented
user1 = {'name': 'tom', 'hp': 100}
user2 = {'name': 'jerry', 'hp': 80}


def print_role(rolename):
    print('name is %s , hp is %s' % (rolename['name'], rolename['hp']))


print_role(user1)

object-oriented programming

definition:
The same features of different objects are extracted, and the extracted things are called classes.

code:

# object-oriented
class Player():  # Define a class. The class name should start with an uppercase letter
    def __init__(self, name, hp): # __ init__  Is a special method that executes automatically after class instantiation
        self.name = name  # Self means that after the Player class is instantiated, is the self of the instance equivalent to this in java
        self.hp = hp

    def print_role(self):  # Define a method
        print('%s: %s' % (self.name, self.hp))


user1 = Player('tom', 100)  # Class instantiation
user2 = Player('jerry', 90)
user1.print_role()
user2.print_role()

# Note: in a class, the first parameter of all functions (Methods) must take self

be careful:

  • In a class, the first parameter of all functions (Methods) must take self
  • Class names should start with uppercase letters

How to add class properties and methods

Add a property and method

code:

# object-oriented
class Player(): 
    def __init__(self, name, hp, occu): 
        self.name = name 
        self.hp = hp  
        self.occu = occu  # Add a class attribute

    def print_role(self): 
        print('%s: %s %s' % (self.name, self.hp, self.occu))

    def updateName(self, newname):  # Create a renamed method
        self.name = newname

user1 = Player('tom', 100, 'war')  # Class instantiation
user2 = Player('jerry', 90, 'master')
user1.print_role()
user2.print_role()

user1.updateName('wilson')
user1.print_role()


class Monster():
    'Define monster class'
    pass

Class encapsulation

The properties of the class do not want to be accessed by others.

  • Precede the properties of the class with two underscores "_"

code:

# object-oriented
class Player(): 
    def __init__(self, name, hp, occu): 
        self.__name = name 
        self.hp = hp  
        self.occu = occu  # Add a class attribute

    def print_role(self): 
        print('%s: %s %s' % (self.__name, self.hp, self.occu))

    def updateName(self, newname):  # Create a renamed method
        self.__name = newname

user1 = Player('tom', 100, 'war')  # Class instantiation
user2 = Player('jerry', 90, 'master')
user1.print_role()
user2.print_role()

user1.updateName('wilson')
user1.print_role()

result:

name is tom , hp is 100
tom: 100 war
jerry: 90 master
wilson: 100 war
wilson: 100 war

In this way, the properties of the class will not be accessed by the instance of the class.
You can only change the properties of a class through methods.

Class inheritance

inherit

code:

# Cats inherit the methods used by cats
# Cats are called the father of cats, and cats are called the children of cats

class Monster():
    'Define monster class'

    def __init__(self, hp=100):  # There is already life value during initialization
        self.hp = hp

    def run(self):
        print('Move to a location')


class Animals(Monster):  # Subclasses inherit the parent class. Subclasses write the name of the parent class in parentheses
    'Ordinary monster'

    def __init__(self, hp=10):
        self.hp = hp


class Boss(Monster):
    'Boss Monster like'
    pass


# Parent class
a1 = Monster(200)
print(a1.hp)
a1.run()

# Subclass
a2 = Animals(1)
print(a2.hp)
a2.run()

Subclasses can call properties and methods in the parent class.

super

effect:
For the property initialized in the parent class, the child class does not need to be initialized repeatedly.

code:

class Animals(Monster):  # Subclasses inherit the parent class. Subclasses write the name of the parent class in parentheses
    'Ordinary monster'
    def __init__(self,hp=10):
        super().__init__(hp) # hp in Animals does not need to be initialized anymore. The parent class has been initialized

Subclass method and parent method have the same name

When a subclass is called, the method of the subclass will overwrite the method with the same name as the parent class.

polymorphic

Only when a method with a duplicate name is actually used can it know whether it is a subclass or parent method that calls it, indicating that the method has multiple states at runtime. This feature is called "polymorphism".

Judge the inheritance relationship of classes

code:

# Judge the inheritance relationship of classes
print('a1 The type of is %s' % type(a1))
print('a2 The type of is %s' % type(a2))
print('a2 The type of is %s' % type(a3))

print(isinstance(a2, Monster))  # Judge whether a2 is a subclass of Monster. If so, output True; if not, output False

result:

a1 The type of is <class '__main__.Monster'>
a2 The type of is <class '__main__.Animals'>
a2 The type of is <class '__main__.Boss'>
True

Additional knowledge

Tuple, list, string and other forms are "class"

# Tuple, list, string and other forms are "class"
print(type(tuple))
print(type(list))
print(type('123'))
<class 'type'>
<class 'type'>
<class 'str'>

All objects inherit the parent class object

# All objects inherit the parent class object
print(isinstance(tuple, object))
print(isinstance(list, object))
print(isinstance('123', object))
True
True
True

summary

  1. A class is a collection that describes an object with the same properties and methods;
  2. Encapsulation, inheritance and polymorphism;
  3. Class needs to be instantiated before it can be used.

Class - Custom with statement

Function:

  • Automatic exception handling
  • Exception and object-oriented

Custom with method:
code:

class Testwith():
    def __enter__(self):  # Called at start
        print('run')

    def __exit__(self, exc_type, exc_val, exc_tb):  # Called at the end
        if exc_tb is None:  # If exc_ If TB is not abnormal, its value is None. Judge whether it is empty with "is None"
            print('Normal end')
        else:
            print('has error %s' % exc_tb)


# Class and throw exception
# Simplify the writing of exceptions with with
with Testwith():
    print('Test is running')
    raise NameError('testNameError')  # Throw exception manually

result:

run
Test is running
has error <traceback object at 0x000002016D762B80>
Traceback (most recent call last):
  File "D:\Python\pythonProject\with_test.py", line 15, in <module>
    raise NameError('testNameError')  # Throw exception manually
NameError: testNameError

a key:

  • with can simplify the writing of throwing exceptions (try... catch...)

Chapter 12: multithreaded programming

Definition of multithreaded programming

Process:
Status of program operation

Multithreaded programming:
At the same time, there are a large number of requests. We need to process these requests. The method of processing these requests is multithreaded programming.

No threads are running

code:

def myThread(arg1, arg2):
	print('%s %s' %(arg1, arg2)

for i in range(1, 6, 1):
	t1 = myThread(i, i + 1)

result:

1 2
2 3
3 4
4 5
5 6

Multithreaded method run

code:

import threading


def myThread(arg1, arg2):
	print('%s %s' %(arg1, arg2)

for i in range(1, 6, 1):
	# t1 = myThread(i, i + 1)
	t1 = threading.Thread(target = myThread, args = (i, i + 1))
	t1.start()  # Running multithreaded programs

result:

1 2
2 3
3 4
4 5
5 6

Introducing sleep() to pause multithreading:
code:

import threading
import time


def myThread(arg1, arg2):
	print('%s %s' %(arg1, arg2)
	time.sleep(1)

for i in range(1, 6, 1):
	# t1 = myThread(i, i + 1)
	t1 = threading.Thread(target = myThread, args = (i, i + 1))
	t1.start()  # Running multithreaded programs

result:

1 2
2 3
3 4
4 5
5 6

The program directly prints out all the contents, and waits for 1s after running.

Display the current thread running status:

import threading
import time
from threading import current_thread  # The running status of the current thread is displayed


def myThread(arg1, arg2):
    print(current_thread().getName(), 'start')  # Mark the name of the current thread. The first one: the name of the thread; Second: the comment you want to add
    print('%s %s' % (arg1, arg2))
    time.sleep(1)  # The program does not wait for 1s and outputs directly, indicating that the program is running in parallel
    print(current_thread().getName(), 'stop')


for i in range(1, 6, 1):
    # t1 = myThread(i, i + 1) # No thread
    t1 = threading.Thread(target=myThread, args=(i, i + 1))  # First parameter: function name; Second parameter: the passed in parameter generates 5 new threads
    t1.start()  # Running multithreaded programs

print(current_thread().getName(), 'end')
# After the main program ends, the thread ends with "MainThread end" and then "stop"
# The main thread ends first and ends after Thread1-5

result:

Thread-1 start
1 2
Thread-2 start
2 3
Thread-3 start
3 4
Thread-4 start
4 5
Thread-5 start
5 6
MainThread end
Thread-1 stop
Thread-3 stop
Thread-2 stop
Thread-4 stop
Thread-5 stop

After the main program ends, the thread ends with "MainThread end" and then "stop".

Synchronization between threads

When one thread is running, you can wait for another thread to end.

code:

# Implement "stop" before "end"
import threading
from threading import current_thread  # Used when the method name is unusual


# threading.Thread().run()  # Function calls in threads
# Inherit threading Thread () then overrides run() -- polymorphism

class Mythread(threading.Thread):  # Inherit without parentheses
    def run(self):  # Re implement the run() method
        # 1. Get the name of the current thread -- judge whether to wait for the thread
        print(current_thread().getName(), 'start')
        print('run')
        print(current_thread().getName(), 'stop')


t1 = Mythread()
t1.start()
# Thread ends first and main ends later
t1.join()

print(current_thread().getName(), 'end')  # Print the display results of the main thread

result:

Thread-1 start
run
Thread-1 stop
MainThread end

Classic producer and consumer issues

When the program runs, it will continuously produce a large amount of data, and users will consume a series of data. This process is the problem of producers and consumers. (analogy with water injection and drainage of water tank)

queue

Synchronize data between different threads.

code:

# Implementation of queue
import queue

q = queue.Queue()  # Generate a queue
q.put(1)  # Add a data to the queue
q.put(2)
q.put(3)
q.get()  # The read queue is read in the order of joining

code implementation

Code 1:

from threading import Thread, current_thread  # It can realize the parallel production and consumption of multiple producers and consumers
import time  # dormancy
import random  # Generate random data
from queue import Queue  # Import queue library

queue = Queue(5)  # Defines the length of the queue


class ProducerThread(Thread):
    def run(self):
        name = current_thread().getName()  # Gets the name of the producer's thread
        nums = range(100)
        global queue  # Defines a global variable for a queue
        while True:
            num = random.choice(nums)  # Select a number at random
            queue.put(num)  # Put randomly selected numbers into the queue
            print('producer %s Production data %s' % (name, num))
            t = random.randint(1, 3)  # Random sleep time
            time.sleep(t)  # Put producers to sleep
            print('producer %s Sleep %s second' % (name, t))


class ConsumerThread(Thread):
    def run(self):
        name = current_thread().getName()  # Gets the thread name of the consumer
        global queue

        while True:
            num = queue.get()  # Get the desired number in the queue
            queue.task_done()  # The code of thread waiting and synchronization is encapsulated
            print('consumer %s Consumed data %s' % (name, num))  # Random sleep time
            t = random.randint(1, 5)  # Wait a few seconds at random
            time.sleep(t)
            print('consumer %s Sleep %s second' % (name, t))


# One producer and two consumers
p1 = ProducerThread(name='p1')
p1.start()
c1 = ConsumerThread(name='c1')
c1.start()

result:

producer p1 Production data 74
 consumer c1 Data consumed 74
 producer p1 Sleep for 1 second
 producer p1 Production data 82
 producer p1 Sleep for 1 second
 producer p1 Production data 67
 consumer c1 Sleep for 3 seconds
 consumer c1 Data consumed 82
 producer p1 Sleep for 1 second
 producer p1 Production data 28
 producer p1 Sleep for 1 second
 producer p1 Production data 75

Code 2:

from threading import Thread, current_thread  # It can realize the parallel production and consumption of multiple producers and consumers
import time  # dormancy
import random  # Generate random data
from queue import Queue  # Import queue library

queue = Queue(5)  # Defines the length of the queue


class ProducerThread(Thread):
    def run(self):
        name = current_thread().getName()  # Gets the name of the producer's thread
        nums = range(100)
        global queue  # Defines a global variable for a queue
        while True:
            # After a random sleep time, add a random number to the queue
            num = random.choice(nums)  # Select a number at random
            queue.put(num)  # Put randomly selected numbers into the queue
            print('producer %s Production data %s' % (name, num))
            t = random.randint(1, 3)  # Random sleep time
            time.sleep(t)  # Put producers to sleep
            print('producer %s Sleep %s second' % (name, t))


class ConsumerThread(Thread):
    def run(self):
        name = current_thread().getName()  # Gets the thread name of the consumer
        global queue

        while True:
            # After random time, random numbers are extracted from the queue
            num = queue.get()  # Get the desired number in the queue
            queue.task_done()  # The code of thread waiting and synchronization is encapsulated
            print('consumer %s Consumed data %s' % (name, num))  # Random sleep time
            t = random.randint(1, 5)  # Wait a few seconds at random
            time.sleep(t)
            print('consumer %s Sleep %s second' % (name, t))


# # One producer and two consumers (slow production and fast consumption)
# p1 = ProducerThread(name='p1')
# p1.start()
# c1 = ConsumerThread(name='c1')
# c1.start()

# Three producers and two consumers (fast production and slow consumption)
p1 = ProducerThread(name='p1')
p1.start()
p2 = ProducerThread(name='p2')
p2.start()
p3 = ProducerThread(name='p3')
p3.start()
c1 = ConsumerThread(name='c1')
c1.start()
c2 = ConsumerThread(name='c2')
c2.start()
# When the queue is full, the producer will no longer produce data and will produce after consumption by consumers

result:

producer p1 Production data 70
 producer p2 Production data 33
 producer p3 Production data 95
 consumer c1 Consumed data 70
 consumer c2 Data consumed 33
 consumer c1 Sleep for 1 second
 consumer c1 Consumed data 95

Chapter XIII standard library

Definition of Python standard library

Python standard library official documentation

Key points:

Regular exp re ssion library

matching

code:

import re

# matching
p = re.compile('a')  # Define a string to match
print(p.match('a'))  # The matched string can be successfully matched
print(p.match('b'))  # Cannot match on

# Match a regular string of characters
# Introduce some special characters (representing the regularity of character repetition, etc.), which are called "metacharacters"
p = re.compile('cat')
print(p.match('caaaaat'))  # Failed to match, None
# The advantage of regular expressions: some special functions are represented by special symbols
p = re.compile('ca*t')  # Repeat a and replace with * a
print(p.match('caaaaat'))  # Match successful

result:

<re.Match object; span=(0, 1), match='a'>
None
None
<re.Match object; span=(0, 7), match='caaaaat'>

Metacharacters of regular expressions

Common metacharacters

Metacharacterfunction
.Match any single character
^Match a string that begins with what
$Match a string that ends with what
*Matches the previous character 0 to more than once
+Matches the preceding character 1 to more than once
?Matches the previous character 0 to 1 times
{m}Indicates that the specified number of occurrences of the preceding character is m times
{m,n}Indicates that the specified number of occurrences of the preceding character is m~n times
[]Indicates that any character in brackets matches successfully
Indicates that the character is left or right. It is usually used with parentheses (often used with ())
\dIndicates that the matching content is a string of numbers, equivalent to one of [1234567890] + or [0-9] +
\DIndicates that the matching content does not contain numbers
\sIndicates that a string (a-z) matches
()Group
^$Indicates that this line is empty. When matching text, one line is empty and nothing is included
.*?Do not use greedy mode

. metacharacter

Function:
Match any single character

code:

import re

p = re.compile('.')
print(p.match('c'))
print(p.match('d'))

# Match three characters
p = re.compile('...')
print(p.match('abc'))

result:

<re.Match object; span=(0, 1), match='c'>
<re.Match object; span=(0, 1), match='d'>
<re.Match object; span=(0, 3), match='abc'>

^And $metacharacters

Function:
^: matches a string that starts with what
$: matches a string that ends with what

code:

# ^What kind of content to start with $what kind of content to end with (matching from back to front)
# Search: means to search from the beginning.
import re

p = re.compile('^jpg')  # Match strings starting with jpg
print(p.match('jpg'))
p = re.compile('jpg$')  # Matches strings ending in jpg to all files with "jpg" extension
print(p.match('jpg'))

result:

<re.Match object; span=(0, 3), match='jpg'>
<re.Match object; span=(0, 3), match='jpg'>

*Metacharacter

Function:
Matches the preceding character 0 to more than once.

code:

# *Matches the previous character 0 to more than once
import re

p = re.compile('ca*t')
print(p.match('ct'))  # Match cat with 0 occurrences - success
print(p.match('caaaaaat'))  # Match cat that occurs multiple times - success

result:

<re.Match object; span=(0, 2), match='ct'>
<re.Match object; span=(0, 8), match='caaaaaat'>

+And? Metacharacter

Function:
+: matches the preceding character 1 to more than once
?: Matches the previous character 0 to 1 times

code:

# +Matches the preceding character 1 to more than once
# ?  Matches the previous character 0 to 1 times
import re

p = re.compile('c?t')
print(p.match('t'))  # Match 0 occurrences of c
print(p.match('ct'))  # Matching occurs once in c

q = re.compile('c+t')
print(q.match('ct'))  # Matching occurs once in c
print(q.match('cccct'))  # Matches c that occurs more than once

result:

<re.Match object; span=(0, 1), match='t'>
<re.Match object; span=(0, 2), match='ct'>
<re.Match object; span=(0, 2), match='ct'>
<re.Match object; span=(0, 5), match='cccct'>

{m} Metacharacter

Function:
Indicates that the specified number of occurrences of the preceding character is m times

code:

import re

p = re.compile('ca{4}t')  # a that matches 4 times
print(p.match('caaaat'))
<re.Match object; span=(0, 6), match='caaaat'>

{m,n} metacharacter

Function:
Indicates that the specified number of occurrences of the preceding character is m~n times

code:

import re

p = re.compile('ca{4,6}t')  # a matching occurs 4 ~ 6 times
print(p.match('caaaaat'))

result:

<re.Match object; span=(0, 7), match='caaaaat'>

[] metacharacter

Function:
Indicates that any character in brackets matches successfully.

code:

# [] indicates that any character in brackets matches successfully
import re

p = re.compile('c[bcd]t')  # Indicates that bcd any character matches successfully
print(p.match('cat'))  # Matching failed
print(p.match('cbt'))  # Match successful
print(p.match('cct'))  # Match successful
print(p.match('cdt'))  # Match successful

result:

None
<re.Match object; span=(0, 3), match='cbt'>
<re.Match object; span=(0, 3), match='cct'>
<re.Match object; span=(0, 3), match='cdt'>

|Metacharacter

Function:
Indicates that the character is left or right. It is usually used with parentheses (often used with ())

\d-ary character

Function:
Indicates that the matching content is a string of numbers, equivalent to one of [1234567890] + or [0-9] +

\D-ary character

Function:
Indicates that the matching content does not contain numbers

\s metacharacter

Function:
Indicates that a string (a-z) matches

() metacharacter

Function:
Group

code:

# () group
# Withdrawal date (mm / DD / yyyy)
# 2018-03-04
# (2018)-(03)-(04)
# (2018)-(03)-(04).group() extracts one of the groups

# Match different content, but look very similar
# 2018-03-04
# 2018-04-12
# Want to withdraw 2018-03 or 2018-04
# (03|04) extract only 03 or 04

^$metacharacter

Function:
Indicates that this line is empty. When matching text, one line is empty and nothing is included

.*? Metacharacter

Function:
Do not use greedy mode

code:

# .*?  Do not use greedy mode
# Greedy model
# abcccccd
# abc* # It will match all the c * in front of d. when matching, it will match as long as possible

# # Non greedy model
# abcccccd
# abc*?
# # Only match the content of the first match. It is very common to match the content of web pages
# <img   /img>
# <img   /img>

Regular expression grouping function example

code:

# Using regularization to realize grouping function
# Do the characters that appear match the characters we want
import re

p = re.compile('.{3}')  # Matching three arbitrary characters is equivalent to '...'
print(p.match('bat'))

# Match the year, month and day, and then take the year, month and day
# q = re.compile('....-..-..')
# If it is a judgment number and it is continuous
q = re.compile('\d-\d-\d')
q = re.compile(r'\d-\d-\d')  # r tells python to output the content behind the program as it is without escape
q = re.compile(r'(\d)-(\d)-(\d)')  # () group the parts you want to take out
q = re.compile(r'(\d+)-(\d+)-(\d+)')  # Add a plus sign because the number may match more than once: consecutive numbers (05 and 5 can match)

print(q.match('2018-05-10'))
print(q.match('2018-05-10').group())  # Take out a part group() takes out all the parts
print(q.match('2018-05-10').group(1))  # Take out a part group(1) take out the contents contained in the first bracket
print(q.match('2018-05-10').group(2))  # Take out a part group(2) take out the contents contained in the second bracket
print(q.match('2018-05-10').groups())  # Take it all out
year, month, day = q.match('2018-05-10').groups()  # Assign to variable
print(year, month, day)

# It is hoped that special symbols will not be escaped during regular matching
print('\nx\n')
print(r'\nx\n')

result:

<re.Match object; span=(0, 3), match='bat'>
<re.Match object; span=(0, 10), match='2018-05-10'>
2018-05-10
2018
05
('2018', '05', '10')
2018 05 10

x

\nx\n

Process finished with exit code 0

The difference between regular expression library function match and search

match

code:

# match
# The matching string must correspond to the regular one-to-one. Before matching, you should clearly know what form the string appears in
import re

p = re.compile(r'(\d+)-(\d+)-(\d+)')
print(p.match('aa2018-05-10bb').group(2)) # Matching failed, unable to group, and then unable to match
print(p.match('2018-05-10').group())

result:

Traceback (most recent call last):
  File "D:\Python\pythonProject\43.py", line 129, in <module>
    print(p.match('aa2018-05-10bb').group(2)) # Matching failed, unable to group, and then unable to match
AttributeError: 'NoneType' object has no attribute 'group'

search

code:

import re

p = re.compile(r'(\d+)-(\d+)-(\d+)')
# search
# Incomplete matching does not require an exact match between the metacharacter and the input
print(p.search('aa2018-05-10bb'))  # Keep searching for matches until they can be matched successfully. As long as the corresponding regular expression is included, they can be matched successfully

result:

<re.Match object; span=(2, 12), match='2018-05-10'>

The respective usage of match and search

  • Search is often used to search the specified string in the function
  • Match is often grouped after an exact match

An instance of the regular expression library replacement function sub()

Function:
Replace the string.

code:

# sub
# Function: string replacement
# Sub (arg1, arg2, arg3) arg1: content to match + matching rule (metacharacter) arg2: target replacement content arg3: replaced string

# take#Replace the following contents with empty ones
import re
phone = '123-456-789 # This is the phone number '
p2 = re.sub(r'#.*$', '', phone)  # arg1: matching#The following multiple arbitrary contents and ends with arg2: the target is replaced with an empty string arg3: the replaced string is phone
print(p2)
# Replace the - in the middle
p3 = re.sub(r'\D', '', p2)  # Replace all non numeric characters with empty ones
print(p3)

code:

<re.Match object; span=(2, 12), match='2018-05-10'>
123-456-789 
123456789

a key:

  • Match and search can only match and search to the character on the first match
  • findall can match multiple times

Date and time library

time module

effect:
View of date and time.

code:

import time

# View of date and time
print(time.time())  # Seconds since January 1, 1970
print(time.localtime())  # specific date
print(time.strftime('%Y-%m-%d %H:%M:%S'))  # Output year, month and day in a specific format (custom format)
print(time.strftime('%Y%m%d'))

result:

1612482375.7951758
time.struct_time(tm_year=2021, tm_mon=2, tm_mday=5, tm_hour=7, tm_min=46, tm_sec=15, tm_wday=4, tm_yday=36, tm_isdst=0)
2021-02-05 07:46:15
20210205

datatime module

effect:
Modification of date and time.

code:

# datatime
# 1. Modification of date and time
# Get the time after 10min
import datetime

print(datetime.datetime.now())  # Take the present time
newtime = datetime.timedelta(minutes=10)  # timedelta after 10min -- offset
print(datetime.datetime.now() + (newtime))  # Current time + offset
# 2. Get the time after the specified date
# 10 days after May 27, 2018
one_day = datetime.datetime(2008, 5, 27)
new_date = datetime.timedelta(days=10)
print(one_day + new_date)

result:

2021-02-05 08:48:36.233056
2021-02-05 08:58:36.233056
2008-06-06 00:00:00

Mathematical correlation Library

random

Function:
Take random numbers according to the limiting conditions

code:

import random

# Take random numbers according to the limiting conditions

r = random.randint(1, 5)  # 1 ~ 5 random integers
print(r)

s = random.choice(['aa', 'bb', 'cc'])  # Random string
print(s)

result:

1
aa

Use the command line to manipulate files and folders

File and directory operation Library

os.path Library

code:
Function:
File and directory access.

# File and directory access
import os

# 1. According to the relative path To get the current absolute path
jd = os.path.abspath('.')
print(jd)

# According to the relative path To get the absolute path of the upper level
last_jd = os.path.abspath('..')
print(last_jd)

# 2. Judge whether the file exists
isExise = os.path.exists('/Python')  # The corresponding directory is in parentheses
print(isExise)

# 3. Judge whether it is a document
isFile = os.path.isfile('/Python')
print(isFile)

# Determine whether it is a directory
isDir = os.path.isdir('/Python')
print(isDir)

# 4. Path splicing
dirJoint = os.path.join('/tmp/a/', 'b/c')  # Followed by the path to be connected
print(dirJoint)

result:

D:\Python\pythonProject
D:\Python
True
False
True
/tmp/a/b/c

pathlib Library

code:

from pathlib import Path

# 1. Get the relative path Corresponding absolute path
p = Path('.')  # Take the first Encapsulated into this type
print(p.resolve())  # The absolute path corresponding to the relative path is equivalent to OS path. abspath('.')

# 2. List all directories under the current path - list derivation

# 3. Determine whether it is a directory
p.is_dir()

# 4. Create a new directory (key)
q = Path('/Python/a/b/c/d/e')  # Note that the slash '/' format should not be wrong

Path.mkdir(q, parents=True)  # create directory 
# arg1: established path arg2: automatically create the upper level directory. parents=True let automatically create parents=False don't let automatically create

Path.rmdir('/Python/a/b/c/d/e')  # Delete the specified directory. Non empty directory cannot be deleted

result:

D:\Python\pythonProject

Process finished with exit code 0

Chapter 14 machine learning library

General process of machine learning and NumPy installation

General processing steps

Data collection:

  1. questionnaire
  2. Collection of network information

Data preprocessing:

  1. Unit unification
  2. Format adjustment

Data cleaning:

  1. Deletion of missing values and outliers of data
  2. Get quality data

Data modeling:

  1. Combined with what you want to do, design the corresponding algorithm
  2. Feed the data to the machine
  3. The machine obtains the results through data and algorithm
  4. Results through the test, the algorithm is feasible and the model is established

Data test:

  1. The model is used to complete the functions of automatic driving, image classification, voice prediction and so on

NumPy Library

Function:
Data preprocessing.

Array and data type of NumPy


code:

import numpy as np

# Automatically convert according to the type of input data
arr1 = np.array([2, 3, 4])  # Define a list
print(arr1)  # This list has been encapsulated by numpy, and its computational efficiency is much higher than that of its own list
print(arr1.dtype)  # integer

arr2 = np.array([1.2, 2.3, 3.4])
print(arr2)
print(arr2.dtype)

# Mathematical calculation: List accumulation
print(arr1 + arr2)

result:

[2 3 4]
int32
[1.2 2.3 3.4]
float64
[3.2 5.3 7.4]

NumPy array and scalar calculation

code:

import numpy as np

# Scalar operation
arr = np.array([1.2, 2.3, 3.4])
print(arr * 10)

# Define a two-dimensional array (matrix)
data = [[1, 2, 3], [4, 5, 6]]  # Two lists are nested in one list
# A two-dimensional matrix converted to numpy through a list
arr2 = np.array(data)
print(arr2)
print(arr2.dtype)

# Make all two-dimensional matrices 0 or 1
one = np.zeros(10)  # Defines a one-dimensional array with a length of 10, and initializes the value to 0
print(one)

two = np.zeros((3, 5))  # Defines a two-dimensional, 3 × 5, and initializes the value to 0
print(two)

# Make the matrix contents all 1
all_one = np.ones((4, 6))  # Defines a two-dimensional, 4 × 6, and initializes the value to 1
print(all_one)

# Make the matrix contents null
all_empty = np.empty((2, 3, 2))  # Defines a three-dimensional, 2 × three × 2, and initializes the value to "null"
print(all_empty)  # The result is a random value (because null values are unsafe for program operations)

result:

[12. 23. 34.]
[[1 2 3]
 [4 5 6]]
int32
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1.]]
[[[0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]]]

Index and slice of NumPy array

code:

# Slicing operation
import numpy as np

arr = np.arange(10)
print(arr)

print(arr[5])
print(arr[5:8])

# Direct assignment to slice
arr[5:8] = 10
print(arr)

# Reassign without changing the original value
arr_slice = arr[5:8].copy()  # copy

arr_slice[:] = 15  # [:] indicates that all the values have been assigned from the first element to the last element

print(arr)
print(arr_slice)

result:

[0 1 2 3 4 5 6 7 8 9]
5
[5 6 7]
[ 0  1  2  3  4 10 10 10  8  9]
[ 0  1  2  3  4 10 10 10  8  9]
[15 15 15]

pandas installation and Series structure

pandas function:
Data preprocessing and data cleaning.

code:

from pandas import Series, DataFrame
import pandas as pd

# Align and display the data automatically or in a customized way
# It can flexibly handle missing data (fill in or specify values based on most well-known averages)
# Connection operation

# Related operations of one-dimensional array of pandas
obj = Series([4, 5, 6, -7])  # The array of numpy is encapsulated
print(obj)
# Benefit: the index is automatically added in front, which makes it easier to access the data

# The index of pandas can be repeated
print(obj.index)
# Take out the value
print(obj.values)

# Hash operation (the key in the dictionary cannot be repeated)
# After a simple character is hashed, the only complex character is obtained
# 'a' --> 'asdfasdfasdfasd'  # The hash algorithm is the same, and the results are the same
# Internal storage form of hash value -- link form
{'a': 1, 'b': 2, 'c': 3}
# 1. Map abc into a series of complex characters and store them in memory
# a -> asdhfljasdf
# b -> askldjfaisddf
# c -> dofjwoifjife
# The key in the dictionary can be int float string tuple, etc
# Dictionary key Cannot be: list, set # Because they can re assign values
{['a']: 1}

result:

0    4
1    5
2    6
3   -7
dtype: int64
RangeIndex(start=0, stop=4, step=1)
[ 4  5  6 -7]
Traceback (most recent call last):
  File "D:\Python\pythonProject\pandas_test.py", line 29, in <module>
    {['a']: 1}
TypeError: unhashable type: 'list'

Basic operation of Series

code:

import pandas as pd
from pandas import Series

# Operation on one-dimensional array (Series) in pandas
obj2 = Series([4, 7, -5, 3], index=['d', 'b', 'c', 'a'])  # Manually specify index
print(obj2)
# Assign value to index
obj2['c'] = 6
print(obj2)

# See if the index is in the Series
print('a' in obj2)
print('f' in obj2)

# Convert dictionary to Series
sdata = {'beijing': 35000, 'shanghai': 71000, 'guangzhou': 16000, 'shenzhen': 5000}
obj3 = Series(sdata)
print(obj3)

# Modifying the index of a Series
obj3.index = ['bj', 'gz', 'sh', 'sz']
print(obj3)

result:

d    4
b    7
c   -5
a    3
dtype: int64
d    4
b    7
c    6
a    3
dtype: int64
True
False
beijing      35000
shanghai     71000
guangzhou    16000
shenzhen      5000
dtype: int64
bj    35000
gz    71000
sh    16000
sz     5000
dtype: int64

Basic operations of Dataframe

code:

from pandas import Series, DataFrame  # Similar to spreadsheet

# Generate dataframe
# Pass in a list of equal length or an array of numpy
# Use the dictionary to create a DataFrame in the dictionary in the form of equal length list
data = {'city': ['shanghai', 'shanghai', 'shanghai', 'beijing', 'beijing'],
        'year': [2016, 2017, 2018, 2017, 2018],
        'pop': [1.5, 1.7, 3.6, 2.4, 2.9]}

# Assign dictionary to DataFrame
frame = DataFrame(data)
print(frame)

# Sort in the specified order
frame2 = DataFrame(data, columns=['year', 'city', 'pop'])  # Display by year city pop
print(frame2)

# Extract one-dimensional data (two-dimensional - > one-dimensional)
print(frame2['city'])
print(frame2.year)

# Add a column
frame2['new'] = 100
print(frame2)
# Generate new columns using calculations
# city is beijing's new column is True, not beijing's new column is False
frame2['cap'] = frame2.city == 'beijing'
print(frame2)

# Dictionary nesting
pop = {'beijing': {2008: 1.5, 2009: 2.0},
       'shanghai': {2008: 2.0, 2009: 3.6}
       }
frame3 = DataFrame(pop)
print(frame3)

# Row column interchange (transpose of determinant)
print(frame3.T)

# Re index (re modify the current index)
obj4 = Series([4.5, 7.2, -5.3, 3.6], index=['b', 'd', 'c', 'a'])
obj5 = obj4.reindex(['a', 'b', 'c', 'd', 'e'])
print(obj5)  # NaN stands for null value

# Filter missing values
# Uniformly fill in null values
obj6 = obj4.reindex(['a', 'b', 'c', 'd', 'e'], fill_value=0)
print(obj6)

# Uniformly fill in null values to the values of adjacent elements
# Populates the values above or below the current value
obj7 = Series(['blue', 'purple', 'yellow'], index=[0, 2, 4])
obj7_new = obj7.reindex(range(6))
print(obj7_new)
# Fill with previous values
obj7_newf = obj7.reindex(range(6), method='ffill')
print(obj7_newf)
# Fill with the following values
obj7_newb = obj7.reindex(range(6), method='bfill')
print(obj7_newb)

# Delete missing data
from numpy import nan as NA  # Import a missing value from numpy and rename it NA

data = Series([1, NA, 2])
print(data)
print(data.dropna())  # Delete missing values

# Deletion of DataFrame missing data
data2 = DataFrame([[1., 6.5, 3], [1., NA, NA], [NA, NA, NA]])
print(data2)
print(data2.dropna())  # The rows and columns with NA will be deleted

# Delete a row with all missing values and retain a row with some missing values
print(data2.dropna(how='all'))

data2[4] = NA
print(data2)
# Delete a column with all missing values, and retain a column with some missing values
print(data2.dropna(axis=1, how='all'))

# Filling method: fillna fills all missing values with 0
data2.fillna(0)  # Only a copy of data2 was modified
print(data2.fillna(0))
print(data2.fillna(0, inplace=True))  # Modify data2 directly
print(data2)

result:

       city  year  pop
0  shanghai  2016  1.5
1  shanghai  2017  1.7
2  shanghai  2018  3.6
3   beijing  2017  2.4
4   beijing  2018  2.9
   year      city  pop
0  2016  shanghai  1.5
1  2017  shanghai  1.7
2  2018  shanghai  3.6
3  2017   beijing  2.4
4  2018   beijing  2.9
0    shanghai
1    shanghai
2    shanghai
3     beijing
4     beijing
Name: city, dtype: object
0    2016
1    2017
2    2018
3    2017
4    2018
Name: year, dtype: int64
   year      city  pop  new
0  2016  shanghai  1.5  100
1  2017  shanghai  1.7  100
2  2018  shanghai  3.6  100
3  2017   beijing  2.4  100
4  2018   beijing  2.9  100
   year      city  pop  new    cap
0  2016  shanghai  1.5  100  False
1  2017  shanghai  1.7  100  False
2  2018  shanghai  3.6  100  False
3  2017   beijing  2.4  100   True
4  2018   beijing  2.9  100   True
      beijing  shanghai
2008      1.5       2.0
2009      2.0       3.6
          2008  2009
beijing    1.5   2.0
shanghai   2.0   3.6
a    3.6
b    4.5
c   -5.3
d    7.2
e    NaN
dtype: float64
a    3.6
b    4.5
c   -5.3
d    7.2
e    0.0
dtype: float64
0      blue
1       NaN
2    purple
3       NaN
4    yellow
5       NaN
dtype: object
0      blue
1      blue
2    purple
3    purple
4    yellow
5    yellow
dtype: object
0      blue
1    purple
2    purple
3    yellow
4    yellow
5       NaN
dtype: object
0    1.0
1    NaN
2    2.0
dtype: float64
0    1.0
2    2.0
dtype: float64
     0    1    2
0  1.0  6.5  3.0
1  1.0  NaN  NaN
2  NaN  NaN  NaN
     0    1    2
0  1.0  6.5  3.0
     0    1    2
0  1.0  6.5  3.0
1  1.0  NaN  NaN
     0    1    2   4
0  1.0  6.5  3.0 NaN
1  1.0  NaN  NaN NaN
2  NaN  NaN  NaN NaN
     0    1    2
0  1.0  6.5  3.0
1  1.0  NaN  NaN
2  NaN  NaN  NaN
     0    1    2    4
0  1.0  6.5  3.0  0.0
1  1.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0
None
     0    1    2    4
0  1.0  6.5  3.0  0.0
1  1.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0

Hierarchical index

code:

# Hierarchical index
# Multilevel index
# Series -- one dimensional data DataFrame -- two dimensional data
from pandas import Series, DataFrame
import numpy as np

data3 = Series(np.random.randn(10),
               index=[['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'd'],
                      [1, 2, 3, 1, 2, 3, 1, 2, 2, 3]])
print(data3)
# Extract data under index b
print(data3['b'])

# Extract multiple indexes
print(data3['b':'c'])

# Convert Series into DataFrame one-dimensional data into two-dimensional data
print(data3.unstack())

# Transform DataFrame into Series two-dimensional data into one-dimensional data
print(data3.unstack().stack())

result:

a  1    0.254799
   2    1.096555
   3    0.283251
b  1    1.009634
   2   -0.643421
   3    1.774040
c  1    0.550576
   2   -0.687114
d  2    0.308054
   3    0.788189
dtype: float64
1    1.009634
2   -0.643421
3    1.774040
dtype: float64
b  1    1.009634
   2   -0.643421
   3    1.774040
c  1    0.550576
   2   -0.687114
dtype: float64
          1         2         3
a  0.254799  1.096555  0.283251
b  1.009634 -0.643421  1.774040
c  0.550576 -0.687114       NaN
d       NaN  0.308054  0.788189
a  1    0.254799
   2    1.096555
   3    0.283251
b  1    1.009634
   2   -0.643421
   3    1.774040
c  1    0.550576
   2   -0.687114
d  2    0.308054
   3    0.788189
dtype: float64

Installation and drawing of Matplotlib

code:

import matplotlib.pyplot as plt
import numpy as np

# # Draw simple curves
# plt.plot([1, 3, 5], [4, 8, 10])  # x-axis y-axis
# plt.show()  # Display curve
#
# # Drawing data in numpy
# x = np.linspace(-np.pi, np.pi, 100)  # The definition domain of x is -3.14 ~ 3.14, with an interval of 100 elements
# plt.plot(x, np.sin(x))  # Abscissa: x value ordinate: sinx value
# plt.show()
#
# # Draw multiple curves
# x = np.linspace(-np.pi * 2, np.pi * 2, 100)  # The definition domain is: - 2pi to 2pi
# plt.figure(1, dpi=50)  # Create chart dpi: the higher the precision (the detail of the drawing), the larger the volume of the picture, and the clearer the drawing
# for i in range(1, 5):  # Draw four lines
#     plt.plot(x, np.sin(x / i))
#
# plt.show()
#
# # Draw histogram
# plt.figure(1, dpi=50)  # Create chart 1. dpi represents the picture fineness. The larger the dpi, the larger the file, and the magazine should be more than 300
# data = [1, 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 4]
# plt.hist(data)  # As long as the data is passed in, the histogram will count the number of times the data appears
#
# plt.show()
#
# # Scatter plot
# x = np.arange(1, 10)
# y = x
# fig = plt.figure()  # Create chart
# plt.scatter(x, y, c='r', marker='o')  # c = 'r' indicates that the color of the scatter is red, and marker indicates that the shape of the specified scatter is circular
# plt.show()

# Combination of pandas and matplotlib
# pandas reading data -- matplotlib drawing
import pandas as pd
#
# iris = pd.read_csv("./iris_training.csv")
# print(iris.head())  # Displays the first five lines of iris information
#
# # Scatter plot based on the first two columns
# iris.plot(kind='scatter', x='120', y='4')  # Scatter chart x-axis y-axis
#
# # It's no use, just let pandas's plot() method display on pyCharm
# plt.show()

# A library that encapsulates the matplotlib library
import seaborn as sns
import warnings  # Ignore warning

warnings.filterwarnings("ignore")  # When warnings are generated, ignore

iris = pd.read_csv("./iris_training.csv")
# Set style
sns.set(style="white", color_codes=True)
# Set the drawing format to scatter chart
# sns.jointplot(x="120", y="4", data=iris, size=5)
# distplot curve
# sns.distplot(iris['120'])  # Draw the scatter diagram corresponding to 120

# It's no use, just let pandas's plot() method display on pyCharm
# plt.show()

# Based on different classifications, points are divided into different colors
# FacetGrid general drawing function
# hue color display classification 0 / 1 / 2
# plt.scatter plot with scatter
# add_legend() displays the description of the classification
# sns.FacetGrid(iris, hue="virginica", size=5).map(plt.scatter, "120", "4").add_legend()  # hue draws scatter points in different colors based on different values of virginica

# Display the corresponding information in columns 3 and 4
sns.FacetGrid(iris, hue="virginica", size=5).map(plt.scatter, "setosa", "versicolor").add_legend()

# It's no use, just let pandas's plot() method display on pyCharm
plt.show()

result:

Principles of machine learning classification

working principle:
Input the eigenvalue and corresponding classification result data into the machine learning algorithm - get the model - test the model - input new values for prediction

Installation of Tensorflow

Models and codes classified according to eigenvalues

Chapter 15 reptiles

Web data collection and urllib Library

code:

# Get web page
from urllib import request

url = 'http://www.baidu. Com '# HTTP protocol
# Parse url
response = request.urlopen(url, timeout=1)  # Timeout set the timeout (if you don't open the web page for more than 1s, you will give up opening the web page)
print(response.read().decode('utf-8'))  # Direct reading will read out garbled code, which needs to be parsed. The format to be encoded is utf-8

Two common request methods for web pages: get and post

code:

# There are two ways to request web pages
from urllib import parse  # parse is used to process post data
from urllib import request

# GET mode
# When entering the web address, it is followed by the parameters to be passed to the web page
# Write the information to be transmitted to the web server in the url address
# httpbin. org/get? A = 123 & BB = 456 tell the server to pass it two values a and b, and the server will display them
# Benefits: it is easy to transfer data to the server; Disadvantages: limitation of transmission data size

response = request.urlopen('http://httpbin.org/get', timeout=1) # followed by no data parameter is get
print(response.read())

# POST mode
# When submitting the user name and password, use post to submit. After entering the user name and password in the web page, they do not appear in the url address bar

data = bytes(parse.urlencode({'word': 'hellp'}), encoding='utf8')  # Encapsulate data with data
response2 = request.urlopen('http://httpbin. Org / post ', data = data) # data = data specifies the data to be transferred
print(response2.read().decode('utf-8'))

import urllib
import socket  # Socket font library (network timeout usually occurs here)

# Timeout condition
try:
    response3 = request.urlopen('http://httpbin.org/get', timeout=0.1)
except urllib.error.URLError as e:
    if isinstance(e.reason, socket.timeout): # Determine whether the error of error is caused by timeout
        print('TIME OUT')

result:

b'{\n  "args": {}, \n  "headers": {\n    "Accept-Encoding": "identity", \n    "Host": "httpbin.org", \n    "User-Agent": "Python-urllib/3.8", \n    "X-Amzn-Trace-Id": "Root=1-601f3179-477c6caf31b7d3e6394fe965"\n  }, \n  "origin": "1.50.125.140", \n  "url": "http://httpbin.org/get"\n}\n'
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "word": "hellp"
  }, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Content-Length": "10", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "Python-urllib/3.8", 
    "X-Amzn-Trace-Id": "Root=1-601f317a-116916d86dd759e02cda01f4"
  }, 
  "json": null, 
  "origin": "1.50.125.140", 
  "url": "http://httpbin.org/post"
}

TIME OUT

Simulation of HTTP header information

# url disguises as a browser to request web information
from urllib import request, parse

url = 'http://httpbin.org/post'

headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
    "Accept-Encoding": "gzip, deflate",
    "Accept-Language": "zh-CN,zh;q=0.9",
    "Cache-Control": "max-age=259200",
    "Host": "httpbin.org",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36",
    "X-Amzn-Trace-Id": "Root=1-601f3d71-47afea5509bb17d15ab795ec"
}

dict = {
    'name': 'value'
}

data = bytes(parse.urlencode(dict), encoding='utf8')
req = request.Request(url=url, data=data, headers=headers, method='POST')
response = request.urlopen(req)
print(response.read().decode('utf-8'))

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "value"
  }, 
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", 
    "Accept-Encoding": "gzip, deflate", 
    "Accept-Language": "zh-CN,zh;q=0.9", 
    "Cache-Control": "max-age=259200", 
    "Content-Length": "10", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36", 
    "X-Amzn-Trace-Id": "Self=1-601f3dee-7861d78308a6c5be459b26a8;Root=1-601f3d71-47afea5509bb17d15ab795ec"
  }, 
  "json": null, 
  "origin": "1.50.125.140", 
  "url": "http://httpbin.org/post"
}

Basic use of requests Library

# get request
import requests

url = 'http://httpbin.org/get'
data = {'key': 'value', 'abc': 'xyz'} # Data passed to the server
# . get is to use the get method to request the url, and dictionary type data does not need additional processing
response = requests.get(url, data)
print(response.text)

# post request
url = 'http://httpbin.org/post'
data = {'key': 'value', 'abc': 'xyz'}
# . post is expressed as a post method
response = requests.post(url, data)
# The return type is json format
print(response.json())

Crawling image links with regular expressions

import requests
import re

content = requests.get('http://www.cnu.cc/discoveryPage/hot - portrait ') text

print(content)

# <div class="grid-item work-thumbnail">
# <a href="(http://www.cnu.cc/works/244938)" class = "thumbnail" target="_blank">
# < div class = "title" > (Fashion kids | retro childhood) < / div >
# <div class="author">LynnWei </div>

# <a href="(.*?)" .*?title">( .*? )</div>

pattern = re.compile(r'<a href="(.*?)".*?title">(.*?)</div>', re.S)
results = re.findall(pattern, content)
print(results)  # The tuple form in the list is displayed

# Take out each element in the list
for result in results:
    url, name = result
    print(url, re.sub('\s', '', name))  # \S matches special symbols such as white space characters \ n are matched by \ s and replaced with null ''

Installation and use of Beautiful Soup

# You can match the html language without writing regular
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

from bs4 import BeautifulSoup

# Import the contents of the web page
soup = BeautifulSoup(html_doc, 'lxml')  # Parsing html in lxml format

# Process to standard html format
print(soup.prettify())

# Find the title tag
print(soup.title)

# Content in the title tag
print(soup.title.string)

# Find p tag
print(soup.p)

# Find the name of the p tag class
print(soup.p['class'])

# Find the first a tag
print(soup.a)

# Find all a Tags
print(soup.find_all('a'))

# Find the tag with id link3
print(soup.find(id="link3"))

# Find links to all < a > tags
for link in soup.find_all('a'):
    print(link.get('href'))

# Find all text content in the document
print(soup.get_text())

Use crawlers to crawl news websites

# requests grabs the text beautiful soup for processing
from bs4 import BeautifulSoup
import requests

# Tell the website that we are a legitimate browser
headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language": "zh-CN,zh;q=0.8",
    "Connection": "close",
    "Cookie": "_gauges_unique_hour=1; _gauges_unique_day=1; _gauges_unique_month=1; _gauges_unique_year=1; _gauges_unique=1",
    "Referer": "http://www.infoq.com",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER"
}

url = 'https://www.infoq.com/news/'


# Get the full content of the web page


def craw(url):
    response = requests.get(url, headers=headers)
    print(response.text)


# craw(url)

# Get news headlines


def craw2(url):
    response = requests.get(url, headers=headers)

    soup = BeautifulSoup(response.text, 'lxml')

    for title_href in soup.find_all('div', class_='news_type_block'):
        print([title.get('title')
               for title in title_href.find_all('a') if title.get('title')])


craw2(url)

# Turn pages
for i in range(15, 46, 15):
    url = 'http://www.infoq.com/cn/news/' + str(i)
    # print(url)
    craw2(url)

Use the crawler to crawl the picture link and download the picture

# Batch download of pictures through requests
# Get the link address of the picture
from bs4 import BeautifulSoup
import requests
import os
import shutil

headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language": "zh-CN,zh;q=0.8",
    "Connection": "close",
    "Cookie": "_gauges_unique_hour=1; _gauges_unique_day=1; _gauges_unique_month=1; _gauges_unique_year=1; _gauges_unique=1",
    "Referer": "http://www.infoq.com",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER"
}

url = 'http://www.infoq.com/presentations'


# Download pictures
# The Requests library encapsulates complex interfaces and provides more user-friendly HTTP clients, but does not directly provide functions to download files.
# This needs to be achieved by setting a special parameter stream for the request. When stream is set to True,
# The above request only downloads the HTTP response header and keeps the connection open,
# Until you access response The content of the response body can be downloaded only when the content attribute is


def download_jpg(image_url, image_localpath):
    response = requests.get(image_url, stream=True)
    if response.status_code == 200:
        with open(image_localpath, 'wb') as f:
            response.raw.deconde_content = True
            shutil.copyfileobj(response.raw, f)


# Get speech pictures
def craw3(url):
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'lxml')
    for pic_href in soup.find_all('div', class_='items__content'):
        for pic in pic_href.find_all('img'):
            imgurl = pic.get('src')
            dir = os.path.abspath('.')
            filename = os.path.basename(imgurl)
            imgpath = os.path.join(dir, filename)
            print('Start downloading %s' % imgurl)
            download_jpg(imgurl, imgpath)


# craw3(url)

# Turn pages
j = 0
for i in range(12, 37, 12):
    url = 'http://www.infoq.com/presentations' + str(i)
    j += 1
    print('The first %d page' % j)
    craw3(url)