Week 9 Python computing ecology overview

Posted by hkothari on Wed, 05 Jan 2022 12:32:40 +0100

9.1 from data processing to artificial intelligence

Data representation → data cleaning → data statistics → data visualization → data mining → artificial intelligence
Data representation: use appropriate methods to express data by program
Data cleaning: data normalization, data conversion and outlier processing
Data statistics: summary understanding of data, quantity, distribution, median, etc
Data visualization: a way to visually display the connotation of data
Data mining: acquire knowledge from data analysis and generate value outside data
Artificial intelligence: in-depth analysis and decision-making in data / language / image / vision

Data analysis of python Library
Data visualization of python Library
Text processing of python Library
Machine learning of python Library

Numpy: the most basic library for expressing N-dimensional arrays
Python interface is implemented in C language, with excellent computing speed
Python is a basic library for data analysis and scientific computing, supporting Pandas, etc
It provides direct matrix operation, broadcast function, linear algebra and other functions

def pySum():
	a = [0, 1, 2, 3, 4]
	b = [9, 8, 7, 6, 5]
	c = []
	
	for i in range(len(a)):
		c.append(a[i]**2 + b[i]**3)
	
	return c
print(pySum())
import numpy as np

def npSum():
	a = np.array([0, 1, 2, 3, 4])
	b = np.array([9, 8, 7, 6, 5])
	
	c = a**2 + b**3

	return c

print(npSum())

http://www.numpy.org

Pandas: Python data analysis high level application library
It provides simple and easy-to-use data structure and data analysis tools
Understand the relationship between data type and index. Operating index is operating data
Python's main data analysis library is developed based on Numpy
Series = index + one-dimensional data
DataFrame = row column index + 2D data
http://pandas.pydata.org

Scipy: mathematical, scientific and Engineering Computing Library
It provides a number of mathematical algorithms and engineering data operation functions
Similar to Matlab, it can be used in applications such as Fourier transform and signal processing
Python's main scientific computing library is developed based on Numpy
http://www.scipy.org

Data visualization of python Library
Matplotlib: high quality 2D data visualization Library
More than 100 kinds of data visualization effects are provided
Through Matplotlib Pyplot sub library calls each visualization effect
Python's main data visualization library is developed based on Numpy
http://matplotlib.org

Seaborn: statistical data visualization Library
A batch of high-level statistical data visualization display effects are provided
It mainly displays the distribution, classification and linear relationship among data
It is developed based on Matplotlib and supports Numpy and Pandas
http://seaborn.pydata.org/

Mayavi: 3D scientific data visualization Library
It provides a batch of easy-to-use 3D scientific computing data visualization display effects
The current version is Mayavi2, the main third-party library for 3D visualization
Support Numpy, TVTK, Traits, envision and other third-party libraries
http://docs.enthought.com/mayavi/mayavi/

Text processing of python Library
PyPDF2: toolset for processing pdf files
It provides a calculation function for batch processing PDF files
Support information acquisition, file separation / integration, encryption and decryption, etc
Fully implemented in Python language, no additional dependency is required, and the function is stable

from PyPDF2 import PdfFileReader, PdfFileMerger
merger = PdfFileMerger()
input1 = open("document1.pdf", "rb")
input2 = open("document2.pdf", "rb")
merger.append(fileobj = input1, pages = (0,3))
merger.merge(position = 2, fileobj = input2, pages = (0,1))
output = open("document-output.pdf", "wb")
merger.write(output)

http://mstamy2.github.io/PyPDF2

NLTK: natural language text processing third party Library
It provides a number of simple and easy-to-use natural language text processing functions
Support language text classification, marking, syntax, semantic analysis, etc
The best Python natural language processing library

from nltk.corpus import treebank
t = treebank.parsed_sents('wsj_0001.mrg')[0]
t.draw()

http://www.nltk.org/

Python docx: create or update third-party libraries for Microsoft Word files
Provide creation or update doc or Calculation function of docx and other files
Add and configure paragraphs, pictures, tables, text, etc., with comprehensive functions

from docx import Document
document = Document()
document.add_heading('Document Title', 0)
p = document.add_paragraph('A plain paragraph having some ')
document.add_page_break()
document.save('demo.docx')

http://python-docx.readthedocs.io/en/latest/index.html

Machine learning of python Library
Scikit learn: tool set of machine learning methods
Provide a number of unified machine learning method function interfaces
It provides computing functions such as clustering, classification, regression and reinforcement learning
The most basic and excellent Python third-party library for machine learning
http://scikit-learn.org/

Tensorflow: machine learning computing framework behind alphago
Open source machine learning framework promoted by Google
Based on the data flow graph, the graph nodes represent operations and edges represent tensors
A way to apply machine learning methods to support Google's artificial intelligence applications

import tensorflow as tf
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
res = sess.run(result)
print('result:', res)

https://www.tensorflow.org/

MXNet: deep learning computing framework based on Neural Network
It provides scalable neural network and deep learning computing function
It can be used in many fields such as automatic driving, machine translation, speech recognition and so on
Python's most important deep learning computing framework

9.2 example 15: radar chart of Holland personality analysis

Radar Chart
Radar chart is an important way to display multi characteristics intuitively

Holland believes that there should be an internal correspondence between personality interest and occupation

Requirements: radar chart to verify Holland's personality analysis
Input: survey data of various occupational groups combined with interests
Output: radar chart

#HollandRadarDraw
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

matplotlib.rcParams['font.family'] = 'SimHei'
radar_labels = np.array(['Research type(I)', 'Artistic type(A)', 'Social type(S)', \
                        'enterprise class(E)', 'Conventional type(C)', 'Realistic type(R)', 'Research type(I)'])
data = np.array([[0.40, 0.32, 0.35, 0.30, 0.30, 0.88],
                 [0.85, 0.35, 0.30, 0.40, 0.40, 0.30],
                 [0.43, 0.89, 0.30, 0.28, 0.22, 0.30],
                 [0.30, 0.25, 0.48, 0.85, 0.45, 0.40],
                 [0.20, 0.38, 0.87, 0.45, 0.32, 0.28],
                 [0.34, 0.31, 0.38, 0.40, 0.92, 0.28]
                 ]) #Data value
data_labels = ('artist', 'Experimenter', 'engineer', 'salesman', 'social worker', 'Recorder')

angles = np.linspace(0, 2*np.pi, 6, endpoint=False)
data = np.concatenate((data, [data[0]]))
angles = np.concatenate((angles, [angles[0]]))
fig = plt.figure(facecolor="white")
plt.subplot(111, polar=True)
plt.plot(angles, data, 'o-', linewidth=1, alpha=0.2) 
plt.fill(angles, data, alpha=0.25)
plt.thetagrids(angles*180/np.pi, radar_labels) 

plt.figtext(0.52, 0.95, 'Holland personality analysis', ha = 'center', size = 20)
legend = plt.legend(data_labels, loc=(0.94, 0.80), labelspacing=0.1)
plt.setp(legend.get_texts(), fontsize='large')
plt.grid(True)
plt.savefig('holland_radar.jpg')
plt.show()             

Goal + immersion + proficiency

9.3 parsing from Web to cyberspace

Web crawler of python Library
Requests: the most friendly web crawler Library
It provides a simple and easy-to-use HTTP protocol like web crawler function
Support connection pool, SSL, Cookies, HTTP(S) proxy, etc
Python's main page level Web crawler Library

import requests
r = requests.get('https://api.github.com/user',\
                 auth = ('user', 'pass'))
r.status_code
r.headers['content-type']
r.encoding
r.text               

http://www.python-requests.org/

Scrapy: excellent web crawler framework
It provides the framework function and semi-finished product of building web crawler system
Support batch and regular web page crawling, provide data processing flow, etc
Python is the most important and professional web crawler framework
https://scrapy.org

pyspider: a powerful Web page crawling system
It provides a complete web page crawling system construction function
Support database backend, message queue, priority, distributed architecture, etc
Python's important third-party library of web crawlers
http://docs.pyspider.org

Web page information extraction of python Library
Beautiful soup: parsing library for HTML and XML
It provides the function of parsing Web information such as HTML and XML
Also known as beautiful soup 4 or bs4, it can load a variety of parsing engines
It is often used with web crawler libraries, such as Scrapy, requests, etc
https://www.crummy.com/software/BeautifulSoup/bs4

Re: regular expression parsing and processing library
Provides a number of general functions for defining and parsing regular expressions
It can be used in various scenarios, including fixed-point Web information extraction
Python is one of the most important standard libraries without installation

re.search()
re.match()
re.findall()
re.split()
re.finditer()
re.sub()

https://docs.python.org/3.6/library/re.html

Python Goose: feature library for extracting article type Web pages
It provides the function of extracting metadata such as article information / video in Web pages
For specific types of Web pages, the application coverage is wide
Python's main Web information extraction Library

from goose import Goose
url = 'http://www.elmundo.es/elmundo/2012/10/28/espana/1351388909.html'
g = Goose({'use_meta_language': False, 'target_language': 'es'})
article = g.extract(url=url)
article.cleaned_text[:150]

https://github.com/grangier/python-goose

Web development of python Library
Django: the most popular Web application framework
It provides the basic application framework for building Web system
MTV mode: model, Template, Views
Python is the most important Web application framework, a slightly complex application framework
https://www.djangoproject.com

Pyramid: a moderate scale Web application framework
It provides a simple and convenient application framework for building Web system
Medium size, moderate scale, suitable for rapid construction and moderate expansion of class applications
Python product level Web application framework is simple to start and has good scalability

from wsgiref.simple_server import make_server
from pyramid.config import Configurator
from pyramid.response import Response
def hello_world(request):
	return Response('Hello World!')
if _name_ == '_main_':
	with Configurator() as config:
		config.add_route('hello', '/')
		config.add_view(hello_world, route_name='hello')
		app = config.make_wsgi_app()
	server = make_server('0.0.0.0', 6543, app)
	server.serve_forever()

https://trypyramid.com/

Flask: Web application development micro framework
It provides the simplest application framework for building Web system
Features: simple, small-scale, fast

from flask import Flask
app = Flask(_name_)
@app.route('/')
def hello_world():
	return 'Hello, World!'

http://flask.pocoo.org

Network application development of Python Library
WeRoBot: WeChat official account development framework
It provides the function of parsing wechat server messages and feedback messages
An important technical means of establishing wechat robot

import werobot
robot = werobot.WeRoBot(token='tokenhere')
@robot.handler
def hello(message):
	return 'Hello World!'

https://github.com/offu/WeRoBot

aip: Baidu AI open platform interface
Python function interface for accessing Baidu AI service is provided
Voice, face, OCR, NLP, knowledge map, image search and other fields
Python is the main way of Baidu AI application
https://github.com/Baidu-AIP/python-sdk

MyQR: QR code generation third party Library
It provides a series of functions for generating QR codes
Basic QR code, art QR code and dynamic QR code
https://github.com/x-hw/amazing-qr

9.4 from human-computer interaction to art design

Graphical user interface GUI of python Library
Pyqt5: Python interface of Qt development framework
Provides a Python API interface for creating Qt5 programs
Qt is a very mature cross platform desktop application development system with complete GUI
Recommended Python GUI development third-party library
https://www.riverbankcomputing.com/software/pyqt

wxPython: GUI development framework for box platform
Provides a cross platform GUI development framework dedicated to Python
Understand the relationship between data type and index. Operating index is operating data

import wx
app = wx.App(False)
frame = wx.Frame(None, wx.ID_ANY, "Hello World")
frame.Show(True)
app.MainLoop()

https://www.wxpython.org

PyGobject: develop GUI function library using GTK +
It provides the function of integrating GTK +, WebKit GTK + and other libraries
GTK +: a cross platform GUI framework of user graphical interface
Example: Anaconda uses this library to build GUI

import gi
gi.require_version("Gtk", "3.0")
from gi.repository import Gtk
window = Gtk.Window(title="Hello World")
window.show()
window.connect("destroy", Gtk.main_quit)
Gtk.main()

https://pygobject.readthedocs.io

Game development of python Library
PyGame: a simple game development library
It provides a simple game development function and implementation engine based on SDL
Understand the response mechanism of the game to external input and the role construction and interaction mechanism
The main third-party library for getting started with Python games
http://www.pygame.org

Panda3D: open source, cross platform 3D rendering and game development library
A 3D game engine that provides Python and C + + interfaces
Support many advanced features: normal map, gloss map, cartoon rendering, etc
Jointly developed by Disney and Carnegie Mellon University
http://www.panda3d.org

cocos2d: a framework for building interactive applications of 2D games and graphical interfaces
It provides the graphics rendering function of game development based on OpenGL
It supports GPU acceleration and adopts tree structure to manage game object types hierarchically
Suitable for 2D professional game development
http://python.cocos2d.org/

Virtual reality of Python Library
VR Zero: Python library for developing VR applications on raspberry pie
It provides a large number of functions related to VR development
VR development library for raspberry pie supports miniaturization of equipment and simplified configuration
It is very suitable for beginners to practice VR development and application
https://github.com/WayneKeenan/python-vrzero

Pyovr: python development interface of oculus rift
Python development library for Oculus VR device
Based on mature VR equipment, provide a full set of documents and industrial application equipment
An idea of Python + virtual reality exploration
https://github.com/cmbruns/pyovr

Wizard: General VR development engine based on Python
Professional enterprise virtual reality development engine
Provide detailed official documents
It supports a variety of mainstream VR hardware devices and has certain universality
http://www.worldviz.com/vizard-virtual-reality-software

Graphic art of Python Library
Quads: the art of iteration
The image is divided into four iterations to form a pixel wind
Dynamic or static images can be generated
Easy to use, with a high degree of display
https://github.com/fogleman/Quads

ascii_art: ASCII Art Library
Convert normal pictures to ASCII art style
The output can be plain text or color text
It can be output in picture format
https://github.com/jontonsoup4/ascii_art

turtle: turtle drawing system
Random Art
https://docs.python.org/3/library/turtle.html

9.5 example 16: Rose drawing

#RoseDraw.py
import turtle as t
# Define a curve drawing function
def DegreeCurve(n, r, d=1):
    for i in range(n):
        t.left(d)
        t.circle(r, abs(d))
# Initial position setting
s = 0.2 # size
t.setup(450*5*s, 750*5*s)
t.pencolor("black")
t.fillcolor("red")
t.speed(100)
t.penup()
t.goto(0, 900*s)
t.pendown()
# Draw flower shape
t.begin_fill()
t.circle(200*s,30)
DegreeCurve(60, 50*s)
t.circle(200*s,30)
DegreeCurve(4, 100*s)
t.circle(200*s,50)
DegreeCurve(50, 50*s)
t.circle(350*s,65)
DegreeCurve(40, 70*s)
t.circle(150*s,50)
DegreeCurve(20, 50*s, -1)
t.circle(400*s,60)
DegreeCurve(18, 50*s)
t.fd(250*s)
t.right(150)
t.circle(-500*s,12)
t.left(140)
t.circle(550*s,110)
t.left(27)
t.circle(650*s,100)
t.left(130)
t.circle(-300*s,20)
t.right(123)
t.circle(220*s,57)
t.end_fill()
# Draw flower branch shape
t.left(120)
t.fd(280*s)
t.left(115)
t.circle(300*s,33)
t.left(180)
t.circle(-300*s,33)
DegreeCurve(70, 225*s, -1)
t.circle(350*s,104)
t.left(90)
t.circle(200*s,105)
t.circle(-500*s,63)
t.penup()
t.goto(170*s,-30*s)
t.pendown()
t.left(160)
DegreeCurve(20, 2500*s)
DegreeCurve(220, 250*s, -1)
# Draw a green leaf
t.fillcolor('green')
t.penup()
t.goto(670*s,-180*s)
t.pendown()
t.right(140)
t.begin_fill()
t.circle(300*s,120)
t.left(60)
t.circle(300*s,120)
t.end_fill()
t.penup()
t.goto(180*s,-550*s)
t.pendown()
t.right(85)
t.circle(600*s,40)
# Draw another green leaf
t.penup()
t.goto(-150*s,-1000*s)
t.pendown()
t.begin_fill()
t.rt(120)
t.circle(300*s,115)
t.left(75)
t.circle(300*s,100)
t.end_fill()
t.penup()
t.goto(430*s,-1070*s)
t.pendown()
t.right(30)
t.circle(-600*s,35)
t.done()


Art: thought first, programming is the means
Design: ideas are as important as programming
Engineering: programming first, thought second