9.1 from data processing to artificial intelligence
Data representation → data cleaning → data statistics → data visualization → data mining → artificial intelligence
Data representation: use appropriate methods to express data by program
Data cleaning: data normalization, data conversion and outlier processing
Data statistics: summary understanding of data, quantity, distribution, median, etc
Data visualization: a way to visually display the connotation of data
Data mining: acquire knowledge from data analysis and generate value outside data
Artificial intelligence: in-depth analysis and decision-making in data / language / image / vision
Data analysis of python Library
Data visualization of python Library
Text processing of python Library
Machine learning of python Library
Numpy: the most basic library for expressing N-dimensional arrays
Python interface is implemented in C language, with excellent computing speed
Python is a basic library for data analysis and scientific computing, supporting Pandas, etc
It provides direct matrix operation, broadcast function, linear algebra and other functions
def pySum(): a = [0, 1, 2, 3, 4] b = [9, 8, 7, 6, 5] c = [] for i in range(len(a)): c.append(a[i]**2 + b[i]**3) return c print(pySum())
import numpy as np def npSum(): a = np.array([0, 1, 2, 3, 4]) b = np.array([9, 8, 7, 6, 5]) c = a**2 + b**3 return c print(npSum())
http://www.numpy.org
Pandas: Python data analysis high level application library
It provides simple and easy-to-use data structure and data analysis tools
Understand the relationship between data type and index. Operating index is operating data
Python's main data analysis library is developed based on Numpy
Series = index + one-dimensional data
DataFrame = row column index + 2D data
http://pandas.pydata.org
Scipy: mathematical, scientific and Engineering Computing Library
It provides a number of mathematical algorithms and engineering data operation functions
Similar to Matlab, it can be used in applications such as Fourier transform and signal processing
Python's main scientific computing library is developed based on Numpy
http://www.scipy.org
Data visualization of python Library
Matplotlib: high quality 2D data visualization Library
More than 100 kinds of data visualization effects are provided
Through Matplotlib Pyplot sub library calls each visualization effect
Python's main data visualization library is developed based on Numpy
http://matplotlib.org
Seaborn: statistical data visualization Library
A batch of high-level statistical data visualization display effects are provided
It mainly displays the distribution, classification and linear relationship among data
It is developed based on Matplotlib and supports Numpy and Pandas
http://seaborn.pydata.org/
Mayavi: 3D scientific data visualization Library
It provides a batch of easy-to-use 3D scientific computing data visualization display effects
The current version is Mayavi2, the main third-party library for 3D visualization
Support Numpy, TVTK, Traits, envision and other third-party libraries
http://docs.enthought.com/mayavi/mayavi/
Text processing of python Library
PyPDF2: toolset for processing pdf files
It provides a calculation function for batch processing PDF files
Support information acquisition, file separation / integration, encryption and decryption, etc
Fully implemented in Python language, no additional dependency is required, and the function is stable
from PyPDF2 import PdfFileReader, PdfFileMerger merger = PdfFileMerger() input1 = open("document1.pdf", "rb") input2 = open("document2.pdf", "rb") merger.append(fileobj = input1, pages = (0,3)) merger.merge(position = 2, fileobj = input2, pages = (0,1)) output = open("document-output.pdf", "wb") merger.write(output)
http://mstamy2.github.io/PyPDF2
NLTK: natural language text processing third party Library
It provides a number of simple and easy-to-use natural language text processing functions
Support language text classification, marking, syntax, semantic analysis, etc
The best Python natural language processing library
from nltk.corpus import treebank t = treebank.parsed_sents('wsj_0001.mrg')[0] t.draw()
http://www.nltk.org/
Python docx: create or update third-party libraries for Microsoft Word files
Provide creation or update doc or Calculation function of docx and other files
Add and configure paragraphs, pictures, tables, text, etc., with comprehensive functions
from docx import Document document = Document() document.add_heading('Document Title', 0) p = document.add_paragraph('A plain paragraph having some ') document.add_page_break() document.save('demo.docx')
http://python-docx.readthedocs.io/en/latest/index.html
Machine learning of python Library
Scikit learn: tool set of machine learning methods
Provide a number of unified machine learning method function interfaces
It provides computing functions such as clustering, classification, regression and reinforcement learning
The most basic and excellent Python third-party library for machine learning
http://scikit-learn.org/
Tensorflow: machine learning computing framework behind alphago
Open source machine learning framework promoted by Google
Based on the data flow graph, the graph nodes represent operations and edges represent tensors
A way to apply machine learning methods to support Google's artificial intelligence applications
import tensorflow as tf init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) res = sess.run(result) print('result:', res)
https://www.tensorflow.org/
MXNet: deep learning computing framework based on Neural Network
It provides scalable neural network and deep learning computing function
It can be used in many fields such as automatic driving, machine translation, speech recognition and so on
Python's most important deep learning computing framework
9.2 example 15: radar chart of Holland personality analysis
Radar Chart
Radar chart is an important way to display multi characteristics intuitively
Holland believes that there should be an internal correspondence between personality interest and occupation
Requirements: radar chart to verify Holland's personality analysis
Input: survey data of various occupational groups combined with interests
Output: radar chart
#HollandRadarDraw import numpy as np import matplotlib.pyplot as plt import matplotlib matplotlib.rcParams['font.family'] = 'SimHei' radar_labels = np.array(['Research type(I)', 'Artistic type(A)', 'Social type(S)', \ 'enterprise class(E)', 'Conventional type(C)', 'Realistic type(R)', 'Research type(I)']) data = np.array([[0.40, 0.32, 0.35, 0.30, 0.30, 0.88], [0.85, 0.35, 0.30, 0.40, 0.40, 0.30], [0.43, 0.89, 0.30, 0.28, 0.22, 0.30], [0.30, 0.25, 0.48, 0.85, 0.45, 0.40], [0.20, 0.38, 0.87, 0.45, 0.32, 0.28], [0.34, 0.31, 0.38, 0.40, 0.92, 0.28] ]) #Data value data_labels = ('artist', 'Experimenter', 'engineer', 'salesman', 'social worker', 'Recorder') angles = np.linspace(0, 2*np.pi, 6, endpoint=False) data = np.concatenate((data, [data[0]])) angles = np.concatenate((angles, [angles[0]])) fig = plt.figure(facecolor="white") plt.subplot(111, polar=True) plt.plot(angles, data, 'o-', linewidth=1, alpha=0.2) plt.fill(angles, data, alpha=0.25) plt.thetagrids(angles*180/np.pi, radar_labels) plt.figtext(0.52, 0.95, 'Holland personality analysis', ha = 'center', size = 20) legend = plt.legend(data_labels, loc=(0.94, 0.80), labelspacing=0.1) plt.setp(legend.get_texts(), fontsize='large') plt.grid(True) plt.savefig('holland_radar.jpg') plt.show()
Goal + immersion + proficiency
9.3 parsing from Web to cyberspace
Web crawler of python Library
Requests: the most friendly web crawler Library
It provides a simple and easy-to-use HTTP protocol like web crawler function
Support connection pool, SSL, Cookies, HTTP(S) proxy, etc
Python's main page level Web crawler Library
import requests r = requests.get('https://api.github.com/user',\ auth = ('user', 'pass')) r.status_code r.headers['content-type'] r.encoding r.text
http://www.python-requests.org/
Scrapy: excellent web crawler framework
It provides the framework function and semi-finished product of building web crawler system
Support batch and regular web page crawling, provide data processing flow, etc
Python is the most important and professional web crawler framework
https://scrapy.org
pyspider: a powerful Web page crawling system
It provides a complete web page crawling system construction function
Support database backend, message queue, priority, distributed architecture, etc
Python's important third-party library of web crawlers
http://docs.pyspider.org
Web page information extraction of python Library
Beautiful soup: parsing library for HTML and XML
It provides the function of parsing Web information such as HTML and XML
Also known as beautiful soup 4 or bs4, it can load a variety of parsing engines
It is often used with web crawler libraries, such as Scrapy, requests, etc
https://www.crummy.com/software/BeautifulSoup/bs4
Re: regular expression parsing and processing library
Provides a number of general functions for defining and parsing regular expressions
It can be used in various scenarios, including fixed-point Web information extraction
Python is one of the most important standard libraries without installation
re.search() re.match() re.findall() re.split() re.finditer() re.sub()
https://docs.python.org/3.6/library/re.html
Python Goose: feature library for extracting article type Web pages
It provides the function of extracting metadata such as article information / video in Web pages
For specific types of Web pages, the application coverage is wide
Python's main Web information extraction Library
from goose import Goose url = 'http://www.elmundo.es/elmundo/2012/10/28/espana/1351388909.html' g = Goose({'use_meta_language': False, 'target_language': 'es'}) article = g.extract(url=url) article.cleaned_text[:150]
https://github.com/grangier/python-goose
Web development of python Library
Django: the most popular Web application framework
It provides the basic application framework for building Web system
MTV mode: model, Template, Views
Python is the most important Web application framework, a slightly complex application framework
https://www.djangoproject.com
Pyramid: a moderate scale Web application framework
It provides a simple and convenient application framework for building Web system
Medium size, moderate scale, suitable for rapid construction and moderate expansion of class applications
Python product level Web application framework is simple to start and has good scalability
from wsgiref.simple_server import make_server from pyramid.config import Configurator from pyramid.response import Response def hello_world(request): return Response('Hello World!') if _name_ == '_main_': with Configurator() as config: config.add_route('hello', '/') config.add_view(hello_world, route_name='hello') app = config.make_wsgi_app() server = make_server('0.0.0.0', 6543, app) server.serve_forever()
https://trypyramid.com/
Flask: Web application development micro framework
It provides the simplest application framework for building Web system
Features: simple, small-scale, fast
from flask import Flask app = Flask(_name_) @app.route('/') def hello_world(): return 'Hello, World!'
http://flask.pocoo.org
Network application development of Python Library
WeRoBot: WeChat official account development framework
It provides the function of parsing wechat server messages and feedback messages
An important technical means of establishing wechat robot
import werobot robot = werobot.WeRoBot(token='tokenhere') @robot.handler def hello(message): return 'Hello World!'
https://github.com/offu/WeRoBot
aip: Baidu AI open platform interface
Python function interface for accessing Baidu AI service is provided
Voice, face, OCR, NLP, knowledge map, image search and other fields
Python is the main way of Baidu AI application
https://github.com/Baidu-AIP/python-sdk
MyQR: QR code generation third party Library
It provides a series of functions for generating QR codes
Basic QR code, art QR code and dynamic QR code
https://github.com/x-hw/amazing-qr
9.4 from human-computer interaction to art design
Graphical user interface GUI of python Library
Pyqt5: Python interface of Qt development framework
Provides a Python API interface for creating Qt5 programs
Qt is a very mature cross platform desktop application development system with complete GUI
Recommended Python GUI development third-party library
https://www.riverbankcomputing.com/software/pyqt
wxPython: GUI development framework for box platform
Provides a cross platform GUI development framework dedicated to Python
Understand the relationship between data type and index. Operating index is operating data
import wx app = wx.App(False) frame = wx.Frame(None, wx.ID_ANY, "Hello World") frame.Show(True) app.MainLoop()
https://www.wxpython.org
PyGobject: develop GUI function library using GTK +
It provides the function of integrating GTK +, WebKit GTK + and other libraries
GTK +: a cross platform GUI framework of user graphical interface
Example: Anaconda uses this library to build GUI
import gi gi.require_version("Gtk", "3.0") from gi.repository import Gtk window = Gtk.Window(title="Hello World") window.show() window.connect("destroy", Gtk.main_quit) Gtk.main()
https://pygobject.readthedocs.io
Game development of python Library
PyGame: a simple game development library
It provides a simple game development function and implementation engine based on SDL
Understand the response mechanism of the game to external input and the role construction and interaction mechanism
The main third-party library for getting started with Python games
http://www.pygame.org
Panda3D: open source, cross platform 3D rendering and game development library
A 3D game engine that provides Python and C + + interfaces
Support many advanced features: normal map, gloss map, cartoon rendering, etc
Jointly developed by Disney and Carnegie Mellon University
http://www.panda3d.org
cocos2d: a framework for building interactive applications of 2D games and graphical interfaces
It provides the graphics rendering function of game development based on OpenGL
It supports GPU acceleration and adopts tree structure to manage game object types hierarchically
Suitable for 2D professional game development
http://python.cocos2d.org/
Virtual reality of Python Library
VR Zero: Python library for developing VR applications on raspberry pie
It provides a large number of functions related to VR development
VR development library for raspberry pie supports miniaturization of equipment and simplified configuration
It is very suitable for beginners to practice VR development and application
https://github.com/WayneKeenan/python-vrzero
Pyovr: python development interface of oculus rift
Python development library for Oculus VR device
Based on mature VR equipment, provide a full set of documents and industrial application equipment
An idea of Python + virtual reality exploration
https://github.com/cmbruns/pyovr
Wizard: General VR development engine based on Python
Professional enterprise virtual reality development engine
Provide detailed official documents
It supports a variety of mainstream VR hardware devices and has certain universality
http://www.worldviz.com/vizard-virtual-reality-software
Graphic art of Python Library
Quads: the art of iteration
The image is divided into four iterations to form a pixel wind
Dynamic or static images can be generated
Easy to use, with a high degree of display
https://github.com/fogleman/Quads
ascii_art: ASCII Art Library
Convert normal pictures to ASCII art style
The output can be plain text or color text
It can be output in picture format
https://github.com/jontonsoup4/ascii_art
turtle: turtle drawing system
Random Art
https://docs.python.org/3/library/turtle.html
9.5 example 16: Rose drawing
#RoseDraw.py import turtle as t # Define a curve drawing function def DegreeCurve(n, r, d=1): for i in range(n): t.left(d) t.circle(r, abs(d)) # Initial position setting s = 0.2 # size t.setup(450*5*s, 750*5*s) t.pencolor("black") t.fillcolor("red") t.speed(100) t.penup() t.goto(0, 900*s) t.pendown() # Draw flower shape t.begin_fill() t.circle(200*s,30) DegreeCurve(60, 50*s) t.circle(200*s,30) DegreeCurve(4, 100*s) t.circle(200*s,50) DegreeCurve(50, 50*s) t.circle(350*s,65) DegreeCurve(40, 70*s) t.circle(150*s,50) DegreeCurve(20, 50*s, -1) t.circle(400*s,60) DegreeCurve(18, 50*s) t.fd(250*s) t.right(150) t.circle(-500*s,12) t.left(140) t.circle(550*s,110) t.left(27) t.circle(650*s,100) t.left(130) t.circle(-300*s,20) t.right(123) t.circle(220*s,57) t.end_fill() # Draw flower branch shape t.left(120) t.fd(280*s) t.left(115) t.circle(300*s,33) t.left(180) t.circle(-300*s,33) DegreeCurve(70, 225*s, -1) t.circle(350*s,104) t.left(90) t.circle(200*s,105) t.circle(-500*s,63) t.penup() t.goto(170*s,-30*s) t.pendown() t.left(160) DegreeCurve(20, 2500*s) DegreeCurve(220, 250*s, -1) # Draw a green leaf t.fillcolor('green') t.penup() t.goto(670*s,-180*s) t.pendown() t.right(140) t.begin_fill() t.circle(300*s,120) t.left(60) t.circle(300*s,120) t.end_fill() t.penup() t.goto(180*s,-550*s) t.pendown() t.right(85) t.circle(600*s,40) # Draw another green leaf t.penup() t.goto(-150*s,-1000*s) t.pendown() t.begin_fill() t.rt(120) t.circle(300*s,115) t.left(75) t.circle(300*s,100) t.end_fill() t.penup() t.goto(430*s,-1070*s) t.pendown() t.right(30) t.circle(-600*s,35) t.done()
Art: thought first, programming is the means
Design: ideas are as important as programming
Engineering: programming first, thought second