Python Introductory Built-in Module - Serialization Module (json Module, pickle Module)

Posted by brooky on Wed, 11 Sep 2019 09:41:42 +0200

Python Introductory Built-in Module - Serialization Module (json Module, pickle Module)

1. Serialization

There are three kinds of serialization modules in Python:

json module:

A data transformation format followed by different languages is a special string used by different languages. (For example, a list of Python [1, 2, 3] converts json into a special string, and then sends it to the developer of php by encoding bytes. The developer of php can decode it into a special string, and then decompose it into an original array (list): [1, 2, 3].

json serialization only supports partial Python data structures: dict,list, tuple,str,int, float,True,False,None

pickle module:

It can only be a data transformation format followed by Python language and can only be used in Python language.

All data types supporting Python include instantiated objects.

Sheve module: A dictionary-like operation to manipulate special strings.

The essence of serialization is to transform a data structure (such as dictionary, list) into a special sequence (string or bytes), which is called serialization.

(1) Serialization module is to transform a common data structure into a special sequence, and this special sequence can be decomposed back.

(2) Main uses:

<1> File Read-Write Data

<2> Data Transfer over Network

(3) json module

The < 1 > JSON module converts the data structure that satisfies the condition into a special string, and can also be deserialized and restored back

<2> Data types that can be sequenced: dictionaries, lists, tuples

4 methods 2 groups

dumps loads - for network transmission
dump load - for file storage

1> dumps,loads

[1] Convert dictionary type to string type
import json
dic = {'k1':'v1','k2':'v2','k3':'v3'}
str_dic = json.dumps(dic)  #Serialization: Converting a dictionary into a string
print(type(str_dic),str_dic)  #<class 'str'> {"k3": "v3", "k1": "v1", "k2": "v2"}
#Note that the string in the dictionary of the type of string converted by json is represented by "".
[2] Converting a dictionary of string type to a dictionary type
import json
dic2 = json.loads(str_dic)  #Deserialization: Converting a string-formatted dictionary into a dictionary
#Note that strings in a dictionary of string type to be processed with json's loads function must be represented by ""
print(type(dic2),dic2)  #<class 'dict'> {'k1': 'v1', 'k2': 'v2', 'k3': 'v3'}
[3] List types are also supported
list_dic = [1,['a','b','c'],3,{'k1':'v1','k2':'v2'}]
str_dic = json.dumps(list_dic) #Nested data types can also be handled 
print(type(str_dic),str_dic) #<class 'str'> [1, ["a", "b", "c"], 3, {"k1": "v1", "k2": "v2"}]
list_dic2 = json.loads(str_dic)
print(type(list_dic2),list_dic2) #<class 'list'> [1, ['a', 'b', 'c'], 3, {'k1': 'v1', 'k2': 'v2'}]

2> dump,load

[1] Convert objects into strings and write them to files
import json
f = open('json_file.json','w')
dic = {'k1':'v1','k2':'v2','k3':'v3'}
json.dump(dic,f)  #The dump method receives a file handle and directly converts the dictionary into a json string to write to the file
f.close()
# json files are also files, which store json strings.
[2] Converting a dictionary of string type in a file into a dictionary
import json
f = open('json_file.json')
dic2 = json.load(f)  #The load method receives a file handle that directly converts the json string in the file into a data structure and returns it
f.close()
print(type(dic2),dic2)
Other parameter descriptions

Ensure_ascii:, when it is True, all non-ASCII code characters are displayed as XXXXXX sequence. Just set ensure_ascii to False when dump, and then save it into json's Chinese to display normally.

separators: The separator is actually a tuple of (item_separator, dict_separator), which by default is (,:); this means that keys in a dictionary are separated by "," while KEY and value are separated by ":".

sort_keys: Sort the data according to the value of keys.

json serialization stores multiple data into the same file

For json serialization, storing multiple data into one file is problematic. By default, a json file can only store one json data, but it can also be solved. Examples are given to illustrate that:

about json Store multiple data into files
dic1 = {'name':'oldboy1'}
dic2 = {'name':'oldboy2'}
dic3 = {'name':'oldboy3'}
f = open('serialize',encoding='utf-8',mode='a')
json.dump(dic1,f)
json.dump(dic2,f)
json.dump(dic3,f)
f.close()

f = open('serialize',encoding='utf-8')
ret = json.load(f)
ret1 = json.load(f)
ret2 = json.load(f)
print(ret)

The code above will report errors. Solution:

dic1 = {'name':'oldboy1'}
dic2 = {'name':'oldboy2'}
dic3 = {'name':'oldboy3'}
f = open('serialize',encoding='utf-8',mode='a')
str1 = json.dumps(dic1)
f.write(str1+'\n')
str2 = json.dumps(dic2)
f.write(str2+'\n')
str3 = json.dumps(dic3)
f.write(str3+'\n')
f.close()

f = open('serialize',encoding='utf-8')
for line in f:
    print(json.loads(line))

(4) pickle module

The < 1 > pickle module converts all data structures and objects of Python into bytes type, and then can be de-serialized and restored.

<2> Only Python has it. It can sequence almost all data types in Python. Anonymous functions cannot be sequenced.

It's almost the same as json, and it's also two-to-four methods.

dumps loads - for network transmission
dump load - for file storage

1> dumps,loads

import pickle
dic = {'k1':'v1','k2':'v2','k3':'v3'}
str_dic = pickle.dumps(dic)
print(str_dic)  # bytes type

dic2 = pickle.loads(str_dic)
print(dic2)    #Dictionaries
# You can also serialize objects
import pickle
def func():
    print(666)

ret = pickle.dumps(func)
print(ret,type(ret))  # b'\x80\x03c__main__\nfunc\nq\x00.' <class 'bytes'>
f1 = pickle.loads(ret)  # f1 gets the memory address of func function
f1()  # Executing func functions

2> dump,load

dic = {(1,2):'oldboy',1:True,'set':{1,2,3}}
f = open('pick serialize',mode='wb')
pickle.dump(dic,f)
f.close()
with open('pick serialize',mode='wb') as f1:
    pickle.dump(dic,f1)

pickle serialization stores multiple data into a file

dic1 = {'name':'oldboy1'}
dic2 = {'name':'oldboy2'}
dic3 = {'name':'oldboy3'}

f = open('pick Multiple data',mode='wb')
pickle.dump(dic1,f)
pickle.dump(dic2,f)
pickle.dump(dic3,f)
f.close()

f = open('pick Multiple data',mode='rb')
while True:
    try:
        print(pickle.load(f))
    except EOFError:
        break
f.close()

Write a pickle to write the file context

class MyPickle:
    def __init__(self,path,mode='load'):
        self.path = path
        self.mode = 'ab' if mode=='dump' else 'rb'

    def __enter__(self):
        self.f = open(self.path, mode=self.mode)
        return self

    def dump(self,content):
        pickle.dump(content,self.f)

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.f.close()

    def __iter__(self):
        while True:
            try:
                yield  pickle.load(self.f)
            except EOFError:
                break


class Course:
    def __init__(self,name,price,period):
        self.name = name
        self.price = price
        self.period = period
python = Course('python',19800,'6 months')
linux = Course('linux',19800,'6 months')


with MyPickle('course_file') as p:
    for obj in p:
        print(obj.__dict__)
with MyPickle('course_file','dump') as p:
    p.dump(python)
    p.dump(linux)

with open('course_file','ab') as f:
    pickle.dump(linux,f)

with open('course_file','rb') as f:
    while True:
        try:
            obj = pickle.load(f)
            print(obj.__dict__)
        except EOFError:
            break

Topics: Python JSON encoding Linux