Crawler: JSON file store

Posted by sebastiaandraaisma on Mon, 03 Jan 2022 12:33:14 +0100

JSON, fully known as JavaScript Object Notation, also known as JavaScript object markup, represents data through the combination of objects and arrays. It is simple in structure but highly structured. It is a lightweight data exchange format

catalogue

Objects and arrays

Read Jason

Output JSON

Objects and arrays

In JavaScript language, everything is an object. Therefore, any supported types can be represented by JSON, such as string, number, object, array, etc., but object and array are two special and commonly used types. Let's briefly introduce them

Object: the content wrapped in curly brackets {} in JavaScript. The data organization is the key value pair structure of {key1:value1,key2:value2,...}. In object-oriented language, key is the attribute of the object and value is the corresponding value. Key names can be represented by integers and strings, and the type of value can be any type

Array: the array is wrapped in square brackets [] and the data structure is the index structure of ["java","javascript","vb"...]. In JavavScript, array is a special data type. It can also use key value pairs like objects, but the multivalued type referenced by the index can be any type

json_s=[{
    "name":"Bod",
    "gender":"male",
    "birehday":"1992-10-18"
},{
    "name":"Selina",
    "gender":"female",
    "birthday":"1994-10-18"
}]

Read Jason

Python has a simple way to read and write JSON files

loads()Converts a JSON text string to a JSON object
dumps()Convert JSON objects to text strings
dump()It is used to convert the data of dict type into str and write it to json file
load()Used to read data from json files
import json

json_str="""[{
    "name":"Bod",
    "gender":"male",
    "birehday":"1992-10-18"
},{
    "name":"Selina",
    "gender":"female",
    "birthday":"1994-10-18"
}]
"""
print(type(json_str))
data = json.loads(json_str)
print(data)
print(type(data))

result:
<class 'str'>
[{'name': 'Bod', 'gender': 'male', 'birehday': '1992-10-18'}, {'name': 'Selina', 'gender': 'female', 'birthday': '1994-10-18'}]
<class 'list'>

Using the loads() method to convert the string into a JSON object, you can use the index to get the content

data = json.loads(json_str)
print(data[0])
print(data[1])

result:
{'name': 'Bod', 'gender': 'male', 'birehday': '1992-10-18'}
{'name': 'Selina', 'gender': 'female', 'birthday': '1994-10-18'}

Output JSON

We can also call the dumps() method to convert JSON objects into strings

import json

json_str=[{
    "name":"Bod",
    "gender":"male",
    "birehday":"1992-10-18"
},{
    "name":"Selina",
    "gender":"female",
    "birthday":"1994-10-18"
}]

with open("data.json","w") as fd:
    fd.write(json.dumps(json_str,indent=2))



By using the dumps() method, you can turn the JSON object into a string, and then call the write() method to write the text. If it is saved as JSON format, a indent added in dumps represents the number of indented characters.

What if the JSON object contains Chinese?

import json

json_str=[{
    "name":"Zhang San",
    "gender":"male",
    "birehday":"1992-10-18"
},{
    "name":"Li Si",
    "gender":"female",
    "birthday":"1994-10-18"
}]

with open("data.json","w",encoding="utf-8") as fd:
    fd.write(json.dumps(json_str,indent=2,ensure_ascii=False))




When writing a file, add the encoding format parameter. dumps() also needs to specify the parameter ensure_ascii is False,

 

 

Topics: JSON crawler