xml document is nothing more than a tree data warehouse, and there are four basic parts: addition, deletion, modification and query.
Parse tree structure
- Read from hard disk
- Read from string
Note: XML etree. Elementtree module is not safe when dealing with malicious structure data.
from xml.etree import ElementTree # import data from our dataset tree = ElementTree.parse([path of xml file]) # pick the root of xml tree root = tree.getroot()
Note: parse is not required when reading from the string, because what fromstring directly returns is our root node.
from xml.etree import ElementTree # pick the root of xml tree root = ElementTree.fromstring(country_data_as_string)
Among them, tree is easy to understand, which is the tree of our xml file. Root is our root node.
root belongs to the element object and has the following attributes:
- tag: string object, indicating the type of data representation.
- attrib: a dictionary object that represents the attached attributes.
- text: string object, representing the content of element.
- tail: a string object that represents the wake after the element is closed.
- Several child elements. These child elements can be indexed by index.
<tag attrib1=1>text</tag>tail 1 2 3 4
Tip: if you want to speed up, you can use the API XML compiled in C language etree. cElementTree. Priority should be given to import when importing. The code is modified as follows.
try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET
ElementTree.Element Class
class xml.etree.ElementTree.Element(tag, attrib={}, **extra) # attribute tag: string,The type of data represented by the element. attrib: dictionary,Attribute dictionary of the element. text: string,The content of the element. tail: string,The tail shape of the element. # Actions on attributes clear(): Clear descendants, attributes text and tail Also set to None. get(key, default=None): obtain key The corresponding property value. If the property does not exist, it will be returned default Value. items(): Returns a list according to the attribute dictionary. The list element is(key, value). keys(): Returns a list of all element attribute keys. set(key, value): Set new attribute keys and values. # Actions for future generations ## Add new element append(subelement): Add an immediate child element. extend(subelements): Add a string of element objects as child elements. insert(index, element): Inserts a child element at the specified location. ## Delete element remove(subelement): Delete child elements. ## Traverse elements to get iter or list find(match): Find the first matching sub element. The matching object can be tag or path. findall(match): Find all matching sub elements. The matching object can be tag or path. findtext(match): Find the first matching sub element and return its text Value. The matching object can be tag or path. iter(tag=None): Generate or traverse all descendants of the current element tag Iterator for descendants of. iterfind(match): according to tag or path Find all descendants. itertext(): Traverse all descendants and return text Value.
ElementTree Object
class xml.etree.ElementTree.ElementTree(element=None, file=None) element New if given ElementTree The root node of the. _setroot(element): With the given element Replace the current root node. Use with caution. getroot(): Get the root node. parse(source, parser=None): load xml Object, source Can be a file name or file type object. # Writeback method write write(file, encoding="us-ascii", xml_declaration=None, default_namespace=None,method="xml") # The following methods are similar to the methods with the same name in the Element class, except that they specify the root node as the operand. find(match) findall(match) findtext(match, default=None) iter(tag=None) iterfind(match)
Add, delete, modify and check
I thought about it. The object-oriented method is easier to think and organize ideas. In actual use, the above methods should be regarded as an xml file object, and then sorted into a separate Class.
Practical application of AI tuner in small projects:
class xmlResolver(xmlFilePath) xmlWri
Python object oriented review
method
self represents the instance of a class. self is necessary when defining the method of a class, although it is not necessary to pass in the corresponding parameters when calling.
init() method is a special method, which is called the constructor or initialization method of a class. It will be called when an instance of this class is created.
Class
dict: attribute of the class (including a dictionary, which is composed of data attributes of the class)
doc: document string of class
Name: class name
Module: the module where the class definition is located (the full name of the class is' main.className '. If the class is in an import module mymod, className.module is equal to mymod)
bases: the constituent elements of all the parent classes of a class (including a tuple composed of all the parent classes)
Subclass parent class
class Derived class name(Base class name) ...
Note: python allows multiple parent class inheritance, which is called multiple inheritance.
Then, the concept of method rewriting in python refers to the method of subclass rewriting parent class, which is different from Java.
Overload method of foundation
Serial number | method | describe | Simple application |
---|---|---|---|
1 | init ( self [,args...] ) | Constructor | Simple call method: obj = className(args) |
2 | del( self ) | Destruct method, delete an object | Simple call method: del obj |
3 | repr( self ) | Converted to a form for the interpreter to read | Simple call method: repr(obj) |
4 | str( self ) | Used to convert a value into a form suitable for human reading | Simple call method: str(obj) |
5 | cmp ( self, x ) | Object comparison | Simple call method: cmp(obj, x) |
Private class method
__ private_method: it starts with two underscores and declares that the method is private and cannot be called outside the class. Call self. Inside the class__ private_ methods
Description of single underline, double underline and double underline at the beginning and end
foo: it defines special methods, generally system defined names, such as init().
_ foo: variables starting with a single underscore represent protected variables, that is, protected types can only be accessed by themselves and subclasses, not from module import*
__ foo: Double underscores represent variables of private type, which can only be accessed by the class itself.