Chapter 21 using% XML Textreader navigation document

Posted by matanoosh on Tue, 25 Jan 2022 02:19:39 +0100

Chapter 21 using% XML Textreader navigation document

Navigation document

To navigate through a document, use the following methods of a text reader: Read(), ReadStartElement(), MoveToAttributeIndex(), MoveToAttributeName(), MoveToElement(), MoveToContent(), and Rewind().

Navigate to the next node

To move to the next node in the document, use the read () method. The read () method returns a value of TRUE until no more nodes are readable (that is, until the end of the document is reached). The previous example uses this method in the loop shown below:

 While (textreader.Read()) {

...

 }

Navigate to the first occurrence of a specific element

You can move to the first match of a specific element in the document. To do this, use the ReadStartElement() method. This method returns TRUE unless the element is not found. If the element is not found, the method reaches the end of the file.

The readstartelement () method has two parameters: the name of the element and the namespace URI (optional). Notice the% XML. XML in the class Textreader does not handle namespace prefixes. Therefore, the ReadStartElement() method treats the following two elements as having different names:

<Person>Smith,Ellen W. xmlns="http://www.person.org"</Person>

<s01:Person>Smith,Ellen W. xmlns:s01="http://www.person.org"</s01:Person>

Navigate to properties

When navigating to an element, if the element has attributes, you can navigate to these attributes in one of two ways:

  • Use the MoveToAttributeIndex() method to move to a specific attribute by index (the ordinal position of the attribute in the element). This method has only one parameter: the index number of the property. Note that you can use the AttributeCount attribute to see how many attributes a given element has

  • Use the MoveToAttributeName() method to move to a specific attribute by name. This method has two parameters: attribute name and namespace URI (optional). Notice the% XML. XML in the class Textreader does not process namespace prefixes; If an attribute has a prefix, the prefix is considered part of the attribute name.

After completing the properties of the current element, you can move to the next element in the document by calling one of the navigation methods, such as read(). Alternatively, you can call the MoveToElement() method to return to the element containing the current attribute.

For example, the following code lists all the attributes of the current node by index number:

 If (textreader.NodeType = "element") {
     // list attributes for this node
     For a = 1:1:textreader.AttributeCount {
         Do textreader.MoveToAttributeIndex(a)
         Write textreader.LocalName," = ",textreader.Value,!
     }
 }

The following code finds the value of the color attribute of the current node:

 If (textreader.NodeType = "element") {
     // find color attribute for this node
     If (textreader.MoveToAttributeName("color")) {
         Write "color = ",textreader.Value,!
     }
 }

Navigate to the next node that contains the content

The MoveToContent() method helps find content. Specifically:

  • If the node is not of type "chars", this method advances to the next node of type "chars".
  • If the node is of type "chars", this method does not advance in the file.

Rewinding

All the methods described here advance in the document, except the Rewind() method, which navigates to the beginning of the document and resets all properties.

Perform validation

By default, the source document is validated against any DTD or schema document provided. If the document contains a DTD section, the document will be validated against that DTD. To validate against a schema document, specify the schema in the parameter list of ParseFile(), ParseStream(), ParseString(), or ParseURL(), as described in "parameter list of Parse method".

Most types of validation problems are not fatal and can lead to errors or warnings. Specifically, nodes of type "Error" or "Warning" are automatically added to the location where the Error occurred in the document tree. You can navigate and examine nodes in the same way as any other type of node.

For example, the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Root [
  <!ELEMENT Root (Person)>
  <!ELEMENT Person (#PCDATA)>
]>
<Root>
   <Person>Smith,Joe C.</Person>
</Root>

In this case, we do not expect any validation errors. Recall the example method WriteNodes() shown earlier in this chapter. If we use this method to read this document, the output will be as follows:

Node 1 is a(n) element named: Root
    and has no value
Node 2 is a(n) ignorablewhitespace and has no name
    with value:
 
Node 3 is a(n) element named: Person
    and has no value
Node 4 is a(n) chars and has no name
    with value: Smith,Joe C.
Node 5 is a(n) endelement named: Person
    and has no value
Node 6 is a(n) ignorablewhitespace and has no name
    with value:
 
Node 7 is a(n) endelement named: Root
    and has no value

Instead, the file is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Root [
  <!ELEMENT Root (Person)>
  <!ELEMENT Person (#PCDATA)>
]>
<Root>
   <Employee>Smith,Joe C.</Employee>
</Root>

In this case, we expect an error because the < employee > element is not declared in the DTD section. Here, if we use the example method WriteNodes() to read the document, the output will be as follows:

Node 1 is a(n) element named: Root
    and has no value
Node 2 is a(n) ignorablewhitespace and has no name
    with value:
 
Node 3 is a(n) error and has no name
    with value: Unknown element 'Employee' 
while processing c:/TextReader/docwdtd2.txt at line 7 offset 14
Node 4 is a(n) element named: Employee
    and has no value
Node 5 is a(n) chars and has no name
    with value: Smith,Joe C.
Node 6 is a(n) endelement named: Employee
    and has no value
Node 7 is a(n) ignorablewhitespace and has no name
    with value:
 
Node 8 is a(n) error and has no name
    with value: Element 'Employee' is not valid for content model: '(Person)' 
while processing c:/TextReader/docwdtd2.txt at line 8 offset 8
Node 9 is a(n) endelement named: Root
    and has no value

Example: namespace Report

The following example method reads any XML file and indicates the namespace to which each element and attribute belongs:

ClassMethod ShowNamespacesInFile(filename As %String)
{
  Set status = ##class(%XML.TextReader).ParseFile(filename,.textreader)
  
  //check status
  If $$$ISERR(status) {do $System.Status.DisplayError(status) quit}
  
  //iterate through document, node by node
  While textreader.Read()
  {
    If (textreader.NodeType = "element")
    {
       Write !,"The element ",textreader.LocalName
       Write " is in the namespace ",textreader.NamespaceUri
       }
    If (textreader.NodeType = "attribute")
    {
       Write !,"The attribute ",textreader.LocalName
       Write " is in the namespace ",textreader.NamespaceUri
       }
     }
}

When used in the terminal, this method generates the following output:

 
The element Person is in the namespace www://www.person.com
The element Name is in the namespace www://www.person.com

The following variants accept XML enabled objects, write them to a stream, and then use the stream to generate the same type of report:

ClassMethod ShowNamespacesInObject(obj)
{
  set writer=##class(%XML.Writer).%New()

  set str=##class(%GlobalCharacterStream).%New()
  set status=writer.OutputToStream(str)
  if $$$ISERR(status) {do $System.Status.DisplayError(status) quit ""}

  //write to the stream
  set status=writer.RootObject(obj)
  if $$$ISERR(status) {do $System.Status.DisplayError(status) quit }

  Set status = ##class(%XML.TextReader).ParseStream(str,.textreader)
  
  //check status
  If $$$ISERR(status) {do $System.Status.DisplayError(status) quit}
  
  //iterate through document, node by node
  While textreader.Read()
  {
    If (textreader.NodeType = "element")
    {
       Write !,"The element ",textreader.LocalName
       Write " is in the namespace ",textreader.NamespaceUri
       }
    If (textreader.NodeType = "attribute")
    {
       Write !,"The attribute ",textreader.LocalName
       Write " is in the namespace ",textreader.NamespaceUri
       }
     }
  }

Topics: xml iris