Tinyxml-2 incomplete tutorial

XML is one of the most commonly used data documents in program development. Each language or development environment has corresponding libraries for processing XML files. In C++, tinyxml-2 is one such library.

Tinyxml-2 is a simple, compact, high-performance C++ XML parser that can be easily integrated into other programs.

Two pages related to TinyXML-2 are as follows:

  • GitHub homepage: github.com/leethomason…
  • The online help document: leethomason. Making. IO/tinyxml2 /

TinyXML – 2 access

There are two main files in TinyXML2:

  • tinyxml2.cpp
  • tinyxml2.h

Generally speaking, you just need to put the above two main files into your project to use it. You only need to compile these with the other source files at compile time.

In addition, there is a test file:

  • xmltest.cpp

From this file, we can see many uses of tinyXML2. But with tinyXML2 alone, you can ignore this file for the time being.

Use TinyXML-2 to parse XML files

Reading an XML file with TinyXML-2 is simple. An XML file can be parsed into an XMLDocument object with just two lines of code:

XMLDocument doc;
doc.LoadFile( "dream.xml" );
Copy the code

Once parsed into an XMLDocument object, all the information in the XML file can be retrieved from this object. This object, like any other C++ object, can be created on the stack or deleted on the heap using the new or delete keywords.

The following code reads and iterates through the entire contents of the XML file. This is the most important part of this article.

The code is also here gist.github.com/Lee-swifter… Please give me a zan ha…

#include <iostream>
#include "tinyxml2.h"

using namespace std;
using namespace tinyxml2;

void traversingXML(XMLNode* node) {
    if(node == nullptr)
        return;
    
    if(node->ToDeclaration()) {
        auto declaration = dynamic_cast<XMLDeclaration*>(node);
        cout << XML declaration, value= << declaration->Value() << endl;
    }
    if(node->ToElement()) {
        auto element = dynamic_cast<XMLElement*>(node);
        cout << "XML element name=" << element->Name() < <", value=" << element->Value() << endl;
        const XMLAttribute* attribute = element->FirstAttribute(a);while(attribute ! =nullptr) {
            cout << "\ t attribute" << attribute->Name() < <"=" << attribute->Value() << endl;
            attribute = attribute->Next();
        }
    }
    if(node->ToText()) {
        auto text = dynamic_cast<XMLText*>(node);
        cout << "XML text:" << text->Value() << endl;
    }
    if(node->ToComment()) {
        auto comment = dynamic_cast<XMLComment*>(node);
        cout << "XML comment:" << comment->Value() << endl;
    }
    if(node->ToUnknown()) {
        auto unknown = dynamic_cast<XMLUnknown*>(node);
        cout << "XML unknown:" << unknown->Value() << endl;
    }
    if(node->ToDocument()) {
        auto document = dynamic_cast<XMLDocument*>(node);
        cout << "XML document:" << document->ErrorName() << endl;
    }
    
    if(node->NoChildren()) {
        return;
    }
    
    XMLNode* child = node->FirstChild(a);while(child ! =nullptr) {
        traversingXML(child);
        child = child->NextSibling();
    }
}

int main(int argc, const char * argv[]) {
    
    XMLDocument xmlDocument;
    XMLError error = xmlDocument.LoadFile("test.xml");
    if(error ! = XML_SUCCESS) { std::cout <<"Reading XML failed:" << xmlDocument.ErrorStr() << endl;
        return EXIT_FAILURE;
    }
    
    traversingXML(&xmlDocument);
    return EXIT_SUCCESS;
}
Copy the code

But before we parse this code in detail, we need to understand the memory model of XMLDocument to take a closer look at the code.

Memory model for TinyXML-2

TinyXML – 2 type

The diagram above is a diagram of several classes used in TinyXML-2 to represent XML content. You can see that TinyXML-2 categorizes all elements in XML as XMLNode. XMLNode is an abstract class, so it inherits several different types to distinguish between different content in XML:

  • XMLDocument represents an XML document
  • XMLElement represents an ELEMENT of XML
  • XMLComment represents a comment in XML
  • XMLDeclaration represents the declaration of XML, usually first in XML
  • XMLText represents text information in an XML
  • XMLUnknow Unknown type

These types may not be intuitive to talk about, so here’s an example:

Tinyxml-2 uses in-memory sibling notation to represent a tree that generates XML. The two most important methods in XMLNode are:

  • const XMLNode * FirstChild() const
  • const XMLNode * NextSibling() const

Sibling representation in TinyXML-2

These two methods are common functions used for traversal in sibling notation. For those unfamiliar with sibling notation, take a look at this chart:

At this point, we can simply treat all the nodes in the diagram as an XMLNode object. Using the above two methods, you can recursively traverse all nodes in an XML.

For an XMLNode, its contents can be retrieved using the Value() method and safely converted to its subtypes using the corresponding ToXXX method.

Tinyxml-2 Operations on XML

Combined with the above two points, a code for traversing the XML should come out:

void traversingXML(XMLNode* node) {
    // View the node
    printf(node->Value());

    if(node->NoChildren())
        return;
    
    XMLNode* child = node->FirstChild(a);while(child ! =nullptr) {
        traversingXML(child);
        child = child->NextSibling();
    }
}
Copy the code

If you need to change the XML, you simply create an XMLNode, manipulate it, and add it to another node. But here’s a special note: Any child nodes of an XMLDocument, such as XMLElement, XMLText, and so on, can only be created by calling the corresponding methods, namely XMLDocument::NewElement, XMLDocument::NewText. Although you can have Pointers to these node objects, the child nodes still belong to their XMLDocument object. When the XMLDocument object is deleted, all the nodes it contains are also deleted.

For an XML, the most important thing to care about is the CORRESPONDING XMLDocument object. It can be used to create child nodes and perform some of the most important operations. For node insertion and deletion, you can use the following methods of XMLNode:

  • XMLNode * InsertFirstChild(XMLNode *addThis)
  • XMLNode * InsertAfterChild(XMLNode *afterThis, XMLNode *addThis)
  • XMLNode * InsertEndChild(XMLNode *addThis)
  • void DeleteChild(XMLNode *node)
  • void DeleteChildren()

For example, the following code creates an XML file:

void createXML(a) {
    XMLDocument document;
    XMLDeclaration* declaration = document.NewDeclaration("The XML version = '1.0' encoding =" utf-8 "standalone = 'yes'");
    
    XMLComment* comment = document.NewComment("This is a comment.");
    XMLUnknown* unknow = document.NewUnknown("Unknow type");
    
    XMLElement* root = document.NewElement("svg");
    root->SetAttribute("height"."1080");
    root->SetAttribute("widht"."1920");
    root->SetAttribute("viewBox"."0 0 1920 1080");
    
    XMLElement* g = document.NewElement("g");
    XMLElement* path = document.NewElement("path");
    path->SetAttribute("stroke-width"."3.5");
    
    XMLText* text = document.NewText("text int path tag.");
    
    path->InsertEndChild(text);
    g->InsertEndChild(path);
    root->InsertEndChild(g);
    
    document.InsertEndChild(declaration);
    document.InsertEndChild(comment);
    document.InsertEndChild(unknow);
    document.InsertEndChild(root);
    
    document.SaveFile("test_save.xml");
}
Copy the code

It creates the following XML file:


      
<! This is a comment -->

      
<svg height="1080" widht="1920" viewBox="0 0 1920 1080">
    <g>
        <path stroke-width="3.5">text int path tag.</path>
    </g>
</svg>

Copy the code

Use TinyXML-2 to save XML files

To save XML to a file, simply call the following method:

XMLDocument doc; . doc.SaveFile( "foo.xml" );
Copy the code

A couple of caveats

Here’s a translation of a few of the features on tinyXML2’s home page that you can skip if you want to get your feet on the ground quickly.

File coding

When parsing XML, TinyXML-2 uses UTF-8 encoding only and assumes that all XML files are encoded in UTF-8.

Loaded/saved file names are passed unchanged to the underlying operating system.

White space characters

Reserving whitespace characters (default)

Microsoft has a great article on whitespace handling: msdn.microsoft.com/en-us/libra…

By default, TinyXML-2 preserves whitespace characters in a reasonable way. All newlines/carriage-returns/line-feeds are normalized to newline characters as required by the XML specification.

Whitespace characters in the text will be retained. Such as:

<element> Hello,  World</element>
Copy the code

In this XML, the Spaces before Hello and the next two Spaces will be preserved. Newlines in text are also preserved, as in the following XML:

<element> Hello again,
          World</element>
Copy the code

However, whitespace characters between elements are not retained. Because blank content between trace and report elements is awkward and usually of no value, tinyXML-2 will treat the following two XNL as the same:

<document>
	<data>1</data>
	<data>2</data>
	<data>3</data>
</document>

<document><data>1</data><data>2</data><data>3</data></document>
Copy the code

Delete whitespace character

Some applications prefer whitespace to be dropped, and tinyXML-2 supports this by passing whitespace as an argument to the XMLDocument constructor, leaving whitespace by default.

Tinyxml-2 will collapse as you use COLLAPSE_WHITESPACE to delete white space characters

  • Deletes leading and trailing whitespace characters
  • Converts a newline character to a whitespace character
  • Folds successive whitespace characters into a single whitespace character

Note however that COLLAPSE_WHITESPACE has a performance impact, which essentially parses the XML twice.

The bug report

Tinyxml-2 reports the line number of the line where the error occurred if an error occurred while parsing XML. In addition, all nodes (elements, declarations, text, comments, etc.) and attributes have a line number record when they are parsed. In this way, the application can perform additional validation on the parsed XML document.

Special characters

Tinyxml-2 recognizes some defined special characters, namely:

&amp;	&
&lt;	<
&gt;	>
&quot;	"
&apos;	'
Copy the code

When reading XML documents, these characters will be recognized as their UTF-8 values. For example, if a piece of text in XML is:

Far &amp; Away
Copy the code

If you call Value() from this XMLText object, you will get “Far & Away”.