Question: How and in what ways is XML data parsed?

Four ways to parse

The DOM parsing
SAX parsing
JDOM parsing
DOM4J to parse

A case in field

The DOM parsing

DOM (Document Object Model), in the application, the XML parser based on DOM transforms an XML Document into a collection of Object models (usually called DOM tree), the application is through the operation of this Object Model, To implement operations on XML document data. XML itself appears as a tree, so when DOM manipulates, it will be transformed as a tree of chapters. In the entire DOM tree, the largest place is the Document, which represents a Document in which only one root node exists.

Note: When using DOM manipulation, each text area is also a node, called a text node.

Core operation interface

There are four core operation interfaces in DOM parsing:

Document: This interface represents the entire XML Document, represents the root of the DOM tree, provides access to and manipulation of the data in the Document, through the Document node can access all elements of the XML file content.

Node: This interface plays an important role in the DOM tree. A large portion of the core interface for DOM manipulation is inherited from Node. For example: Document, Element and other interfaces, in the DOM tree, each Node interface represents a Node in the DOM tree.

NodeList: This interface represents a collection of nodes. It is generally used to represent a sequential set of nodes, such as children of a node. Changes to the document will directly affect the NodeList collection.

NamedNodeMap: this interface represents the one-to-one mapping between a group of nodes and their unique names. This interface is mainly used for the representation of attribute nodes.

DOM parsing process

If an application needs to perform DOM parsing and reading operations, it also needs to follow the following steps:

(1) establish DocumentBuilderFactory: DocumentBuilderFactory factory = DocumentBuilderFactory. NewInstance (); (2) establish DocumentBuilder: DocumentBuilder builder = factory. The newDocumentBuilder (); Document: Document doc = Builder.parse (" file path to parse "); NodeList nl = doc.getelementsbytagName (" read node "); ⑤ Read XML informationCopy the code

SAX parsing

SAX (Simple API for XML) parsing is done step by step in the order of XML files. SAX has no official standards body, it does not belong to any standards organization or group, nor does it belong to any company or individual, but rather provides a computer technology for anyone to use.

SAX (Simple API for XML, a Simple interface for manipulating XML), is different from DOM operation, SAX uses a sequential mode to access, is a fast way to read XML data. When the SAX parser is used for operation, a series of things will be triggered. When the document is scanned to the beginning and end of the document, the element is scanned to the beginning and end of the relevant processing methods will be called, and these operation methods will make corresponding operations until the end of the whole document scan.

If you want to implement this SAX parsing, you must first build a SAX parser.

// create a parser factory
SAXParserFactory factory = SAXParserFactory.newInstance();
// get the parser
SAXParser parser = factory.newSAXParser();
// A SAX parser that inherits DefaultHandler
String path = new File("resource/demo01.xml").getAbsolutePath();
/ / parsing
parser.parse(path, new MySaxHandler());
Copy the code

JDOM parsing

In W3C itself to provide XML operation standards, DOM and SAX, but from the point of view of development, DOM and SAX itself is unique, DOM can be modified, but not suitable for reading large files, and SAX can read large files, but itself can not be modified. The so-called JDOM = DOM modifiable + SAX read large files, JDOM itself is a free open source component, downloaded directly from www.jdom.org.

Common classes for JDOM to manipulate XML:

Document: represents the entire XML Document, which is a tree structure

Eelment: Represents an XML element and provides methods to manipulate its children, such as text, attributes, and namespaces

Attribute: Indicates the attributes contained in the element

Text: indicates XML Text information

XMLOutputter: XML output stream, underlying through JDK stream

Format: Provides encoding, styling, and layout Settings for the output of XML files

We found JDOM’s output operation much more convenient and intuitive than traditional DOM, including easy output. What is observed at this point is JDOM’s support for DOM parsing, but also that JDOM itself supports SAX features; So, you can use SAX for parsing.

// Get the SAX parser
SAXBuilder builder = new SAXBuilder();
File file = new File("resource/demo01.xml");
// Get the document
Document doc = builder.build(new File(file.getAbsolutePath()));  
// Get the root node
Element root = doc.getRootElement();  
System.out.println(root.getName());
// Get all the children of the root node, or the specified direct point based on the label name
List<Element> list = root.getChildren();
System.out.println(list.size());
for(int x = 0; x<list.size(); x++){
	Element e = list.get(x);  
    // Get the name of the element and the text inside it
    String name = e.getName();
    System.out.println(name + "=" + e.getText());
    System.out.println("= = = = = = = = = = = = = = = = = =");
}
Copy the code

DOM4J to parse

Dom4j is a simple open source library for processing XML, XPath, and XSLT based on the Java platform, using Java’s collections framework and fully integrating DOM, SAX, and JAXP. Download path:

www.dom4j.org/dom4j-1.6.1…

Sourceforge.net/projects/do…

DOM4J, like JDOM, is a free XML open source component, but it is widely used in current development frameworks, such as Hibernate, Spring, etc. DOM4J is used in this function, so as an introduction, you can have an understanding of this component. There is no one better than the other. DOM4J is used more often in frameworks than JDOM. You can see that DOM4J offers many new features, such as output formats that work well.

File file = new File("resource/outputdom4j.xml");
SAXReader reader = new SAXReader();
// Read the file as a document
Document doc = reader.read(file);
// Get the root element of the document
Element root = doc.getRootElement();
// Find all child nodes based on the following elements
Iterator<Element> iter = root.elementIterator();
while(iter.hasNext()){
    Element name = iter.next();
    System.out.println("value = " + name.getText());
}
Copy the code

Extend the creation of ~XML

The DOM to create

If you want to generate XML files, you should use the newDocument() method when creating the document

If you want to output the DOM document, it can be cumbersome. Write multiple copies at once

public static void createXml(a) throws Exception{  
	// Get the parser factory
    DocumentBuilderFactory factory=DocumentBuilderFactory.newInstance();  
    // Get the parser
    DocumentBuilder builder=factory.newDocumentBuilder();  
    // Create a document
    Document doc=builder.newDocument();  
    // Create the element and set the relationship
    Element root=doc.createElement("people");  
    Element person=doc.createElement("person");  
    Element name=doc.createElement("name");  
    Element age=doc.createElement("age");  
    name.appendChild(doc.createTextNode("lebyte"));  
    age.appendChild(doc.createTextNode("10"));  
    doc.appendChild(root);  
    root.appendChild(person);  
    person.appendChild(name);  
    person.appendChild(age);  
    / / write out
    // Get the transformer factory
    TransformerFactory tsf=TransformerFactory.newInstance();  
    Transformer ts=tsf.newTransformer();  
    // Set the encoding
	ts.setOutputProperty(OutputKeys.ENCODING, "UTF-8");  
    // Create a new input Source with a DOM node as the holder of the transformation Source tree
    DOMSource source=new DOMSource(doc);  
    // Act as a holder of the result of the transformation
    File file=new File("src/output.xml");  
    StreamResult result=new StreamResult(file);  
    ts.transform(source, result);  
} 
Copy the code

SAX create

// Create a SAXtransformerfactory object
SAXTransformerFactory stf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
try {
    // Create a TransfomerHandler object from the SAXTransformerFactory object
    TransformerHandler handler = stf.newTransformerHandler();
    // Create a Transformer object from the transformerHandler object
    Transformer tf = handler.getTransformer();
    // Sets the properties of the Transfomer object
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    // Create a Result object and associate it with handler
    File file = new File("src/output.xml");
    if(! file.exists()){ file.createNewFile(); } Result result =new StreamResult(new FileOutputStream(file));
    handler.setResult(result);
    // Write the content of XML with Handler
    / / open the Document
    handler.startDocument();
    AttributesImpl attr = new AttributesImpl();
    // Create the root node bookstore
    handler.startElement("".""."bookstore", attr);
    attr.clear();
    attr.addAttribute("".""."id".""."1");
    handler.startElement("".""."book", attr);
    attr.clear();
    handler.startElement("".""."name", attr);
    handler.characters("Cervical spondylosis rehabilitation Guide".toCharArray(), 0."Cervical spondylosis rehabilitation Guide".length());
    handler.endElement("".""."name");
    // Shut down each node
    handler.endElement("".""."book");
    handler.endElement("".""."bookstore");
    handler.endDocument();
} catch (SAXException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (TransformerConfigurationException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
Copy the code

JDOM to create

// Create a node
Element person = new Element("person");  
Element name = new Element("name");  
Element age = new Element("age");  
// Create a property
Attribute id = new Attribute("id"."1");  
// Set the text
name.setText("lebyte");  
age.setText("10");  
// Set the relationship
Document doc = new Document(person);  
person.addContent(name);  
name.setAttribute(id);  
person.addContent(age);  
XMLOutputter out = new XMLOutputter();  
File file = new File("resource/outputjdom.xml");  
out.output(doc, new FileOutputStream(file.getAbsoluteFile())); 
Copy the code

DOM4J create

// Use DocumentHelper to create the Document object
Document document = DocumentHelper.createDocument();  
// Create the element and set the relationship
Element person = document.addElement("person");  
Element name = person.addElement("name");   
Element age = person.addElement("age");  
// Set the text name.settext ("lebyte");
age.setText("10"); 
// Create a formatter output
OutputFormat of = OutputFormat.createPrettyPrint();  
of.setEncoding("utf-8");  
// Output to a file
File file = new File("resource/outputdom4j.xml");  
XMLWriter writer = new XMLWriter(new FileOutputStream(new  File(file.getAbsolutePath())),of);  
/ / write
writer.write(document);  
writer.flush();  
writer.close(); 
Copy the code