在java使用DOM,SAX,STAX解析XML

摘要

在java使用DOM,SAX,STAX解析XML

   在java的API文档中,碰巧读到了解析XML文档这一章。然后我使用了不同的方式去解析XML文档,我想把解析方法通过我的博客分享给大家,以便供大家学习和参考,在这篇文章中,我用不同的方式去解析同一个XML文档,把XML文档的内容解析成Object对象然后放到一个List集合中。

   案例中的XML内容如下:

<employees>
  <employee id="111">
    <firstName>Rakesh</firstName>
    <lastName>Mishra</lastName>
    <location>Bangalore</location>
  </employee>
  <employee id="112">
    <firstName>John</firstName>
    <lastName>Davis</lastName>
    <location>Chennai</location>
  </employee>
  <employee id="113">
    <firstName>Rajesh</firstName>
    <lastName>Sharma</lastName>
    <location>Pune</location>
  </employee>
</employees>

把XML里面的节点定义一个如下的JAVA对象:

class Employee{
  String id;
  String firstName;
  String lastName;
  String location;
  @Override
  public String toString() {
    return firstName+" "+lastName+"("+id+")"+location;
  }
}

下面是我用来解析XML文档使用的三种方式

DOM解析,SAX解析,STAX解析:

  • 使用DOM解析:

这里使用了Java7 jdk中的DOM解析器,Dom解析器把XML内容加载为一个树行结构,然后我们通过遍历节点和节点列表去获取内容。相关代码如下:

public class DOMParserDemo {
  public static void main(String[] args) throws Exception {
    //Get the DOM Builder Factory
    DocumentBuilderFactory factory =
        DocumentBuilderFactory.newInstance();
    //Get the DOM Builder
    DocumentBuilder builder = factory.newDocumentBuilder();
    //Load and Parse the XML document
    //document contains the complete XML as a Tree.
    Document document =
      builder.parse(
        ClassLoader.getSystemResourceAsStream("xml/employee.xml"));
    List<Employee> empList = new ArrayList<>();
    //Iterating through the nodes and extracting the data.
    NodeList nodeList = document.getDocumentElement().getChildNodes();
    for (int i = 0; i < nodeList.getLength(); i++) {
      //We have encountered an <employee> tag.
      Node node = nodeList.item(i);
      if (node instanceof Element) {
        Employee emp = new Employee();
        emp.id = node.getAttributes().
            getNamedItem("id").getNodeValue();
        NodeList childNodes = node.getChildNodes();
        for (int j = 0; j < childNodes.getLength(); j++) {
          Node cNode = childNodes.item(j);
          //Identifying the child tag of employee encountered.
          if (cNode instanceof Element) {
            String content = cNode.getLastChild().
                getTextContent().trim();
            switch (cNode.getNodeName()) {
              case "firstName":
                emp.firstName = content;
                break;
              case "lastName":
                emp.lastName = content;
                break;
              case "location":
                emp.location = content;
                break;
            }
          }
        }
        empList.add(emp);
      }
    }
    //Printing the Employee list populated.
    for (Employee emp : empList) {
      System.out.println(emp);
    }
  }
}
class Employee{
  String id;
  String firstName;
  String lastName;
  String location;
  @Override
  public String toString() {
    return firstName+" "+lastName+"("+id+")"+location;
  }
}

程序输出结果:

Rakesh Mishra(111)Bangalore
John Davis(112)Chennai
Rajesh Sharma(113)Pune
  • 使用SAX解析XML:

SAX解析器和DOM解析的不同之处在于,SAX解析器不用把整个XML文档全部加载在内存中,相反,SAX解析器会根据XML的节点去触发不同的事件:如打开标记,关闭标识,字符数据,注释等等。这也就是SAX解析器是基于事件驱动模式的原因。

根据XML原始文档,我们可以设计一个继承于DefaultHandler类的一个控制器,DefaultHandler类将会提供不同种类的回调函数,我们感兴趣的回调函数如下:

  • startElement() – 节点(标签)开始的事件触发器

  • endElement() –节点(标签)结束的事件触发器

  • characters() –触发该事件,我们将会获取到数据的内容

相关代码如下:

import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParserDemo {
  public static void main(String[] args) throws Exception {
    SAXParserFactory parserFactor = SAXParserFactory.newInstance();
    SAXParser parser = parserFactor.newSAXParser();
    SAXHandler handler = new SAXHandler();
    parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"),
                 handler);
    //Printing the list of employees obtained from XML
    for ( Employee emp : handler.empList){
      System.out.println(emp);
    }
  }
}
/**
 * The Handler for SAX Events.
 */
class SAXHandler extends DefaultHandler {
  List<Employee> empList = new ArrayList<>();
  Employee emp = null;
  String content = null;
  @Override
  //Triggered when the start of tag is found.
  public void startElement(String uri, String localName,
                           String qName, Attributes attributes)
                           throws SAXException {
    switch(qName){
      //Create a new Employee object when the start tag is found
      case "employee":
        emp = new Employee();
        emp.id = attributes.getValue("id");
        break;
    }
  }
  @Override
  public void endElement(String uri, String localName,
                         String qName) throws SAXException {
   switch(qName){
     //Add the employee to list once end tag is found
     case "employee":
       empList.add(emp);      
       break;
     //For all other end tags the employee has to be updated.
     case "firstName":
       emp.firstName = content;
       break;
     case "lastName":
       emp.lastName = content;
       break;
     case "location":
       emp.location = content;
       break;
   }
  }
  @Override
  public void characters(char[] ch, int start, int length)
          throws SAXException {
    content = String.copyValueOf(ch, start, length).trim();
  }
}
class Employee {
  String id;
  String firstName;
  String lastName;
  String location;
  @Override
  public String toString() {
    return firstName + " " + lastName + "(" + id + ")" + location;
  }
}

程序的输出结果:

Rakesh Mishra(111)Bangalore
John Davis(112)Chennai
Rajesh Sharma(113)Pune
  • 使用STAX解析器

STAX是一种面向流方式去解析XML,STAX解析器不同于DOM解析器,但它采用了和SAX相同解析方式,但它和SAX还是有一些不同之处。

  • STAX事件解析器不同于SAX的数据推送,它只需要把需要的数据从XML文档中加载出来

  • 在StAX中,程序的切入点是表示XML文档中一个位置的光标,应用程序在需要时向前移动光标,从解析器拉出信息

XMLInputFactory and XMLStreamReader are the two class which can be used to load an XML file. And as we read through the XML file using XMLStreamReader, events are generated in the form of integer values and these are then compared with the constants inXMLStreamConstants. The below code shows how to parse XML using StAX parser:

XMLInputFactory 和XMLStreamReader 是两个用来加载XML的类,当我们使用XMLStreamReader去加载XMX文件,事件所产生的数值是整形类型的,然后和inXMLStreamConstants进行比较。具体的代码实现入下:

import java.util.ArrayList;
import java.util.List;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class StaxParserDemo {
  public static void main(String[] args) throws XMLStreamException {
    List<Employee> empList = null;
    Employee currEmp = null;
    String tagContent = null;
    XMLInputFactory factory = XMLInputFactory.newInstance();
    XMLStreamReader reader =
        factory.createXMLStreamReader(
        ClassLoader.getSystemResourceAsStream("xml/employee.xml"));
    while(reader.hasNext()){
      int event = reader.next();
      switch(event){
        case XMLStreamConstants.START_ELEMENT:
          if ("employee".equals(reader.getLocalName())){
            currEmp = new Employee();
            currEmp.id = reader.getAttributeValue(0);
          }
          if("employees".equals(reader.getLocalName())){
            empList = new ArrayList<>();
          }
          break;
        case XMLStreamConstants.CHARACTERS:
          tagContent = reader.getText().trim();
          break;
        case XMLStreamConstants.END_ELEMENT:
          switch(reader.getLocalName()){
            case "employee":
              empList.add(currEmp);
              break;
            case "firstName":
              currEmp.firstName = tagContent;
              break;
            case "lastName":
              currEmp.lastName = tagContent;
              break;
            case "location":
              currEmp.location = tagContent;
              break;
          }
          break;
        case XMLStreamConstants.START_DOCUMENT:
          empList = new ArrayList<>();
          break;
      }
    }
    //Print the employee list populated from XML
    for ( Employee emp : empList){
      System.out.println(emp);
    }
  }
}
class Employee{
  String id;
  String firstName;
  String lastName;
  String location;
  @Override
  public String toString(){
    return firstName+" "+lastName+"("+id+") "+location;
  }
}

程序输出:

Rakesh Mishra(111) Bangalore
John Davis(112) Chennai
Rajesh Sharma(113) Pune


程序爱好者翻译整理:原文来自于Java Code Greeks,文章内容归原作者所有转载请注明出处。

IT家园
IT家园

网友最新评论 (0)