java运用xpath和dom4j剖析xml【XML教程】,xpath dom4j xml

1 XML文件剖析的4种要领

一般剖析XML文件有四种典范的要领。基本的剖析体式格局有两种，一种叫SAX，另一种叫DOM。SAX是基于事宜流的剖析，DOM是基于XML文档树结构的剖析。在此基本上，为了削减DOM、SAX的编码量，涌现了JDOM，其长处是，20-80准绳（帕累托轨则），极大削减了代码量。一般状况下JDOM运用时满足要完成的功用简朴，如剖析、建立等请求。但在底层，JDOM照样运用SAX（最经常使用）、DOM、Xanan文档。别的一种是DOM4J，是一个非常非常优秀的Java XML API，具有机能优秀、功用强大和极度易用的特性，同时它也是一个开放源代码的软件。现在你可以看到越来越多的 Java 软件都在运用 DOM4J 来读写 XML，迥殊值得一提的是连 Sun 的 JAXM 也在用 DOM4J。细致四种要领的运用，百度一下，会有浩瀚细致的引见。

2 XPath简朴引见

XPath是一门在XML文档中查找信息的言语。XPath用于在 XML 文档中经由过程元素和属性举行导航，并对元素和属性举行遍历。XPath 是 W3C XSLT 规范的重要元素，而且 XQuery 和 XPointer 同时被构建于 XPath 表达之上。因而，对 XPath 的明白是许多高等 XML 运用的基本。XPath非常相似对数据库操纵的SQL言语，或者说JQuery，它可以轻易开发者抓起文档中须要的东西。个中DOM4J也支撑XPath的运用。

3 DOM4J运用XPath

DOM4J运用XPath剖析XML文档是，起首须要在项目中援用两个JAR包：

dom4j-1.6.1.jar：DOM4J软件包，下载地点http://sourceforge.net/projects/dom4j/；

jaxen-xx.xx.jar：一般不增加此包，会激发非常（java.lang.NoClassDefFoundError: org/jaxen/JaxenException），下载地点http://www.jaxen.org/releases.html。

3.1 定名空间（namespace）的滋扰

在处来由excel文件或其他格式文件转换的xml文件时，一般会碰到经由过程XPath剖析得不到效果的状况。这类状况一般是因为定名空间的存在致使的。以下述内容的XML文件为例，经由过程XPath=" // Workbook/ Worksheet / Table / Row[1]/ Cell[1]/Data[1] "举行简朴的检索，一般是没有效果涌现的。这就是因为定名空间namespace（xmlns="urn:schemas-microsoft-com:office:spreadsheet"）致使的。

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
  <Worksheet ss:Name="Sheet1">
    <Table ss:ExpandedColumnCount="81" ss:ExpandedRowCount="687" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="52.5" ss:DefaultRowHeight="15.5625">
      <Row ss:AutoFitHeight="0">
  <Cell>
   <Data ss:Type="String">敲代码的耗子</Data>
  </Cell> 
      </Row>
      <Row ss:AutoFitHeight="0">
  <Cell>
   <Data ss:Type="String">Sunny</Data>
  </Cell> 
      </Row>
    </Table>
  </Worksheet>
</Workbook>

3.2 XPath对带有定名空间的xml文件剖析

第一种要领（read1()函数）：运用XPath语法中自带的local-name() 和 namespace-uri() 指定你要运用的节点名和定名空间。 XPath表达式誊写较为贫苦。

第二种要领（read2()函数）：设置XPath的定名空间，应用setNamespaceURIs()函数。

第三种要领（read3()函数）：设置DocumentFactory()的定名空间，运用的函数是setXPathNamespaceURIs()。二和三两种要领的XPath表达式誊写相对简朴。

第四种要领（read4()函数）：要领和第三种一样，然则XPath表达式差别（顺序细致表现），重要是为了磨练XPath表达式的差别，重要指完全水平，是不是会对检索效力产生影响。

（以上四种要领均经由过程DOM4J连系XPath对XML文件举行剖析）

第五种要领（read5()函数）：运用DOM连系XPath对XML文件举行剖析，重要是为了磨练机能差别。

没有什么可以比代码更能申明题目的了！坚决上代码！

packageXPath;
importjava.io.IOException;
importjava.io.InputStream;
importjava.util.HashMap;
importjava.util.List;
importjava.util.Map;
importjavax.xml.parsers.DocumentBuilder;
importjavax.xml.parsers.DocumentBuilderFactory;
importjavax.xml.parsers.ParserConfigurationException;
importjavax.xml.xpath.XPathConstants;
importjavax.xml.xpath.XPathExpression;
importjavax.xml.xpath.XPathExpressionException;
importjavax.xml.xpath.XPathFactory;
importorg.dom4j.Document;
importorg.dom4j.DocumentException;
importorg.dom4j.Element;
importorg.dom4j.XPath;
importorg.dom4j.io.SAXReader;
importorg.w3c.dom.NodeList;
importorg.xml.sax.SAXException;
/**
*DOM4JDOMXMLXPath
*/
publicclassTestDom4jXpath{
publicstaticvoidmain(String[]args){
read1();
read2();
read3();
read4();//read3（）要领一样，然则XPath表达式差别
read5();
}
publicstaticvoidread1(){
/*
*uselocal-name()andnamespace-uri()inXPath
*/
try{
longstartTime=System.currentTimeMillis();
SAXReaderreader=newSAXReader();
InputStreamin=TestDom4jXpath.class.getClassLoader().getResourceAsStream("XPath\\XXX.xml");
Documentdoc=reader.read(in);
/*Stringxpath="//*[local-name()='Workbook'andnamespace-uri()='urn:schemas-microsoft-com:office:spreadsheet']"
+"/*[local-name()='Worksheet']"
+"/*[local-name()='Table']"
+"/*[local-name()='Row'][4]"
+"/*[local-name()='Cell'][3]"
+"/*[local-name()='Data'][1]";*/
Stringxpath="//*[local-name()='Row'][4]/*[local-name()='Cell'][3]/*[local-name()='Data'][1]";
System.err.println("=====uselocal-name()andnamespace-uri()inXPath====");
System.err.println("XPath："+xpath);
@SuppressWarnings("unchecked")
List<Element>list=doc.selectNodes(xpath);
for(Objecto:list){
Elemente=(Element)o;
Stringshow=e.getStringValue();
System.out.println("show="+show);
longendTime=System.currentTimeMillis();
System.out.println("顺序运转时候："+(endTime-startTime)+"ms");
}
}catch(DocumentExceptione){
e.printStackTrace();
}
}
publicstaticvoidread2(){
/*
*setxpathnamespace(setNamespaceURIs)
*/
try{
longstartTime=System.currentTimeMillis();
Mapmap=newHashMap();
map.put("Workbook","urn:schemas-microsoft-com:office:spreadsheet");
SAXReaderreader=newSAXReader();
InputStreamin=TestDom4jXpath.class.getClassLoader().getResourceAsStream("XPath\\XXX.xml");
Documentdoc=reader.read(in);
Stringxpath="//Workbook:Row[4]/Workbook:Cell[3]/Workbook:Data[1]";
System.err.println("=====usesetNamespaceURIs()tosetxpathnamespace====");
System.err.println("XPath："+xpath);
XPathx=doc.createXPath(xpath);
x.setNamespaceURIs(map);
@SuppressWarnings("unchecked")
List<Element>list=x.selectNodes(doc);
for(Objecto:list){
Elemente=(Element)o;
Stringshow=e.getStringValue();
System.out.println("show="+show);
longendTime=System.currentTimeMillis();
System.out.println("顺序运转时候："+(endTime-startTime)+"ms");
}
}catch(DocumentExceptione){
e.printStackTrace();
}
}
publicstaticvoidread3(){
/*
*setDocumentFactory()namespace(setXPathNamespaceURIs)
*/
try{
longstartTime=System.currentTimeMillis();
Mapmap=newHashMap();
map.put("Workbook","urn:schemas-microsoft-com:office:spreadsheet");
SAXReaderreader=newSAXReader();
InputStreamin=TestDom4jXpath.class.getClassLoader().getResourceAsStream("XPath\\XXX.xml");
reader.getDocumentFactory().setXPathNamespaceURIs(map);
Documentdoc=reader.read(in);
Stringxpath="//Workbook:Row[4]/Workbook:Cell[3]/Workbook:Data[1]";
System.err.println("=====usesetXPathNamespaceURIs()tosetDocumentFactory()namespace====");
System.err.println("XPath："+xpath);
@SuppressWarnings("unchecked")
List<Element>list=doc.selectNodes(xpath);
for(Objecto:list){
Elemente=(Element)o;
Stringshow=e.getStringValue();
System.out.println("show="+show);
longendTime=System.currentTimeMillis();
System.out.println("顺序运转时候："+(endTime-startTime)+"ms");
}
}catch(DocumentExceptione){
e.printStackTrace();
}
}
publicstaticvoidread4(){
/*
*同read3（）要领一样，然则XPath表达式差别
*/
try{
longstartTime=System.currentTimeMillis();
Mapmap=newHashMap();
map.put("Workbook","urn:schemas-microsoft-com:office:spreadsheet");
SAXReaderreader=newSAXReader();
InputStreamin=TestDom4jXpath.class.getClassLoader().getResourceAsStream("XPath\\XXX.xml");
reader.getDocumentFactory().setXPathNamespaceURIs(map);
Documentdoc=reader.read(in);
Stringxpath="//Workbook:Worksheet/Workbook:Table/Workbook:Row[4]/Workbook:Cell[3]/Workbook:Data[1]";
System.err.println("=====usesetXPathNamespaceURIs()tosetDocumentFactory()namespace====");
System.err.println("XPath："+xpath);
@SuppressWarnings("unchecked")
List<Element>list=doc.selectNodes(xpath);
for(Objecto:list){
Elemente=(Element)o;
Stringshow=e.getStringValue();
System.out.println("show="+show);
longendTime=System.currentTimeMillis();
System.out.println("顺序运转时候："+(endTime-startTime)+"ms");
}
}catch(DocumentExceptione){
e.printStackTrace();
}
}
publicstaticvoidread5(){
/*
*DOMandXPath
*/
try{
longstartTime=System.currentTimeMillis();
DocumentBuilderFactorydbf=DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(false);
DocumentBuilderbuilder=dbf.newDocumentBuilder();
InputStreamin=TestDom4jXpath.class.getClassLoader().getResourceAsStream("XPath\\XXX.xml");
org.w3c.dom.Documentdoc=builder.parse(in);
XPathFactoryfactory=XPathFactory.newInstance();
javax.xml.xpath.XPathx=factory.newXPath();
//拔取一切class元素的name属性
Stringxpath="//Workbook/Worksheet/Table/Row[4]/Cell[3]/Data[1]";
System.err.println("=====DomXPath====");
System.err.println("XPath："+xpath);
XPathExpressionexpr=x.compile(xpath);
NodeListnodes=(NodeList)expr.evaluate(doc,XPathConstants.NODE);
for(inti=0;i<nodes.getLength();i++){
System.out.println("show="+nodes.item(i).getNodeValue());
longendTime=System.currentTimeMillis();
System.out.println("顺序运转时候："+(endTime-startTime)+"ms");
}
}catch(XPathExpressionExceptione){
e.printStackTrace();
}catch(ParserConfigurationExceptione){
e.printStackTrace();
}catch(SAXExceptione){
e.printStackTrace();
}catch(IOExceptione){
e.printStackTrace();
}
}
}

更多java运用xpath和dom4j剖析xml相干文章请关注ki4网！

正文

java运用xpath和dom4j剖析xml【XML教程】,xpath dom4j xml

相关阅读

四种使用dom4j读取xml文件的方式_XML教程,xml

XPath手艺【XML教程】,XPath技术

有关XML剖析中DOM剖析的细致引见【XML教程】,DOM

XML编程-SAX【XML教程】,XML，SAX