本文最开始发表于择维士社区
XPath是一种用于在xml格式的内容中提取信息的方式. 它与从JSON中提取信息的JSONPath类似. (如何使用JSONPath). 本文将介绍xpath的基本格式以及在Java中如何使用Xpath提取信息.
一般表达式格式如下: /foo/bar 可以搜索如下的xml内容/节点:
<foo>
<bar/>
foo>
或者:
<foo>
<bar/>
<bar/>
<bar/>
foo>
如果以// 开始代表忽略深度限制.
常见的节点元素类型:
| Location Path | Description |
|---|---|
/foo/bar/@id | bar元素的id属性 |
/foo/bar/text() | bar元素的text值. |
预测允许我们来查找满足条件的节点. 格式是[表达式]. 比如:
选择所有foo节点(含所有子节点,孙子节点...)包含include属性,且值为true
//foo[@include='true']
//foo[@include='true'][@mode='bar']
<Tutorials>
<Tutorial tutId="01" type="java">
<title>Guavatitle>
<description>Introduction to Guavadescription>
<date>04/04/2016date>
<author>GuavaAuthorauthor>
Tutorial>
<Tutorial tutId="02" type="java">
<title>XMLtitle>
<description>Introduction to XPathdescription>
<date>04/05/2016date>
<author>XMLAuthorauthor>
Tutorial>
Tutorials>
比如上面的例子:
/Tutorials/Tutorial[1]
/Tutorials/Tutorial[first()]
/Tutorials/Tutorial[position()<4]
JDK11中原生支持了xmlpath解析, 以解析上面的xml为例:
返回所有 /Tutorials/Tutorial 节点:
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import java.io.*;
public class XmlDemo {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(new ByteArrayInputStream(EXAMPLE_STRING.getBytes()));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "/Tutorials/Tutorial";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("Find nodes length = " + nodeList.getLength());
}
static String EXAMPLE_STRING = "" +
"\n" +
" \n" +
" Guava \n" +
" Introduction to Guava \n" +
" 04/04/2016 \n" +
" GuavaAuthor \n" +
" \n" +
" \n" +
" XML \n" +
" Introduction to XPath \n" +
" 04/05/2016 \n" +
" XMLAuthor \n" +
" \n" +
"";
}
获取Tutorial (tutId=01)的节点:
String expression = "/Tutorials/Tutorial[@tutId=\"01\"]";
Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE);
System.out.println("Find node" + node);
获取包含title的节点 以及节点值为Guava:
String expression = "//Tutorial[descendant::title[text()=" + "'" + "Guava" + "'" + "]]";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("Found title=Guava length:" + nodeList.getLength());
1.JDK中的api