XMLPath的基本使用
-
什么是XPath
XPath是一种用于在xml格式的内容中提取信息的方式. 它与从JSON中提取信息的JSONPath类似. (如何使用JSONPath). 本文将介绍xpath的基本格式以及在Java中如何使用Xpath提取信息.
XPath基本格式
XPath的表达式
一般表达式格式如下:
/foo/bar
可以搜索如下的xml内容/节点:<foo> <bar/> </foo> 或者: <foo> <bar/> <bar/> <bar/> </foo>
如果以
//
开始代表忽略深度限制.常见的节点元素类型:
Location Path Description /foo/bar/@id
bar元素的id属性 /foo/bar/text()
bar元素的text值. 预测允许我们来查找满足条件的节点. 格式是
[表达式]
. 比如:选择所有foo节点(含所有子节点,孙子节点...)包含include属性,且值为true //foo[@include='true'] //foo[@include='true'][@mode='bar']
更多的预测格式
<?xml version="1.0"?> <Tutorials> <Tutorial tutId="01" type="java"> <title>Guava</title> <description>Introduction to Guava</description> <date>04/04/2016</date> <author>GuavaAuthor</author> </Tutorial> <Tutorial tutId="02" type="java"> <title>XML</title> <description>Introduction to XPath</description> <date>04/05/2016</date> <author>XMLAuthor</author> </Tutorial> </Tutorials>
比如上面的例子:
/Tutorials/Tutorial[1] /Tutorials/Tutorial[first()] /Tutorials/Tutorial[position()<4]
XPath在Java中的使用示例
JDK11中原生支持了xmlpath解析, 以解析上面的xml为例:
获取一堆节点
返回所有
/Tutorials/Tutorial
节点:import org.w3c.dom.*; import javax.xml.parsers.*; import javax.xml.xpath.*; import java.io.*; public class XmlDemo { public static void main(String[] args) throws Exception { DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = builderFactory.newDocumentBuilder(); Document xmlDocument = builder.parse(new ByteArrayInputStream(EXAMPLE_STRING.getBytes())); XPath xPath = XPathFactory.newInstance().newXPath(); String expression = "/Tutorials/Tutorial"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); System.out.println("Find nodes length = " + nodeList.getLength()); } static String EXAMPLE_STRING = "<?xml version=\"1.0\"?>" + "<Tutorials>\n" + " <Tutorial tutId=\"01\" type=\"java\">\n" + " <title>Guava</title>\n" + " <description>Introduction to Guava</description>\n" + " <date>04/04/2016</date>\n" + " <author>GuavaAuthor</author>\n" + " </Tutorial>\n" + " <Tutorial tutId=\"02\" type=\"java\">\n" + " <title>XML</title>\n" + " <description>Introduction to XPath</description>\n" + " <date>04/05/2016</date>\n" + " <author>XMLAuthor</author>\n" + " </Tutorial>\n" + "</Tutorials>"; }
根据某个id获取节点:
获取Tutorial (tutId=01)的节点:
String expression = "/Tutorials/Tutorial[@tutId=\"01\"]"; Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); System.out.println("Find node" + node);
根据某个tag获取节点:
获取包含title的节点 以及节点值为Guava:
String expression = "//Tutorial[descendant::title[text()=" + "'" + "Guava" + "'" + "]]"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); System.out.println("Found title=Guava length:" + nodeList.getLength());
参考
1.JDK中的api