Java and XPath tutorial to extract a subset of an XML
Q. What is XPath?
A. XPath is a query language to extract a part of XML document as an SQL is used to extract a part of a database data or a REGEX (i.e. regular expression) is used to extract a part of text. The XPath expressions can return,
XPathConstants.STRING XPathConstants.NUMBER XPathConstants.BOOLEAN XPathConstants.NODE XPathConstants.NODESET
Here is a very basic example of XPath in Java to process a very basic XML
<Employee> <name type="first">Peter</name> <age>25</age> </Employee>
package com.xml; import java.io.ByteArrayInputStream; import java.io.IOException; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpression; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.NodeList; import org.xml.sax.SAXException; public class XpathQuery { public static void main(String[] args) { String xml = "<Employee><name type=\"first\">Peter</name><age>25</age></Employee>"; DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setNamespaceAware(true); DocumentBuilder builder; Document document = null; try { builder = factory.newDocumentBuilder(); document = builder.parse(new ByteArrayInputStream(xml.getBytes())); XPathFactory xpathFactory = XPathFactory.newInstance(); XPath xpath = xpathFactory.newXPath(); // get employee by name with XPath expression // get name of the employee with age > 18 XPathExpression expr = xpath.compile("/Employee[age>18]/name/text()"); NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET); for (int i = 0; i < nodes.getLength(); i++) { System.out.println("age > 15 : " + nodes.item(i).getNodeValue()); } // get the age of Peter expr = xpath.compile("/Employee[name='Peter']/age/text()"); nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET); for (int i = 0; i < nodes.getLength(); i++) { System.out.println("age of peter : " + nodes.item(i).getNodeValue()); } // get first name where type=first expr = xpath.compile("/Employee/name[@type='first']/text()"); nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET); for (int i = 0; i < nodes.getLength(); i++) { System.out.println("attr type='first': " + nodes.item(i).getNodeValue()); } } catch (ParserConfigurationException | IOException | SAXException | XPathExpressionException e) { // Java 6 e.printStackTrace(); } } }
Output:
age > 15 : Peter age of peter : 25 attribute type='first': Peter
Note: Like SQL, you need to learn the XPath query language or syntax. For example @Type means attribute "type", and "/" means root node, etc. Google for XPath syntax to learn more.
Labels: XML
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home