Java and XPath tutorial to extract a subset of an XML
Q. What is XPath?
A. XPath is a query language to extract a part of XML document as an SQL is used to extract a part of a database data or a REGEX (i.e. regular expression) is used to extract a part of text. The XPath expressions can return,
XPathConstants.STRING XPathConstants.NUMBER XPathConstants.BOOLEAN XPathConstants.NODE XPathConstants.NODESET
Here is a very basic example of XPath in Java to process a very basic XML
<Employee> <name type="first">Peter</name> <age>25</age> </Employee>
package com.xml;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class XpathQuery {
public static void main(String[] args) {
String xml = "<Employee><name type=\"first\">Peter</name><age>25</age></Employee>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document document = null;
try {
builder = factory.newDocumentBuilder();
document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
// get employee by name with XPath expression
// get name of the employee with age > 18
XPathExpression expr = xpath.compile("/Employee[age>18]/name/text()");
NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("age > 15 : " + nodes.item(i).getNodeValue());
}
// get the age of Peter
expr = xpath.compile("/Employee[name='Peter']/age/text()");
nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("age of peter : " + nodes.item(i).getNodeValue());
}
// get first name where type=first
expr = xpath.compile("/Employee/name[@type='first']/text()");
nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("attr type='first': " + nodes.item(i).getNodeValue());
}
} catch (ParserConfigurationException | IOException | SAXException
| XPathExpressionException e) { // Java 6
e.printStackTrace();
}
}
}
Output:
age > 15 : Peter age of peter : 25 attribute type='first': Peter
Note: Like SQL, you need to learn the XPath query language or syntax. For example @Type means attribute "type", and "/" means root node, etc. Google for XPath syntax to learn more.
Labels: XML

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home