What is XML Parsing?
XML Parsing refers to going through an XML
document in order to access or modify data.
What is XML Parser?
XML Parser provides a way to access or modify
data in an XML document. Java provides multiple options to parse XML documents.
Following are the various types of parsers which are commonly used to parse XML
documents.
·
Dom
Parser − Parses an XML
document by loading the complete contents of the document and creating its
complete hierarchical tree in memory.
·
SAX
Parser − Parses an XML
document on event-based triggers. Does not load the complete document into the
memory.
·
JDOM
Parser − Parses an XML
document in a similar fashion to DOM parser but in an easier way.
·
StAX
Parser − Parses an XML
document in a similar fashion to SAX parser but in a more efficient way.
·
XPath
Parser − Parses an XML
document based on expression and is used extensively in conjunction with XSLT.
·
DOM4J
Parser − A java library
to parse XML, XPath, and XSLT using Java Collections Framework. It provides support
for DOM, SAX, and JAXP.
There are JAXB and XSLT APIs available to handle
XML parsing in object-oriented way
Java
DOM Parser
DOM parser parses the entire XML document and loads
it into memory; then models it in a “TREE” structure for easy traversal or
manipulation.
DOM Parser has a tree based structure.
//Get the DOM
Builder Factory
DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance(); //Get the DOM Builder DocumentBuilder builder = factory.newDocumentBuilder(); //Load and Parse the XML document //document contains the complete XML as a Tree. Document document =builder.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml")); |
Advantages
1)
It supports both read and write operations and the API is very simple to use.
2)
It is preferred when random access to widely separated parts of a document is
required.
Disadvantages
1)
It is memory inefficient. (Consumes more memory because the whole XML document
needs to loaded into memory).
2)
It is comparatively slower than other parsers.
SAX (Simple
API for XML)
SAX Parser is
different from the DOM Parser where SAX parser doesn’t load the complete XML
into the memory, instead it parses the XML line by line triggering different
events as and when it encounters different elements like: opening tag, closing
tag, character data, comments and so on. This is the reason why SAX Parser is
called an event based parser.
SAXParserFactory parserFactor = SAXParserFactory.newInstance();
SAXParser parser = parserFactor.newSAXParser(); SAXHandler handler = new SAXHandler(); parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"), handler); |
Features of SAX Parser
It does not create any internal structure.
Clients does not know what methods to call, they just
overrides the methods of the API and place his own code inside method.
It is an event based parser, it works like an event handler
in Java.
Advantages
1) It is simple and memory efficient.
2) It is very fast and works for huge documents.
Disadvantages
1) It is event-based so its API is less intuitive.
2) Clients never know the full information because the data
is broken into pieces.
Java
JDOM Parser
JDOM is an open source, Java-based library to parse
XML documents. It is typically a Java developer friendly API. It is Java
optimized and it uses Java collections like List and Arrays.
JDOM works with DOM and SAX APIs and combines the
best of the two. It is of low memory footprint and is nearly as fast as SAX.
StAX
Parser
StAX stands for Streaming API for XML and StAX
Parser is different
from DOM in the same way SAX Parser is. StAX parser is also in a subtle way
different from SAX parser.
·
The SAX Parser pushes the data but StAX parser pulls the
required data from the XML.
·
The StAX parser maintains a cursor at the current position in
the document allows to extract the content available at the cursor whereas SAX
parser issues events as and when certain data is encountered.
XMLInputFactory and XMLStreamReader are the two class which can be used to load an XML file.
And as we read through the XML file using XMLStreamReader, events are generated
in the form of integer values and these are then compared with the constants in XMLStreamConstants.
The below code shows how to parse XML using StAX parser:
XMLInputFactory factory =
XMLInputFactory.newInstance();
XMLStreamReader reader =actory.createXMLStreamReader(ClassLoader.getSystemResourceAsStream("xml/employee.xml")); |
JAXB (Java
Architecture for XML binding)
JAXB so called java architecture for XML
binding is an efficient technology to convert XML to and from Java Object. JAXB is mostly used to create java classes from XML in Java Web Services. In
Java JAXBprovides two
general purpose implementation.
Marshalling – It Converts a Java object into XML.
Unmarshalling – It Converts XML into a Java Object.
Again JAXB is a part of JDK , we don’t need to download or add anything extra to start. But if you are using a version less than JDK5.0 you need to add two libraries named ‘jaxb-api.jar’ and ‘jaxb-impl.jar’ to the classpath.
Marshalling – It Converts a Java object into XML.
Unmarshalling – It Converts XML into a Java Object.
Again JAXB is a part of JDK , we don’t need to download or add anything extra to start. But if you are using a version less than JDK5.0 you need to add two libraries named ‘jaxb-api.jar’ and ‘jaxb-impl.jar’ to the classpath.
What is XPATH xml parsing
XPATH is a
more advanced technique to parse and extract required data from XML. XPATH
provides an extensive support for query based extraction system to get more
accurate data from XML documents.
XPATH is similar to SQL in context of query, it provides a powerful set of expressions to parse and extract data from xml. Java provides full support for XPATH implementation, all classes required by XPATH can be found under ‘javx.xml.xpath.*’ package. We don’t need to download or add anything else.
XPATH is similar to SQL in context of query, it provides a powerful set of expressions to parse and extract data from xml. Java provides full support for XPATH implementation, all classes required by XPATH can be found under ‘javx.xml.xpath.*’ package. We don’t need to download or add anything else.
No comments:
Post a Comment