Wednesday, 16 December 2015

SAX(Simple API for XML Parsing)


Architecture of SAX API





To Parse XML using SAX API,
  1. First we need to create an instance of a SAXParserFactory.
  2. SAXParserFactory will create a SAXParser for you.
  3. The SAXParser will get an instance of a SAXReader which will communicate with various handlers.
  4. Each Handler will raise events to communicate with SAXReader to perform various tasks.
Lets explore all the Handlers one by one. We have below Handlers present in JAXP:

  1. ContentHandler
  2. ErrorHandler
  3. DTDHandler
  4. EntityResolver

So the names itself explaining about these Handlers but we will see little description of them.

ContentHandler:

This Handler contains various methods for handling content inside xml document. For this various event driven methods provided inside this Handler which invoked by a SAXReader are:-

startDocument();
endDocument();
startElement();
endElement();
characters();
processingIntructions();

These events will be invoked whenever our parser read a text inside an XML element.

ErrorHandler:

Whenever a parser encounter an error during parsing of a XML Document, ErrorHandler will basically handle those events situation by methods like:

error()
fatalError()
warning()

DTDHandler:

DTDHandler basically handles the processing of DTD (Document Type Definition) and read the document as per definition provided. What is DTD, I have already explained in my previous blog content.

EntityResolver:

Last but not least is EntityResolver. This Handler basically help to resolve an URL/URN/URI which is unique path in webspace/local to find a document and get a local copy of it.

Sunday, 13 December 2015

Basics of JAXP


So far we have seen the basics of parsers like SAX and DOM. Now another most terrifying term in XML is JAXP. Need not to worry because you have already seen some part of it. Surprising!! How? Let’s dig in to the basics of JAXP then.

JAXP, it stands for Java API for XML Processing. The on-going API version for JAXP is 1.6.

Now, one of the most interesting query about JAXP is what sort of API actually it provides. I mean what is this all about.

Simply if we say, it is an interface or a kind of plug-in provided or a layer provided to plug in various implementations whether it is own or a third party vendor implementation for XML processing. So what kind of XML processing it will do, The Plug-in API provides you platform for processing an XML in terms of Parsing, Presentation, and Streaming etc.

Well! Again term comes here, Right!! Parsing. We have already seen SAX and DOM parser in previous blogs. What we are going to discuss here about parsing is the basics architecture and little more examples with in-depth explanation for SAX and DOM API provided inside JAXP API. Other than that we will also see the basics of some other XML processing task and How JAXP will help to achieve them.

There are various packages define with in JAXP API for these different tasks are:-

  • javax.xml.parser: This package provides a common interface for SAX and DOM parsing activities.
  • org.w3c.dom: This package having components that defines XML as a Document class and also some more component classes are available for other elements of XML file like Attributes, Node etc.
  • org.xml.sax: This package provides basic API for SAX.
  • javax.xml.transform: This package contains API for transforming XML in to various other forms in terms of presentation. These transformations API are called XSLT API. We will see later what this is exactly.
  • javax.xml.stream: This is the very latest API included in JAXP from version JAXP 1.4 i.e streaming API for XML also in short term it is called as stAX. Remember ‘st’ is in small letter and ‘AX’ in caps letter. We will later discuss about this as well.

We have discuss some part of parsing stuff with SAX and DOM in my previous post. Where we have seen the Basics of XML as well and also how to parse them using SAX and DOM parser. Now we will see more basics about its architecture and some more examples in terms of JAXP API. We will see all the above JAXP packages provided API in detail in our next blog sections on:

SAX API  
DOM API
XSLT API
stAX API


Tuesday, 13 October 2015

Generate XML using DOM parser


Jar Required: jdom.jar

Directory structure in eclipse:





SKYDOMWriter.java:

package com.sky.parse.dom;

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;

public class SKYDOMWriter {
public static void main(String[] args) {
try {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();
Document doc = documentBuilder.newDocument();
Element rootElement = doc.createElement("sky");
doc.appendChild(rootElement);

Element employee = doc.createElement("employee");
rootElement.appendChild(employee);
employee.setAttribute("empid", "101");
Element name = doc.createElement("name");
name.appendChild(doc.createTextNode("Sumit Kumar"));
employee.appendChild(name);
Element email = doc.createElement("email");
email.appendChild(doc.createTextNode("xyz@yahoo.com"));
employee.appendChild(email);

Element employee2 = doc.createElement("employee");
rootElement.appendChild(employee2);
employee2.setAttribute("empid", "102");
Element name2 = doc.createElement("name");
name2.appendChild(doc.createTextNode("Sunil Kumar"));
employee2.appendChild(name2);
Element email2 = doc.createElement("email");
email2.appendChild(doc.createTextNode("skumar102@outlook.com"));
employee2.appendChild(email2);

TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("src/sky.xml"));
transformer.transform(source, result);
} catch (Exception e) {
e.printStackTrace();
}
}

}



Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sky>
<employee empid="101">
<name>Sumit Kumar</name>
<email>xyz@yahoo.com</email>
</employee>
<employee empid="102">
<name>Sunil Kumar</name>
<email>skumar102@outlook.com</email>
</employee>
</sky>

Delete data in XML using DOM Parser

Jar Required: jdom.jar

Directory structure in eclipse:

XML before Updation:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sky>
      <employee empid="101">
            <name>Sumit Kumar</name>
            <email>xyz@yahoo.com</email>
      </employee>
      <employee empid="102">
            <name>Sunil Kumar</name>
            <email>skumar102@outlook.com</email>
      </employee>

</sky>

SKYDOMDeleter.java:

package com.sky.parse.dom;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class SKYDOMDeleter {
public static void main(String arg[]) {
try {
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document document = documentBuilder.parse("src/sky.xml");
Node employee = document.getElementsByTagName("employee").item(0);
NodeList list = employee.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
if ("email".equals(node.getNodeName())) {
employee.removeChild(node);
}
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult("src/sky.xml");
transformer.transform(source, result);
} catch (Exception e) {
e.printStackTrace();
}
}
}

XML After Updation:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sky>
      <employee empid="101">
            <name>Sumit Kumar</name>
      </employee>
      <employee empid="102">
            <name>Sunil Kumar</name>
            <email>skumar102@outlook.com</email>
      </employee>

</sky>

Update data in XML using DOM Parser


Jar Required: jdom.jar

Directory structure in eclipse:


XML before Updation:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sky>
      <employee empid="101">
            <name>Sumit Kumar</name>
            <email>xyz@yahoo.com</email>
      </employee>
      <employee empid="102">
            <name>Sunil Kumar</name>
            <email>skumar102@outlook.com</email>
      </employee>

</sky>

SKYDOMUpdater.java:
package com.sky.parse.dom;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;

public class SKYDOMUpdater {
public static void main(String argv[]) {
try {
DocumentBuilderFactory documentFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document document = documentBuilder.parse("src/sky.xml");
Node employee = document.getElementsByTagName("employee").item(0);
NamedNodeMap attribute = employee.getAttributes();
Node empid = attribute.getNamedItem("empid");
empid.setTextContent("201");
Element mobileNo = document.createElement("mobile");
mobileNo.appendChild(document.createTextNode("9999999999"));
employee.appendChild(mobileNo);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult("src/sky.xml");
transformer.transform(source, result);
} catch (Exception e) {
e.printStackTrace();
}
}

}


XML After Updation:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sky>
      <employee empid="201">
            <name>Sumit Kumar</name>
            <email>xyz@yahoo.com</email>
            <mobile>9999999999</mobile>
      </employee>
      <employee empid="102">
            <name>Sunil Kumar</name>
            <email>skumar102@outlook.com</email>
      </employee>

</sky>

Read Data From XML using DOM Parser


Use Xerces – 2.0.2.jar if using jdk 1.6.

Directory structure in eclipse:




Sky.xml

 <?xml version="1.0"?>
<!DOCTYPE sky SYSTEM "sky.dtd">
<sky>
<employee>
<name>Sumit Kumar</name>
<email>sky@outlook.com</email>
</employee>
<employee>
<name>Sunil Kumar</name>
<email>skumar102@outlook.com</email>
</employee>
</sky>


Sky.dtd

<!ELEMENT sky (employee*)>
<!ELEMENT employee (name,email)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>


SKYDOMReader.java:

package com.sky.parse.dom;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class SKYDOMReader {
      public static void main(String[] args) {
     
            try{
            DocumentBuilderFactory builderFactory=DocumentBuilderFactory.newInstance();
            DocumentBuilder documentBuilder=builderFactory.newDocumentBuilder();
            Document document=documentBuilder.parse("src/sky.xml");
            System.out.println("Root element :" + document.getDocumentElement().getNodeName());
            NodeList nodeList=document.getElementsByTagName("employee");
            for (int i = 0; i < nodeList.getLength(); i++){
                  Node node=nodeList.item(i);
                  Element eElement = (Element) node;
                      System.out.println("Name : "  + eElement.getElementsByTagName("name").item(0).getTextContent());
                      System.out.println("email : "   + eElement.getElementsByTagName("email").item(0).getTextContent());
            }
            }catch (Exception e) {
                  e.printStackTrace();
            }
           
      }
}



Output:

Root element :sky
Name : Sumit Kumar
email : sky@outlook.com
Name : Sunil Kumar
email : skumar102@outlook.com

Sunday, 11 October 2015

XML Parsing

Parsing: Splitting of a full string in to small chunks using some special tokens is known as parsing. For parsing an XML file we have two parser provided:

1.    SAX Parser.
2.    DOM parser.

For using both parser API, JAR file required: Xerces.jar

SAX (Simple API for XML)

SAX parser is read only parser which is used to read the data from XML doc. SAX parser cannot update, delete and generate new xml. It is implemented based on event driven model. At the time of parsing the xml, it will generate the following types of events:

Document is started.
Element is started.
Character data is started.
Element data is ended.
Document is ended. etc.

SAX parser reads the XML data in sequential order.

When ever it found start tag and end tag inside an XML, it fires the corresponding event handler methods.

Basic Example:

Directory Structure in Eclipse






SAXParserTest.java

package com.sky.parse.sax;

import java.io.IOException;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;

public class SAXParserTest {
     
      public static void main(String[] args) {
            XMLReader reader;
            try {
                                                                                                reader=XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");

reader.setContentHandler(new SkyHandler());

reader.parse("src/sky.xml");


            } catch (Exception e) {
                  e.printStackTrace();
            }
           
      }
}

class SkyHandler extends DefaultHandler{
     
      @Override
      public void startDocument() throws SAXException {
            System.out.println("SkyHandler.startDocument()");
      }
     
     
      @Override
      public void startElement(String uri, String localName, String qName,
                  Attributes attributes) throws SAXException {
            System.out.println("SkyHandler.startElement()");
            System.out.println(uri +"\t" + localName +"\t" + qName);
            for (int i = 0; i < attributes.getLength(); i++) {
                  System.out.println(attributes.getLocalName(i) + "\t" + attributes.getValue(i));
            }
      }
     
      @Override
      public void characters(char[] ch, int start, int length)
                  throws SAXException {
            System.out.println("SkyHandler.characters()");
            System.out.println(new String(ch, start, length));
      }
     
      @Override
      public void endElement(String uri, String localName, String qName)
                  throws SAXException {
            System.out.println("SkyHandler.endElement()");
            System.out.println(uri +"\t" + localName +"\t" + qName);
      }
     
      @Override
      public void endDocument() throws SAXException {
            System.out.println("SkyHandler.endDocument()");
      }

}


OUTPUT:

SkyHandler.startDocument()
SkyHandler.startElement()
http://www.sky.com/sky  sky   sky
schemaLocation    http://www.sky.com/sky http://www.sky.com/sky/sky3.xsd http://www.sky.com/sky/emp http://www.sky.com/sky/emp/sky2.xsd http://www.sky.com/sky/dept http://www.sky.com/sky/dept/sky.xsd
SkyHandler.characters()



SkyHandler.startElement()
http://www.sky.com/sky/dept   hai   dept:hai
SkyHandler.endElement()
http://www.sky.com/sky/dept   hai   dept:hai
SkyHandler.characters()


SkyHandler.startElement()
http://www.sky.com/sky/emp    hello emp:hello
SkyHandler.endElement()
http://www.sky.com/sky/emp    hello emp:hello
SkyHandler.characters()



SkyHandler.endElement()
http://www.sky.com/sky  sky   sky
SkyHandler.endDocument()

DONE



DOM (Document Object Model)

DOM parser is read-write parser which is used to both read the data from XML doc and write the data in to an xml document. DOM parser can read, update, delete and generate xml. While reading the xml file DOM parser construct the Node Tree also called as DOM tree with all the elements of XML.

Various kind of node of the DOM trees are:
Document Node.
Element Node.
Attribute Node.
Character Node.

Example:
<sky>
<Employee empid=”101”>
<name>Sumit Kumar</name>
</Employee>
</sky>

DOM Tree:



DOM parser reads the XML data in Random order.

Once parsing is completed with DOM parser, entire XML data will be loaded in to Main Memory.

Hence when using DOM parser make sure XML file must not be too large other wise it will lead to OutOfMemoryError:java heap space in java

Basic Example: Please click on below links
Difference between SAX and DOM parser

S.No
SAX
DOM
1
 Read Only Parser
Read-Write Parser
2
 It reads data sequentially
It reads data randomly.
3
It occupies less memory because it loads only one element information at a time in to memory.
It occupies more memory because it loads whole xml data in to memory once.
4.
 It follows event driven model.
It follows Object Driven model.