Saturday, 10 October 2015

XML - Extensible Mark Up Language

Introduction

XML stands for Extensible Markup Language. XML is a technology which allows you to store and transport the data.


Advantage of XML

·      XML is more readable.
·      XML is very flexible.
·      XML is platform independent.
·      XML is technology independent.

XML Validation rules

A XML document can be defined & validated using two specifications:
·      DTD (Document Type Definition).
·      XSD (XML Schema Definition.)

Validation rules for XML definition:

1.    XML Document must have single root.
2.    Elements must be nested properly.
3.    Each element must have correct sub elements.
4.    Order of sub elements must be correct.
5.    Occurrence of elements must be correct.
6.    Attributes values must be enclosed in single quotation mark or double quotation mark.
7.    Mandatory attributes are used or not should be validated.

8.    Valid attributes are used or not should be validated.

Document Type Definition (DTD)

DTD Validation rules:

To specify the validation rules with DTD, Use following Mark up declaration:-

1.    <! ELEMENT>
2.    <! ATTLIST>
3.    <! ENTITY>
4.    Cardinality
5.    Attributes

a. <! ELEMENT>

<!ELEMENT> defines an elements in a XML file.

e.g.

<!ELEMENT  sky (Employees, Departments)>
<!ELEMENT  Employees (Employee *)>
<!ELEMENT  Departments (Department +)>
<!ELEMENT  Employee (name, email, company)>
<!ELEMENT  name (#PCDATA)>
<!ELEMENT  email (#PCDATA)>
<!ELEMENT  company EMPTY>


b.    <! ATTLIST>

<! ATTLIST> defines attributes of an Element in a XML file.

e.g. <!ATTLIST Employee
empid CDATA  #REQUIRED
phone CDATA  #IMPLIED>

c.     <! ENTITY>

<! ENTITY> defines own entities in DTD.

e.g. <! ENTITY manager “Sumit Kumar”>

<manager>&manager</manager>


d.    Cardinality

Occurrence of an element can be specified as cardinality:

·      * defines 0 or many
·      + defines 1 or many
·      ? defines 0 or 1
·      No symbol defines exactly once.



e.g.

<!ELEMENT  sky (Employees, Departments)>
<!ELEMENT  Employees (Employee *)>
<!ELEMENT  Departments (Department +)>


e.     Attributes

Occurrence of attributes must be once.
Order of attributes can be any.


PCDATA vs. CDATA

PCDATA is to specify the datatype for elements where as CDATA is for specifying the data type for attribute.

With PCDATA entity references will resolved where as with CDATA entity references will not be resolved.

Example of DTD:

Sky.xml

<sky>
<Employees>
<Employee empid="101" phone="99999999">
<name>ABC</name>
<email>abc@xyz.com</email>
<company>XYZ Corporation</company>
</Employee>
<Employee empid="102">
<name>DEF</name>
<email>def@xyz.com</email>
<company>XYZ Corporation</company>
</Employee>
</Employees>
<Departments>
<Department deptid="908">
<deptname>Information Technology</deptname>
<manager>John Simpson</manager>
</Department>
<Department deptid="910">
<deptname>Finance</deptname>
<manager>&manager </manager>
</Department>
</Departments>
</sky>



Sky.dtd

<! ELEMENT sky (Employees, Departments)>
<! ELEMENT Employees (Employee *)>
<! ELEMENT Departments (Department +)>
<! ELEMENT Employee (name, email, company?)>
<! ELEMENT Department (deptname, manager)>
<! ELEMENT name (#PCDATA)>
<! ELEMENT email (#PCDATA)>
<! ELEMENT company EMPTY>
<! ELEMENT deptname (#PCDATA)>
<! ELEMENT manager (#PCDATA)>
<! ATTLIST Employee empid CDATA#REQUIRED phone CDATA#IMPLIED>
<! ATTLIST Department deptid CDATA#REQUIRED >
<! ENTITY manager “Sumit Kumar”>


Including DTD in XML document.

There are two ways to include DTD’s:-

1.    <! DOCTYPE rootElement SYSTEM dtd_file_name>
2.    <! DOCTYPE rootElement PUBLIC Identifier dtd_file_name>








With SYSTEM

Case 1.

When both XML and DTD in same folder then:

e.g.:

<? xml version=”1.0”?>
<! DOCTYPE sky SYSTEM “Sky.dtd”>
<sky>
</sky>


Case 2.

When both XML and DTD in different folder then:

e.g.:

<? xml version=”1.0”?>
<! DOCTYPE sky SYSTEM “file:///d:/sky/dtd/Sky.dtd”>
<sky>
</sky>

Case 3.

When DTD available at world wide web (www) then:

e.g.:

<? xml version=”1.0”?>
<! DOCTYPE sky SYSTEM “http://www.sky.com/sky/dtd/Sky.dtd”>
<sky></sky> 


With PUBLIC

Case 1.

When both XML and DTD in same folder then:

<! DOCTYPE sky PUBLIC “sky” “sky.dtd”>
Case 2.

When both XML and DTD in different folder then:

<!DOCTYPE sky PUBLIC “sky” “file:///d:/sky/dtd/Sky.dtd”>

Case 3.

 When DTD available at world wide web (www) then:

<! DOCTYPE sky PUBLIC “sky” “http://www.sky.com/sky/dtd/Sky.dtd”>


XML Declaration

XML declaration has an optional attribute called standalone with two possible values i.e. ‘Yes/No’ and with the default value ‘No’. if standalone value is no then dtd is placed externally in a separate file.

External DTD:

<? xml version=”1.0” standalone=”no”?>
<! DOCTYPE sky SYSTEM “sky.dtd”>
<sky>
….
….
</sky>

if standalone value is yes then dtd is placed externally in a separate file.

Internal DTD:

<? xml version=”1.0” standalone=”yes”?>
<! DOCTYPE sky [
<! ELEMENT sky (Employees, Departments)>
<! ELEMENT Employees (Employee *)>
<! ELEMENT Departments (Department +)>
………
………
]>
<sky>
….
….
</sky>
  

XML Namespaces

Writing same tag name for different purposes with in the single xml document may lead to conflict when reading data using any parser.
XML namespaces provide a way to avoid name conflict.

XML name spaces contains:-
1.    Namespace prefix.
2.    Namespace uri.

Namespace prefix

Namespace prefix will be used to qualify the elements which internally refer corresponding namespace uri. e.g. following two are same:

<emp:employee>
<http://www.sky.com/employees:employee>

Namespace uri

Namespace uri must be unique and it is recommendable to use uri as namespace uri.

Example of XML Namespace:

sky.xml (with out namespace)

<? xml version=”1.0” ?>
<sky>
<Employee>
<name>Sumit Kumar</name>
</Employee>
<Department>
<name>Finance</name>
</Department>
</sky

sky.xml (with namespace)

<? xml version=”1.0” ?>

<emp:employee>
<emp:name>Sumit Kumar</emp:name>
</emp:employee>

<dept:department>
<dept:name>Finance</dept:name>

</dept:department>


XML Schema

Schema is an alternative to DTD for validating the xml documents. It supports various types of data types like int, float, double, boolean, string, date etc. It provides the way to define custom data types. It supports xml namespaces. Schema document definition itself an xml document and must be well formed.

Sky.xsd

<?xml version="1.0"?>
<sk:schema xmlns:sk="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.sky.com/sky/emp"
xmlns:emp="http://www.sky.com/sky/emp">
<sk:element name="hello" type="sk:string"/>
</sk:schema>

sky2.xsd

<?xml version="1.0"?>
<sk:schema xmlns:sk="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.sky.com/sky/dept"
xmlns:emp="http://www.sky.com/sky/dept">
<sk:element name="hai" type="sk:string"/>
</sk:schema>

sky3.xsd

<?xml version="1.0"?>
<sk:schema xmlns:sk="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.sky.com/sky"
xmlns:emp="http://www.sky.com/sky">
<sk:element name="sky" type="sk:string"/>

</sk:schema>


Using Above XSD: 

sky.xml

<?xml version="1.0"?>
<sky xmlns="http://www.sky.com/sky"
xmlns:emp="http://www.sky.com/sky/emp"
xmlns:dept="http://www.sky.com/sky/dept"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sky.com/sky http://www.sky.com/sky/sky3.xsd http://www.sky.com/sky/emp http://www.sky.com/sky/emp/sky2.xsd http://www.sky.com/sky/dept http://www.sky.com/sky/dept/sky.xsd>

<dept:hai/>
<emp:hello/>

</sky>



Difference between DTD and Schema

S.No
DTD
Schema
1
DTD supports only two data types: PCDATA & CDATA
Schema supports multiple datatype like int, short, boolean, date, time. datetime, string, double etc.
2
DTD doesn’t support namespaces
Schema supports namespaces
3
DTD doesn’t allow to define custom data types
Schema supports custom data type definition.
4.
DTD document may not be a xml document.
Schema document must be a XML document
5.
Only one DTD allowed in one xml document
One xml document can include multiple schema definition.
6.
DTD can be placed inside the xml and externally also.
Schema Definition must be placed externally.







No comments:

Post a Comment