XML

Introduction

XML

XML stands for EXtensible Markup Language

XML is a markup language much like HTML

XML was designed to carry data, not to display data

XML tags are not predefined. We must define our own tags

XML is designed to be self-descriptive

XML is a W3C Recommendation.

XML is not a replacement for HTML. XML and HTML were designed with different goals. XML was designed to transport and store data, with focus on what data is.

HTML was designed to display data, with focus on how data looks. HTML is about displaying information, while XML is about carrying information.

With XML we invent our Own Tags. The tags in the example below (like <to> and <from>) are not defined in any XML standard.

<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<note>

These tags are “invented” by the author of the XML document. That is because the XML language has no predefined tags.

The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only use tags defined in the HTML standard (like <p>, <h1>, etc.). XML allows the author to define his own tags and his own document structure.

 

 

 

XML and SGML

 

 

Design goals of XML

Spurred on by dissatisfaction with the existing standard and non-standard formats, a group of companies and

organizations that called itself the World Wide Web Consortium (W3C) began work in the mid-1990s on a markup

language that combined the flexibility of SGML with the simplicity of HTML. Their philosophy in creating XML was

embodied by several important tenets, which are described in the following sections.

 

 Application-Specific Markup Languages

XML doesn’t define any markup elements, but rather tells you how you can make your own. In other words,

instead of creating a general-purpose element (say, a paragraph) and hoping it can cover every situation, the

designers of XML left this task to you. So, if you want an element called <segmentedlist>, <chapter>, or

<rocketship>, that’s your prerogative. Make up your own markup language to express your information in the

best way possible. Or, if you like, you can use an existing set of tags that someone else has made.

This means there’s an unlimited number of markup languages that can exist, and there must be a way to prevent

programs from breaking down as they attempt to read them all. Along with the freedom to be creative, there are

rules XML expects you to follow. If you write your elements a certain way and obey all the syntax rules, your

document is considered well-formed and any XML processor can read it. So you can have your cake and eat it

too.

 

Unambiguous Structure

XML takes a hard line when it comes to structure. A document should be marked up in such a way that there are

no two ways to interpret the names, order, and hierarchy of the elements. This vastly reduces errors and code

complexity. Programs don’t have to take an educated guess or try to fix syntax mistakes the way HTML browsers

often do, as there are no surprises of one XML processor creating a different result from another.

Of course, this makes writing good XML markup more difficult. You have to check the document’s syntax with a

parser to ensure that programs further down the line will run with few errors, that your data’s integrity is

protected, and that the results are consistent.

In addition to the basic syntax check, you can create your own rules for how a document should look. The DTD is

a blueprint for document structure. An XML schema can restrict the types of data that are allowed to go inside

elements (e.g., dates, numbers, or names). The possibilities for error-checking and structure control are

incredible.

 

Presentation Stored Elsewhere

For your document to have maximum flexibility for output format, you should strive to keep the style information

out of the document and stored externally. XML allows this by using stylesheets that contain the formatting

information. This has many benefits:

  • • You can use the same style settings for many documents.
  • • If you change your mind about a style setting, you can fix it in one place, and all the documents will

be affected.

  • • You can swap stylesheets for different purposes, perhaps having one for print and another for web

pages.

  • • The document’s content and structure is intact no matter what you do to change the presentation.

There’s no way to mess up the document by playing with the presentation.

  • • The document’s content isn’t cluttered with the vocabulary of style (font changes, spacing, color

specifications, etc.). It’s easier to read and maintain.

  • • With style information gone, you can choose names that precisely reflect the purpose of items, rather

than labeling them according to how they should look. This simplifies editing and transformation.

 

Keep It Simple

For XML to gain widespread acceptance, it has to be simple. People don’t want to learn a complicated system just

to author a document. XML is intuitive, easy to read, and elegant. It allows you to devise your own markup

language that conforms to logical rules. It’s a narrow subset of SGML, throwing out a lot of stuff that most people

don’t need.

Simplicity also benefits application development. If it’s easy to write programs that process XML files, there will

more and cheaper programs available to the public. XML’s rules are strict, but they make the burden of parsing

and processing files more predictable and therefore much easier.

Simplicity leads to abundance. You can think of XML as the DNA for many different kinds of information

expression. Stylesheets for defining appearance and transforming document structure can be written in an XMLbased

language called XSL. Schemas for modeling documents are another form of XML. This ubiquity means that

you can use the same tools to edit and process many different technologies.

 

 Maximum Error Checking

Some markup languages are so lenient about syntax that errors go undiscovered. When errors build up in a file,

it no longer behaves the way you want it to: its appearance in a browser is unpredictable, information may be

lost, and programs may act strangely and possibly crash when trying to open the file.

The XML specification says that a file is not well-formed unless it meets a set of minimum syntax requirements.

Your XML parser is a faithful guard dog, keeping out errors that will affect your document. It checks the spelling

of element names, makes sure the boundaries are air-tight, tells you when an object is out of place, and reports

broken links. You may carp about the strictness, and perhaps struggle to bring your document up to standard,

but it will be worth it when you’re done. The document’s durability and usefulness will be assured.

 

 

Application of XML :

 

 

 

Document application Data application

 

Most popular applications of XML

– Document applications manipulate information primarily intended for human consumption

– Data applications manipulate information primarily intended for software communications

 

 

Data Applications

• If the structure of a document can be expressed in XML, so as the structure of a database.

• XML web site can be regarded as a large database that application can tap

 

 

 

XML software :

 

 

 

Browsers

 

 

 

Editors

XML is Text-based

XML is a text-based markup language.

One great thing about XML is that XML files can be created and edited using a simple text-editor like Notepad.

However, when you start working with XML, you will soon find that it is better to edit XML documents using a professional XML editor.


Why Not Notepad?

Many web developers use Notepad to edit both HTML and XML documents because Notepad is included with the most common OS and it is simple to use. Personally I often use Notepad for quick editing of simple HTML, CSS, and XML files.

But, if you use Notepad for XML editing, you will soon run into problems.

Notepad does not know that you are writing XML, so it will not be able to assist you.


Why an XML Editor?

Today XML is an important technology, and development projects use XML-based technologies like:

  • XML Schema to define XML structures and data types
  • XSLT to transform XML data
  • SOAP to exchange XML data between applications
  • WSDL to describe web services
  • RDF to describe web resources
  • XPath and XQuery to access XML data
  • SMIL to define graphics

To be able to write error-free XML documents, you will need an intelligent XML editor!


XML Editors

Professional XML editors will help you to write error-free XML documents, validate your XML against a DTD or a schema, and force you to stick to a valid XML structure.

An XML editor should be able to:

  • Add closing tags to your opening tags automatically
  • Force you to write valid XML
  • Verify your XML against a DTD
  • Verify your XML against a Schema
  • Color code your XML syntax

 

 

 

 

 

 

Parsers

All modern browsers have a built-in XML parser.

parser is a piece of program that takes a physical representation of some data and converts it into an in-memory form for the program as a whole to use. Parsers are used everywhere in software. An XML Parser is a parser that is designed to read XML and create a way for programs to use XML. There are different types, and each has its advantages. Unless a program simply and blindly copies the whole XML file as a unit, every program must implement or call on an XML parser.

An XML parser converts an XML document into an XML DOM object – which can then be manipulated with JavaScript.

 

What’s a Pull Parser?

SAX is a push parser, since it pushes events out to the calling application. Pull parsers, on the other hand, sit and wait for the application to come calling. They ask for the next available event, and the application basically loops until it runs out of XML.

Pull parsers are useful in streaming applications, which are areas where either the data is too large to fit in memory, or the data is being assembled just in time for the next stage to use it. It is designed to be used with large data sources, and unlike SAX which returns every event, the pull parser can choose to skip events (or in some implementations, whole sections of the document) that it is not interested in. The converters are designed to work with both the SAX and the pull parser interfaces.

In Java, the current leading contender for streaming parsers appears to be StAX, while in Microsoft’s .Net platform, the System.Xml XmlReader is built right in.

 

 

 

Processor

xml processor

An XML processor is a necessary piece of the XML puzzle. The XML processor takes the XML document and DTD and processes the information so that it may then be used by applications requesting the information.

what the processor does

The processor is a software module that reads the XML document to find out the structure and content of the XML document. The structure and content can be derived by the processor because XML documents contain self-explanatory data.

 

 

 

 

 

XML tags

 

All XML Elements Must Have a Closing Tag

In HTML, some elements do not have to have a closing tag:

In XML, it is illegal to omit the closing tag. All elements must have a closing tag

 

XML Tags are Case Sensitive

XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.

Opening and closing tags must be written with the same case:

<Message>This is incorrect</message>
<message>This is correct</message>

 

XML Elements Must be Properly Nested

In HTML, you might see improperly nested elements:

<b><i>This text is bold and italic</b></i>

In XML, all elements must be properly nested within each other:

<b><i>This text is bold and italic</i></b>

In the example above, “Properly nested” simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.

 

 

 

structure of XML documents

structure of xml

The first (blue) line indicates we’re dealing with an XML file. This line should appear in each XML file. The file contains both information and pairs of opening and closing tags around the information. A set of opening and closing tags and the information in-between is called an element. The text below shows a complete element:

<name>Joe Jackson</name>

The XML example shown above thus contains the following named elements:

companies, company, companyname, employee, code, name, street, houseno, areacode, place, phone.

Tag names are case sensitive, “name” is not the same as “Name”. A tag is characterized by text surrounded by less than and greater than symbols: <tag>. Each tag must have a closing tag: </tag>. All text between the closing and opening tags belongs to the tag.

If there is no content between the tags we can combine opening and closing tag into one self-closing tag: <tag/>. This is done when either there is no content, or when the content of a tag is given as an attribute of the tag, for example::

<name name=”Jan Janssen”/>

is equivalent to

<name>Jan Jansen</name>

The use of attributes really doesn’t improve readability and it is recommended to use paired opening and closing tags instead.

The second line of the XML file contains the starting tag of the root element: <companies>. The element “companies” is the so-called Root element of the XML file. Each XML file must contain exactly one root element. Compare an XML file with a tree: a tree has exactly one stem. If there is more than one stem, we have a bush, not a tree.

Next, line 3 indicates the start of the element “companies”. Within this “companies” element, multiple “company” elements may be nested.
Line 4 shows the entire “companyname” element, after which lines 5 to 27 contain data of the three employees of company “Stanford and Son”.

 

 

 

 

XML documents

An Example XML Document

XML documents use a self-describing and simple syntax:

<?xml version=”1.0″ encoding=”ISO-8859-1″?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend!</body>
</note>

The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin-1/West European character set).

 

 

 

 

Xml elements tags

What is an XML Element?

An XML element is everything from (including) the element’s start tag to (including) the element’s end tag.

An element can contain:

  • other elements
  • text
  • attributes
  • or a mix of all of the above…

<bookstore>
<book category=”CHILDREN”>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category=”WEB”>
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

 

 

In the example above, <bookstore> and <book> have element contents, because they contain other elements. <book> also has an attribute (category=”CHILDREN”). <title>, <author>, <year>, and <price> have text contentbecause they contain text.

 

element markup attribute markup HTML document

 

XML Naming Rules

XML elements must follow these naming rules:

  • Names can contain letters, numbers, and other characters
  • Names cannot start with a number or punctuation character
  • Names cannot start with the letters xml (or XML, or Xml, etc)
  • Names cannot contain spaces

Any name can be used, no words are reserved.

 

 

Adding scripts

Sample XML scripts

This topic presents the following sample scripts to use with the XML Scripting tool:

  • Defining a JDBC subscription
  • Removing tables from a JDBC subscription
  • Adding tables to a JDBC subscription
  • Changing tables of a JDBC Subscription
  • Adding and removing indexes for a JDBC subscription
  • Defining a DataPropagator™ subscription
  • Creating a group or user
  • Changing a user definition
  • Defining a subscription set
  • Changing a subscription set

 

 

 

Data types in XML

xml data types

 

XML  Namespace :

What is an XML namespace?

“XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary is given a namespace, the ambiguity between identically named elements or attributes can be resolved.”

 

 

XML Namespaces provide a method to avoid element name conflicts.

 

Name Conflicts

In XML, element names are defined by the developer. This often results in a conflict when trying to This XML carries HTML table information:

<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>

mix XML documents from different XML applications.

 

Default Namespaces

Defining a default namespace for an element saves us from using prefixes in all the child elements. It has the following syntax:

xmlns=”namespaceURI

 

 

 

 

 

 

Qualified name and Unqualified names

Unqualified Names

All names in a document have a fully-qualified form. Unless explicitly qualified, the outermost element in a document is always assumed to come from the document’s DTD namespace. For contained elements, the explicit namespace and “:” can also be omitted. If so, the enclosing element’s namespace is inferred (for sub-elements, attributes, etc.). These rules ensure that all names in all existing XML documents have fully-qualified names matching their current meaning, and all validations remain unchanged.

Here is an example of explicit and inferred namespaces:

<?xml:namespace href=”http://zoo.org/schema.dtd/” as=”zoo”?>

<?xml:namespace href=”http://foodstore.com/schema.dtd/” as=”food”?>

<zoo:animalFriends>

<Bear>Christopher</Bear>

<Rabbit size=”small” food:eats=”dandilions”>Sabrina</Rabbit>

</zoo:animalFriends>

Here, BearRabbit, and size are qualified to the http://zoo.org/schema.dtd/ namespace, while eats is qualified to http://foodstore.com/schema.dtd/.

The pre-defined attribute xml:child-namespace sets a default namespace for contained elements differing from that of the parent.

<?xml:namespace href=”http://zoo.org/schema.dtd/” as=”zoo”?>

<?xml:namespace href=”http://w3.org/HTML/schema.dtd/” as=”html”?>

<html:ul xml:child-namespace=”zoo”>

<Bear>Christopher</Bear>

<Rabbit>Sabrina</Rabbit>

<Pig>Claude</Pig>

<Hedgehog>Alfred</Hedgehog>

</html:ul>

 

 

 

 

 

Namespace scope

The namespace declaration is considered to apply to the element where it is specified and to all elements within the content of that element, unless overridden by another namespace declaration with the same NSAttName part:

<?xml version=”1.0″?>
<!– all elements here are explicitly in the HTML namespace –>
<html:html xmlns:html=’http://www.w3.org/TR/REC-html40′>
<html:head><html:title>Frobnostication</html:title></html:head>
<html:body><html:p>Moved to
<html:a href=’http://frob.com’>here.</html:a></html:p></html:body>
</html:html>

Multiple namespace prefixes can be declared as attributes of a single element, as shown in this example:

 

<?xml version=”1.0″?>
<!– both namespace prefixes are available throughout –>
<bk:book xmlns:bk=’urn:loc.gov:books’
xmlns:isbn=’urn:ISBN:0-395-36341-6′>
<bk:title>Cheaper by the Dozen</bk:title>
<isbn:number>1568491379</isbn:number>
</bk:book>

 

 

default name space working with text and font :

 

A default namespace is considered to apply to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. If the URI reference in a default namespace declaration is empty, then unprefixed elements in the scope of the declaration are not considered to be in any namespace. Note that default namespaces do not apply directly to attributes.

 

<?xml version=”1.0″?>
<!– elements are in the HTML namespace, in this case by default –>
<html xmlns=’http://www.w3.org/TR/REC-html40′>
<head><title>Frobnostication</title></head>
<body><p>Moved to
<a href=’http://frob.com’>here</a>.</p></body>
</html>

 

 

<?xml version=”1.0″?>
<!– unprefixed element types are from ”books” –>
<book xmlns=’urn:loc.gov:books’
xmlns:isbn=’urn:ISBN:0-395-36341-6′>
<title>Cheaper by the Dozen</title>
<isbn:number>1568491379</isbn:number>
</book>

 

The default namespace can be set to the empty string. This has the same effect, within the scope of the declaration, of there being no default namespace.

 

<?xml version=’1.0′?>
<Beers>
<!– the default namespace is now that of HTML –>
<table xmlns=’http://www.w3.org/TR/REC-html40′>
<th><td>Name</td><td>Origin</td><td>Description</td></th>
<tr>
<!– no default namespace inside table cells –>
<td><brandName xmlns=”">Huntsman</brandName></td>
<td><origin xmlns=”">Bath, UK</origin></td>
<td>
<details xmlns=”"><class>Bitter</class><hop>Fuggles</hop>
<pro>Wonderful hop, light alcohol, good summer beer</pro>
<con>Fragile; excessive variance pub to pub</con>
</details>
</td>
</tr>
</table>
</Beers>

 

 

 

 

font, font size font style text alignment text indent, line height, color and Background

 

 

 

Properties : Foreground color

 

 

 

Background color

 

 

 

Border color

 

 

Background image

 

 

 

 

 

 

 

 

 

 

 

 

Registration


A password will be e-mailed to you.

Feedback Form

Name (required)

Email (required)

Feedback