The XML Framework is a collection of components for event-based XML parsing and provides content-processing architecture.
The XML Framework provides configurable features for parsing XML and WBXML (WAP Binary XML), with options for validating against a specification and auto-correcting for spelling errors in the validated text, using a single interface. It is based on the SAX (Simple API for XML) specification.
You must have a basic understanding on XML before using XML Framework.
The following are the key concepts of XML Framework:
Attribute
A name-value pair separated by an equals
sign, for example author="Jane Austen"
Attribute type
One of certain data types defined for attributes,
for instance CDATA
.
Client
An application which uses the XML framework for parsing or generating a document.
Document Type Definition (DTD)
A document which defines a particular use of XML entities (the names, attributes and values permitted).
Extension
WBXML extends XML syntax with extension tokens which are used differently by different applications. For example, extension token is used to refer to a string table created specifically for each message and transmitted in the introduction of the message.
Parser
It is an interface to the XML framework which allows a client to access the parser plug-ins, which are specific for a mark-up language. For example, XML Expat Parser and WBXML Parser.
String dictionary collection
A class that holds a collection of string dictionaries.
String dictionary plug-ins
The XML Framework allows strings to be stored in DTD document, XML namespace or WBXML code page in an ECOM plug-in that could be accessed as required by the parser and the client. These plug-ins are referred as string dictionary plug-ins.
String pool
A string pool is a mechanism for storing strings in a particular way using which the strings can be compared quickly.
String table
A WBXML document is encoded and decoded using a table of frequently encountered strings which the body of the document references by index to compress the data.
Uniform Resource Identifier (URI)
The web
address associated with a prefix. For instance, http://www.w3.org/XML/1998/namespace
.
WBXML
WAP Binary XML (WBXML) is a binary representation of XML. It was developed by the Open Mobile Alliance as a standard to allow XML documents to be transmitted in a compact manner over mobile networks and was proposed as an addition to the World Wide Web Consortium's Wireless Application Protocol family of standards.
The following diagram illustrates the XML framework, consisting of client and a parser:
Figure: Block diagram of XML framework
The XML framework consists of classes which model the main constituents of the architecture - the framework as a whole, the parser plug-ins and extensions to XML, the content processor chain and the content handler mechanism.
The XML and WBXML parsers convert the contents of a document to UTF-8 format. This is to ensure that extended characters are not lost from the document by the String Pool. Expat is the engine behind the XML parser plug-in.
The XML Framework allows strings to be stored for a particular DTD, XML namespace or WBXML codepage in an ECOM plug-in that can be accessed when required by the Parser and the Client. These plug-ins are referred as String Dictionary Plug-ins and they are managed through a string dictionary collection object. See String Dictionary
Libxml2 provides XML processing, parsing and validation APIs. See libxml2.
Plug-in 1 and Plug-in 2 are examples of optional processors, which may be chained together with the parser output to allow further processing of the data, before the client receives it. Such plug-ins can be a DTD validator or a document auto-corrector. The chain is not limited to just two plug-ins.
Parser framework
The XML framework contains Parser framework which is represented by the CParser class. A client with an XML document to be parsed creates a CParser object and calls its parse functions. CParser obtains the data about plug-ins and the document to be parsed from CMatchData and RDocumentParameters classes respectively.
The parser framework conforms to the event-based SAX specification. It outputs an event when it starts or finishes reading one of the following:
a document
a start tag
an end tag
a prefix mapping
a processing instruction
character data
ignorable white space
For more information on XML-related concepts, refer to W3C or similar sources.
Parser plug-ins
The CParser is the interface to the XML framework allowing
the client to access the parser plug-ins, each one of which is specific
to a mark-up language (e.g. XML, WBXML). Individual parser plug-in
implements the MParser
interface. It is associated
through the TParserInitParams class, with a character
set converter (to convert other formats to Unicode), a string dictionary
and an element stack.
The Symbian platform framework is delivered with three parser plug-ins, two for XML and one for WBXML.
The first XML parser consists of CXmlParser class, which is wrapped around the CExpat class, an implementation of the stream-based Expat parser.
The second XML
parser consists of CXMLEngineSAXPlugin class, which
encapsulates the SAX parser of the libxml2
component.
It is not available if the Symbian platform build excludes this component.
Extensions to XML
The XML framework provides extensions to XML. At present WBXML is implemented. WBXML requires use of string dictionaries and extension tokens to store the element strings specific to the WBXML, which are represented by the RStringDictionaryCollection, MWbxmlExtensionHandler and TExtensionTokens classes.
Content handlers
A client application which is designed to react to the output of the XML framework event must implement the MContentHandler interface. The functions to be implemented correspond to the SAX specification discussed in the Parser Framework section.
The XML Framework exports the following APIs:
API | Description |
---|---|
CExpat |
Encapsulates the Expat XML parser. |
CXmlParser |
Implementation of the stream-based Expat parser. |
CXMLEngineSAXPlugin |
Encapsulates the SAX parser of the |
CParser |
Represents the entire parser framework. |
CMatchData |
Consists of the data of the plug-ins. |
RDocumentParameters |
Consists of the data about the document to be parsed. |
RElementStack |
Data structure used to store XML elements and check the tag ordering. |
RStringDictionaryCollection |
Holds a collection of dictionaries requested by the user. |