1// Copyright (C) 2020 The Qt Company Ltd.
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
8 \brief Classes that support XML.
10 These classes are relevant to \l{XML Processing}{XML} users.
12 \generatelist{related}
16 \page xml-processing.html
19 \brief An Overview of the XML processing facilities in Qt.
21 Qt provides two general-purpose sets of APIs to read and write well-formed
22 XML: \l{XML Streaming}{stream based} and
23 \l{Working with the DOM Tree}{DOM based}.
25 Qt also provides specific support for some XML dialects. For instance, the
26 Qt SVG module provides the QSvgRenderer and QSvgGenerator classes to read
27 and write a subset of SVG, an XML-based file
28 format. Qt also provides helper functions that may be useful to
29 those working with XML and XHTML: see Qt::escape() and
30 Qt::convertFromPlainText().
35 \li \l {Classes for XML Processing}
36 \li \l {An Introduction to Namespaces}
37 \li \l {XML Streaming}
38 \li \l {Working with the DOM Tree}
41 \section1 Classes for XML Processing
43 These classes are relevant to XML users.
45 \annotatedlist xml-tools
49 \page xml-namespaces.html
50 \title An Introduction to Namespaces
53 \nextpage XML Streaming
55 Parts of the Qt XML module documentation assume that you are familiar
56 with XML namespaces. Here we present a brief introduction; skip to
57 \l{#namespacesConventions}{Qt XML documentation conventions}
58 if you already know this material.
60 Namespaces are a concept introduced into XML to allow a more modular
61 design. With their help data processing software can easily resolve
62 naming conflicts in XML documents.
64 Consider the following example:
66 \snippet code/doc_src_qtxml.qdoc 6
68 Here we find three different uses of the name \e title. If you wish to
69 process this document you will encounter problems because each of the
70 \e titles should be displayed in a different manner -- even though
71 they have the same name.
73 The solution would be to have some means of identifying the first
74 occurrence of \e title as the title of a book, i.e. to use the \e
75 title element of a book namespace to distinguish it from, for example,
76 the chapter title, e.g.:
77 \snippet code/doc_src_qtxml.qdoc 7
79 \e book in this case is a \e prefix denoting the namespace.
81 Before we can apply a namespace to element or attribute names we must
84 Namespaces are URIs like \e http://www.example.com/fnord/book/. This
85 does not mean that data must be available at this address; the URI is
86 simply used to provide a unique name.
88 We declare namespaces in the same way as attributes; strictly speaking
89 they \e are attributes. To make for example \e
90 http://www.example.com/fnord/ the document's default XML namespace \e
93 \snippet code/doc_src_qtxml.qdoc 8
95 To distinguish the \e http://www.example.com/fnord/book/ namespace from
96 the default, we must supply it with a prefix:
98 \snippet code/doc_src_qtxml.qdoc 9
100 A namespace that is declared like this can be applied to element and
101 attribute names by prepending the appropriate prefix and a ":"
102 delimiter. We have already seen this with the \e book:title element.
104 Element names without a prefix belong to the default namespace. This
105 rule does not apply to attributes: an attribute without a prefix does
106 not belong to any of the declared XML namespaces at all. Attributes
107 always belong to the "traditional" namespace of the element in which
108 they appear. A "traditional" namespace is not an XML namespace, it
109 simply means that all attribute names belonging to one element must be
110 different. Later we will see how to assign an XML namespace to an
113 Due to the fact that attributes without prefixes are not in any XML
114 namespace there is no collision between the attribute \e title (that
115 belongs to the \e author element) and for example the \e title element
118 Let's clarify this with an example:
119 \snippet code/doc_src_qtxml.qdoc 10
121 Within the \e document element we have two namespaces declared. The
122 default namespace \e http://www.example.com/fnord/ applies to the \e
123 book element, the \e chapter element, the appropriate \e title element
124 and of course to \e document itself.
126 The \e book:author and \e book:title elements belong to the namespace
127 with the URI \e http://www.example.com/fnord/book/.
129 The two \e book:author attributes \e title and \e name have no XML
130 namespace assigned. They are only members of the "traditional"
131 namespace of the element \e book:author, meaning that for example two
132 \e title attributes in \e book:author are forbidden.
134 In the above example we circumvent the last rule by adding a \e title
135 attribute from the \e http://www.example.com/fnord/ namespace to \e
136 book:author: the \e fnord:title comes from the namespace with the
137 prefix \e fnord that is declared in the \e book:author element.
139 Clearly the \e fnord namespace has the same namespace URI as the
140 default namespace. So why didn't we simply use the default namespace
141 we'd already declared? The answer is quite complex:
143 \li attributes without a prefix don't belong to any XML namespace at
144 all, not even to the default namespace;
145 \li additionally omitting the prefix would lead to a \e title-title clash;
146 \li writing it as \e xmlns:title would declare a new namespace with the
147 prefix \e title instead of applying the default \e xmlns namespace.
150 With the Qt XML classes elements and attributes can be accessed in two
151 ways: either by referring to their qualified names consisting of the
152 namespace prefix and the "real" name (or \e local name) or by the
153 combination of local name and namespace URI.
155 More information on XML namespaces can be found at
156 \l http://www.w3.org/TR/REC-xml-names/.
158 \target namespacesConventions
159 \section1 Conventions Used in the Qt XML Documentation
161 The following terms are used to distinguish the parts of names within
162 the context of namespaces:
164 \li The \e {qualified name}
165 is the name as it appears in the document. (In the above example \e
166 book:title is a qualified name.)
167 \li A \e {namespace prefix} in a qualified name
168 is the part to the left of the ":". (\e book is the namespace prefix in
170 \li The \e {local part} of a name (also referred to as the \e {local
171 name}) appears to the right of the ":". (Thus \e title is the
172 local part of \e book:title.)
173 \li The \e {namespace URI} ("Uniform Resource Identifier") is a unique
174 identifier for a namespace. It looks like a URL
175 (e.g. \e http://www.example.com/fnord/ ) but does not require
176 data to be accessible by the given protocol at the named address.
179 Elements without a ":" (like \e chapter in the example) do not have a
180 namespace prefix. In this case the local part and the qualified name
181 are identical (i.e. \e chapter).
183 \sa {DOM Bookmarks Application}
187 \page xml-streaming.html
190 \previouspage An Introduction to Namespaces
191 \nextpage Working with the DOM Tree
193 Qt provides two classes for reading and writing XML through a simple streaming
194 API: QXmlStreamReader and QXmlStreamWriter. These classes are located in
195 \l{Qt Serialization}{Qt Serialization (part of QtCore)}.
197 A stream reader reports an XML document as a stream
198 of tokens. This differs from SAX as SAX applications provide handlers to
199 receive XML events from the parser whereas the QXmlStreamReader drives the
200 loop, pulling tokens from the reader when they are needed.
201 This pulling approach makes it possible to build recursive descent parsers,
202 allowing XML parsing code to be split into different methods or classes.
204 QXmlStreamReader is a well-formed XML 1.0 parser that excludes external
205 parsed entities. Hence, data provided by the stream reader adheres to the
206 W3C's criteria for well-formed XML, as long as no error occurs. Otherwise,
207 functions such as \l{QXmlStreamReader::atEnd()}{atEnd()},
208 \l{QXmlStreamReader::error()}{error()} and \l{QXmlStreamReader::hasError()}
209 {hasError()} can be used to check and view the errors.
211 An example of an implementation tha uses QXmlStreamReader would be the
212 \l{QXmlStream Bookmarks Example#xbelreader-class-definition}{XbelReader} in
213 \l{QXmlStream Bookmarks Example}, which wraps a QXmlStreamReader. Read the
214 \l{QXmlStream Bookmarks Example#xbelreader-class-implementation}{implementation}
215 to learn more about how to use the QXmlStreamReader class.
217 Paired with QXmlStreamReader is the QXmlStreamWriter class, which provides
218 an XML writer with a simple streaming API. QXmlStreamWriter operates on a
219 QIODevice and has specialized functions for all XML tokens or events you
220 want to write, such as \l{QXmlStreamWriter::writeDTD()}{writeDTD()},
221 \l{QXmlStreamWriter::writeCharacters()}{writeCharacters()},
222 \l{QXmlStreamWriter::writeComment()}{writeComment()} and so on.
224 To write XML document with QXmlStreamWriter, you start a document with the
225 \l{QXmlStreamWriter::writeStartDocument()}{writeStartDocument()} function
226 and end it with \l{QXmlStreamWriter::writeEndDocument()}
227 {writeEndDocument()}, which implicitly closes all remaining open tags.
228 Element tags are opened with \l{QXmlStreamWriter::writeStartDocument()}
229 {writeStartDocument()} and followed by
230 \l{QXmlStreamWriter::writeAttribute()}{writeAttribute()} or
231 \l{QXmlStreamWriter::writeAttributes()}{writeAttributes()},
232 element content, and then \l{QXmlStreamWriter::writeEndDocument()}
233 {writeEndDocument()}. Also, \l{QXmlStreamWriter::writeEmptyElement()}
234 {writeEmptyElement()} can be used to write empty elements.
236 Element content comprises characters, entity references or nested elements.
237 Content can be written with \l{QXmlStreamWriter::writeCharacters()}
238 {writeCharacters()}, a function that also takes care of escaping all
239 forbidden characters and character sequences,
240 \l{QXmlStreamWriter::writeEntityReference()}{writeEntityReference()},
241 or subsequent calls to \l{QXmlStreamWriter::writeStartElement()}
242 {writeStartElement()}.
244 The \l{QXmlStream Bookmarks Example#xbelwriter-class-definition}{XbelWriter}
245 class from \l{QXmlStream Bookmarks Example} wraps a QXmlStreamWriter. View
246 the \l{QXmlStream Bookmarks Example#xbelwriter-class-implementation}{implementation}
247 to see how to use the QXmlStreamWriter class.
252 \title Working with the DOM Tree
255 \previouspage XML Streaming
257 DOM Level 2 is a W3C Recommendation for XML interfaces that maps the
258 constituents of an XML document to a tree structure. The specification
259 of DOM Level 2 can be found at \l{http://www.w3.org/DOM/}.
262 \section1 Introduction to DOM
264 DOM provides an interface to access and change the content and
265 structure of an XML file. It makes a hierarchical view of the document
266 (a tree view). Thus -- in contrast to the streaming API provided
267 by QXmlStreamReader -- an object
268 model of the document is resident in memory after parsing which makes
271 All DOM nodes in the document tree are subclasses of \l QDomNode. The
272 document itself is represented as a \l QDomDocument object.
274 Here are the available node classes and their potential child classes:
277 \li \l QDomDocument: Possible children are
279 \li \l QDomElement (at most one)
280 \li \l QDomProcessingInstruction
282 \li \l QDomDocumentType
284 \li \l QDomDocumentFragment: Possible children are
287 \li \l QDomProcessingInstruction
290 \li \l QDomCDATASection
291 \li \l QDomEntityReference
293 \li \l QDomDocumentType: No children
294 \li \l QDomEntityReference: Possible children are
297 \li \l QDomProcessingInstruction
300 \li \l QDomCDATASection
301 \li \l QDomEntityReference
303 \li \l QDomElement: Possible children are
308 \li \l QDomProcessingInstruction
309 \li \l QDomCDATASection
310 \li \l QDomEntityReference
312 \li \l QDomAttr: Possible children are
315 \li \l QDomEntityReference
317 \li \l QDomProcessingInstruction: No children
318 \li \l QDomComment: No children
319 \li \l QDomText: No children
320 \li \l QDomCDATASection: No children
321 \li \l QDomEntity: Possible children are
324 \li \l QDomProcessingInstruction
327 \li \l QDomCDATASection
328 \li \l QDomEntityReference
330 \li \l QDomNotation: No children
333 With \l QDomNodeList and \l QDomNamedNodeMap two collection classes
334 are provided: \l QDomNodeList is a list of nodes,
335 and \l QDomNamedNodeMap is used to handle unordered sets of nodes
336 (often used for attributes).
338 The \l QDomImplementation class allows the user to query features of the
341 To get started please refer to the \l QDomDocument documentation.
342 You might also want to take a look at the \l{DOM Bookmarks Application},
343 which illustrates how to read and write an XML bookmark file (XBEL)