example.html revision 61f6fb66add2c6a39e89cdeab466c8518bfa56ff
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
2<html>
3<head>
4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
5<link rel="SHORTCUT ICON" href="/favicon.ico">
6<style type="text/css"><!--
7TD {font-family: Verdana,Arial,Helvetica}
8BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
9H1 {font-family: Verdana,Arial,Helvetica}
10H2 {font-family: Verdana,Arial,Helvetica}
11H3 {font-family: Verdana,Arial,Helvetica}
12A:link, A:visited, A:active { text-decoration: underline }
13--></style>
14<title>A real example</title>
15</head>
16<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
17<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
18<td width="180">
19<a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo"></a></div>
20</td>
21<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
22<h1>The XML C library for Gnome</h1>
23<h2>A real example</h2>
24</td></tr></table></td></tr></table></td>
25</tr></table>
26<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
27<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
28<table width="100%" border="0" cellspacing="1" cellpadding="3">
29<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
30<tr><td bgcolor="#fffacd"><ul>
31<li><a href="index.html">Home</a></li>
32<li><a href="intro.html">Introduction</a></li>
33<li><a href="FAQ.html">FAQ</a></li>
34<li><a href="docs.html">Documentation</a></li>
35<li><a href="bugs.html">Reporting bugs and getting help</a></li>
36<li><a href="help.html">How to help</a></li>
37<li><a href="downloads.html">Downloads</a></li>
38<li><a href="news.html">News</a></li>
39<li><a href="XMLinfo.html">XML</a></li>
40<li><a href="XSLT.html">XSLT</a></li>
41<li><a href="python.html">Python and bindings</a></li>
42<li><a href="architecture.html">libxml architecture</a></li>
43<li><a href="tree.html">The tree output</a></li>
44<li><a href="interface.html">The SAX interface</a></li>
45<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
46<li><a href="xmlmem.html">Memory Management</a></li>
47<li><a href="encoding.html">Encodings support</a></li>
48<li><a href="xmlio.html">I/O Interfaces</a></li>
49<li><a href="catalog.html">Catalog support</a></li>
50<li><a href="library.html">The parser interfaces</a></li>
51<li><a href="entities.html">Entities or no entities</a></li>
52<li><a href="namespaces.html">Namespaces</a></li>
53<li><a href="upgrade.html">Upgrading 1.x code</a></li>
54<li><a href="threads.html">Thread safety</a></li>
55<li><a href="DOM.html">DOM Principles</a></li>
56<li><a href="example.html">A real example</a></li>
57<li><a href="contribs.html">Contributions</a></li>
58<li><a href="tutorial/index.html">Tutorial</a></li>
59<li>
60<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
61</li>
62</ul></td></tr>
63</table>
64<table width="100%" border="0" cellspacing="1" cellpadding="3">
65<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr>
66<tr><td bgcolor="#fffacd">
67<form action="search.php" enctype="application/x-www-form-urlencoded" method="GET">
68<input name="query" type="TEXT" size="20" value=""><input name="submit" type="submit" value="Search ...">
69</form>
70<ul>
71<li><a href="APIchunk0.html">Alphabetic</a></li>
72<li><a href="APIconstructors.html">Constructors</a></li>
73<li><a href="APIfunctions.html">Functions/Types</a></li>
74<li><a href="APIfiles.html">Modules</a></li>
75<li><a href="APIsymbols.html">Symbols</a></li>
76</ul>
77</td></tr>
78</table>
79<table width="100%" border="0" cellspacing="1" cellpadding="3">
80<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
81<tr><td bgcolor="#fffacd"><ul>
82<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
83<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
84<li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li>
85<li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li>
86<li><a href="ftp://xmlsoft.org/">FTP</a></li>
87<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
88<li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li>
89<li><a href="http://www.zveno.com/open_source/libxml2xslt.html">MacOsX binaries</a></li>
90<li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li>
91<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml&product=libxml2">Bug Tracker</a></li>
92</ul></td></tr>
93</table>
94</td></tr></table></td>
95<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
96<p>Here is a real size example, where the actual content of the application
97data is not kept in the DOM tree but uses internal structures. It is based on
98a proposal to keep a database of jobs related to Gnome, with an XML based
99storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
100base</a>:</p>
101<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
102&lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location">;
103  &lt;gjob:Jobs&gt;
104
105    &lt;gjob:Job&gt;
106      &lt;gjob:Project ID=&quot;3&quot;/&gt;
107      &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
108      &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
109
110      &lt;gjob:Update&gt;
111        &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
112        &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
113        &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
114      &lt;/gjob:Update&gt;
115
116      &lt;gjob:Developers&gt;
117        &lt;gjob:Developer&gt;
118        &lt;/gjob:Developer&gt;
119      &lt;/gjob:Developers&gt;
120
121      &lt;gjob:Contact&gt;
122        &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
123        &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
124        &lt;gjob:Company&gt;
125        &lt;/gjob:Company&gt;
126        &lt;gjob:Organisation&gt;
127        &lt;/gjob:Organisation&gt;
128        &lt;gjob:Webpage&gt;
129        &lt;/gjob:Webpage&gt;
130        &lt;gjob:Snailmail&gt;
131        &lt;/gjob:Snailmail&gt;
132        &lt;gjob:Phone&gt;
133        &lt;/gjob:Phone&gt;
134      &lt;/gjob:Contact&gt;
135
136      &lt;gjob:Requirements&gt;
137      The program should be released as free software, under the GPL.
138      &lt;/gjob:Requirements&gt;
139
140      &lt;gjob:Skills&gt;
141      &lt;/gjob:Skills&gt;
142
143      &lt;gjob:Details&gt;
144      A GNOME based system that will allow a superuser to configure 
145      compressed and uncompressed files and/or file systems to be backed 
146      up with a supported media in the system.  This should be able to 
147      perform via find commands generating a list of files that are passed 
148      to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 
149      or via operations performed on the filesystem itself. Email 
150      notification and GUI status display very important.
151      &lt;/gjob:Details&gt;
152
153    &lt;/gjob:Job&gt;
154
155  &lt;/gjob:Jobs&gt;
156&lt;/gjob:Helping&gt;</pre>
157<p>While loading the XML file into an internal DOM tree is a matter of
158calling only a couple of functions, browsing the tree to gather the data and
159generate the internal structures is harder, and more error prone.</p>
160<p>The suggested principle is to be tolerant with respect to the input
161structure. For example, the ordering of the attributes is not significant,
162the XML specification is clear about it. It's also usually a good idea not to
163depend on the order of the children of a given node, unless it really makes
164things harder. Here is some code to parse the information for a person:</p>
165<pre>/*
166 * A person record
167 */
168typedef struct person {
169    char *name;
170    char *email;
171    char *company;
172    char *organisation;
173    char *smail;
174    char *webPage;
175    char *phone;
176} person, *personPtr;
177
178/*
179 * And the code needed to parse it
180 */
181personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
182    personPtr ret = NULL;
183
184DEBUG(&quot;parsePerson\n&quot;);
185    /*
186     * allocate the struct
187     */
188    ret = (personPtr) malloc(sizeof(person));
189    if (ret == NULL) {
190        fprintf(stderr,&quot;out of memory\n&quot;);
191        return(NULL);
192    }
193    memset(ret, 0, sizeof(person));
194
195    /* We don't care what the top level element name is */
196    cur = cur-&gt;xmlChildrenNode;
197    while (cur != NULL) {
198        if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
199            ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
200        if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
201            ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
202        cur = cur-&gt;next;
203    }
204
205    return(ret);
206}</pre>
207<p>Here are a couple of things to notice:</p>
208<ul>
209<li>Usually a recursive parsing style is the more convenient one: XML data
210    is by nature subject to repetitive constructs and usually exhibits highly
211    structured patterns.</li>
212  <li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
213    i.e. the pointer to the global XML document and the namespace reserved to
214    the application. Document wide information are needed for example to
215    decode entities and it's a good coding practice to define a namespace for
216    your application set of data and test that the element and attributes
217    you're analyzing actually pertains to your application space. This is
218    done by a simple equality test (cur-&gt;ns == ns).</li>
219  <li>To retrieve text and attributes value, you can use the function
220    <em>xmlNodeListGetString</em> to gather all the text and entity reference
221    nodes generated by the DOM output and produce an single text string.</li>
222</ul>
223<p>Here is another piece of code used to parse another level of the
224structure:</p>
225<pre>#include &lt;libxml/tree.h&gt;
226/*
227 * a Description for a Job
228 */
229typedef struct job {
230    char *projectID;
231    char *application;
232    char *category;
233    personPtr contact;
234    int nbDevelopers;
235    personPtr developers[100]; /* using dynamic alloc is left as an exercise */
236} job, *jobPtr;
237
238/*
239 * And the code needed to parse it
240 */
241jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
242    jobPtr ret = NULL;
243
244DEBUG(&quot;parseJob\n&quot;);
245    /*
246     * allocate the struct
247     */
248    ret = (jobPtr) malloc(sizeof(job));
249    if (ret == NULL) {
250        fprintf(stderr,&quot;out of memory\n&quot;);
251        return(NULL);
252    }
253    memset(ret, 0, sizeof(job));
254
255    /* We don't care what the top level element name is */
256    cur = cur-&gt;xmlChildrenNode;
257    while (cur != NULL) {
258        
259        if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
260            ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
261            if (ret-&gt;projectID == NULL) {
262                fprintf(stderr, &quot;Project has no ID\n&quot;);
263            }
264        }
265        if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
266            ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
267        if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
268            ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
269        if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
270            ret-&gt;contact = parsePerson(doc, ns, cur);
271        cur = cur-&gt;next;
272    }
273
274    return(ret);
275}</pre>
276<p>Once you are used to it, writing this kind of code is quite simple, but
277boring. Ultimately, it could be possible to write stubbers taking either C
278data structure definitions, a set of XML examples or an XML DTD and produce
279the code needed to import and export the content between C data and XML
280storage. This is left as an exercise to the reader :-)</p>
281<p>Feel free to use <a href="example/gjobread.c">the code for the full C
282parsing example</a> as a template, it is also available with Makefile in the
283Gnome CVS base under gnome-xml/example</p>
284<p><a href="bugs.html">Daniel Veillard</a></p>
285</td></tr></table></td></tr></table></td></tr></table></td>
286</tr></table></td></tr></table>
287</body>
288</html>
289