example.html revision 43d3f61ad5c142c8c17e45c8c954432916ffceab
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
2<html>
3<head>
4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
5<style type="text/css"><!--
6TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
7BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
8H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
9H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
10H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
11A:link, A:visited, A:active { text-decoration: underline }
12--></style>
13<title>A real example</title>
14</head>
15<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
16<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
17<td width="180">
18<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
19</td>
20<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
21<h1>The XML C library for Gnome</h1>
22<h2>A real example</h2>
23</td></tr></table></td></tr></table></td>
24</tr></table>
25<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
26<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
27<table width="100%" border="0" cellspacing="1" cellpadding="3">
28<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
29<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
30<li><a href="index.html">Home</a></li>
31<li><a href="intro.html">Introduction</a></li>
32<li><a href="FAQ.html">FAQ</a></li>
33<li><a href="docs.html">Documentation</a></li>
34<li><a href="bugs.html">Reporting bugs and getting help</a></li>
35<li><a href="help.html">How to help</a></li>
36<li><a href="downloads.html">Downloads</a></li>
37<li><a href="news.html">News</a></li>
38<li><a href="XML.html">XML</a></li>
39<li><a href="XSLT.html">XSLT</a></li>
40<li><a href="architecture.html">libxml architecture</a></li>
41<li><a href="tree.html">The tree output</a></li>
42<li><a href="interface.html">The SAX interface</a></li>
43<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
44<li><a href="xmlmem.html">Memory Management</a></li>
45<li><a href="encoding.html">Encodings support</a></li>
46<li><a href="xmlio.html">I/O Interfaces</a></li>
47<li><a href="catalog.html">Catalog support</a></li>
48<li><a href="library.html">The parser interfaces</a></li>
49<li><a href="entities.html">Entities or no entities</a></li>
50<li><a href="namespaces.html">Namespaces</a></li>
51<li><a href="upgrade.html">Upgrading 1.x code</a></li>
52<li><a href="threads.html">Thread safety</a></li>
53<li><a href="DOM.html">DOM Principles</a></li>
54<li><a href="example.html">A real example</a></li>
55<li><a href="contribs.html">Contributions</a></li>
56<li>
57<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
58</li>
59</ul></td></tr>
60</table>
61<table width="100%" border="0" cellspacing="1" cellpadding="3">
62<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
63<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
64<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
65<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
66<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
67<li><a href="ftp://xmlsoft.org/">FTP</a></li>
68<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
69<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
70<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml">Bug Tracker</a></li>
71</ul></td></tr>
72</table>
73</td></tr></table></td>
74<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
75<p>Here is a real size example, where the actual content of the application
76data is not kept in the DOM tree but uses internal structures. It is based on
77a proposal to keep a database of jobs related to Gnome, with an XML based
78storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
79base</a>:</p>
80<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
81&lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location">;
82  &lt;gjob:Jobs&gt;
83
84    &lt;gjob:Job&gt;
85      &lt;gjob:Project ID=&quot;3&quot;/&gt;
86      &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
87      &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
88
89      &lt;gjob:Update&gt;
90        &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
91        &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
92        &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
93      &lt;/gjob:Update&gt;
94
95      &lt;gjob:Developers&gt;
96        &lt;gjob:Developer&gt;
97        &lt;/gjob:Developer&gt;
98      &lt;/gjob:Developers&gt;
99
100      &lt;gjob:Contact&gt;
101        &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
102        &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
103        &lt;gjob:Company&gt;
104        &lt;/gjob:Company&gt;
105        &lt;gjob:Organisation&gt;
106        &lt;/gjob:Organisation&gt;
107        &lt;gjob:Webpage&gt;
108        &lt;/gjob:Webpage&gt;
109        &lt;gjob:Snailmail&gt;
110        &lt;/gjob:Snailmail&gt;
111        &lt;gjob:Phone&gt;
112        &lt;/gjob:Phone&gt;
113      &lt;/gjob:Contact&gt;
114
115      &lt;gjob:Requirements&gt;
116      The program should be released as free software, under the GPL.
117      &lt;/gjob:Requirements&gt;
118
119      &lt;gjob:Skills&gt;
120      &lt;/gjob:Skills&gt;
121
122      &lt;gjob:Details&gt;
123      A GNOME based system that will allow a superuser to configure 
124      compressed and uncompressed files and/or file systems to be backed 
125      up with a supported media in the system.  This should be able to 
126      perform via find commands generating a list of files that are passed 
127      to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 
128      or via operations performed on the filesystem itself. Email 
129      notification and GUI status display very important.
130      &lt;/gjob:Details&gt;
131
132    &lt;/gjob:Job&gt;
133
134  &lt;/gjob:Jobs&gt;
135&lt;/gjob:Helping&gt;</pre>
136<p>While loading the XML file into an internal DOM tree is a matter of
137calling only a couple of functions, browsing the tree to gather the ata and
138generate the internal structures is harder, and more error prone.</p>
139<p>The suggested principle is to be tolerant with respect to the input
140structure. For example, the ordering of the attributes is not significant,
141the XML specification is clear about it. It's also usually a good idea not to
142depend on the order of the children of a given node, unless it really makes
143things harder. Here is some code to parse the information for a person:</p>
144<pre>/*
145 * A person record
146 */
147typedef struct person {
148    char *name;
149    char *email;
150    char *company;
151    char *organisation;
152    char *smail;
153    char *webPage;
154    char *phone;
155} person, *personPtr;
156
157/*
158 * And the code needed to parse it
159 */
160personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
161    personPtr ret = NULL;
162
163DEBUG(&quot;parsePerson\n&quot;);
164    /*
165     * allocate the struct
166     */
167    ret = (personPtr) malloc(sizeof(person));
168    if (ret == NULL) {
169        fprintf(stderr,&quot;out of memory\n&quot;);
170        return(NULL);
171    }
172    memset(ret, 0, sizeof(person));
173
174    /* We don't care what the top level element name is */
175    cur = cur-&gt;xmlChildrenNode;
176    while (cur != NULL) {
177        if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
178            ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
179        if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
180            ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
181        cur = cur-&gt;next;
182    }
183
184    return(ret);
185}</pre>
186<p>Here are a couple of things to notice:</p>
187<ul>
188<li>Usually a recursive parsing style is the more convenient one: XML data
189    is by nature subject to repetitive constructs and usually exibits highly
190    stuctured patterns.</li>
191<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
192    i.e. the pointer to the global XML document and the namespace reserved to
193    the application. Document wide information are needed for example to
194    decode entities and it's a good coding practice to define a namespace for
195    your application set of data and test that the element and attributes
196    you're analyzing actually pertains to your application space. This is
197    done by a simple equality test (cur-&gt;ns == ns).</li>
198<li>To retrieve text and attributes value, you can use the function
199    <em>xmlNodeListGetString</em> to gather all the text and entity reference
200    nodes generated by the DOM output and produce an single text string.</li>
201</ul>
202<p>Here is another piece of code used to parse another level of the
203structure:</p>
204<pre>#include &lt;libxml/tree.h&gt;
205/*
206 * a Description for a Job
207 */
208typedef struct job {
209    char *projectID;
210    char *application;
211    char *category;
212    personPtr contact;
213    int nbDevelopers;
214    personPtr developers[100]; /* using dynamic alloc is left as an exercise */
215} job, *jobPtr;
216
217/*
218 * And the code needed to parse it
219 */
220jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
221    jobPtr ret = NULL;
222
223DEBUG(&quot;parseJob\n&quot;);
224    /*
225     * allocate the struct
226     */
227    ret = (jobPtr) malloc(sizeof(job));
228    if (ret == NULL) {
229        fprintf(stderr,&quot;out of memory\n&quot;);
230        return(NULL);
231    }
232    memset(ret, 0, sizeof(job));
233
234    /* We don't care what the top level element name is */
235    cur = cur-&gt;xmlChildrenNode;
236    while (cur != NULL) {
237        
238        if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
239            ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
240            if (ret-&gt;projectID == NULL) {
241                fprintf(stderr, &quot;Project has no ID\n&quot;);
242            }
243        }
244        if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
245            ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
246        if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
247            ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
248        if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
249            ret-&gt;contact = parsePerson(doc, ns, cur);
250        cur = cur-&gt;next;
251    }
252
253    return(ret);
254}</pre>
255<p>Once you are used to it, writing this kind of code is quite simple, but
256boring. Ultimately, it could be possble to write stubbers taking either C
257data structure definitions, a set of XML examples or an XML DTD and produce
258the code needed to import and export the content between C data and XML
259storage. This is left as an exercise to the reader :-)</p>
260<p>Feel free to use <a href="example/gjobread.c">the code for the full C
261parsing example</a> as a template, it is also available with Makefile in the
262Gnome CVS base under gnome-xml/example</p>
263<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
264</td></tr></table></td></tr></table></td></tr></table></td>
265</tr></table></td></tr></table>
266</body>
267</html>
268