example.html revision 43d3f61ad5c142c8c17e45c8c954432916ffceab
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> 2<html> 3<head> 4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> 5<style type="text/css"><!-- 6TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica} 7BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt} 8H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica} 9H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica} 10H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica} 11A:link, A:visited, A:active { text-decoration: underline } 12--></style> 13<title>A real example</title> 14</head> 15<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000"> 16<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr> 17<td width="180"> 18<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a> 19</td> 20<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"> 21<h1>The XML C library for Gnome</h1> 22<h2>A real example</h2> 23</td></tr></table></td></tr></table></td> 24</tr></table> 25<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr> 26<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td> 27<table width="100%" border="0" cellspacing="1" cellpadding="3"> 28<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr> 29<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt"> 30<li><a href="index.html">Home</a></li> 31<li><a href="intro.html">Introduction</a></li> 32<li><a href="FAQ.html">FAQ</a></li> 33<li><a href="docs.html">Documentation</a></li> 34<li><a href="bugs.html">Reporting bugs and getting help</a></li> 35<li><a href="help.html">How to help</a></li> 36<li><a href="downloads.html">Downloads</a></li> 37<li><a href="news.html">News</a></li> 38<li><a href="XML.html">XML</a></li> 39<li><a href="XSLT.html">XSLT</a></li> 40<li><a href="architecture.html">libxml architecture</a></li> 41<li><a href="tree.html">The tree output</a></li> 42<li><a href="interface.html">The SAX interface</a></li> 43<li><a href="xmldtd.html">Validation & DTDs</a></li> 44<li><a href="xmlmem.html">Memory Management</a></li> 45<li><a href="encoding.html">Encodings support</a></li> 46<li><a href="xmlio.html">I/O Interfaces</a></li> 47<li><a href="catalog.html">Catalog support</a></li> 48<li><a href="library.html">The parser interfaces</a></li> 49<li><a href="entities.html">Entities or no entities</a></li> 50<li><a href="namespaces.html">Namespaces</a></li> 51<li><a href="upgrade.html">Upgrading 1.x code</a></li> 52<li><a href="threads.html">Thread safety</a></li> 53<li><a href="DOM.html">DOM Principles</a></li> 54<li><a href="example.html">A real example</a></li> 55<li><a href="contribs.html">Contributions</a></li> 56<li> 57<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a> 58</li> 59</ul></td></tr> 60</table> 61<table width="100%" border="0" cellspacing="1" cellpadding="3"> 62<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr> 63<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt"> 64<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li> 65<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li> 66<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li> 67<li><a href="ftp://xmlsoft.org/">FTP</a></li> 68<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li> 69<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li> 70<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml">Bug Tracker</a></li> 71</ul></td></tr> 72</table> 73</td></tr></table></td> 74<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"> 75<p>Here is a real size example, where the actual content of the application 76data is not kept in the DOM tree but uses internal structures. It is based on 77a proposal to keep a database of jobs related to Gnome, with an XML based 78storage structure. Here is an <a href="gjobs.xml">XML encoded jobs 79base</a>:</p> 80<pre><?xml version="1.0"?> 81<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location"> 82 <gjob:Jobs> 83 84 <gjob:Job> 85 <gjob:Project ID="3"/> 86 <gjob:Application>GBackup</gjob:Application> 87 <gjob:Category>Development</gjob:Category> 88 89 <gjob:Update> 90 <gjob:Status>Open</gjob:Status> 91 <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified> 92 <gjob:Salary>USD 0.00</gjob:Salary> 93 </gjob:Update> 94 95 <gjob:Developers> 96 <gjob:Developer> 97 </gjob:Developer> 98 </gjob:Developers> 99 100 <gjob:Contact> 101 <gjob:Person>Nathan Clemons</gjob:Person> 102 <gjob:Email>nathan@windsofstorm.net</gjob:Email> 103 <gjob:Company> 104 </gjob:Company> 105 <gjob:Organisation> 106 </gjob:Organisation> 107 <gjob:Webpage> 108 </gjob:Webpage> 109 <gjob:Snailmail> 110 </gjob:Snailmail> 111 <gjob:Phone> 112 </gjob:Phone> 113 </gjob:Contact> 114 115 <gjob:Requirements> 116 The program should be released as free software, under the GPL. 117 </gjob:Requirements> 118 119 <gjob:Skills> 120 </gjob:Skills> 121 122 <gjob:Details> 123 A GNOME based system that will allow a superuser to configure 124 compressed and uncompressed files and/or file systems to be backed 125 up with a supported media in the system. This should be able to 126 perform via find commands generating a list of files that are passed 127 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 128 or via operations performed on the filesystem itself. Email 129 notification and GUI status display very important. 130 </gjob:Details> 131 132 </gjob:Job> 133 134 </gjob:Jobs> 135</gjob:Helping></pre> 136<p>While loading the XML file into an internal DOM tree is a matter of 137calling only a couple of functions, browsing the tree to gather the ata and 138generate the internal structures is harder, and more error prone.</p> 139<p>The suggested principle is to be tolerant with respect to the input 140structure. For example, the ordering of the attributes is not significant, 141the XML specification is clear about it. It's also usually a good idea not to 142depend on the order of the children of a given node, unless it really makes 143things harder. Here is some code to parse the information for a person:</p> 144<pre>/* 145 * A person record 146 */ 147typedef struct person { 148 char *name; 149 char *email; 150 char *company; 151 char *organisation; 152 char *smail; 153 char *webPage; 154 char *phone; 155} person, *personPtr; 156 157/* 158 * And the code needed to parse it 159 */ 160personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 161 personPtr ret = NULL; 162 163DEBUG("parsePerson\n"); 164 /* 165 * allocate the struct 166 */ 167 ret = (personPtr) malloc(sizeof(person)); 168 if (ret == NULL) { 169 fprintf(stderr,"out of memory\n"); 170 return(NULL); 171 } 172 memset(ret, 0, sizeof(person)); 173 174 /* We don't care what the top level element name is */ 175 cur = cur->xmlChildrenNode; 176 while (cur != NULL) { 177 if ((!strcmp(cur->name, "Person")) && (cur->ns == ns)) 178 ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 179 if ((!strcmp(cur->name, "Email")) && (cur->ns == ns)) 180 ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 181 cur = cur->next; 182 } 183 184 return(ret); 185}</pre> 186<p>Here are a couple of things to notice:</p> 187<ul> 188<li>Usually a recursive parsing style is the more convenient one: XML data 189 is by nature subject to repetitive constructs and usually exibits highly 190 stuctured patterns.</li> 191<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>, 192 i.e. the pointer to the global XML document and the namespace reserved to 193 the application. Document wide information are needed for example to 194 decode entities and it's a good coding practice to define a namespace for 195 your application set of data and test that the element and attributes 196 you're analyzing actually pertains to your application space. This is 197 done by a simple equality test (cur->ns == ns).</li> 198<li>To retrieve text and attributes value, you can use the function 199 <em>xmlNodeListGetString</em> to gather all the text and entity reference 200 nodes generated by the DOM output and produce an single text string.</li> 201</ul> 202<p>Here is another piece of code used to parse another level of the 203structure:</p> 204<pre>#include <libxml/tree.h> 205/* 206 * a Description for a Job 207 */ 208typedef struct job { 209 char *projectID; 210 char *application; 211 char *category; 212 personPtr contact; 213 int nbDevelopers; 214 personPtr developers[100]; /* using dynamic alloc is left as an exercise */ 215} job, *jobPtr; 216 217/* 218 * And the code needed to parse it 219 */ 220jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 221 jobPtr ret = NULL; 222 223DEBUG("parseJob\n"); 224 /* 225 * allocate the struct 226 */ 227 ret = (jobPtr) malloc(sizeof(job)); 228 if (ret == NULL) { 229 fprintf(stderr,"out of memory\n"); 230 return(NULL); 231 } 232 memset(ret, 0, sizeof(job)); 233 234 /* We don't care what the top level element name is */ 235 cur = cur->xmlChildrenNode; 236 while (cur != NULL) { 237 238 if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) { 239 ret->projectID = xmlGetProp(cur, "ID"); 240 if (ret->projectID == NULL) { 241 fprintf(stderr, "Project has no ID\n"); 242 } 243 } 244 if ((!strcmp(cur->name, "Application")) && (cur->ns == ns)) 245 ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 246 if ((!strcmp(cur->name, "Category")) && (cur->ns == ns)) 247 ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 248 if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns)) 249 ret->contact = parsePerson(doc, ns, cur); 250 cur = cur->next; 251 } 252 253 return(ret); 254}</pre> 255<p>Once you are used to it, writing this kind of code is quite simple, but 256boring. Ultimately, it could be possble to write stubbers taking either C 257data structure definitions, a set of XML examples or an XML DTD and produce 258the code needed to import and export the content between C data and XML 259storage. This is left as an exercise to the reader :-)</p> 260<p>Feel free to use <a href="example/gjobread.c">the code for the full C 261parsing example</a> as a template, it is also available with Makefile in the 262Gnome CVS base under gnome-xml/example</p> 263<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p> 264</td></tr></table></td></tr></table></td></tr></table></td> 265</tr></table></td></tr></table> 266</body> 267</html> 268