example.html revision 61f6fb66add2c6a39e89cdeab466c8518bfa56ff
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> 2<html> 3<head> 4<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> 5<link rel="SHORTCUT ICON" href="/favicon.ico"> 6<style type="text/css"><!-- 7TD {font-family: Verdana,Arial,Helvetica} 8BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em} 9H1 {font-family: Verdana,Arial,Helvetica} 10H2 {font-family: Verdana,Arial,Helvetica} 11H3 {font-family: Verdana,Arial,Helvetica} 12A:link, A:visited, A:active { text-decoration: underline } 13--></style> 14<title>A real example</title> 15</head> 16<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000"> 17<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr> 18<td width="180"> 19<a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo"></a></div> 20</td> 21<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"> 22<h1>The XML C library for Gnome</h1> 23<h2>A real example</h2> 24</td></tr></table></td></tr></table></td> 25</tr></table> 26<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr> 27<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td> 28<table width="100%" border="0" cellspacing="1" cellpadding="3"> 29<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr> 30<tr><td bgcolor="#fffacd"><ul> 31<li><a href="index.html">Home</a></li> 32<li><a href="intro.html">Introduction</a></li> 33<li><a href="FAQ.html">FAQ</a></li> 34<li><a href="docs.html">Documentation</a></li> 35<li><a href="bugs.html">Reporting bugs and getting help</a></li> 36<li><a href="help.html">How to help</a></li> 37<li><a href="downloads.html">Downloads</a></li> 38<li><a href="news.html">News</a></li> 39<li><a href="XMLinfo.html">XML</a></li> 40<li><a href="XSLT.html">XSLT</a></li> 41<li><a href="python.html">Python and bindings</a></li> 42<li><a href="architecture.html">libxml architecture</a></li> 43<li><a href="tree.html">The tree output</a></li> 44<li><a href="interface.html">The SAX interface</a></li> 45<li><a href="xmldtd.html">Validation & DTDs</a></li> 46<li><a href="xmlmem.html">Memory Management</a></li> 47<li><a href="encoding.html">Encodings support</a></li> 48<li><a href="xmlio.html">I/O Interfaces</a></li> 49<li><a href="catalog.html">Catalog support</a></li> 50<li><a href="library.html">The parser interfaces</a></li> 51<li><a href="entities.html">Entities or no entities</a></li> 52<li><a href="namespaces.html">Namespaces</a></li> 53<li><a href="upgrade.html">Upgrading 1.x code</a></li> 54<li><a href="threads.html">Thread safety</a></li> 55<li><a href="DOM.html">DOM Principles</a></li> 56<li><a href="example.html">A real example</a></li> 57<li><a href="contribs.html">Contributions</a></li> 58<li><a href="tutorial/index.html">Tutorial</a></li> 59<li> 60<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a> 61</li> 62</ul></td></tr> 63</table> 64<table width="100%" border="0" cellspacing="1" cellpadding="3"> 65<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr> 66<tr><td bgcolor="#fffacd"> 67<form action="search.php" enctype="application/x-www-form-urlencoded" method="GET"> 68<input name="query" type="TEXT" size="20" value=""><input name="submit" type="submit" value="Search ..."> 69</form> 70<ul> 71<li><a href="APIchunk0.html">Alphabetic</a></li> 72<li><a href="APIconstructors.html">Constructors</a></li> 73<li><a href="APIfunctions.html">Functions/Types</a></li> 74<li><a href="APIfiles.html">Modules</a></li> 75<li><a href="APIsymbols.html">Symbols</a></li> 76</ul> 77</td></tr> 78</table> 79<table width="100%" border="0" cellspacing="1" cellpadding="3"> 80<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr> 81<tr><td bgcolor="#fffacd"><ul> 82<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li> 83<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li> 84<li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li> 85<li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li> 86<li><a href="ftp://xmlsoft.org/">FTP</a></li> 87<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li> 88<li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li> 89<li><a href="http://www.zveno.com/open_source/libxml2xslt.html">MacOsX binaries</a></li> 90<li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li> 91<li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml&product=libxml2">Bug Tracker</a></li> 92</ul></td></tr> 93</table> 94</td></tr></table></td> 95<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"> 96<p>Here is a real size example, where the actual content of the application 97data is not kept in the DOM tree but uses internal structures. It is based on 98a proposal to keep a database of jobs related to Gnome, with an XML based 99storage structure. Here is an <a href="gjobs.xml">XML encoded jobs 100base</a>:</p> 101<pre><?xml version="1.0"?> 102<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location"> 103 <gjob:Jobs> 104 105 <gjob:Job> 106 <gjob:Project ID="3"/> 107 <gjob:Application>GBackup</gjob:Application> 108 <gjob:Category>Development</gjob:Category> 109 110 <gjob:Update> 111 <gjob:Status>Open</gjob:Status> 112 <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified> 113 <gjob:Salary>USD 0.00</gjob:Salary> 114 </gjob:Update> 115 116 <gjob:Developers> 117 <gjob:Developer> 118 </gjob:Developer> 119 </gjob:Developers> 120 121 <gjob:Contact> 122 <gjob:Person>Nathan Clemons</gjob:Person> 123 <gjob:Email>nathan@windsofstorm.net</gjob:Email> 124 <gjob:Company> 125 </gjob:Company> 126 <gjob:Organisation> 127 </gjob:Organisation> 128 <gjob:Webpage> 129 </gjob:Webpage> 130 <gjob:Snailmail> 131 </gjob:Snailmail> 132 <gjob:Phone> 133 </gjob:Phone> 134 </gjob:Contact> 135 136 <gjob:Requirements> 137 The program should be released as free software, under the GPL. 138 </gjob:Requirements> 139 140 <gjob:Skills> 141 </gjob:Skills> 142 143 <gjob:Details> 144 A GNOME based system that will allow a superuser to configure 145 compressed and uncompressed files and/or file systems to be backed 146 up with a supported media in the system. This should be able to 147 perform via find commands generating a list of files that are passed 148 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 149 or via operations performed on the filesystem itself. Email 150 notification and GUI status display very important. 151 </gjob:Details> 152 153 </gjob:Job> 154 155 </gjob:Jobs> 156</gjob:Helping></pre> 157<p>While loading the XML file into an internal DOM tree is a matter of 158calling only a couple of functions, browsing the tree to gather the data and 159generate the internal structures is harder, and more error prone.</p> 160<p>The suggested principle is to be tolerant with respect to the input 161structure. For example, the ordering of the attributes is not significant, 162the XML specification is clear about it. It's also usually a good idea not to 163depend on the order of the children of a given node, unless it really makes 164things harder. Here is some code to parse the information for a person:</p> 165<pre>/* 166 * A person record 167 */ 168typedef struct person { 169 char *name; 170 char *email; 171 char *company; 172 char *organisation; 173 char *smail; 174 char *webPage; 175 char *phone; 176} person, *personPtr; 177 178/* 179 * And the code needed to parse it 180 */ 181personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 182 personPtr ret = NULL; 183 184DEBUG("parsePerson\n"); 185 /* 186 * allocate the struct 187 */ 188 ret = (personPtr) malloc(sizeof(person)); 189 if (ret == NULL) { 190 fprintf(stderr,"out of memory\n"); 191 return(NULL); 192 } 193 memset(ret, 0, sizeof(person)); 194 195 /* We don't care what the top level element name is */ 196 cur = cur->xmlChildrenNode; 197 while (cur != NULL) { 198 if ((!strcmp(cur->name, "Person")) && (cur->ns == ns)) 199 ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 200 if ((!strcmp(cur->name, "Email")) && (cur->ns == ns)) 201 ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 202 cur = cur->next; 203 } 204 205 return(ret); 206}</pre> 207<p>Here are a couple of things to notice:</p> 208<ul> 209<li>Usually a recursive parsing style is the more convenient one: XML data 210 is by nature subject to repetitive constructs and usually exhibits highly 211 structured patterns.</li> 212 <li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>, 213 i.e. the pointer to the global XML document and the namespace reserved to 214 the application. Document wide information are needed for example to 215 decode entities and it's a good coding practice to define a namespace for 216 your application set of data and test that the element and attributes 217 you're analyzing actually pertains to your application space. This is 218 done by a simple equality test (cur->ns == ns).</li> 219 <li>To retrieve text and attributes value, you can use the function 220 <em>xmlNodeListGetString</em> to gather all the text and entity reference 221 nodes generated by the DOM output and produce an single text string.</li> 222</ul> 223<p>Here is another piece of code used to parse another level of the 224structure:</p> 225<pre>#include <libxml/tree.h> 226/* 227 * a Description for a Job 228 */ 229typedef struct job { 230 char *projectID; 231 char *application; 232 char *category; 233 personPtr contact; 234 int nbDevelopers; 235 personPtr developers[100]; /* using dynamic alloc is left as an exercise */ 236} job, *jobPtr; 237 238/* 239 * And the code needed to parse it 240 */ 241jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 242 jobPtr ret = NULL; 243 244DEBUG("parseJob\n"); 245 /* 246 * allocate the struct 247 */ 248 ret = (jobPtr) malloc(sizeof(job)); 249 if (ret == NULL) { 250 fprintf(stderr,"out of memory\n"); 251 return(NULL); 252 } 253 memset(ret, 0, sizeof(job)); 254 255 /* We don't care what the top level element name is */ 256 cur = cur->xmlChildrenNode; 257 while (cur != NULL) { 258 259 if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) { 260 ret->projectID = xmlGetProp(cur, "ID"); 261 if (ret->projectID == NULL) { 262 fprintf(stderr, "Project has no ID\n"); 263 } 264 } 265 if ((!strcmp(cur->name, "Application")) && (cur->ns == ns)) 266 ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 267 if ((!strcmp(cur->name, "Category")) && (cur->ns == ns)) 268 ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 269 if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns)) 270 ret->contact = parsePerson(doc, ns, cur); 271 cur = cur->next; 272 } 273 274 return(ret); 275}</pre> 276<p>Once you are used to it, writing this kind of code is quite simple, but 277boring. Ultimately, it could be possible to write stubbers taking either C 278data structure definitions, a set of XML examples or an XML DTD and produce 279the code needed to import and export the content between C data and XML 280storage. This is left as an exercise to the reader :-)</p> 281<p>Feel free to use <a href="example/gjobread.c">the code for the full C 282parsing example</a> as a template, it is also available with Makefile in the 283Gnome CVS base under gnome-xml/example</p> 284<p><a href="bugs.html">Daniel Veillard</a></p> 285</td></tr></table></td></tr></table></td></tr></table></td> 286</tr></table></td></tr></table> 287</body> 288</html> 289