example.html revision 598bec37b8be930a87ec453ac064239b24e9b9d8
1<?xml version="1.0" encoding="ISO-8859-1"?> 2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css"> 4TD {font-family: Verdana,Arial,Helvetica} 5BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em} 6H1 {font-family: Verdana,Arial,Helvetica} 7H2 {font-family: Verdana,Arial,Helvetica} 8H3 {font-family: Verdana,Arial,Helvetica} 9A:link, A:visited, A:active { text-decoration: underline } 10</style><title>A real example</title></head><body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>A real example</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Developer Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html">Home</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="tutorial/index.html">Tutorial</a></li><li><a href="xmlreader.html">The Reader Interface</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="python.html">Python and bindings</a></li><li><a href="architecture.html">libxml2 architecture</a></li><li><a href="tree.html">The tree output</a></li><li><a href="interface.html">The SAX interface</a></li><li><a href="xmlmem.html">Memory Management</a></li><li><a href="xmlio.html">I/O Interfaces</a></li><li><a href="library.html">The parser interfaces</a></li><li><a href="entities.html">Entities or no entities</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="upgrade.html">Upgrading 1.x code</a></li><li><a href="threads.html">Thread safety</a></li><li><a href="DOM.html">DOM Principles</a></li><li><a href="example.html">A real example</a></li><li><a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="APIchunk0.html">Alphabetic</a></li><li><a href="APIconstructors.html">Constructors</a></li><li><a href="APIfunctions.html">Functions/Types</a></li><li><a href="APIfiles.html">Modules</a></li><li><a href="APIsymbols.html">Symbols</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://garypennington.net/libxml2/">Solaris binaries</a></li><li><a href="http://www.zveno.com/open_source/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml&product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>Here is a real size example, where the actual content of the application 11data is not kept in the DOM tree but uses internal structures. It is based on 12a proposal to keep a database of jobs related to Gnome, with an XML based 13storage structure. Here is an <a href="gjobs.xml">XML encoded jobs 14base</a>:</p><pre><?xml version="1.0"?> 15<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location"> 16 <gjob:Jobs> 17 18 <gjob:Job> 19 <gjob:Project ID="3"/> 20 <gjob:Application>GBackup</gjob:Application> 21 <gjob:Category>Development</gjob:Category> 22 23 <gjob:Update> 24 <gjob:Status>Open</gjob:Status> 25 <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified> 26 <gjob:Salary>USD 0.00</gjob:Salary> 27 </gjob:Update> 28 29 <gjob:Developers> 30 <gjob:Developer> 31 </gjob:Developer> 32 </gjob:Developers> 33 34 <gjob:Contact> 35 <gjob:Person>Nathan Clemons</gjob:Person> 36 <gjob:Email>nathan@windsofstorm.net</gjob:Email> 37 <gjob:Company> 38 </gjob:Company> 39 <gjob:Organisation> 40 </gjob:Organisation> 41 <gjob:Webpage> 42 </gjob:Webpage> 43 <gjob:Snailmail> 44 </gjob:Snailmail> 45 <gjob:Phone> 46 </gjob:Phone> 47 </gjob:Contact> 48 49 <gjob:Requirements> 50 The program should be released as free software, under the GPL. 51 </gjob:Requirements> 52 53 <gjob:Skills> 54 </gjob:Skills> 55 56 <gjob:Details> 57 A GNOME based system that will allow a superuser to configure 58 compressed and uncompressed files and/or file systems to be backed 59 up with a supported media in the system. This should be able to 60 perform via find commands generating a list of files that are passed 61 to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 62 or via operations performed on the filesystem itself. Email 63 notification and GUI status display very important. 64 </gjob:Details> 65 66 </gjob:Job> 67 68 </gjob:Jobs> 69</gjob:Helping></pre><p>While loading the XML file into an internal DOM tree is a matter of 70calling only a couple of functions, browsing the tree to gather the data and 71generate the internal structures is harder, and more error prone.</p><p>The suggested principle is to be tolerant with respect to the input 72structure. For example, the ordering of the attributes is not significant, 73the XML specification is clear about it. It's also usually a good idea not to 74depend on the order of the children of a given node, unless it really makes 75things harder. Here is some code to parse the information for a person:</p><pre>/* 76 * A person record 77 */ 78typedef struct person { 79 char *name; 80 char *email; 81 char *company; 82 char *organisation; 83 char *smail; 84 char *webPage; 85 char *phone; 86} person, *personPtr; 87 88/* 89 * And the code needed to parse it 90 */ 91personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 92 personPtr ret = NULL; 93 94DEBUG("parsePerson\n"); 95 /* 96 * allocate the struct 97 */ 98 ret = (personPtr) malloc(sizeof(person)); 99 if (ret == NULL) { 100 fprintf(stderr,"out of memory\n"); 101 return(NULL); 102 } 103 memset(ret, 0, sizeof(person)); 104 105 /* We don't care what the top level element name is */ 106 cur = cur->xmlChildrenNode; 107 while (cur != NULL) { 108 if ((!strcmp(cur->name, "Person")) && (cur->ns == ns)) 109 ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 110 if ((!strcmp(cur->name, "Email")) && (cur->ns == ns)) 111 ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 112 cur = cur->next; 113 } 114 115 return(ret); 116}</pre><p>Here are a couple of things to notice:</p><ul><li>Usually a recursive parsing style is the more convenient one: XML data 117 is by nature subject to repetitive constructs and usually exhibits highly 118 structured patterns.</li> 119 <li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>, 120 i.e. the pointer to the global XML document and the namespace reserved to 121 the application. Document wide information are needed for example to 122 decode entities and it's a good coding practice to define a namespace for 123 your application set of data and test that the element and attributes 124 you're analyzing actually pertains to your application space. This is 125 done by a simple equality test (cur->ns == ns).</li> 126 <li>To retrieve text and attributes value, you can use the function 127 <em>xmlNodeListGetString</em> to gather all the text and entity reference 128 nodes generated by the DOM output and produce an single text string.</li> 129</ul><p>Here is another piece of code used to parse another level of the 130structure:</p><pre>#include <libxml/tree.h> 131/* 132 * a Description for a Job 133 */ 134typedef struct job { 135 char *projectID; 136 char *application; 137 char *category; 138 personPtr contact; 139 int nbDevelopers; 140 personPtr developers[100]; /* using dynamic alloc is left as an exercise */ 141} job, *jobPtr; 142 143/* 144 * And the code needed to parse it 145 */ 146jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) { 147 jobPtr ret = NULL; 148 149DEBUG("parseJob\n"); 150 /* 151 * allocate the struct 152 */ 153 ret = (jobPtr) malloc(sizeof(job)); 154 if (ret == NULL) { 155 fprintf(stderr,"out of memory\n"); 156 return(NULL); 157 } 158 memset(ret, 0, sizeof(job)); 159 160 /* We don't care what the top level element name is */ 161 cur = cur->xmlChildrenNode; 162 while (cur != NULL) { 163 164 if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) { 165 ret->projectID = xmlGetProp(cur, "ID"); 166 if (ret->projectID == NULL) { 167 fprintf(stderr, "Project has no ID\n"); 168 } 169 } 170 if ((!strcmp(cur->name, "Application")) && (cur->ns == ns)) 171 ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 172 if ((!strcmp(cur->name, "Category")) && (cur->ns == ns)) 173 ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 174 if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns)) 175 ret->contact = parsePerson(doc, ns, cur); 176 cur = cur->next; 177 } 178 179 return(ret); 180}</pre><p>Once you are used to it, writing this kind of code is quite simple, but 181boring. Ultimately, it could be possible to write stubbers taking either C 182data structure definitions, a set of XML examples or an XML DTD and produce 183the code needed to import and export the content between C data and XML 184storage. This is left as an exercise to the reader :-)</p><p>Feel free to use <a href="example/gjobread.c">the code for the full C 185parsing example</a> as a template, it is also available with Makefile in the 186Gnome CVS base under gnome-xml/example</p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html> 187