entities.html revision 69839ba1974ba8a7e996652bb2f7865de33b366f
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
4TD {font-family: Verdana,Arial,Helvetica}
5BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
6H1 {font-family: Verdana,Arial,Helvetica}
7H2 {font-family: Verdana,Arial,Helvetica}
8H3 {font-family: Verdana,Arial,Helvetica}
9A:link, A:visited, A:active { text-decoration: underline }
10</style><title>Entities or no entities</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="120"><a href="http://swpat.ffii.org/"><img src="epatents.png" alt="Action against software patents" /></a></td><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>Entities or no entities</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Developer Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html" style="font-weight:bold">Main Menu</a></li><li><a href="html/index.html" style="font-weight:bold">Reference Manual</a></li><li><a href="examples/index.html" style="font-weight:bold">Code Examples</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="tutorial/index.html">Tutorial</a></li><li><a href="xmlreader.html">The Reader Interface</a></li><li><a href="ChangeLog.html">ChangeLog</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="python.html">Python and bindings</a></li><li><a href="architecture.html">libxml2 architecture</a></li><li><a href="tree.html">The tree output</a></li><li><a href="interface.html">The SAX interface</a></li><li><a href="xmlmem.html">Memory Management</a></li><li><a href="xmlio.html">I/O Interfaces</a></li><li><a href="library.html">The parser interfaces</a></li><li><a href="entities.html">Entities or no entities</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="upgrade.html">Upgrading 1.x code</a></li><li><a href="threads.html">Thread safety</a></li><li><a href="DOM.html">DOM Principles</a></li><li><a href="example.html">A real example</a></li><li><a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="APIchunk0.html">Alphabetic</a></li><li><a href="APIconstructors.html">Constructors</a></li><li><a href="APIfunctions.html">Functions/Types</a></li><li><a href="APIfiles.html">Modules</a></li><li><a href="APIsymbols.html">Symbols</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://www.blastwave.org/packages.php/libxml2">Solaris binaries</a></li><li><a href="http://www.explain.com.au/oss/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://libxmlplusplus.sourceforge.net/">C++ bindings</a></li><li><a href="http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4">PHP bindings</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://libxml.rubyforge.org/">Ruby bindings</a></li><li><a href="http://tclxml.sourceforge.net/">Tcl bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>Entities in principle are similar to simple C macros. An entity defines
11anabbreviation for a given string that you can reuse many times throughout
12thecontent of your document. Entities are especially useful when a given
13stringmay occur frequently within a document, or to confine the change needed
14to adocument to a restricted area in the internal subset of the document (at
15thebeginning). Example:</p><pre>1 &lt;?xml version="1.0"?&gt;
162 &lt;!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
173 &lt;!ENTITY xml "Extensible Markup Language"&gt;
184 ]&gt;
195 &lt;EXAMPLE&gt;
206    &amp;xml;
217 &lt;/EXAMPLE&gt;</pre><p>Line 3 declares the xml entity. Line 6 uses the xml entity, by
22prefixingits name with '&amp;' and following it by ';' without any spaces
23added. Thereare 5 predefined entities in libxml2 allowing you to escape
24characters withpredefined meaning in some parts of the xml document
25content:<strong>&amp;lt;</strong>for the character '&lt;',
26<strong>&amp;gt;</strong>for the character '&gt;', 
27<strong>&amp;apos;</strong>for the character
28''',<strong>&amp;quot;</strong>for the character '"',
29and<strong>&amp;amp;</strong>for the character '&amp;'.</p><p>One of the problems related to entities is that you may want the parser
30tosubstitute an entity's content so that you can see the replacement text
31inyour application. Or you may prefer to keep entity references as such in
32thecontent to be able to save the document back without losing this
33usuallyprecious information (if the user went through the pain of
34explicitlydefining entities, he may have a a rather negative attitude if you
35blindlysubstitute them as saving time). The <a href="html/libxml-parser.html#xmlSubstituteEntitiesDefault">xmlSubstituteEntitiesDefault()</a>function
36allows you to check and change the behaviour, which is to notsubstitute
37entities by default.</p><p>Here is the DOM tree built by libxml2 for the previous document in
38thedefault case:</p><pre>/gnome/src/gnome-xml -&gt; /xmllint --debug test/ent1
39DOCUMENT
40version=1.0
41   ELEMENT EXAMPLE
42     TEXT
43     content=
44     ENTITY_REF
45       INTERNAL_GENERAL_ENTITY xml
46       content=Extensible Markup Language
47     TEXT
48     content=</pre><p>And here is the result when substituting entities:</p><pre>/gnome/src/gnome-xml -&gt; /tester --debug --noent test/ent1
49DOCUMENT
50version=1.0
51   ELEMENT EXAMPLE
52     TEXT
53     content=     Extensible Markup Language</pre><p>So, entities or no entities? Basically, it depends on your use case.
54Isuggest that you keep the non-substituting default behaviour and avoid
55usingentities in your XML document or data if you are not willing to handle
56theentity references elements in the DOM tree.</p><p>Note that at save time libxml2 enforces the conversion of the
57predefinedentities where necessary to prevent well-formedness problems, and
58will alsotransparently replace those with chars (i.e. it will not generate
59entityreference elements in the DOM tree or call the reference() SAX callback
60whenfinding them in the input).</p><p><span style="background-color: #FF0000">WARNING</span>: handling
61entitieson top of the libxml2 SAX interface is difficult!!! If you plan to
62usenon-predefined entities in your documents, then the learning curve to
63handlethen using the SAX API may be long. If you plan to use complex
64documents, Istrongly suggest you consider using the DOM interface instead and
65let libxmldeal with the complexity rather than trying to do it yourself.</p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>
66