doc/tutorial/ar01s03.html - third_party/libxml2 - Git at Google

 <html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Parsing the file</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><th width="60%" align="center"> </th><td width="20%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialparsing"></a>Parsing the file</h2></div></div><div></div></div><p><a class="indexterm" name="fileparsing"></a>
 Parsing the file requires only the name of the file and a single
       function call, plus error checking. Full code: <a href="apc.html" title="C. Code for Keyword Example">Appendix C, <i>Code for Keyword Example</i></a></p><p>
     </p><pre class="programlisting">
         <a name="declaredoc"></a><img src="images/callouts/1.png" alt="1" border="0"> xmlDocPtr doc;
 	<a name="declarenode"></a><img src="images/callouts/2.png" alt="2" border="0"> xmlNodePtr cur;

 	<a name="parsefile"></a><img src="images/callouts/3.png" alt="3" border="0"> doc = xmlParseFile(docname);

 	<a name="checkparseerror"></a><img src="images/callouts/4.png" alt="4" border="0"> if (doc == NULL ) {
 		fprintf(stderr,"Document not parsed successfully. \n");
 		return;
 	}

 	<a name="getrootelement"></a><img src="images/callouts/5.png" alt="5" border="0"> cur = xmlDocGetRootElement(doc);

 	<a name="checkemptyerror"></a><img src="images/callouts/6.png" alt="6" border="0"> if (cur == NULL) {
 		fprintf(stderr,"empty document\n");
 		xmlFreeDoc(doc);
 		return;
 	}

 	<a name="checkroottype"></a><img src="images/callouts/7.png" alt="7" border="0"> if (xmlStrcmp(cur-&gt;name, (const xmlChar *) "story")) {
 		fprintf(stderr,"document of the wrong type, root node != story");
 		xmlFreeDoc(doc);
 		return;
 	}

     </pre><p>
       </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#declaredoc"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p>Declare the pointer that will point to your parsed document.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#declarenode"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>Declare a node pointer (you'll need this in order to
 	  interact with individual nodes).</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkparseerror"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>Check to see that the document was successfully parsed. If it
 	    was not, <span class="application">libxml</span> will at this point
 	    register an error and stop.
 	    </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td colspan="2" align="left" valign="top"><p><a class="indexterm" name="id2525337"></a>
 One common example of an error at this point is improper
 	    handling of encoding. The <span class="acronym">XML</span> standard requires
 	    documents stored with an encoding other than UTF-8 or UTF-16 to
 	    contain an explicit declaration of their encoding. If the
 	    declaration is there, <span class="application">libxml</span> will
 	    automatically perform the necessary conversion to UTF-8 for
 		you. More information on <span class="acronym">XML's</span> encoding
 		requirements is contained in the <a href="http://www.w3.org/TR/REC-xml#charencoding" target="_top">standard</a>.</p></td></tr></table></div><p>
 	  </p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#getrootelement"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>Retrieve the document's root element.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkemptyerror"><img src="images/callouts/6.png" alt="6" border="0"></a> </td><td valign="top" align="left"><p>Check to make sure the document actually contains something.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkroottype"><img src="images/callouts/7.png" alt="7" border="0"></a> </td><td valign="top" align="left"><p>In our case, we need to make sure the document is the right
 	  type. "story" is the root type of the documents used in this
 	  tutorial.</p></td></tr></table></div><p>
       <a class="indexterm" name="id2525415"></a>
     </p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Data Types </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Retrieving Element Content</td></tr></table></div></body></html>
	<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Parsing the file</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><th width="60%" align="center"> </th><td width="20%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialparsing"></a>Parsing the file</h2></div></div><div></div></div><p><a class="indexterm" name="fileparsing"></a>
	Parsing the file requires only the name of the file and a single
	function call, plus error checking. Full code: <a href="apc.html" title="C. Code for Keyword Example">Appendix C, <i>Code for Keyword Example</i></a></p><p>
	</p><pre class="programlisting">
	<a name="declaredoc"></a><img src="images/callouts/1.png" alt="1" border="0"> xmlDocPtr doc;
	<a name="declarenode"></a><img src="images/callouts/2.png" alt="2" border="0"> xmlNodePtr cur;

	<a name="parsefile"></a><img src="images/callouts/3.png" alt="3" border="0"> doc = xmlParseFile(docname);

	<a name="checkparseerror"></a><img src="images/callouts/4.png" alt="4" border="0"> if (doc == NULL ) {
	fprintf(stderr,"Document not parsed successfully. \n");
	return;
	}

	<a name="getrootelement"></a><img src="images/callouts/5.png" alt="5" border="0"> cur = xmlDocGetRootElement(doc);

	<a name="checkemptyerror"></a><img src="images/callouts/6.png" alt="6" border="0"> if (cur == NULL) {
	fprintf(stderr,"empty document\n");
	xmlFreeDoc(doc);
	return;
	}

	<a name="checkroottype"></a><img src="images/callouts/7.png" alt="7" border="0"> if (xmlStrcmp(cur->name, (const xmlChar *) "story")) {
	fprintf(stderr,"document of the wrong type, root node != story");
	xmlFreeDoc(doc);
	return;
	}

	</pre><p>
	</p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#declaredoc"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p>Declare the pointer that will point to your parsed document.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#declarenode"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>Declare a node pointer (you'll need this in order to
	interact with individual nodes).</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkparseerror"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>Check to see that the document was successfully parsed. If it
	was not, <span class="application">libxml</span> will at this point
	register an error and stop.
	</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td colspan="2" align="left" valign="top"><p><a class="indexterm" name="id2525337"></a>
	One common example of an error at this point is improper
	handling of encoding. The <span class="acronym">XML</span> standard requires
	documents stored with an encoding other than UTF-8 or UTF-16 to
	contain an explicit declaration of their encoding. If the
	declaration is there, <span class="application">libxml</span> will
	automatically perform the necessary conversion to UTF-8 for
	you. More information on <span class="acronym">XML's</span> encoding
	requirements is contained in the <a href="http://www.w3.org/TR/REC-xml#charencoding" target="_top">standard</a>.</p></td></tr></table></div><p>
	</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#getrootelement"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>Retrieve the document's root element.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkemptyerror"><img src="images/callouts/6.png" alt="6" border="0"></a> </td><td valign="top" align="left"><p>Check to make sure the document actually contains something.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkroottype"><img src="images/callouts/7.png" alt="7" border="0"></a> </td><td valign="top" align="left"><p>In our case, we need to make sure the document is the right
	type. "story" is the root type of the documents used in this
	tutorial.</p></td></tr></table></div><p>
	<a class="indexterm" name="id2525415"></a>
	</p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Data Types </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Retrieving Element Content</td></tr></table></div></body></html>