Read and Write XML Data

In yesterday’s posts, we examined how to retrieve and process data from a database with the help of ADO.NET. Today, we take a look at the third major area of ADO.NET: XML.

I assume that everybody got this far already met with XML, some may also become quite fond of it (shame on me, but I haven’t), so I wouldn’t waste time to introduce it. Instead let’s get see what ADO.NET has for us. In this post, I will revise the following:

  1. Processing stream-based XML
  2. Processing in-memory XML
  3. XML data binding
  4. XML and DataSet integration

I’ll not going to describe LINQ to XML, since it isn’t covered in the exam, and even when you work with LINQ to XML, you’ll still need to know the basics of ADO.NET to perform for example basic XML data binding.

Processing stream-based XML

To process stream-based XML, the simplest approach is to use the XmlTextReader and XmlTextWriter classes. Let’s work with the latter. You can pass a stream or a file path to work with in the constructors. The main methods of the XmlTextWriter:

  • WriteStartDocument: writes the XML declaration.
  • WriteComment: adds a comment to your file.
  • WriteStartElement: creates an element specified in the string parameter. You need to call the WriteEndElement method to close it.
  • WriteAttributeString: adds an attribute to the current element you are working with.
  • WriteElementString: writes elements to the current position, in the following format: <dog>Barky</dog>. You wouldn’t need to call WriteEndElement.
  • Close: completes the document.

To read an XML document from stream, you’ll typically use the XmlTextReader class. XmlTextReader is similar to DataReader. Only one node can be processed at a time, and you can only move forward in the document. This means you’ll  need to use nested loops to get to the content you are interested in. The most important methods are:

  • Read: Read and the corresponding methods (ReadContentAsString) reads the next node of data, and optionally, casts it into the correct form. Read returns true until there is a next row.
  • Value: the value of the current node.
  • Name: the name of the current node.
  • AttributeCount: when you are interested in attributes, you should perform a check whether AttributeCount of the current node is greater than zero (thus, the node has attributes).
  • MoveToNextAttribute: when you are working with attributes, this method moves the cursor to the next one of them.

In-memory XML processing

When you are ready to sacrifice scalability for flexibility, you should choose the methods of in-memory XML processing. The main class here is XmlDocument, but let’s see in a table what do we have here:

XmlDocument XmlDocument represents a full XML document in memory. This document is editable and navigable. You can also use it as a source of an XSL transform.
XmlDataDocument You’ll typically use XmlDataDocument when you are working with a DataSet. More of it later.
XPathNavigator An entire XML document is held in an XPathNavigator. You can search slightly faster in it, but you cannot make and save changes.

You can create XPathNavigators by calling the XmlDocument’s CreateNavigator method.

XPathNodeIterator The XPathNavigator’s Select command returns XPathNodeIterator. It is a forward-only read-only cursor for iterating over data.
XPathDocument A fast, read-only XML document based on XPath syntax. You can create XPathNavigator classes from it by calling the CreateNavigator method.
XmlNamespaceManager Adds or removes XML namespaces.

 

Now with a little terminology behind us, let’s see the two main classes we should be most interested in: XmlDocument and XPathNavigator.

XmlDocument stores all information from an XML file as a set of nodes. Nodes can be everything in an XML document: namespaces, elements, attributes, etc. To start working with an XmlDocument, we should load some XML data inside it. Use the Load method and provide the file path to fill the XML data.

As mentioned above, everything in an XmlDocument is treated as an XmlNode object. When you’d like to iterate over these nodes, you’d do something like this:

XmlDocument xDoc = new XmlDocument();

xDoc.Load(“xml.xml”);

foreach(XmlNode node in xDoc.GetChildNodes())

{

    switch(node.NodeType)

        {

             case XmlNodeType.Element:

                   //Processing code

}

Not so hard to implement it. When you’d like to search in an XmlDocument, you have two important methods: GetElemetsByTagName, which needs a tag name, and returns a NodeList collection, and you can also use XPath queries. To do so, use the SelectNodes method, and a valid XPath query.

The other approach for in-memory XML handling is the XPathNavigator. You acquire XPathNavigator object by calling for example the XmlDocument’s CreateNavigator method. XPathNavigators are cursor-based, read-only and forward-only objects. You use the same syntax as introduced at XmlTextReader.

Xml data binding

When you’d like to bind your data objects to XML data, you have several scenarios to do so. We examine the most important ones.

The simplest form of XML data binding is to treat XML as it would be non-hierarchical. Simply add an XmlDataSource to your form, point it to an XML document, and bind a GridView to it. The result won’t be so pleasing. Only the top-level nodes would be bound to the GridView. This approach is a little bit crappy, so let’s look the other ones.

You can use XPath syntax with the XmlDataSource to bind your data. You can even use it as you’d use Evals and Binds, but only within templated controls. For example a DropDownList won’t work as you would wait it.

The control which is designed to work with hierarchical data is the TreeView.  The default behavior isn’t the best available, but luckily you can extend it. Just define a DataBindings tag with TreeNodeBinding items within the TreeView declaration, and you are ready to go.

XML and DataSet integration

As mentioned in the previous post, DataSet has some XML integration. The most important methods are as follows:

  • GetXml: generates the XML representation of the DataSet into a string.
  • WriteXml: writes the contents of the DataSet to an XML file, schema can be written if needed.
  • ReadXml: reads XML to the DataSet to populate it.
  • GetXmlSchema: retrieves the XML schema of the DataSet into a string.
  • WriteXmlSchema: writes the XML schema of the DataSet into an XML file.
  • ReadXmlSchema: reads an XML schema from an XML file to configure the structure of the DataSet.
  • InferXmlSchema: reads an XML document with DataSet contents from a file, and uses it to infer the DataSet’s structure. However, the result is not guaranteed.

You can even access the DataSet as XML with the use of the XmlDataDocument class. You just need to pass a DataSet object to the constructor, and you are ready to go. However, this approach is very slow compared to the speed of XmlDocument or XDocument.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s