Traditionally XML documents are processed using DOM (document object model), which is a set of classes standardized by W3C (World Wide Web Consortium) . So process an XML document one has to use DOM. Though using DOM is particularly a difficult task, learning any new API ( a set of classes and methods) is adding extra burden. We wish we could process XML document in the same manner as a table in a relational table. Well Microsoft listened to our wish and delivered ADO.NET.
ADO.NET is the new set of classes used to access database from .NET applications. It provides DataSet, which can be used to access XML documents. It allows programmer to read, update and create XML documents with the same set of classes and methods that we use to manipulate relational databases like Oracle and SQL Server. This allows programmer to master one set of classes and methods (ADO.NET) and use it with both relational databases and XML documents.
This article is about using DataSet and related classes to process XML documents. We will work with a simple XML document called PLAYERS.XML, which contains information about players of a Soccer team. This article is divided into different section. In each section we concentrate on one task.
Adding elements to XML document using Dataset
If you want to store details of multiple players then the same can be extended as follows:
XML Schema is the description of the structure of the XML document. It specifies the hierarchy of the elements and attributes of each element like name, data type etc.
By using XML Schema, it is possible to specify exactly how you want your XML document to be. It also enables data to be validated according to the data type.
ADO.NET provides classes like Connect, DataReader, Command and DataAdapter. But the class that we use in this article is DataSet.
DataSet contains a collection of tables in memory. DataSet also contains relation objects establishing relationship between tables in DataSet.
The following example program is in VB.NET. It shows how you can read and display the content of an XML document called PLAYERS.XML.
Tables collection represents tables in dataset. The expression ds.tables(0) provides access to first table
in the dataset.
Columns collection of a table represents columns of the table. So the expression ds.tables(0).columns.count
returns the number of columns in the first table of the dataset.
Item collection of a row provides access to values of columns in the row. You can access a column either by index
of the column or the name of the column.
When you run the above program the following output will be generated.
The following is schema that can be used to specify the structure of players.xml file. The schema itself
is an XML document and stored with the name playerschema.xml.
If you load schema first and then data ReadXML methods checks whether the data in the document confirms with the structure in the schema.
The following VB.NET program loads schema from playerschema.xml first and then data from player.xml.
If you change age in XML document to a non-numeric then it displays an error and program is terminated, otherwise the data from the document is displayed.
The following example loads data from players.xml into a table of dataset. Then it adds a new
row to the table and write schema of the table into playerschema.xml and data into players.xml.
Manipulating XML documents doesn't need knowledge of DOM and instead ADO.NET itself can be used
for that purpose.
What is XML?
XML stands for eXtensible Markup Language. It is about storing data using tags. For example, if you want to store information about a player then you put data related to player using tags as follows:
<player>
<name>Ronaldo</name>
<position>Forward</position>
<age>25</age>
</player>
Each value is stored with tags. In the above document, we have player tag to store information about player. Each attribute is stored again with a tag - name, position and age.
<players>
<player>
<name>Ronaldo</name>
<position>Forward</position>
<age>25</age>
</player>
<player>
<name>Roberto Carlos</name>
<position>Defender</position>
<age>29</age>
</player>
<player>
<name>Rivaldo</name>
<position>Midfielder</position>
<age>30</age>
</player>
</players>
Each XML document should have a root element, which encompasses all other elements and unique.
In the above example as we have multiple player tags, we used players tag to encompass all player tags.
What is ADO.NET?
ADO.NET is new way to access data from .NET Framework. It is mainly consisting of two sets of classes.
One set is used to access SQL Server using SQL Server .NET Data Provider, which access SQL Server directly.
The other provider is a bridge. It accesses an OLE DB Provider to access databases other than SQL Server.
How to read XML Document?
As I have already mentioned, prior to ADO.NET, one has to access XML document using DOM. But with DataSet of ADO.NET, it is possible to access and manipulate an XML document.
' import required namespaces
imports system
imports system.data
imports system.xml
Module Test
public sub main()
dim ds as new DataSet
dim r as datarow
dim i as integer
' read an XML file into dataset. Data read using READXML will
' be taken as a table
ds.ReadXML("c:\srikanth\vb\players.xml")
' display data from the file
for each r in ds.tables(0).rows
for i = 0 to ds.tables(0).columns.count - 1
console.write( r.item(i) & " " )
next
console.writeline
next
end sub
end module
ReadXML method takes the name of the file that contains XML data and reads the content of the file
into DataSet as a DataTable. Elements of XML document - Player - become rows in the table.
Ronaldo Forward 25
Roberto Carlos Defender 29
Rivaldo Midfielder 30
Using schema
Schema of an XML document specifies the structure to which XML document must adhere. It is used
to specify the data types of elements. For example, in the above document (players.xml), age must be an integer.
But as of now, it can be anything as default data type is string. So we can create and load schema
so that the data is validated at the time of loading.
<?xml version="1.0"?>
<xs:schema id="players" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="players" msdata:IsDataSet="true">
<xs:complexType>
<xs:choice maxOccurs="unbounded">
<xs:element name="player">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" minOccurs="0" />
<xs:element name="position" type="xs:string" minOccurs="0" />
<xs:element name="age" type="xs:int" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
The above schema specifies that the document contains an element called player and it contains three elements - name, position and age. The type of name and position is string and type of age is int.
imports system
imports system.data
imports system.xml
Module Test
public sub main()
dim ds as new DataSet
dim r as datarow
dim i as integer
ds.readxmlschema("c:\srikanth\vb\playerschema.xml")
try
ds.ReadXML("c:\srikanth\vb\players.xml")
catch ex as Exception
console.writeline( "Error while loading data : " + ex.Message)
exit sub
end try
' display data from the file
for each r in ds.tables(0).rows
for i = 0 to ds.tables(0).columns.count - 1
console.write( r.item(i) & " " )
next
console.writeline
next
end sub
end module
The above program executes ReadXML method in Try block as it may throw an exception if data in the XML document is not in the required format.
Adding elements to XML document using Dataset
It is possible to modify and XML document using dataset. To change data in XML document following
the steps:
imports system
imports system.data
imports system.xml
Module Test
public sub main()
dim ds as new DataSet
dim dr as datarow
ds.ReadXML("c:\srikanth\vb\players.xml")
'add one more player
dr = ds.tables(0).newrow() ' add a new empty row
dr.item(0) = "Cafu"
dr.item(1) = "Defender"
dr.item(2) = 27
ds.tables(0).rows.add(dr) ' add new row to rows collection
ds.writexml("c:\srikanth\vb\players.xml")
ds.writexmlschema("c:\srikanth\vb\playerschema.xml")
end sub
end module
Method WriteXML writes data of table into an XML document. WriteXMLSchema writes schema into the given file.
Open players.xml in Internet Explorer and see the whether a new row is added to players.xml.
Keep Learning
P.Srikanth