Exploring XML Serialization
Introduction
We all need to persist a variety of information; whether it be configuration settings, information caching, description information or even storing information that is sometimes that a database might be better suited for but for one reason or another isn’t an option. We have several choices available for reading/writing this information to the hard drive. Some examples include the “classic” INI files, comma delimited or fixed length data files and even as registry entries. However, given the adoption rate of XML, it’s appropriate to explore the options available to you via the .NET Framework Class Library (FCL).
What will this article do for you?
We’ll be looking at two examples of leveraging XML serialization to persist our classes. We will also explore several options available for where to potentially store these files, cover some migration ideas from the previously mentioned methods to XML storage and show several examples of using Attributes to control the XML formatting.
Initial Setup of our Project Environment
- Create a new Windows Form project (
XMLSerialization
).
First Things First
Saving configuration is something that nearly every application needs to do, so let’s use that as our sample to build upon. First, let’s look at how you might have persisted this information in the past.
One option is to use the registry to read/write this information. Not to difficult to do and the methods are even available in the FCL. A couple of problems exist though with using the registry. First and foremost, using the registry is something that is now considered a “bad thing” by Microsoft. There’s a limited size that the registry can be and the security issues (access the registry as a “user” has it’s own set of problems). It’s something that is really meant (in the age of Windows XP and Windows 2003) for Windows, installations, com objects and drivers to use. There’s more appropriate options for your applications.
The next option is to store the configuration information in a file of some type either in your application directory or more appropriately in one of the users special folders. You can choose to use a comma delimited or fixed length file to store this information; however, let’s use the example of the “classic” INI file for our sample. This allows us to see a storage file that contains a name/value pair in a categorized structure allow a better frame of reference when moving forward to the XML version of the file.
Here’s our example INI file (System.ini
from C:\Windows
):
1
2
3
4
5
6
7
8
9
10
11
12
; for 16-bit app support
[drivers]
wave=mmdrv.dll
timer=timer.drv
[mci]
[driver32]
[386enh]
woafont=dosapp.FON
EGA80WOA.FON=EGA80WOA.FON
EGA40WOA.FON=EGA40WOA.FON
CGA80WOA.FON=CGA80WOA.FON
CGA40WOA.FON=CGA40WOA.FON
We’re not going to get into the specifics of using an INI file in this article (you’d have to either write your own parsing code or use some P/Invoke methods), but you can see there are a couple of sections that stand out. There is a mechanism in place for adding comments, you have sections (enclosed in brackets) and name/value pairs.
At this point, let’s create a class that will hold some of this information in a structured manner.
- Create a new class (
Config.vb
). - Modify the
Config
class as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Public Class Config
Private _drivers As Drivers
Private _enhanced386 As Enhanced386
Public Sub New()
_drivers = New Drivers
_enhanced386 = New Enhanced386
End Sub
Public Property Drivers() As Drivers
Get
Return _drivers
End Get
Set(ByVal Value As Drivers)
_drivers = Value
End Set
End Property
Public Property Enhanced386() As Enhanced386
Get
Return _enhanced386
End Get
Set(ByVal Value As Enhanced386)
_enhanced386 = Value
End Set
End Property
End Class
- Add two more classes (
Drivers
andEnhanced386
).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
Public Class Drivers
Private _wave As String
Private _timer As String
Public Property Wave() As String
Get
Return _wave
End Get
Set(ByVal Value As String)
_wave = Value
End Set
End Property
Public Property Timer() As String
Get
Return _timer
End Get
Set(ByVal Value As String)
_timer = Value
End Set
End Property
End Class
Public Class Enhanced386
Private _woafont As String
Private _fonts As ArrayList
Public Sub New()
_fonts = New ArrayList
End Sub
Public Property WOAFont() As String
Get
Return _woafont
End Get
Set(ByVal Value As String)
_woafont = Value
End Set
End Property
Public Property Fonts() As ArrayList
Get
Return _fonts
End Get
Set(ByVal Value As ArrayList)
_fonts = Value
End Set
End Property
End Class
We’re going to leave out the MCI
and Driver32
sections since there’s nothing in them. We’ve had to rename the 386Enh
to Enhanced386
since we can’t have a class that starts with a numeric value. Notice that we’re also representing the different fonts listed as an array instead of separate name/value pairs.
Now, let’s add the code that we will use to load/save this class to a file.
- First, let’s add a couple of
Imports
to save us some typing:
1
2
3
4
5
Imports System.Xml
Imports System.Xml.Schema
Imports System.Xml.Serialization
Imports System.Runtime.Serialization.Formatters.Soap
Imports System.IO
-
Once we add the above Imports, we will have an error on the
System.Runtime.Serialization.Formatters.Soap
line. Add a reference to theSystem.Runtime.Serialization.Formatters.Soap
namespace. -
Add the following two shared methods to the
Config
class:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Public Shared Function Load(ByVal path As String) As Config
Dim stream As Stream = File.Open(path, FileMode.Open)
Dim formatter As SoapFormatter = New SoapFormatter
Dim c As Config = CType(formatter.Deserialize(stream), Config)
stream.Close()
Return c
End Function
Public Shared Sub Save(ByVal path As String, ByVal config As Config)
Dim stream As Stream = File.Open(path, FileMode.Create)
Dim formatter As New SoapFormatter
formatter.Serialize(stream, config)
stream.Close()
End Sub
So, is this everything we have to do? Well, not yet, but to verify this, let’s add the following to our Form1
load event.
- Add the following to the
Form_Load
event:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Private Sub Form1_Load(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MyBase.Load
Dim c As New Config
c.Drivers.Wave = "mmdrv.dll"
c.Drivers.Timer = "timer.drv"
c.Enhanced386.WOAFont = "dosapp.FON"
c.Enhanced386.Fonts.Add("EGA80WOA.FON")
c.Enhanced386.Fonts.Add("EGA40WOA.FON")
c.Enhanced386.Fonts.Add("CGA80WOA.FON")
c.Enhanced386.Fonts.Add("CGA40WOA.FON")
Config.Save("system.xml", c)
c = Nothing
c = Config.Load("system.xml")
MsgBox("timer=" & c.Drivers.Timer)
End Sub
Now, if you execute the application, you will get the following error:
An unhandled exception of type ‘System.Runtime.Serialization.SerializationException’ occurred in mscorlib.dll Additional information: The type XMLSerialization.Config in Assembly XMLSerialization, Version=1.0.1558.42446, Culture=neutral, PublicKeyToken=null is not marked as serializable.
The key here is that the classes are not marked as being Serializable
, let’s correct this.
- Add the following attribute to each of the classes:
1
<Serializable()>
Note: For a good reference on .NET Attributes, be sure to check out Applied .NET Attributes by Jason Bock and Tom Barnaby.
Now run the application again. This time you should have a system.xml file created in the projects bin folder. Open this file using Internet Explorer (or VS.NET). It should look like the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:clr="http://schemas.microsoft.com/soap/encoding/clr/1.0" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<a1:Config id="ref-1" xmlns:a1="http://schemas.microsoft.com/clr/nsassem/XMLSerialization/XMLSerialization%2C%20Version%3D1.0.1558.42446%2C%20Culture%3Dneutral%2C%20PublicKeyToken%3Dnull">
<_drivers href="#ref-3"/>
<_enhanced386 href="#ref-4"/>
</a1:Config>
<a1:Drivers id="ref-3" xmlns:a1="http://schemas.microsoft.com/clr/nsassem/XMLSerialization/XMLSerialization%2C%20Version%3D1.0.1558.42446%2C%20Culture%3Dneutral%2C%20PublicKeyToken%3Dnull">
<_wave id="ref-5">mmdrv.dll</_wave>
<_timer id="ref-6">timer.drv</_timer>
</a1:Drivers>
<a1:Enhanced386 id="ref-4" xmlns:a1="http://schemas.microsoft.com/clr/nsassem/XMLSerialization/XMLSerialization%2C%20Version%3D1.0.1558.42446%2C%20Culture%3Dneutral%2C%20PublicKeyToken%3Dnull">
<_woafont id="ref-7">dosapp.FON</_woafont>
<_fonts href="#ref-8"/>
</a1:Enhanced386><a2:ArrayList id="ref-8" xmlns:a2="http://schemas.microsoft.com/clr/ns/System.Collections">
<_items href="#ref-9"/>
<_size>4</_size>
<_version>4</_version>
</a2:ArrayList>
<SOAP-ENC:Array id="ref-9" SOAP-ENC:arrayType="xsd:anyType[16]">
<item id="ref-10" xsi:type="SOAP-ENC:string">EGA80WOA.FON</item>
<item id="ref-11" xsi:type="SOAP-ENC:string">EGA40WOA.FON</item>
<item id="ref-12" xsi:type="SOAP-ENC:string">CGA80WOA.FON</item>
<item id="ref-13" xsi:type="SOAP-ENC:string">CGA40WOA.FON</item>
</SOAP-ENC:Array>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
I don’t know about you, but that output looks pretty freakin’ messy! Well, the reality is that this is not the proper way we should be attempting to serialize the class to an XML file. Although it produces completely valid XML that is relatively strongly typed, this method of serialization is better suited for transferring the information between application boundaries. For our purposes, this is sort of over kill. What we are looking for is an XML file that looks something like the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<?xml version="1.0"?>
<config>
<drivers>
<wave>mmdrv.dll</wave>
<timer>timer.drv</timer>
</drivers>
<enhanced386>
<woafont>dosapp.FON</woafont>
<fonts>
<font>EGA80WOA.FON</font>
<font>EGA40WOA.FON</font>
<font>CGA80WOA.FON</font>
<font>CGA40WOA.FON</font>
</fonts>
</enhanced386>
</config>
As usual with my articles, I try to show you the wrong way first ;-) Well, it’s not actually the wrong way, but this is probably the first direction you would look if you were to look out to do XML serialization using .NET.
So now let’s look at how to take some control over how the XML is structured. Moving forward, we will no longer require the System.Runtime.Serialization.Formatters.Soap
reference and can remove the <Serializable()> attributes from the classes.
- Remove the reference to
System.Runtime.Serialization.Formatters.Soap
. - Remove
Imports System.Runtime.Serialization.Formatters.Soap
. - Remove the
Serializable()
attributes from theConfig
,Drivers
andEnhanced386
classes.
At this point, were kind of back to square one. Now we will replace the code in the Load()
and Save()
methods of the Config
class to use a different method of serialization.
- Replace the code in the
Load()
andSave()
methods so they look like the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Public Shared Function Load(ByVal path As String) As Config
Dim xs As XmlSerializer = New XmlSerializer(GetType(Config))
Dim fs As FileStream = New FileStream(path, FileMode.Open)
Dim c as Config = CType(xs.Deserialize(fs), Config)
fs.Close()
Return c
End Function
Public Shared Sub Save(ByVal path As String, ByVal config As Config)
Dim xs As XmlSerializer = New XmlSerializer(GetType(Config))
Dim fs As FileStream = New FileStream(path, FileMode.Create)
xs.Serialize(fs, config)
fs.Close()
End Sub
Now if you execute the application, you should see the system.xml
file change to look something like the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<?xml version="1.0"?>
<Config xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Drivers>
<Wave>mmdrv.dll</Wave>
<Timer>timer.drv</Timer>
</Drivers>
<Enhanced386>
<WOAFont>dosapp.FON</WOAFont>
<Fonts>
<anyType xsi:type="xsd:string">EGA80WOA.FON</anyType>
<anyType xsi:type="xsd:string">EGA40WOA.FON</anyType>
<anyType xsi:type="xsd:string">CGA80WOA.FON</anyType>
<anyType xsi:type="xsd:string">CGA40WOA.FON</anyType>
</Fonts>
</Enhanced386>
</Config>
By default, the XMLSerializer
will attempt to serialize the class structure automatically using what information it can pull using reflection. This isn’t to bad, however, this XML output is a little ugly since the tags are not all lower case and the Fonts
collection looks awkward. However, you can see that it’s pretty simple to persist a class as an XML file. Now let’s experiment to see how we can take even more control over how the serializer processes the class.
First things first, let’s control how the tags are named. We want all of them to be lowercased. To do this, we will add some attributes to each class (XMLRoot()
) and property (XMLElement()
) to explicitly control how how the XMLSerializer
will work.
- Modify the
Config
class to have theXMLRoot()
attribute:
1
2
3
4
<XmlRoot("config")> _
Public Class Config
' <code not displayed to save space>
End Class
- Modify the
Drivers
class to have theXMLRoot()
attribute:
1
2
3
4
<XmlRoot("drivers")> _
Public Class Drivers
' <code not displayed to save space>
End Class
- Modify the
Enhanced386
class to have theXMLRoot()
attribute:
1
2
3
4
<XmlRoot("enhanced386")> _
Public Class Enhanced386
' <code not displayed to save space>
End Class
The XMLRoot
controls root element. We’ve added the XMLRoot()
attribute to the Drivers
and Enhanced386
classes just in case we decide to use them separately. Since they are contained within the Config
class as properties, that property will control how the element is created. Let’s now modify the properties of the classes.
- Add the
XMLElement()
attribute to theDrivers
andEnhanced386
properties in theConfig
class:
1
2
3
4
5
6
7
8
9
<XmlElement("drivers")> _
Public Property Drivers() As Drivers
' <code not displayed to save space>
End Property
</P> <XmlElement("enhanced386")> _
Public Property Enhanced386() As Enhanced386
' <code not displayed to save space>
End Property
- Add the
XMLElement()
attribute to theWave
andTimer
properties in theDrivers
class:
1
2
3
4
5
6
7
8
9
<XmlElement("wave")> _
Public Property Wave() As String
' <code not displayed to save space>
End Property
<XmlElement("timer")> _
Public Property Timer() As String
' <code not displayed to save space>
End Property
- Add the
XMLElement()
attribute to theDrivers
andEnhanced386
properties inEnhanced386
class:
1
2
3
4
5
6
7
8
9
<XmlElement("fonts")> _
Public Property WOAFont() As String
' <code not displayed to save space>
End Property
<XmlElement("fonts")> _
Public Property Fonts() As ArrayList
' <code not displayed to save space>
End Property
Now execute the application and check out the results of the system.xml
file. You will see that the xml now looks closer to what we want to produce. The only problem now is that the <fonts>
section is still not the way we want it. At this point, it looks like the following:
1
2
3
4
5
6
7
<enhanced386>
<woafont>dosapp.FON</woafont>
<fonts xsi:type="xsd:string">EGA80WOA.FON</fonts>
<fonts xsi:type="xsd:string">EGA40WOA.FON</fonts>
<fonts xsi:type="xsd:string">CGA80WOA.FON</fonts>
<fonts xsi:type="xsd:string">CGA40WOA.FON</fonts>
</enhanced386>
We want to get rid of the xsi:type="xsd:string"
portion and reformat it so that the fonts look like the following:
1
2
3
4
5
6
7
8
9
<enhanced386>
<woafont>dosapp.FON</woafont>
<fonts>
<font>EGA80WOA.FON</font>
<font>EGA40WOA.FON</font>
<font>CGA80WOA.FON</font>
<font>CGA40WOA.FON</font>
</fonts>
</enhanced386>
To do this, all we have to do is modify the attribute on the Fonts
property in the Enhanced386
class so that the XMLSerializer
knows how to format an ArrayList
the way we want it formatted.
- Remove the
XMLElement
attribute and replace it with the following:
1
2
3
4
<XmlArray("fonts"), XmlArrayItem("font", DataType:="string", Type:=GetType(String))> _
Public Property Fonts() As ArrayList
' <code not displayed to save space>
End Property
We are letting the XMLSerializer
know that we have an array called fonts with each element called font. By defining the DataType
and Type
, we are able to control the XMLSerializer
so that it doesn’t put the xsi:type="xsd:string
” attribute on each of the font elements since we are telling it that the type of each font element must be a string.
Wow, that’s pretty easy… but…
Yes, using the XMLSerializer
is really pretty simple and could potentially save you a ton of work. Once you build your initial class structure, to add an option, just add a new property and the appropriate attribute to control it’s formatting and your good to go. Even error checking is done for you automatically based on the how you defined your properties. Let’s look at some examples on this.
First, let’s create a property called IntValue
(I know, really creative isn’t it?) that is of an integer type and can only accept a value of 0, 1, or 2. We’ll go ahead and add it to the Config
class.
1
2
3
4
5
6
7
8
9
10
11
12
13
<XmlElement("intvalue")> _
Public Property IntValue() As Integer
Get
Return _intValue
End Get
Set(ByVal Value As Integer)
If Value > -1 AndAlso Value < 3 Then
_intValue = Value
Else
Throw New ArgumentException("IntValue must be 0, 1, or 2.")
End If
End Set
End Property
We also need to add the _intValue
member variable to the class.
1
Private _intValue As Integer
Now execute the application so that the system.xml
file is updated. After you’ve successfully done this, open the system.xml
file and modify the intvalue
to be some value other than 0, 1, or 2. You also need to modify the Form_Load
event to not save the XML file. Just remark out the Config.Save
line. To summarize:
- Execute the application so that a new
system.xml
file is created. - Open
system.xml
and modify theintvalue
to5
. - Modify the
Form_Load
event inForm1
by remarking out theConfig.Save()
line so that the file isn’t written again.
Once you’ve completed these steps, execute the application again. You should see an error stating that there’s a problem with the XML file at position 4, 4. This is because a value of 5 causes an exception to be thrown when the XMLSerializer
tries to set the property value to what is in the XML file. So as you can see, we get some automatic value protection. In addition to this protection, if a element doesn’t exist in the XML file, we can set what the default values are in the Sub New() method and these are honored as well.
So the next logical question that comes to mind is what happens when we use an enumeration when defining our properties? Well, as it turns out, they work just fine. You can define a property using an enumeration type and attribute is just like Let’s add one of these properties to our Config
class:
- Add the following enumeration to the
Config
class:
1
2
3
4
5
Public Enum EnumValues
Value1
Value2
Value3
End Enum
- Add the following member variable:
1
Private _enumValue As EnumValues
- Add the following property (notice the attribute properties):
1
2
3
4
5
6
7
8
9
<XmlElement(ElementName:="enumvalue", Type:=GetType(EnumValues))> _
Public Property EnumValue() As EnumValues
Get
Return _enumValue
End Get
Set(ByVal Value As EnumValues)
_enumValue = Value
End Set
End Property
Again, we are specifically defining what the XML element will be and notice that we also have to specify the Type property of the XMLElement
. Without this, the XMLSerializer
will throw an error since it doesn’t know how to deal with the property. Now here’s a potential gotcha. When this property is stored in the XML file, the enumeration element name will be used, not the numeric representation. In addition to this, this value is case sensitive. So when you save EnumValues.Value1
to the XML file (which will be stored as "Value1"
) and modify it to "value1"
, an error will be thrown by the XMLSerializer
since it doesn’t know how to deserialize that value into the enumeration. Why is it case sensitive? I have no idea, but my guess would be that it was written by a C# guy not thinking about other developers using languages that aren’t case sensitive. Of course, this is just a complete guess ;-)
Conclusion
I suppose this is enough information to get you started using XML serialization to bring this article to a conclusion. Keep in mind there is still a ton of other things you can do. The XMLSerializer
doesn’t give you 100% control, but does provide more than enough to allow you to get at least close to the results you are looking for. You could even use the reverse of what we’ve covered in this article to take an XML file and create a class that uses XML deserialization to populate the class. I’m leaving this article sort of open ended and it’s very possible that it will be added to over time based on comments I receive. So if you see a problem, know of anything that should be added, or if you like/dislike it, please let me know by leaving a comment!