.NET logo

DataSet Serialisation

home | software | utility classes | dataset serialisation

about serialisation of datasets

I discovered that the BinaryFormatter class serialises DataSets using Xml. In fact, it's even slightly less efficient than using XmlFormatter, as it inserts a binary header and chunks of whitespace that aren't created by XmlFormatter.

This discovery was made after stack-profiling an application I was working on for a client. A large amount of reference data was being persisted on the client's computer to reduce network load and decrease application startup time. The profiler revealed the majority of startup time was devoted to performing string operations for Xml deserialisation.

I've written a DataSetFormatter class for (de)serialising DataSet instances.

It persists:

It (currently) does not persist:

Adding such features should be trivial.

using the class

The class should be used as any other IFormatter implementation, however it may only serialise DataSet instances. Attempting to serialise any other type will throw an ArgumentException.

For example, to serialise a DataSet to a file:

string fileName = "TestFileName.ser"; DataSet dataSet = GetDataSet(); // for example Stream stream = null; try { stream = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None); new DataSetFormatter().Serialize(stream, dataSet); } finally { if (stream != null) stream.Close(); }

how it works

Internally, all source ADO DataTable instances are read from the DataSet. A Serializable inner-type exists (called Table) into which the table name, column and row count, column names, column data types, and finally row values are stored as strings, arrays, and a rectangular array (for the data). An array of Table objects are then serialised by the BinaryFormatter. The process occurs in reverse for deserialisation.

performance

Tests have shown considerable improvements to both disk usage and execution time. On average, disk usage is just 25% (4 times faster), and round-trip processing time is 7% (14 times faster), when compared against BinaryFormatter.

The class may work with remoting, though I've not tried, nor have I studied the remoting framework in enough detail to understand providing custom IFormatters.

There are unit tests that give quantitative performance improvements via the following output:

Serialised filesizes: 50659 with DataSetFormatter, 201746 with BinaryFormatter Compressed to 25.11 % of BinaryFormatter size Roundtrip times: 28.001 for DataSetFormatter, 464.4098 for BinaryFormatter Reduced to 6.03 % of BinaryFormatter time

download

Only the most recent version is available for download.

release notes

v1.1 - 9 Nov 2003

v1.0 - 25 Oct 2003

submitting feedback

Please feel free to provide feedback on this class . Bug reports are most appreciated when accompanied by a failing NUnit test case.