:::: MENU ::::

Tuesday, February 10, 2009

In today’s post I’m going to share a problem I solved this week. The solution was to use the framework’s System.IO.Compression namespace and the GZipStream object.

The Problem

In my current project we save Xml data in our database. The field to save the data was of type varchar(6000). This way of saving the data raised a problem of big Xml data (over 8000 kb for every Xml data) which were saved in the database and for the long run could raise space and performance problems.

The Solution

Use the compression abilities of .NET framework, compress the Xml data and save the data in a binary form. We needed to change the field type in the database to binary type and compress the Xml data before inserting it to the database. After the binary data was retrieved from the database a reverse process of decompress returns the original Xml string.

The Code

I first built a console application to write the code and test it. Then, I wired the zip and unzip methods I wrote to the part that needed the compression. The following code is the console application’s zip and unzip methods I used to compress the Xml data.

01.static void Main(string[] args)

02.{

03.    string data = "<Root><Child></Child>data1<Child>data2</Child><Child>data3</Child><Child>data4</Child><Child>data5</Child></Root>";

04.    Console.WriteLine(data);

05.  

06.    byte[] zipped = ZipDocumentData(data);

07.    Console.WriteLine(Encoding.UTF8.GetString(zipped));

08.  

09.    data = UnZipDocumentData(zipped);

10.    Console.WriteLine(data);

11.  

12.    Console.Read();

13.}

14.  

15.private static byte[] ZipDocumentData(string documentData)

16.{

17.    byte[] byteArray = Encoding.UTF8.GetBytes(documentData);

18.    string result = string.Empty;

19.  

20.    using (MemoryStream ms = new MemoryStream())

21.    {

22.        using (GZipStream stream = new GZipStream(ms, CompressionMode.Compress))

23.        {

24.            //Compress

25.            stream.Write(byteArray, 0, byteArray.Length);

26.        }

27.        return ms.ToArray();

28.    }

29.}

30.  

31.private static string UnZipDocumentData(byte[] zippedDocumentData)

32.{            

33.    string result = string.Empty;

34.  

35.    //Prepare for decompress

36.    using (MemoryStream ms = new MemoryStream(zippedDocumentData))

37.    {

38.        using (GZipStream stream = new GZipStream(ms, CompressionMode.Decompress))

39.        {

40.            //Reset variable to collect uncompressed result

41.            byte[] byteArray = new byte[4096];

42.  

43.            //Decompress

44.            int rByte = stream.Read(byteArray, 0, byteArray.Length);

45.  

46.            result = Encoding.UTF8.GetString(byteArray);

47.        }

48.    }

49.    return result;

50.}

Some things that should be concerned if you are going to use this code:

  • The encoding of the strings I use are in UTF8 format. If you use other formats you should change the Encoding.UTF8 code to the format you use.
  • In the decompress process I use a fixed array of 4096 bytes. This is only for the testing application in the real method I save the original size of the array.

Summary

Lets sum up, I used a compression method to compress Xml data in the database. I showed the code to do that using the GZipStream object which is part of System.IO.Compression namespace. I hope the code will help you when you’ll ever need to compress data.