In today’s post I’m going to share a problem I solved this week. The solution was to use the framework’s System.IO.Compression namespace and the GZipStream object.
The Problem
In my current project we save Xml data in our database. The field to save the data was of type varchar(6000). This way of saving the data raised a problem of big Xml data (over 8000 kb for every Xml data) which were saved in the database and for the long run could raise space and performance problems.
The Solution
Use the compression abilities of .NET framework, compress the Xml data and save the data in a binary form. We needed to change the field type in the database to binary type and compress the Xml data before inserting it to the database. After the binary data was retrieved from the database a reverse process of decompress returns the original Xml string.
The Code
I first built a console application to write the code and test it. Then, I wired the zip and unzip methods I wrote to the part that needed the compression. The following code is the console application’s zip and unzip methods I used to compress the Xml data.
01.static
void
Main(string[] args)
02.{
03. string
data = "<Root><Child></Child>data1<Child>data2</Child><Child>data3</Child><Child>data4</Child><Child>data5</Child></Root>";
04. Console.WriteLine(data);
05.
06. byte[] zipped = ZipDocumentData(data);
07. Console.WriteLine(Encoding.UTF8.GetString(zipped));
08.
09. data = UnZipDocumentData(zipped);
10. Console.WriteLine(data);
11.
12. Console.Read();
13.}
14.
15.private
static
byte[] ZipDocumentData(string
documentData)
16.{
17. byte[] byteArray = Encoding.UTF8.GetBytes(documentData);
18. string
result = string.Empty;
19.
20. using
(MemoryStream ms = new
MemoryStream())
21. {
22. using
(GZipStream stream = new
GZipStream(ms, CompressionMode.Compress))
23. {
24. //Compress
25. stream.Write(byteArray, 0, byteArray.Length);
26. }
27. return
ms.ToArray();
28. }
29.}
30.
31.private
static
string
UnZipDocumentData(byte[] zippedDocumentData)
32.{
33. string
result = string.Empty;
34.
35. //Prepare for decompress
36. using
(MemoryStream ms = new
MemoryStream(zippedDocumentData))
37. {
38. using
(GZipStream stream = new
GZipStream(ms, CompressionMode.Decompress))
39. {
40. //Reset variable to collect uncompressed result
41. byte[] byteArray = new
byte[4096];
42.
43. //Decompress
44. int
rByte = stream.Read(byteArray, 0, byteArray.Length);
45.
46. result = Encoding.UTF8.GetString(byteArray);
47. }
48. }
49. return
result;
50.}
Some things that should be concerned if you are going to use this code:
- The encoding of the strings I use are in UTF8 format. If you use other formats you should change the Encoding.UTF8 code to the format you use.
- In the decompress process I use a fixed array of 4096 bytes. This is only for the testing application in the real method I save the original size of the array.
Summary
Lets sum up, I used a compression method to compress Xml data in the database. I showed the code to do that using the GZipStream object which is part of System.IO.Compression namespace. I hope the code will help you when you’ll ever need to compress data.