HttpWebRequest, its request stream, and sending data in chunks ~ dev

Monday, February 9, 2009

February 09, 2009

HttpWebRequest, its request stream, and sending data in chunks

I have recently been spending a great deal of time writing code that communicated with RESTful services (Microsoft's Live Framework, in particular). Towards this end, I'm using the HttpWebRequest class to communicate and, for scalability and responsivenss, I want to perform my I/O operations asynchronously. This has caused me to look into the [Begin]GetRequestStream methods. At first it seemed quite odd to me that there would be a need to get the request stream asynchronously. Afterall, getting the stream just didn't seem like an I/O operation to me. However, via experimentation, I did notice that acquiring the stream did cause some I/O operations. This led me to start some communication with people at Microsoft to really get to the bottom of what is happenning. The information they shared was certainly news to me and I have never seen it documented anywhere. And so now, I'd like to share this information with you too as I think it is very useful for anyone using the HttpWebRequest class to send request data.

There are basically 3 ways to make an HTTP request that contains a payload:

If the payload is amll, you can call [Begin]GetRequestStream and it will immediately return a memory-backed stream that buffers the request payload. Then, when you call [Begin]GetResponse, the HttpWebRequest object calculates the ContentLength header based on how much data was written to the stream and then it initiates the I/O.
If the payload is large (2GB, for example), then you shouldn't cache the whole payload in memory before sending it. In this case, you'd set the ContentLength property explicitly to the payload size. Then when you calll [Begin]GetRequestStream, it now performs all of the I/O up to the point of sending the payload. You now write your payload to the stream which now sends its directly to the network (it does not get sent to an in-memory stream). When you call [Begin]GetResponse, it checks that the amount of data sent matches what you specified for the ContentLength property and does no additional I/O in order to start receiving the response.
If the payload is potentially large, generated at runtime, and you don't know the its size ahead of time then you set HttpWebRequest's SendChunked property to true (and do not set the ContentLength property). When you call [Begin]GetRequestStream, all of the I/O is performed up to the point of sending the payload data (including a "Trasnsfer-Encoding: Chunked" header). You now write to the stream in chunks and these writes go directly to the network. When you close the stream, a special terminating chunk is sent to notify the server that the payload is done. When you call [Begin]GetResponse, no I/O is performed and you can start receiving the response.

I hope you find this as useful as I did. Now, I can make intelligent decisions about my payload data, its size, how to buffer it, how to set headers, and when I can perform I/O synchronously as opposed to asynchronously. I really feel empowered with this knowledge.