Discussion:
HOWTO: Decompress gzip output from IIS
(too old to reply)
d***@pnwsoft.com
2006-09-28 04:16:26 UTC
Permalink
After losing a day of my life to this problem, I thought I would share
my solution for anyone thats also trying to do this. This is for using
C++ inside an ActiveX control. It was doing independant WinInet
requests for .aspx pages that return huge XML documents over a low
bandwidth, but secure network. I saw gzip give a 10 : 1 or better
compression ratio.

Steps to decompress gzip output from IIS:

1) Set IIS to gzip the desired output
(http://technet2.microsoft.com/WindowsServer/f/?en/library/dde52be3-a66b-4770-80e3-9ff3c2de52f21033.mspx)

2) Set gzip as acceptable encoding:
if (!HttpAddRequestHeaders(hRequest, _T("Accept-Encoding: gzip,
utf8\r\n"), (DWORD)-1L, HTTP_ADDREQ_FLAG_REPLACE |
HTTP_ADDREQ_FLAG_ADD))

3) Verify IIS gzipped the output:
HttpQueryInfo(hRequest, HTTP_QUERY_CONTENT_ENCODING, &HBuff, &size,
&index)

4) Call InternetReadFile and compile all the output into a single
memory buffer

5) Allocate a decompression memory buffer sufficient to hold the
decompressed data. I used OutputSize * 25 to be safe.

6) Dowload zlib and follow the readme's on how to use it inside Visual
Studio

7) Write a function similar to this:


int CTreeControl::InflateGZip(BYTE * a_pCompressed, ULONG
a_nCompressedLength, BYTE* a_pUncompressed, ULONG
a_nUncompressedLength)
{
// See zlib.h for all these objects
int err;
z_stream d_stream; /* decompression stream */

strcpy((char*)a_pUncompressed, "garbage");

d_stream.zalloc = (alloc_func)0;
d_stream.zfree = (free_func)0;
d_stream.opaque = (voidpf)0;

d_stream.next_in = a_pCompressed + 10; // The first 10 bytes of
an IIS response is the gzip header
d_stream.avail_in = a_nCompressedLength - 10;
d_stream.avail_out = a_nUncompressedLength;
d_stream.next_out = a_pUncompressed;

err = inflateInit2(&d_stream, -15); // It seems to be the exact
opposite of what zlib.h says, but -15 is what gzio.c used
err = inflate(&d_stream, Z_NO_FLUSH);
if (err == 0 || err == Z_STREAM_END)
{
err = inflateEnd(&d_stream);
if (err == 0)
{
return d_stream.total_out;
}
}
return 0;
}

8) Call that function with your compressed and uncompressed buffers and
BAM! you're done.

Now I went around and around with using streams and doing all the
things zlib.h told me to do, but NONE of it worked for IIS output.
Whatever decompress / inflate routine I used would return -3 (
Z_DATA_ERROR ). I dove into all the RFCs and header definitions and it
still didnt make sense. I finally wrote my output to a file, compiled
in gzio.c, and watched what it did with the actual data, stripped away
all the CRC stuff, and i was left with the function above.

I know this solution is not elegant at all, doesnt do CRC, uses too
much memory, but it DOES work. My hope is those of you that require
less memory or more buffer overrun safety can use this as a starting
point.
Vladimir Scherbina
2006-09-28 11:23:00 UTC
Permalink
IIS is not my passion, so I can tell you only the hints.

The problem may occure because of several facts. The first is: you use wrong
version of zlib. Since I met the same problem it may be really what spends
your days now. Make sure you decompress correctly. Put the small text on web
server and decompress it on a client side - if the results will be different
tell us:

- what was on the server
- what you have recieved
- what you recieved after decompresssion

Another problem might be the streaming. IIS can compress data by chunks - in
this case putting all chunks and decompressing the resulting buffer would
not work.
--
Vladimir
Post by d***@pnwsoft.com
After losing a day of my life to this problem, I thought I would share
my solution for anyone thats also trying to do this. This is for using
C++ inside an ActiveX control. It was doing independant WinInet
requests for .aspx pages that return huge XML documents over a low
bandwidth, but secure network. I saw gzip give a 10 : 1 or better
compression ratio.
1) Set IIS to gzip the desired output
(http://technet2.microsoft.com/WindowsServer/f/?en/library/dde52be3-a66b-4770-80e3-9ff3c2de52f21033.mspx)
if (!HttpAddRequestHeaders(hRequest, _T("Accept-Encoding: gzip,
utf8\r\n"), (DWORD)-1L, HTTP_ADDREQ_FLAG_REPLACE |
HTTP_ADDREQ_FLAG_ADD))
HttpQueryInfo(hRequest, HTTP_QUERY_CONTENT_ENCODING, &HBuff, &size,
&index)
4) Call InternetReadFile and compile all the output into a single
memory buffer
5) Allocate a decompression memory buffer sufficient to hold the
decompressed data. I used OutputSize * 25 to be safe.
6) Dowload zlib and follow the readme's on how to use it inside Visual
Studio
int CTreeControl::InflateGZip(BYTE * a_pCompressed, ULONG
a_nCompressedLength, BYTE* a_pUncompressed, ULONG
a_nUncompressedLength)
{
// See zlib.h for all these objects
int err;
z_stream d_stream; /* decompression stream */
strcpy((char*)a_pUncompressed, "garbage");
d_stream.zalloc = (alloc_func)0;
d_stream.zfree = (free_func)0;
d_stream.opaque = (voidpf)0;
d_stream.next_in = a_pCompressed + 10; // The first 10 bytes of
an IIS response is the gzip header
d_stream.avail_in = a_nCompressedLength - 10;
d_stream.avail_out = a_nUncompressedLength;
d_stream.next_out = a_pUncompressed;
err = inflateInit2(&d_stream, -15); // It seems to be the exact
opposite of what zlib.h says, but -15 is what gzio.c used
err = inflate(&d_stream, Z_NO_FLUSH);
if (err == 0 || err == Z_STREAM_END)
{
err = inflateEnd(&d_stream);
if (err == 0)
{
return d_stream.total_out;
}
}
return 0;
}
8) Call that function with your compressed and uncompressed buffers and
BAM! you're done.
Now I went around and around with using streams and doing all the
things zlib.h told me to do, but NONE of it worked for IIS output.
Whatever decompress / inflate routine I used would return -3 (
Z_DATA_ERROR ). I dove into all the RFCs and header definitions and it
still didnt make sense. I finally wrote my output to a file, compiled
in gzio.c, and watched what it did with the actual data, stripped away
all the CRC stuff, and i was left with the function above.
I know this solution is not elegant at all, doesnt do CRC, uses too
much memory, but it DOES work. My hope is those of you that require
less memory or more buffer overrun safety can use this as a starting
point.
Loading...