Stream-Based Compression
PHP’s request-based architecture does not necessarily make it the
best choice for processing very large amounts of data. Stream-based
data processing, however, works around this limitation by not
requiring all data to be loaded into memory before processing it.
Two examples of stream-based data processing that have been
available since PHP 5.1.2 are xmlreader
and
xmlwriter
. Instead of requiring a whole XML document to
be loaded into memory first, they are capable of consuming (or
producing, respectively) an XML document in a stream-based
fashion.
Compressing and un-compressing, however, which potentially also
has to deal with large amounts of data, did not work on streams.
This has changed with PHP 7. To make use of this new feature, you
will need to have the zlib
extension installed.
PHP 7 has four new functions, deflate_init()
,
deflate_add()
, inflate_init()
, and
inflate_add()
. Let us try to compress some data:
$context = deflate_init(ZLIB_ENCODING_GZIP);
$data = '';
$data .= deflate_add($context, 'data ...', ZLIB_NO_FLUSH);
$data .= deflate_add($context, '... more data ...', ZLIB_NO_FLUSH);
$data .= deflate_add($context, '... even more data', ZLIB_FINISH);
var_dump($data);
To compress (or un-compress) data, we need a context, which
technically just is a PHP resource. When creating the context, we
need to specify the compression algorithm. We do this by passing one
of the following constants: ZLIB_ENCODING_GZIP
,
ZLIB_ENCODING_DEFLATE
, or
ZLIB_ENCODING_RAW
which denotes another type of deflate
algorithm.
Optionally, we could pass additional parameters to
deflate_init()
if we need more control over how the
data is being processed and compressed.
The second parameter passed to deflate_add()
is the
data to compress. The third parameter tells PHP how to deal with the
compression. To achieve the most efficient compression results, you
should use ZLIB_NO_FLUSH
in all but the last calls.
With the last call, you need to pass ZLIB_FINISH
.
The output of the above code is just some binary data that is – obviously – illegible. We can, however, un-compress it again:
$compressContext = deflate_init(ZLIB_ENCODING_DEFLATE);
$compressedData = '';
$compressedData .= deflate_add(
$compressContext, 'data ...', ZLIB_NO_FLUSH
);
$compressedData .= deflate_add(
$compressContext, '... more data ...', ZLIB_NO_FLUSH
);
$compressedData .= deflate_add(
$compressContext, '... even more data', ZLIB_FINISH
);
$unCompressContext = inflate_init(ZLIB_ENCODING_DEFLATE);
$unCompressedData = inflate_add(
$unCompressContext, $compressedData, ZLIB_NO_FLUSH
);
$unCompressedData .= inflate_add(
$unCompressContext, null, ZLIB_FINISH
);
print $unCompressedData;
Luckily – or unsurprisingly – the result is:
data ...... more data ...... even more data