Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4536

Compress and Create a Byte Array Using GZip

$
0
0

1. Overview

The GZIP format is a file format used in data compression. The GZipInputStream and GZipOutputStream classes of the Java language implement this file format.

In this tutorial, we’ll learn how to compress data using GZIP in Java. Also, we’ll look at how we can write the compressed data into a byte array.

2. The GZipOutputStream Class

The GZipOutputStream class compresses and writes data to an underlying output stream.

2.1. Object Instantiation

We can use the constructor to create an object of the class:

ByteArrayOutputStream os = new ByteArrayOutputStream();
GZIPOutputStream gzipOs = new GZIPOutputStream(os);

Here, we pass a ByteArrayOutputStream object to the constructor. As a result, we can later get the compressed data in a byte array using the toByteArray() method.

Instead of ByteArrayOutputStream, other instances of OutputStream that we may provide are:

  • FileOutputStream: for storing the data in a file
  • ServletOutputStream: to transmit the data over the network

In both cases, data is sent to its destination as it comes in.

2.2. Compress Data

The write() method performs data compression:

byte[] buffer = "Sample Text".getBytes();
gzipOs.write(buffer, 0, buffer.length);

The write() method compresses the content of the buffer byte array and writes it to the wrapped output stream.

Besides the buffer byte array, write() includes two more parameters, offset, and length. These define a range of bytes inside the byte array. So, we can use these to specify a range of bytes to write instead of the whole buffer.

Finally, to complete the data compression, we call close():

gzipOs.close();

The close() method writes all the remaining data and closes the stream. So, it’s important to call close(), otherwise we’ll lose data.

3. Getting the Compressed Data in a Byte Array

We’ll create a utility method for data compression using GZIP. We’ll also see how we can get a byte array with the compressed data.

3.1. Compress Data

Let’s create the gzip() method that compresses data in the GZIP format:

private static final int BUFFER_SIZE = 512;
public static void gzip(InputStream is, OutputStream os) throws IOException {
    GZIPOutputStream gzipOs = new GZIPOutputStream(os);
    byte[] buffer = new byte[BUFFER_SIZE];
    int bytesRead = 0;
    while ((bytesRead = is.read(buffer)) > -1) {
        gzipOs.write(buffer, 0, bytesRead);
    }
    gzipOs.close();
}

In the above method, first, we create a new GZIPOutputStream instance. Then, we start copying data from the is input stream, using the buffer byte array.

Notably, we keep reading bytes until we get the -1 return value. The read() method returns -1 when we reach the end of the stream.

3.2. Get a Byte Array With the Compressed Data

Let’s compress a string and write the result into a byte array. We’ll use the gzip() method that we created previously:

String payload = "This is a sample text to test the gzip method. Have a nice day!";
ByteArrayOutputStream os = new ByteArrayOutputStream();
gzip(new ByteArrayInputStream(payload.getBytes()), os);
byte[] compressed = os.toByteArray();

Here, we provide input and output streams to the gzip() method. We wrap the payload value inside a ByteArrayInputStream object. After that, we create an empty ByteArrayOutputStream where gzip() writes the compressed data.

Finally, after we invoke gzip(), we get the compressed data using the toByteArray() method.

4. Test

Before testing our code, let’s add the gzip() method in the GZip class. Now, we’re ready to test our code with a unit test:

@Test
void whenCompressingUsingGZip_thenGetCompressedByteArray() throws IOException {
    String payload = "This is a sample text to test method gzip. The gzip algorithm will compress this string. "
        + "The result will be smaller than this string.";
    ByteArrayOutputStream os = new ByteArrayOutputStream();
    GZip.gzip(new ByteArrayInputStream(payload.getBytes()), os);
    byte[] compressed = os.toByteArray();
    assertTrue(payload.getBytes().length > compressed.length);
    assertEquals("1f", Integer.toHexString(compressed[0] & 0xFF));
    assertEquals("8b", Integer.toHexString(compressed[1] & 0xFF));
}

In this test, we compress a string value. We convert the string into a ByteArrayInputStream and supply it to the gzip() method. Also, the output data is written to a ByteArrayOutputStream.

Furthermore, the test is successful if two conditions are true:

  1. the compressed data is smaller in size than the uncompressed
  2. the compressed byte array starts with the 1f 8b value.

Regarding the second condition, a GZIP file starts with the fixed value 1f 8b to comply with the GZIP file format.

As a result, if we run the unit test we’ll verify that both conditions are true.

5. Conclusion

In this article, we learned how to get the compressed data in a byte array when we use the GZIP file format in the Java language. To do so, we created a utility method for compression. Finally, we tested our code.

As always, the full source code of our examples can be found over on GitHub.

       

Viewing all articles
Browse latest Browse all 4536

Trending Articles