What is Data Compression

Abhishek Ghosh

By Abhishek Ghosh November 14, 2023 7:14 pm Updated on November 14, 2023

What is Data Compression

Data compression is a process in which the amount of digital data is condensed or reduced. This reduces the amount of storage space required and reduces the data transfer time. In telecommunications, the compression of messages from a source by a sender is called source encoding.

Basically, data compression attempts to remove redundant information. To do this, the data is converted into a representation that allows all – or at least most – of the information to be presented in a shorter form. This process is done by an encoder and is called compression. The reversal is called decompression.

Lossless compression, lossless encoding, or redundancy reduction is when the compressed data can be used to extract exactly the original data. This is necessary, for example, when compressing executable program files.

In the case of lossy compression or irrelevance reduction, the original data can usually no longer be recovered exactly from the compressed data, i.e. part of the information is lost; the algorithms try to omit only “unimportant” information as much as possible. Such methods are often used for image or video compression and audio data compression (see Basic Details of MP3 Audio Format).

How Data Compression Works

Data compression takes place in most long-distance transmissions of digital data these days. It helps to save resources when transmitting or storing data by turning it into a form that is as minimal as possible, depending on the application. Only data that is redundant in some form can be compressed. If there is no redundancy – for example, in the case of completely random data – lossless compression is in principle impossible due to the Kolmogorov complexity. Likewise, the dovecote principle prohibits any file from being losslessly compressed. Lossy compression, on the other hand, is always possible: an algorithm ranks the data according to how important it is and then discards the “unimportant” data. In the list of how important which components are, more and more can be discarded by shifting the “keep threshold” accordingly.

In the case of data compression, computational effort is required on both the sender and receiver sides in order to compress or restore the data. However, the computational effort is very different for different compression methods. For example, Deflate and LZO are very fast in both compression and decompression, while LZMA, for example, achieves particularly extensive compression – and thus the smallest possible amounts of data – at great expense, while compressed data can be converted back to its original form very quickly. This forces a different choice of compression method depending on the area of application. Therefore, compression methods are optimized for either data throughput, energy consumption, or data reduction, and compression does not always aim for the most compact representation possible. The difference becomes clear in these examples:

If video or sound recordings are broadcast live, compression and recovery must be performed as quickly as possible. Loss of quality is justifiable if the maximum (possible) transmission rate is maintained. This applies, for example, to telephone conversations, where the other person is often still understood even if the sound quality is poor.

If a single file is downloaded by countless users, a slow but very powerful compression algorithm is worthwhile. The reduced bandwidth during transmission easily makes up for the time spent on compression.

When backing up and archiving data, it is necessary to use an algorithm that may also be used in the distant future. In this case, only common, proven algorithms can be considered, which sometimes do not have the best compression rates.

The type of data is also relevant for the selection of the compression method. For example, the two compression programs commonly used on Unix-like operating systems, gzip and bzip2, have the properties that gzip compresses only 32,000 bytes of blocks, while bzip2 has a block size of 900,000 bytes. Redundant data is only compressed within these blocks.

Tagged With replied3nq , voyage949

What is Data Compression

How Data Compression Works

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email:

How Data Compression Works

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Articles Related to What is Data Compression

Take The Conversation Further ...

Get new posts by email: