Difference between revisions of "Compression"

From TRCCompSci - AQA Computer Science
Jump to: navigation, search
(Lossy Compression)
(Lossless Compression)
Line 16: Line 16:
  
 
==Lossless Compression==  
 
==Lossless Compression==  
Lossless compression is a compression technique that decreases file size whilst keeps all of the data. This is possible because of repeating patterns. Also, this means that the original file can be recreated to the same quality.
+
Lossless compression is a compression technique that decreases file size while keeping all of the data. This means there is no loss in quality, and the original file can be recreated exactly as it was prior to compression.
Examples of lossless compression are Run Length Encoding and Dictionary Based Methods. Run Length Encoding replaces repeating pixels or codes. Dictionary Based Methods rely on patterns within a file and are more effective with larger files. Each pattern can has an ID number.
+
 
 +
This is possible because of repeating patterns in the data.
 +
 
 +
Examples of lossless compression methods include Run Length Encoding and Dictionary Based Methods. Run Length Encoding replaces repeating pixels or codes. Dictionary Based Methods rely on patterns within a file and are more effective with larger files. Each pattern can has an ID number.
  
 
===Run Length Encoding===
 
===Run Length Encoding===
This is a system that counts up the bits of data that are repeated. For Example, if there was a picture with 3 red pixels that are next to each other. The file would store the pixel colour and the amount of them are in the same row.
+
This method of lossless compression counts the bits that are repeated consecutively. For example, if a picture contained 3 red pixels one after the other, rather than storing each pixel individually, the file would instead store the pixel colour and the amount of times it is repeated.
 +
 
 +
However, if a file does not contain many repetitions then this method of compression can actually increase file size, as a single pixel would be stored as its colour and then the information that it is repeated only once.
 +
 
 
===Dictionary Based Methods===
 
===Dictionary Based Methods===
This is used when there are lots of repeating patterns of data. For example, if you where writing a document about Computer Science, you would probably say Computer Science a lot in the document. So what happens is that it sores the phrase in a dictionary and replaces the phrase with a number so whenever that data is needed, it calls up the dictionary and replaces that number with the phrase
+
This is used when there are lots of repeating patterns of data.  
 +
 
 +
For example, when writing a document about Computer Science, key words like "Computer Science" would be repeated throughout the document. Instead of storing the bit pattern for the word over and over, it stores the phrase in a dictionary with a reference number and stores the number in place of the phrase. This means that whenever the phrase is needed, it calls up the dictionary and replaces that number with the phrase.
 +
 
 +
A disadvantage of this method is that additional data is needed to store the dictionary as well as the file.
  
 
==Difference between lossy and lossless compression==
 
==Difference between lossy and lossless compression==

Revision as of 15:43, 3 January 2017

Definition

Data compression is decreasing the size of a file. There are many different compression techniques.

Lossy Compression

Lossy compression is a compression technique that decreases file size by discarding bits of unnecessary data. This means that the original file cannot be recreated. Lossy compression will create a new image which is similar to the original, but has a reduced quality. Another example of lossy compression is used to reduce the file size of a sound file by reducing the bitrate used in the original.

All of the above use lossy methods of compression to save data and space. This isn't the best method to use as it gets rid of some of the data, so this would be a unsuitable method to use if the original file needs to be used. Lossy compression cannot be used on binary or text files because all data is need to convey the correct meaning. People who would need the original file would be: Photographers, Audio Producers and Printing Firms. These could produce lossy compressed images for sample purposes or as draft prints.

Lossy Methods

Some methods used are to delete sounds which are not heard, either because of the frequency or if another sound will drown it out. Images could be compressed more in the background than the foreground, the focus is the foreground so compression will not be noticed in the background. ALso images could merge together adjacent colours just like the human eye. a pixel of black next to a pixel of white will actually be seen as two grey pixels.

Lossy compression formats:

  • JPEG
  • MPEG-1
  • MP3

Lossless Compression

Lossless compression is a compression technique that decreases file size while keeping all of the data. This means there is no loss in quality, and the original file can be recreated exactly as it was prior to compression.

This is possible because of repeating patterns in the data.

Examples of lossless compression methods include Run Length Encoding and Dictionary Based Methods. Run Length Encoding replaces repeating pixels or codes. Dictionary Based Methods rely on patterns within a file and are more effective with larger files. Each pattern can has an ID number.

Run Length Encoding

This method of lossless compression counts the bits that are repeated consecutively. For example, if a picture contained 3 red pixels one after the other, rather than storing each pixel individually, the file would instead store the pixel colour and the amount of times it is repeated.

However, if a file does not contain many repetitions then this method of compression can actually increase file size, as a single pixel would be stored as its colour and then the information that it is repeated only once.

Dictionary Based Methods

This is used when there are lots of repeating patterns of data.

For example, when writing a document about Computer Science, key words like "Computer Science" would be repeated throughout the document. Instead of storing the bit pattern for the word over and over, it stores the phrase in a dictionary with a reference number and stores the number in place of the phrase. ?This means that whenever the phrase is needed, it calls up the dictionary and replaces that number with the phrase.

A disadvantage of this method is that additional data is needed to store the dictionary as well as the file.

Difference between lossy and lossless compression

The main difference between lossy and lossless compression is the fact that when compressed, lossy loses some of the original quality*, whilst lossless retains all of the initial quality, hence the names "lossy" and "lossless".

Although sometimes lossy compression only removes the information that is not needed, for example it may remove some of the frequencies that cannot be heard by humans, so in this sense the reduced quality may not be detected.