Difference between revisions of "History of Computers - Shannon-Fano Coding"

From SJS Wiki
Jump to: navigation, search
Line 21: Line 21:
  
 
5. http://www.binaryessence.com/dct/en000046.htm
 
5. http://www.binaryessence.com/dct/en000046.htm
 +
 +
-Sebastian Varma [[User:Svarma|Svarma]] ([[User talk:Svarma|talk]]) 20:36, 14 September 2015 (CDT)

Revision as of 20:36, 14 September 2015

Introduction

Shannon-Fano coding was the first form of lossless compression, compression in which the quality of the data being compressed is not altered. It was first revealed in “A Mathematical Theory of Communication” in 1948 in the Bell System Technical Journal by Claude Shannon.(Link numbers 1 and 2)

Overview

Consider the following page which contains 100 characters that use an alphabet that only uses A,B,C,D, and E in which A’s frequency = 30%, B’s frequency = 26%, C’s frequency = 24%, D’s frequency = 11%, and E’s frequency= 9%. First, the page is divided into two sets based on frequency (the highest fifty percent or more go in one category and the lower ones go in the other). Set 1, composed of A and B, is issued a 0 as the first digit, while set 2, composed of C, D, and E, is issued a 1 as the first digit. This step is then repeated among sets until each letter is in its own set. A is assigned another 0 digit while B is assigned a 1 digit. C is assigned a 0 digit while D and E are assigned another 1 digit. Then, D is issued a 0 while E is issued another 1 digit. This results in each letter having a binary code (A’s code is 00, B’s code is 01, C’s code is 10, D’s code is 110, and E’s code is 111).(Link number 3) These codes can be used instead of the three bits that would usually be needed for each letter.

300px-ShannonCodeAlg.svg.png (Link number 4)

This saves us a lot of space. Using linear coding to store this page would require 300 bits, while using Shannon-Fano coding would only require 218 bits.(Link number 5) This reduces the size of the document by about 28%. While this might not seem like a lot, when there is a lot of data, 28% saved could mean gigabytes or even terabytes worth of data saved.

Significance

While it is not the most optimal form of lossless compression, Shannon-Fano coding was the first form of lossless compression, a field of information theory that is becoming even more vital by the second with so much memory being used for all sorts of data. Lossless compression will play a large part in the future of computing, and Shannon-Fano coding pioneered it.

Links

1. https://en.wikipedia.org/wiki/Shannon%E2%80%93Fano_coding

2. http://www.britannica.com/topic/information-theory#ref710425

3. https://www.youtube.com/watch?v=dJCck1OgsIA

4. https://www.youtube.com/watch?v=dJCck1OgsIA

5. http://www.binaryessence.com/dct/en000046.htm

-Sebastian Varma Svarma (talk) 20:36, 14 September 2015 (CDT)