;; A Very Gentle Introduction to Information Theory : Part I ;; Entropy and Huffman Coding ;; What is the 'information content' of a random process? ;; We might think about a Victorian bookmaker transmitting horse racing results ;; over an expensive telegraph connection. A more modern example would be ;; streaming files over an internet connection. ;; To think about the essence of these things, let us choose very simple models ;; for both the random process and the channel over which the message is to be ;; sent. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Setting up a model problem ;; Our communications channel will be a device over which we can transmit either ;; 0 or 1, paying a cost of £1 for every symbol. ;; Our random number generators will make arbitrary symbols from a known set, ;; one a second, with simple integer odds. ;; One example might produce the symbols A B and C, with A:B:C in a 2:1:1 ratio. ;; Sample output from this particular random number generator might be ;; BACAAAABCAACBCAACAAABAABCBAAACBC.... ;; And our challenge is to send a message down our channel that will allow our ;; friend at the other end to reconstruct the random stream. ;; We'd be interested in minimizing the cost, but not at the expense of accuracy. ;; If the random number generator is someone repeatedly tossing a fair coin, then we can say: (def fair-coin {:H 1 :T 1}) (defn random-stream [P] (let [pseq (vec (mapcat (fn[[k v]](repeat v k )) P))] (for [i (range)] (rand-nth pseq)))) ;; And if the fair coin were to come up with: (def coin-stream (random-stream fair-coin)) (take 20 coin-stream) ; (:T :H :H :T :H :H :T :H :T :T :T :T :H :T :T :T :T :H :T :T) ;; We might want to transmit the results by sending 1 for tails and 0 for heads (defn coin-coder [sq] (map #(if (= % :T) 0 1) sq)) (take 20 (coin-coder (random-stream fair-coin))) ; (1 0 0 1 0 0 0 0 1 0 0 1 0 0 1 1 1 0 1 0) ;; Our friend on the other end, with whom we have agreed this scheme, might decode like this: (defn coin-decoder [sq] (map #(if (= % 0) :T :H) sq)) (take 20 (coin-decoder (coin-coder coin-stream))) ; (:T :H :H :T :H :H :T :H :T ; :T :T :T :H :T :T :T :T :H ; :T :T) ;; And finally, the world might judge our scheme thus: (defn cost [encoder decoder message] (let [coded (encoder message)] (if (= (decoder coded) message) (count coded) :fail))) (cost coin-coder coin-decoder (take 200000 coin-stream)) ; £200000 ;; Under this scheme, our message gets through accurately, and we have spent ;; £200000 to transmit 200000 symbols. ;; If anyone can think of a better encoding scheme, please let me know. Until I ;; see a counter-example, I'll take the information content of a fair coin under ;; these conditions to be £1/symbol. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; A 'less random' random generator (def unfair-coin {:H 3 :T 1}) (def unfair-stream (random-stream unfair-coin)) (take 20 unfair-stream) ; (:T :H :H :T :H :H :H :H :H :H :H :H :H :H :T :H :H :H ; :H :H) ;; It seems as though there might be less information in the unfair coin tosses. ;; Why do I say this? ;; Well, memory is a kind of channel. You put something in, and later you get ;; something out, and it costs you something to hold the memory. ;; And the 20 tosses of the unfair coin above seem easier to remember than 20 ;; tosses of the fair coin would be. ;; It's quite hard to make this intuition precise, though. After all, all HT ;; sequences are still possible. But it does seem as though we should be able ;; to send these sorts of streams through our channel for less cost than the ;; fair coin streams, ON AVERAGE. ;; To think about how that might work, let's consider what the stream above ;; looks like when split into pairs. (take 20 (partition 2 unfair-stream)) ;;((:T :H) (:H :T) (:H :H) (:H :H) (:H :H) ;;(:H :H) (:H :H) (:T :H) (:H :H) (:H :H) ;;(:T :H) (:H :H) (:H :H) (:T :H) (:H :H) ;;(:T :H) (:H :T) (:T :H) (:T :H) (:T :H)) (frequencies (take 16000 (partition 2 unfair-stream))) ;; {(:T :H) 2981, (:H :T) 3083, (:H :H) 8964, (:T :T) 972} ;; Because the frequencies of the underlying coin are distorted 3:1, the ;; frequencies of the pairs are even more distorted, it looks like 1:3:3:9 ;; A little bird is telling me that the next random number generator we try to ;; code should be: (def unfair-pairs {:HH 9, :HT 3, :TH 3, :TT 1}) (def unfair-pair-stream (random-stream unfair-pairs)) (take 15 unfair-pair-stream ) ; (:HH :TH :HH :HH :HT :HT :HH :HT :HH :TH :TH :HH ; :HH :TH :HT) ;; How might we go about encoding this to send through our channel? ;; If we do it the obvious way: (defn pair-coder [sq] (mapcat #(case % :HH '(0 0) :HT '(0 1) :TH '(1 0) :TT '(1 1)) sq)) (take 20 (pair-coder unfair-pair-stream)) ; (0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0) (defn pair-decoder [sq] (map #(case % '(0 0) :HH '(0 1) :HT '(1 0) :TH '(1 1) :TT) (partition 2 sq))) (take 20 (pair-decoder (pair-coder unfair-pair-stream))) ; (:HH :TH :HH :HH :HT :HT :HH :HT :HH :TH :TH :HH :HH :TH :HT :HH :HH :HT :HH :TH) (cost pair-coder pair-decoder (take 200000 unfair-pair-stream)) ; £400000 ;; Then we can see that it costs £2 per symbol to transmit the data now, which ;; sounds about right, since we made each symbol out of a pair of the previous ;; symbols, so we could use them as a cheaper transmission method if they cost ;; any less. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Sneaking more data down the telegraph ;; As Samuel Morse knew, when transmitting in binary, we should use shorter ;; sequences for more common letters. ;; Let's try HH -> 1, HT ->01 TH->001, TT-> 000 (defn variable-length-pair-coder [sq] (mapcat #(case % :HH '(1) :HT '(0 1) :TH '(0 0 1) :TT '(0 0 0)) sq)) (take 20 (variable-length-pair-coder unfair-pair-stream)) ; (1 0 0 1 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 1) ;; decoding this is trickier (defn variable-length-pair-decoder [sq] (lazy-seq (if-let [sq (seq sq)] (if (= (first sq) 1) (cons :HH (variable-length-pair-decoder (rest sq))) (if-let [sq2 (seq (rest sq))] (if (= (first sq2) 1) (cons :HT (variable-length-pair-decoder (rest sq2))) (if-let [sq3 (seq (rest sq2))] (if (= (first sq3) 1) (cons :TH (variable-length-pair-decoder (rest sq3))) (cons :TT (variable-length-pair-decoder (rest sq3))))))))))) ;; but it can be done: (take 20 (variable-length-pair-decoder (variable-length-pair-coder unfair-pair-stream))) ;;(:HH :TH :HH :HH :HT :HT :HH :HT :HH :TH :TH :HH :HH :TH :HT :HH :HH :HT :HH :TH) ;; and it saves on the number of transmitted symbols: (cost variable-length-pair-coder variable-length-pair-decoder (take 200000 unfair-pair-stream)) ; £337179 ;; For this particular stream, with this particular code, we're now only paying ;; £ 1.69 per symbol, as opposed to £2 per symbol for the naive code. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Blog Archive
-
▼
2011
(26)
-
▼
January
(10)
- Turning Exceptions into Return Values
- Finding Something in a Vector, Parsing CSV Files
- £500 if you can find me a job (version 1.0)
- K-means : An Algorithm for Clustering Data
- Cleaning Old Definitions from the REPL : shred-user
- take-while-unstable
- A Very Gentle Introduction to Information Theory: ...
- A Very Gentle Introduction to Information Theory: ...
- A Very Gentle Introduction to Information Theory :...
- A Very Gentle Introduction to Information Theory :...
-
▼
January
(10)
Search This Blog
Thursday, January 13, 2011
A Very Gentle Introduction to Information Theory : Setting the Scene
I've been reading David Mackay's wonderful book 'Information Theory, Inference, and Learning Algorithms', and writing a lot of clojure code as I go.
I thought I might try to explain the concept of Entropy, or Information Content, as it's used in Communications Theory.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment