;; The Unexpected Appearance of Schlemiel, the Painter ;; I was doing some statistics one day, and I defined: ;; the average of a finite sequence (defn average [sq] (/ (reduce + sq) (count sq))) ;; and the square of a number (defn square [x] (* x x)) ;; and a way of forgetting about all the fiddly little digits at the end (defn twosf [x] (float (/ (Math/round (* x 100.0)) 100))) ;; but for the variance I was a little torn between: (defn variance-one [sq] (let [av (average sq)] (average (map #(square (- % av)) sq))))
;; and
(defn variance-two [sq] (let [sqdiff #(square (- % (average sq)))] (average (map sqdiff sq)))) ;; and (I have a regrettable weakness for the terse...) (defn variance-one-liner [sq] (average (map #(square (- % (average sq))) sq))) ;; but I was surprised when I noticed this: (let [s (repeatedly 1000 #(rand))] (twosf (reduce + s)) ;; just to force the sequence to be generated before timing things [(time (twosf (reduce + s))) (time (twosf (average s))) (time (twosf (variance-one s))) (time (twosf (variance-two s))) (time (twosf (variance-one-liner s)))]) ;; "Elapsed time: 0.535715 msecs" ;; "Elapsed time: 0.834523 msecs" ;; "Elapsed time: 1.417108 msecs" ;; "Elapsed time: 251.650722 msecs" ;; "Elapsed time: 248.196331 msecs" ;; [496.83 0.5 0.09 0.09 0.09] ;; It seems that all these functions are correct, in the sense that they are producing ;; correct-looking answers, and yet one of them is orders of magnitude faster. ;; What is going on here, and why?
You're mapping a function that contains average over sq, so it's recomputing the average multiple times, I suspect. Theoretically, the compiler should be able to figure out that average is a pure function with no side effects and only perform that calculation once, and is thus constant for the same inputs, but apparently it isn't doing that. In variance-one you're forcing that calculation into the av temporary, so it's only computed once. Ideally, the compiler would be smart enough to transform variance-two into variance-one automatically.
ReplyDeleteI think you can try to compile it and see the differences in the .class files
ReplyDeleteDave Roberts is right:
ReplyDelete```
user=> (defn variance-two-2 [sq]
#_=> (let [avg (average sq)
#_=> sqdiff #(square (- % avg))]
#_=> (average (map sqdiff sq))))
user=> (let [s (repeatedly 1000 #(rand))]
#_=> (twosf (reduce + s)) (time (twosf (variance-two-2 s))))
"Elapsed time: 0.8499 msecs"
0.08
user=> (defn variance-one-liner-2 [sq] (let [avg (average sq)] (average (map #(square (- % avg)) sq))))
user=> (let [s (repeatedly 1000 #(rand))]
#_=> (twosf (reduce + s)) (time (twosf (variance-one-liner-2 s))))
"Elapsed time: 1.070023 msecs"
0.08
```