If you were there and blogged it, let me know:

Here’s what I found so far:

If you were there and blogged it, let me know:

Here’s what I found so far:

Last night was our Clojure event with Rich Hickey. BTW, if you were there, regular meeting of the Western Mass Developers Group are every other Thursday at Panera Bread in Hadley (7pm) — most of the time, we just sit around and talk — very informal — hope to see you there.

Back to the presentation — hopefully the slides/video/etc will be up somewhere soon — Rich focussed on the concurrency features of Clojure (Vars, Refs, and Agents). First he showed us the state of concurrency support in Java/C# today (locking) and identified the problem as having direct references to mutable objects. Clojure offers only immutable atoms and datastructures so that’s how he addresses one part of the problem. However, since the mutability is how the world works, Clojure models change as a change to what a reference points to. So, if you view the world through refs, it will appear to change. Then he introduced the controlled ways in which refs can change.

1. Vars are essentially thread local variables. They are a reference whose changes can only be seen in a single thread — each thread gets an independent and isolated view of the variable. Like thread-local variables — they are a very simple way to take code that looks like it’s manipulating global state and give each thread its own isolated copy.

2. Refs are mutable references to immutable objects. They can only be changed inside a transaction and all changes to all refs in a transaction appear to have happened at the time the transaction ends (to other threads). Inside the transaction, the changes are immediate. When two transactions are open and change the same ref, the second one to finish is automatically retried (so you should have no side-effects inside of transactions). All Refs read inside of a transaction will return the value that they had at the start of the transaction (so are consistent with each other).

3. Agents (which I have explained before) are for asynchronous changes to a reference. Each agent hold a value and you send actions to it to change the value. Only one action will happen to an agent at a time and you can read the current state of the agent any time. Rich spent a little time to caution us against any comparison to Erlang’s actors. They are a different concept with different tradeoffs. Agents have the advantage that they can be read from any thread conveniently (just dereference) whereas actors require sending a message to read (which is asynchronous). Clearly, Erlang’s benefit is location transparency (for distributed code) — which is what it was designed for. Rich hinted that Clojure might have an Actor concept, but it would not be unified with Agents.

What was new to me with agents is that there are two types of message sending mechanisms — send and send-off (which appear to be undocumented right now) — Rich used send-off which dedicates a thread to the agent (rather than run the action from a thread-pool). This is how you have to do it if agents block at all (which ants do because they have a small sleep).

Then, he showed us an ant simulation — I think he will make the code available. In short, there is a square world of cells, each one holds a Ref to what is in the cell (food, ant, pheromones, home base). Ants are represented by agents, which are sent a behave message. Behave picks a random action for the ant to do (move, turn, pick up food, drop food) and then mutates the world of cells in a transaction, then it sends itself another behave action. There is an agent responsible for painting the world, and another which evaporates pheromones in the background.

Anyway, the demo was impressive — since painting makes a copy of the world in a transaction, it has a completely consistent view of the world. Refs make sure that all changes to them are in transactions, so you have language support for where you need cooperation (contrasted to locking, which is not enforced). Agents also help you model the real world in a way that a coordinated single-looped simulation (calling behave on ants in a loop, then paint over and over again) could not.

And clojure’s agent mutating mechanism (sending a function to it), means that agents don’t have to have any knowledge of the messages that might be sent to it (again contrasted to Erlang Actors). Finally, various messages can take different times to complete and that would be represented in the simulation — some ants might complete several small actions in the time it took another to complete one (which would not be the case in a behave loop).

I’ll have more on this when the slides, code, etc are available.

Ok, yesterday I decided to try to implement a defclass macro which would hobble multimethods and force you to have some kind of class inheritance for polymorphism. Today, I’ll try to implement that macro.

First, I made this macro, which I think will be useful

(defmacro sym2key [s]

`(keyword (name (quote ~s)))

)

Works like this:

user=> (sym2key rect)

:rect

(I wrote the above this morning, and after several hours, I didn’t get too far)

I had to cheat a lot, but I finally got something — here is my final OO minilanguage

(defclass shape

(defabstractmeth area)

)

(defclass rect

(defctor (fn [w h] (setprop w) (setprop h) ))

(defmeth area (fn [] (* (:w this) (:h this))))

)

(defclass circle

(defctor (fn [r] (setprop r) ))

(defmeth area (fn [] (* (:r this) (:r this) (. Math PI))))

)

I got that to be processed by these macros:

(defmacro sym2key [s]

`(keyword (name (quote ~s)))

)

(defmacro defclass [name & body]

(cons ‘do

(loop [classbody# body parts# nil]

(if classbody#

(recur (rest classbody#) (cons (concat (first classbody#) `(~name)) parts#))

parts#)))

)

(defmacro defabstractmeth [name class] `(defmulti ~name :type) )

(defmacro setprop [x] { (sym2key x) x })

(defmacro defctor [fndef name]

(let [ arglist# (second fndef)

fnbody# (map second (nthrest fndef 2) )

obj# (reduce merge

{:type `(sym2key ~name)}

(map assoc (repeat {})

(map eval

(map sym2key fnbody#)) fnbody#))

]

`(defn ~name ~arglist# ~obj#)

)

)

(defmacro defmeth [meth fndef name]

(let [ arglist# (conj (second fndef) ‘this)

fnbody# (first (nthrest fndef 2))

namesym# (eval `(sym2key ~name))

]

`(defmethod ~meth ~namesym# ~arglist# ~fnbody#)

)

)

This code shows it in action:

(prn (rect 10 20))

(prn (area (rect 10 20)))

(prn (circle 10))

(prn (area (circle 10)))

Outputs:

{:type :rect, :h 20, :w 10}

200

{:type :circle, :r 10}

314.1592653589793

I am way too tired to explain this — suffice to say, this is kind a crazy way to make something, but it sure beats not being able to make it. My ctor implementation is so crazy, that you should just ignore it — defclass and defmeth are worth looking at (ctor is a major hack — assumes only setprop calls and turns them into a map).

I just got around to watching Clojure TV on sequences, which I guess I should have done earlier, but I guess I was too lazy — thanks, I’m here all week.

Ok — since clearly, day 20 is going to be about Clojure Day in Northampton, I only have two days to explore more of clojure. I haven’t even gotten to multimethods, which are very cool, but pretty simple to understand. I want to keep working with macros.

The hardest thing about thinking about macros is that not using a language that has one, I haven’t started to think in macros yet. Kind of reminds me of Clint Eastwood in Firefox who has to keep remembering to think in Russian in order to fly his stolen super-secret Soviet jet.

Ok, since I think in OO, let me try to do some OO in clojure via macros — maybe I get to use multimethods after all (let’s hobble them with conventional OO polymorphism)

(defclass rect

(defctor (fn [w h] (setprop w) (setprop h) ))

(defmeth area (fn [] (* w h)))

)

(defclass circle

(defctor (fn [r] (setprop r)))

(defmeth area (fn [] (* r r (. Math PI))))

)

I want that to turn into

(defn rect [w h] {:type :rect :w w :h h})

(defmulti area :type)

(defmethod area :rect [r]

(* (:w r) (:h r)))

(defn circle [r] {:type :circle :r r})

(defmulti area :type)

(defmethod area :circle [c]

(* (. Math PI) (* (:r c) (:r c))))

I’m not even sure this is possible — I just checked to see if it’s legal to defmulti the same name, and it’s not really because it kills the definition for rect. Luckily, I have a precedent to fall back on — in Java/C# style OO, area is only polymorphic if it came from a superclass. I can change the class structure to this:

(defclass shape

(defabstractmeth area)

)

(defclass rect :extends shape

(defctor (fn [w h] (setprop w) (setprop h) ))

(defmeth area (fn [] (* w h)))

)

(defclass circle :extends shape

(defctor (fn [r] (setprop r)))

(defmeth area (fn [] (* r r (. Math PI))))

)

This is so ugly, I don’t even know if I want to go through with this. I’ll try for tomorrow.

(clojure day in Northampton, MA is Thursday, March 20th)

I know I’ll reveal myself as being tragically unhip for not using emacs or aquamacs or whatever (I’m a VI man after all), but I started this TextMate Bundle for clojure. I’ll grow it as time goes by and probably move it to its own page or some repository for bundles once I figure this bundling stuff out.

I do some rudimentary syntax coloring and automatically expand **defn**[Tab], **def**[Tab], **let**[Tab], and **if**[Tab].

If you use TextMate and have some ideas for things to add, please send me a note.

Ok, time to look at macros. Being new to lisps, I’m especially new to macros. I have been meaning to read On Lisp for a few months, so like the SICP, I’ll show you stuff from the macros chapter and then try to write them in clojure.

In, On Lisp (Chapter 7), Paul Graham writes:

The expression generated by the macro will be evaluated in place of the original macro call. A macro call is a list whose first element is the name of the macro. What happens when we type the macro call into the toplevel? Lisp notices that [it] is the name of a macro, and

- builds the expression specified by the definition, then
- evaluates that expression in place of the original macro call

Macros simply allow you to write code that gets executed at compile time to generate code, which is then run. There really isn’t anything quite like them in popular non-lisp languages, though there are a couple of .NET languages with language altering macros. They are much more powerful than the preprocessor macros in C, and C++ template meta-programming can sometimes get you the effect, but templates were not designed for this, so it is too cumbersome most of the time.

We have been using macros in clojure throughout this series. In fact, since Lisps evaluate all of their arguments before calling a function, whenever you see code where all of the arguments are not evaluated, you are either looking at something built-in to the language (e.g. if) or a macro (e.g. cond). You can look at boot.clj to see the definitions of them. For example, here’s lazy-cons:

(defmacro lazy-cons [x & body]

(list ‘fnseq x (list* `fn [] body)))

Remember, we used lazy-cons to make lazy sequences. lazy-cons acts like cons, except it does not evaluate the rest-expr until rest is called on the resulting list. We can use macroexpand-1 (which returns the unevaluated macro expansion of a macro call) to see how this works:

user=> (defn rfn [] (prn “Called”) ‘(3))

#<Var: user/rfn>

user=> (rfn)

“Called”

(3)

user=> (def lazylist (lazy-cons 1 (rfn)))

#<Var: user/lazylist>

user=> (first lazylist)

1

user=> (rest lazylist)

“Called”

(3)

user=> (rest lazylist)

(3)

user=> (macroexpand-1 ‘(lazy-cons 1 (rfn)))

(fnseq 1 (clojure/fn [] (rfn)))

user=> (first (eval (macroexpand-1 ‘(lazy-cons 1 (rfn)))))

1

user=> (rest (eval (macroexpand-1 ‘(lazy-cons 1 (rfn)))))

“Called”

(3)

So, a lazy-cons is just a short hand to using a fnseq and wrapping the rest in a function that will be called whenever rest is called on the fnseq. That’s a typical use of a macro — codifying common uses to make your programs shorter. And, lazy-cons cannot be implemented as a function, because if it were, its arguments would need to be evaluated, which defeats the purpose.

Ok, let’s write some macros (Documentation on the clojure macro page and the clojure reader page). On Lisp, section 7.3 (Defining Simple Macros) presents this one for while.

(defmacro while (test &body body)

`(do () ((not ,test)) ,@body))

Here it is in clojure:

(defmacro while [test & body]

`(loop [] (if ~test (do ~@body (recur)))))

In clojure, you don’t have many mutable objects — here’s how you use while with a thread-local var binding:

(def x 10)

(prn (macroexpand ‘(while (> x 0) (prn x) (set! x (dec x)))))

(binding [x 10] (while (> x 0) (prn x) (set! x (dec x))))

Here’s the output:

(loop [] (if (> x 0) (do (prn x) (set! x (dec x)) (recur))))

10

9

8

7

6

5

4

3

2

1

Christophe Grand made two observations about my parallel prime code, which improve it quite a bit.

First, since primes-in-list is lazy, you have to (touch) it to get it to be completely created in the agent thread. That leaves you with this:

(defn par-primes-in-list [potential-primes known-primes]

(let [ x 10 x-times-2 (* x 2)]

(! w1 (fn[w pp kp] (touch (primes-in-list pp kp))) (take x potential-primes) known-primes)

(! w2 (fn[w pp kp] (touch (primes-in-list pp kp))) (take x (drop x potential-primes)) known-primes)

(await w1 w2)

(let [new-primes (concat @w1 @w2) ]

(if (= (count new-primes) 0)

(recur (drop x-times-2 potential-primes) known-primes)

(let [real-primes

(if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]

(lazy-cat real-primes (par-primes-in-list

(drop x-times-2 potential-primes)

(into known-primes real-primes))))))))

But, the bigger difference is picking the number of potential-primes to evaluate intelligently. I had experimented with trying to do this dynamically, but without the fix above, anything you do will probably result in primes being found lazily in the main thread. Here’s Christophe’s code:

(defn par-primes-in-list [potential-primes known-primes]

(let [p (first potential-primes) x (quot p 2)]

(! w1 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))

(take x potential-primes) known-primes)

(! w2 (fn[w pp kp] (touch (lazy-primes-in-list pp kp)))

(take (- p x) (drop x potential-primes)) known-primes)

(await w1 w2)

(let [new-primes (concat @w1 @w2) ]

(lazy-cat new-primes

(par-primes-in-list

(drop p potential-primes)

(into known-primes new-primes))))))

Christophe’s observation:

I use the property that no number in the interval [n .. 2*n-1] is divisible by n (except n itself) so it’s safe to process the whole interval with the current set of known primes. Hence this interval can be divided into m intervals (where m is the number of cores) for parallel processing.

The one issue with Christophe’s code it part lazy and part eager, so it’s applicability depends on whether you are trying to get to a specific number the fastest or just generate primes as fast as possible. Since the number of primes to try grows as it searches, it lazily generates bigger and bigger lists of primes. I tried to find a good value to ask for where it would perform well, and chose 20390 which is big enough to show a difference and doesn’t make Christophe’s code do too much extra. Here are the results:

Christophe’s: “Elapsed time: 87266.119 msecs”

Mine (with a touch added): “Elapsed time: 108841.354 msecs”

Sequential: “Elapsed time: 151075.292 msecs”

Ok, I’m finally getting somewhere. To start with, Rich pointed out that I could use recur to make my last prime finder stack safe and faster. Here it is:

(defn primes-in-list [potential-primes known-primes]

(let [p (first potential-primes)]

(if (nil? p) ‘()

(if (is-prime? p known-primes)

(lazy-cat

(list p)

(primes-in-list (rest potential-primes) (conj known-primes p)))

(let [rest-primes (rest potential-primes)]

(recur rest-primes known-primes))))))

Here’s how I parallelized it:

(def w1 (agent 0))

(def w2 (agent 0))

(defn par-primes-in-list [potential-primes known-primes]

(let [ x 10 x-times-2 (* x 2)]

(! w1 (fn[w pp kp] (primes-in-list pp kp)) (take x potential-primes) known-primes)

(! w2 (fn[w pp kp] (primes-in-list pp kp)) (take x (drop x potential-primes)) known-primes)

(await w1 w2)

(let [new-primes (concat @w1 @w2) ]

(if (= (count new-primes) 0)

(recur (drop x-times-2 potential-primes) known-primes)

(let [real-primes

(if (> (first new-primes) x-times-2) new-primes (primes-in-list new-primes []))]

(lazy-cat real-primes (par-primes-in-list

(drop x-times-2 potential-primes)

(into known-primes real-primes))))))))

Basically, I send lists to the agents, wait for them, and then combine their answers. The original runs at about 100% of CPU, calculates 8000 primes in 21.6 seconds and this runs at about 125%, and calculates the primes in 17.1 seconds, or in about 80% of the time.

My brain still hurts from thinking about parallelizing prime number generation. I think the key is still to find primes two at a time in parallel, meaning that it’s only useful if you want a list, which is necessary to find the nth prime anyway.

So, I am starting with a rewrite of prime number generation — this one is simpler for me to understand and for some reason runs three times faster.

(defn divisible? [x y]

(= (rem x y) 0)

)

(defn is-prime? [p known-primes]

(every? (fn [kp] (not (divisible? p kp))) known-primes)

)

(defn primes-in-list [potential-primes known-primes]

(let [p (first potential-primes)]

(if (is-prime? p known-primes)

(lazy-cons

p

(primes-in-list (rest potential-primes) (conj known-primes p)))

(primes-in-list (rest potential-primes) known-primes)

)

)

)

(def primes (primes-in-list (iterate inc 2) []))

(time (reduce + (take 1000 primes)))

(time (reduce + (take 1000 primes)))

It’s also lazy. Not sure why it’s so much faster. Now that I can provide a list of potentials, I’ll be able to run two lists in parallel, and then I have to reduce them in the main thread. When I get the lists back, I need to make sure that they aren’t divisible by each other (since they won’t be in each others list of known primes).

So, Rich thinks I’m on a fool’s errand, but I’m going to keep going with this.

I’m sorry to say I can’t think of a less promising candidate for parallelization :( The 3 things you would want in something to parallelize are independent data/work, moderate-to-course-grained work units and/or complex coordination logic that would be simplified with threads. (Note – wanting it to run faster is not enough :) Here you have none of those – the data/work is interdependent (more later), the work is very fine grained (seq walk and a remainder test), and the logic was pretty elegant.

So, in order to have something to parallelize, I need to get one of those things. Since I am trying to make a list of primes, I’ll generate them two at a time. Each check is independent. There’s probably better ways to do it, but to keep my life simple, I’ll just generate pairs of primes lazily, but in parallel (each member of the pair in a separate thread).

Hmmm. I’m too tired to figure this out right now. So, for some content today, I can explain a mistake that Rich pointed out in my original prime sieve.

I defined primes as a function that returns a lazy seq, but a key benefit of lazy seqs is that they cache values — so by creating a new lazy seq for each call to primes, I lose caching. The right way is to define primes as a seq this way:

(def primes (sieve (iterate inc 2)))

Then, when you do this:

(time (reduce + (take 1000 primes)))

(time (reduce + (take 1000 primes)))

You get this:

“Elapsed time: 1080.253 msecs”

“Elapsed time: 5.36 msecs”