[jocaml-list] A sucessful jocaml program

Mauricio Fernandez mfp at acm.org
Thu Nov 8 17:20:54 CET 2007


On Thu, Nov 08, 2007 at 02:24:27PM +0100, Luc Maranget wrote:
> > It seems,
> > 
> > http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results
> > 

Unsurprisingly, the JoCaml solution was much faster than those written in
interpreted (or even JIT-compiled, as Erlang) languages. 

I find the comparison with this "fairly well optimized, special-purpose C
(okay, C++) program" [1] much more enlightening. It takes 700 LOCs vs 150 +
150 (for the Bigstring module) lines for the fastest JoCaml one, uses a
specialized pattern matcher with "several assumptions about Tim Bray's example
regex hardcoded in", employs a flyweight substring representation that
precludes "chunked" mmaping (the whole file is mapped at once, unpractical on
32bit architectures)... In few words, it's a fair amount of code that has gone
through considerable micro-optimization.

Yet, when I timed it on an old K7 [2] and compared it to the (J)OCaml
solution, it was only 7% faster!

Now comes the funny part... the C++ code gives incorrect results!

I draw the following conclusions from all this:

* the OCaml substrate of JoCaml can generate code about as fast as hand-tuned
  C++ in many (most?) cases (Xavier's "OCaml delivers at least 50% of the
  performance..." still holds :-)

* JoCaml allows to express and reason about concurrent and distributed systems
  at a much higher level, leaving little place for the sort of bugs that
  infest code written with threads and shared state

* we cannot dismiss constant factors. JoCaml code written by an individual in 
  an afternoon can perform much better than Erlang code refined over the
  course of one month by Erlang experts [3] on a per-core basis, while scaling
  just as well. Erlang's major strength is reliability, not speed; promoting
  Erlang by saying that it will be "faster given enough cores" is
  disingenuous because it conceals the existence of alternatives like JoCaml
  that can scale just as well but perform much better per core.

> I seize  the opportunity to cite a research report on
> our distributed ray tracer in JoCaml, a program  much related
> to Mauricio Fernandez wide finder as regards concurrency.
> 
> <http://moscova.inria.fr/~maranget/papers/jocamlrus/index.html>

Nice! The section on how to handle failures is especially interesting.


[1] http://www.lug.corvallis.or.us/drupal/node/102

[2] a 5-year old 1.54 GHz K7 processor is considerably faster than one of the
8 cores from the T5120...

[3] the Erlang solutions have been discussed to great lengths in several
mailing lists; I wrote and benchmarked the OCaml/JoCaml ones in a few hours.
Yet, it takes 4 cores for the best Erlang implementation to be as fast as the
JoCaml one on a single one.

-- 
Mauricio Fernandez  -   http://eigenclass.org



More information about the JoCaml-list mailing list