4.6. Pipelining#
Suppose we wanted to compute the sum of squares of the numbers from 0 up to \(n\). How might we go about it? Of course (math being the best form of optimization), the most efficient way would be a closed-form formula:
But let’s imagine you’ve forgotten that formula. In an imperative language you
might use a for
loop:
# Python
def sum_sq(n):
sum = 0
for i in range(0, n+1):
sum += i * i
return sum
The equivalent (tail) recursive code in OCaml would be:
let sum_sq n =
let rec loop i sum =
if i > n then sum
else loop (i + 1) (sum + i * i)
in loop 0 0
val sum_sq : int -> int = <fun>
Another, clearer way of producing the same result in OCaml uses higher-order functions and the pipeline operator:
let rec ( -- ) i j = if i > j then [] else i :: i + 1 -- j
let square x = x * x
let sum = List.fold_left ( + ) 0
let sum_sq n =
0 -- n (* [0;1;2;...;n] *)
|> List.map square (* [0;1;4;...;n*n] *)
|> sum (* 0+1+4+...+n*n *)
val ( -- ) : int -> int -> int list = <fun>
val square : int -> int = <fun>
val sum : int list -> int = <fun>
val sum_sq : int -> int = <fun>
The function sum_sq
first constructs a list containing all the numbers 0..n
.
Then it uses the pipeline operator |>
to pass that list through
List.map square
, which squares every element. Then the resulting list is
pipelined through sum
, which adds all the elements together.
The other alternatives that you might consider are somewhat uglier:
(* Maybe worse: a lot of extra [let..in] syntax and unnecessary names to
for intermediate values we don't care about. *)
let sum_sq n =
let l = 0 -- n in
let sq_l = List.map square l in
sum sq_l
(* Maybe worse: have to read the function applications from right to left
rather than top to bottom, and extra parentheses. *)
let sum_sq n =
sum (List.map square (0--n))
val sum_sq : int -> int = <fun>
val sum_sq : int -> int = <fun>
The downside of all of these compared to the original tail recursive version is that they are wasteful of space—linear instead of constant—and take a constant factor more time. So as is so often the case in programming, there is a tradeoff between clarity and efficiency of code.
Note that the inefficiency is not from the pipeline operator itself, but from having to construct all those unnecessary intermediate lists. So don’t get the idea that pipelining is intrinsically bad. In fact, it can be quite useful. When we get to the chapter on modules, we’ll use it quite often with some of the data structures we study there.