{
"cells": [
{
"cell_type": "markdown",
"id": "dfc0f019",
"metadata": {},
"source": [
"# Beyond Lists\n",
"\n",
"{{ video_embed | replace(\"%%VID%%\", \"5Yyk-l-cUNI\")}}\n",
"\n",
"Functionals like map and fold are not restricted to lists. They make sense for\n",
"nearly any kind of data collection. For example, recall this tree\n",
"representation:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ccd7fd03",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"type 'a tree = Leaf | Node of 'a * 'a tree * 'a tree\n"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type 'a tree =\n",
" | Leaf\n",
" | Node of 'a * 'a tree * 'a tree"
]
},
{
"cell_type": "markdown",
"id": "7be9619b",
"metadata": {},
"source": [
"## Map on Trees\n",
"\n",
"This one is easy. All we have to do is apply the function `f` to the\n",
"value `v` at each node:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d08bfd02",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"val map_tree : ('a -> 'b) -> 'a tree -> 'b tree = \n"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"let rec map_tree f = function\n",
" | Leaf -> Leaf\n",
" | Node (v, l, r) -> Node (f v, map_tree f l, map_tree f r)"
]
},
{
"cell_type": "markdown",
"id": "aca50e6f",
"metadata": {},
"source": [
"## Fold on Trees\n",
"\n",
"This one is only a little harder. Let's develop a fold functional for `'a tree`\n",
"similar to our `fold_right` over `'a list`. One way to think of\n",
"`List.fold_right` would be that the `[]` value in the list gets replaced by the\n",
"`acc` argument, and each `::` constructor gets replaced by an application of the\n",
"`f` argument. For example, `[a; b; c]` is syntactic sugar for\n",
"`a :: (b :: (c :: []))`. So if we replace `[]` with `0` and `::` with `( + )`,\n",
"we get `a + (b + (c + 0))`. Along those lines, here's a way we could rewrite\n",
"`fold_right` that will help us think a little more clearly:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d215f353",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"type 'a mylist = Nil | Cons of 'a * 'a mylist\n"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"val fold_mylist : ('a -> 'b -> 'b) -> 'b -> 'a mylist -> 'b = \n"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type 'a mylist =\n",
" | Nil\n",
" | Cons of 'a * 'a mylist\n",
"\n",
"let rec fold_mylist f acc = function\n",
" | Nil -> acc\n",
" | Cons (h, t) -> f h (fold_mylist f acc t)"
]
},
{
"cell_type": "markdown",
"id": "ee9c9e1a",
"metadata": {},
"source": [
"The algorithm is the same. All we've done is to change the definition of lists\n",
"to use constructors written with alphabetic characters instead of punctuation,\n",
"and to change the argument order of the fold function.\n",
"\n",
"For trees, we'll want the initial value of `acc` to replace each `Leaf`\n",
"constructor, just like it replaced `[]` in lists. And we'll want each `Node`\n",
"constructor to be replaced by the operator. But now the operator will need to be\n",
"*ternary* instead of *binary*—that is, it will need to take three\n",
"arguments instead of two—because a tree node has a value, a left child,\n",
"and a right child, whereas a list cons had only a head and a tail.\n",
"\n",
"Inspired by those observations, here is the fold function on trees:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "25b707de",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"val fold_tree : ('a -> 'b -> 'b -> 'b) -> 'b -> 'a tree -> 'b = \n"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"let rec fold_tree f acc = function\n",
" | Leaf -> acc\n",
" | Node (v, l, r) -> f v (fold_tree f acc l) (fold_tree f acc r)"
]
},
{
"cell_type": "markdown",
"id": "9de5b941",
"metadata": {},
"source": [
"If you compare that function to `fold_mylist`, you'll note it very nearly\n",
"identical. There's just one more recursive call in the second pattern-matching\n",
"branch, corresponding to the one more occurrence of `'a tree` in the definition\n",
"of that type.\n",
"\n",
"We can then use `fold_tree` to implement some of the tree functions we've\n",
"previously seen:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "71aa5dd6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"val size : 'a tree -> int = \n"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"val depth : 'a tree -> int = \n"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"val preorder : 'a tree -> 'a list = \n"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"let size t = fold_tree (fun _ l r -> 1 + l + r) 0 t\n",
"let depth t = fold_tree (fun _ l r -> 1 + max l r) 0 t\n",
"let preorder t = fold_tree (fun x l r -> [x] @ l @ r) [] t"
]
},
{
"cell_type": "markdown",
"id": "9d081225",
"metadata": {},
"source": [
"Why did we pick `fold_right` and not `fold_left` for this development? Because\n",
"`fold_left` is tail recursive, which is something we're never going to achieve\n",
"on binary trees. Suppose we process the left branch first; then we still have to\n",
"process the right branch before we can return. So there will always be work left\n",
"to do after a recursive call on one branch. Thus on trees an equivalent to\n",
"`fold_right` is the best which we can hope for.\n",
"\n",
"The technique we used to derive `fold_tree` works for any OCaml variant type\n",
"`t`:\n",
"\n",
"* Write a recursive `fold` function that takes in one argument for each\n",
" constructor of `t`.\n",
"\n",
"* That `fold` function matches against the constructors, calling itself\n",
" recursively on any value of type `t` that it encounters.\n",
"\n",
"* Use the appropriate argument of `fold` to combine the results of all recursive\n",
" calls as well as all data not of type `t` at each constructor.\n",
"\n",
"This technique constructs something called a *catamorphism*, aka a *generalized\n",
"fold operation*. To learn more about catamorphisms, take a course on category\n",
"theory.\n",
"\n",
"## Filter on Trees\n",
"\n",
"This one is perhaps the hardest to design. The problem is: if we decide\n",
"to filter a node, what should we do with its children?\n",
"\n",
"- We could recurse on the children. If after filtering them only one child\n",
" remains, we could promote it in place of its parent. But what if both children\n",
" remain, or neither? Then we'd somehow have to reshape the tree. Without\n",
" knowing more about how the tree is intended to be used—that is, what\n",
" kind of data it represents—we are stuck.\n",
"\n",
"- Instead, we could just eliminate the children entirely. So the decision\n",
" to filter a node means pruning the entire subtree rooted at that node.\n",
"\n",
"The latter is easy to implement:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "0ebc236c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"val filter_tree : ('a -> bool) -> 'a tree -> 'a tree = \n"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"let rec filter_tree p = function\n",
" | Leaf -> Leaf\n",
" | Node (v, l, r) ->\n",
" if p v then Node (v, filter_tree p l, filter_tree p r) else Leaf"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"formats": "md:myst",
"text_representation": {
"extension": ".md",
"format_name": "myst",
"format_version": 0.13,
"jupytext_version": "1.10.3"
}
},
"kernelspec": {
"display_name": "OCaml",
"language": "OCaml",
"name": "ocaml-jupyter"
},
"language_info": {
"codemirror_mode": "text/x-ocaml",
"file_extension": ".ml",
"mimetype": "text/x-ocaml",
"name": "OCaml",
"nbconverter_exporter": null,
"pygments_lexer": "OCaml",
"version": "4.14.0"
},
"source_map": [
14,
24,
28,
35,
39,
52,
60,
74,
78,
86,
90,
131
]
},
"nbformat": 4,
"nbformat_minor": 5
}