{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Boolean Scans and Reductions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The content of this notebook was presented in [Dyalog Webinars: Boolean Scans and Reductions](https://dyalog.tv/Webinar/?v=erv_1LxEByk)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Logical values are the integers `0` and `1`, in contrast to distinct `true` and `false` values you might find in other programming languages."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Utility"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Boolean data is widely used in APL."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Results of comparison functions:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1 1 0 0 0\n",
"1 1 1 0 0\n",
"0 0 1 0 0\n",
"1 1 0 1 1\n",
"0 0 1 1 1\n",
"0 0 0 1 1\n",
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"_ ← ,⍥⊆ ⍝ Stranding function\n",
"¯2 ¯1 0 1 2 (↑ < _ ≤ _ = _ ≠ _ ≥ _ >) 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Data-driven conditionals. For example, \"increase the salaries in group A by 5%\":"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"21000 25750 22050 32350 32400\n",
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"groups ← 'ABACC'\n",
"salaries ← 20000 25750 21000 32350 32400\n",
"salaries × 1.05 * groups = 'A'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Selecting from arrays:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"┌───┬────┬─────┐\n",
"│Ben│Ella│Fiona│\n",
"└───┴────┴─────┘\n",
""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"values ← 4 10 6 8 16 24\n",
"names←'Anne' 'Ben' 'Charlie' 'Demi' 'Ella' 'Fiona'\n",
"(values≥10)/names"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Simple statistics:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"0.333333\n",
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(+⌿÷≢)'the alphabet'∊'aeiou'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sixteen logical functions\n",
"With modern APL, all sixteen logical functions for 2-bit inputs can be represented by either [primitives](https://aplwiki.com/wiki/Primitive_function), [atops](https://aplwiki.com/wiki/Tacit_programming#Atops) or [constant functions](https://aplwiki.com/wiki/Constant). \n",
"\n",
"The descriptions given below are either common names for [digital logic gates](https://en.wikipedia.org/wiki/Logic_gate) or mnemonic descriptions of the functionality with regards to left and right Boolean arguments.\n",
"\n",
"> The binary column here represets the output of the logic gate for all combinations of two 1-bit inputs — also known as the truth table. For example, the truth table for an **OR** gate:\n",
">\n",
"> ```\n",
"> 0 0 1 1 ∨ 0 1 0 1\n",
"> 0 1 1 1\n",
"> 2⊥0 1 1 1\n",
"> 7\n",
"> ```\n",
"\n",
"|Binary|Decimal `2⊥`|Function `f`|Description|\n",
"|:---:|:---|:---|:---|\n",
"|`0 0 0 0`|`0`|`0⍨`|FALSE| \n",
"|`0 0 0 1`|`1`|`∧`|AND| \n",
"|`0 0 0 1`|`2`|`>`|Left but not right| \n",
"|`0 0 1 1`|`3`|`⊣`|Left| \n",
"|`0 0 0 1`|`4`|`<`|Right but not left| \n",
"|`0 1 0 1`|`5`|`⊢`|Right| \n",
"|`0 0 1 1`|`6`|`≠`|Exlusive OR| \n",
"|`0 1 1 1`|`7`|`∨`|OR| \n",
"|`0 0 0 1`|`8`|`⍱`|NOR|\n",
"|`1 0 0 1`|`9`|`=`|Exclusive NOR|\n",
"|`0 1 0 1`|`10`|`~⍤⊢`|Not right|\n",
"|`1 1 0 1`|`11`|`≥`|Left OR not right|\n",
"|`0 0 1 1`|`12`|`~⍤⊣`|Not left|\n",
"|`1 0 1 1`|`13`|`≤`|Right OR not left|\n",
"|`0 1 1 1`|`14`|`⍲`|NAND|\n",
"|`1 1 1 1`|`15`|`1⍨`|TRUE|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **Note** \n",
" [Phil Last's article](http://archive.vector.org.uk/art10501140) gives [scalar](https://aplwiki.com/wiki/Scalar_function) [dfn](https://aplwiki.com/wiki/Dfn) definitions, whereas some definitions in the previous table are not scalar functions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scans and reductions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Reductions summarise some property:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"13\n",
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"+/4 3 1 5 ⍝ Sum"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Scans indicate a progression of that property along the array:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"4 7 8 13\n",
""
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"+\\4 3 1 5 ⍝ Cumulative sum"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some reductions, and their related scans, are well known to APLers:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1 1 0 0 0 0 0 0\n",
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0 0 1 1 1 1 1 1\n",
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"∧/1 1 1 1 1 1 1 1 ⍝ Are all true?\n",
"∧\\1 1 0 1 1 1 0 1 ⍝ Were all true so far?\n",
"∨/0 0 1 0 0 0 0 0 ⍝ Are any true?\n",
"∨\\0 0 1 0 0 0 1 0 ⍝ Were any true so far?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The last value"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The value of a vector reduction `f/⍵` is always the last value in the related scan `f\\⍵`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"13\n",
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"13\n",
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"+/3 4 1 5\n",
"⊃⌽+\\3 4 1 5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In general, properties of boolean vectors for which `1≡f/⍵` for a function `f` reflect properties in the related scans `f\\⍵`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From here we will take a quick look at a couple of particular scans and reductions which have some interesting applications. At the end of this notebook is a table of reductions for the sixteen logical functions, as well as some references for further reading."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Less-than"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Less-than-scan leaves only the first true bit on, and all others are turned off:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0 0 1 0 0 0 0 0 0 0 0 0 0 0 0\n",
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"<\\0 0 1 0 1 1 1 1 1 1 0 1 0 1 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is because the reduction only returns true when the last bit is the one and only true bit:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"0 0 0 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Too see the relationship, view the scan as a progression of reductions:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⍝ <\\0 1 0 1 1\n",
"0\n",
"0<1\n",
"0<1<0\n",
"0<1<0<1\n",
"0<1<0<1<1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An example use is to mark the start of end-of-line comments:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"+ / ⍳ 1 0 ⍝ s u m ⍝ o f ⍝ f i r s t ⍝ 1 0 ⍝ i n t s\n",
"0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0\n",
"0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n",
""
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"code ← '+/⍳10 ⍝ sum⍝of⍝first⍝10⍝ints'\n",
"c ← code = '⍝'\n",
"↑ code c (<\\c)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The pair-wise reduction can be used to mark the start of groups of similar cells:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"⍟ ⎕ ⍟ ⍟ ⍟ ⎕ ⎕ ⍟ ⎕ ⎕ ⍟ ⍟ ⍟ ⍟ ⎕ ⎕ ⎕ ⎕ ⍟ ⍟ ⎕ ⎕ ⎕ ⎕ ⍟ ⍟ ⎕ ⍟\n",
"0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 0\n",
"0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0\n",
""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"v ← '⍟⎕⍟⍟⍟⎕⎕⍟⎕⎕⍟⍟⍟⍟⎕⎕⎕⎕⍟⍟⎕⎕⎕⎕⍟⍟⎕⍟'\n",
"q ← '⎕'=v\n",
"↑v q (20,q)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Not-equal"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The not-equal reduction `≠/` only returns true if there are an odd number of `1`s"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"≠/1 1 1 1\n",
"2|+/1 1 1 1\n",
"\n",
"≠/1 1 1 0\n",
"2|+/1 1 1 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The related scan `≠\\` then has the property that odd and even instances of `1`s are joined by a series of `1`s"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0 0 1 0 0 0 1 0 0 0 0 1 0 0 1\n",
"0 0 1 1 1 1 0 0 0 0 0 1 1 1 0\n",
""
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b ← 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1\n",
"↑ b (≠\\b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, looking at the progression can help us to see how this works:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"1\n",
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"0\n",
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⍝ ≠\\0 1 0 0 1\n",
"0\n",
"0≠1\n",
"0≠1≠0\n",
"0≠1≠0≠0\n",
"0≠1≠0≠0≠1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An example use of this is to mark quoted parts of text:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"e x t r a c t t h e \" q u o t e d \" p a r t s f r o m t h i s \" t e x t \"\n",
"0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1\n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0\n",
"0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1\n",
""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"quoted ← 'extract the \"quoted\" parts from this \"text\"'\n",
"q ← '\"'=quoted\n",
"inc ← (~∧≠\\)'\"'=quoted ⍝ Including quote marks\n",
"exc ← (⊢∨≠\\)'\"'=quoted ⍝ Exclusing quote marks\n",
"↑ quoted q inc exc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These masks can easily be used to isolate the quoted parts:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"quotedtext\n",
""
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"┌────────┬──────┐\n",
"│\"quoted\"│\"text\"│\n",
"└────────┴──────┘\n",
""
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inc/quoted\n",
"exc⊆quoted"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With a bit more effort, we can mark C-style (Java, JavaScript, CSS etc.) comments:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"* { c o l o r : # 0 0 0 ; / * t e x t * / b a c k g r o u n d : # f f f ; / * b g * / }\n",
"0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0\n",
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"css ← '* {color: #000; /* text */ background: #fff; /* bg */}'\n",
"cpos ← '/*'∘(≠\\⍷∨¯1⌽∘⌽⍷∘⌽)css\n",
"↑ css cpos"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some of these examples, and more, can be found [on APLcart](https://aplcart.info/?q=%E2%89%A0%5C#)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For one last example, let's use something like a data table with successive fields of different lengths:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"data ← (?9⍴100),(4 2 3⌿3 3⍴3/'ABC'),⍪?9⍴0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In `data`, there is a field of length `4` containing the `'A'` markers, followed by a field of length `2` with the `'B'` markers and finally a field of length 3 with the `'C'` markers."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"50 AAA 0.127932\n",
"26 AAA 0.528921\n",
"10 AAA 0.116772\n",
"50 AAA 0.320547\n",
"64 BBB 0.377289\n",
"90 BBB 0.54566 \n",
"68 CCC 0.339044\n",
"98 CCC 0.145094\n",
"66 CCC 0.63628 \n",
""
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⎕←data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We would like to present the data above in a more human-readable format, so we are going to insert two blank lines between each field."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To start with, let's create some partition vectors. The function `MTake` creates a boolean vector where `1` indicates the start of a field:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`MTake` has a neat, nested-array style definition:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1 0 0 0 1 0 1 0 0\n",
""
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"MTake ← ∊↑∘1¨\n",
"MTake 4 2 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A more traditional definition has performance benefits for larger data:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1 0 0 0 1 0 1 0 0\n",
""
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"MTake ← {¯1⌽(⍳+/⍵)∊+\\⍵}\n",
"MTake 4 2 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If `f` indicates a field, and `b` indicates a series of blank lines,\n",
"```\n",
"1 0 0 0 1 0 1 0 1 0 1 0 0 1 0\n",
"←--f--→ ←b→ ←f→ ←b→ ←-f-→ ←b→\n",
"```\n",
"we can use *catenate-rank-zero* to interleave the blank lines:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"1 0 0 0 1 0 1 0 1 0 1 0 0 1 0\n",
""
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"MTake,4 2 3,⍤0⊢2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And use *not-equal-scan* to switch sections on and off, creating an expansion vector:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1 1 1 1 0 0 1 1 0 0 1 1 1 0 0\n",
""
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⎕←e←≠\\MTake,4 2 3,⍤0⊢2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Which we then use to expand a character matrix format of our data:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"50 AAA 0.127932\n",
"26 AAA 0.528921\n",
"10 AAA 0.116772\n",
"50 AAA 0.320547\n",
" \n",
" \n",
"64 BBB 0.377289\n",
"90 BBB 0.54566 \n",
" \n",
" \n",
"68 CCC 0.339044\n",
"98 CCC 0.145094\n",
"66 CCC 0.63628 \n",
" \n",
" \n",
""
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"e⍀⍕data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A similar technique can be used to also label our fields:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"Field A 50 AAA 0.127932\n",
" 26 AAA 0.528921\n",
" 10 AAA 0.116772\n",
" 50 AAA 0.320547\n",
" \n",
" \n",
"Field B 64 BBB 0.377289\n",
" 90 BBB 0.54566 \n",
" \n",
" \n",
"Field C 68 CCC 0.339044\n",
" 98 CCC 0.145094\n",
" 66 CCC 0.63628 \n",
" \n",
" \n",
""
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"labels ← (MTake 4 2 3)⍀(↑'Field '∘,¨'ABC')\n",
"e⍀⍕labels,data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Flat partition"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [STSC](https://en.wikipedia.org/wiki/Scientific_Time_Sharing_Corporation) publication [*Boolean Functions and Techniques*](http://www.sudleyplace.com/APL/boolean.pdf) contains a wealth of information on Boolean functions and techniques (duh).\n",
"\n",
"In particular, there is a description of a general procedure for applying some function to parts of an array independently, where *parts* is defined similarly to the fields in the previous example."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The general procedure involves a loop. With nested arrays, the pattern is quite easy to show:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"DCBAZYX\n",
""
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"1 0 0 0 1 0 0 {∊⌽¨⍺⊂⍵} 'ABCDXYZ' ⍝ Reverse two parts independently"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10 18\n",
""
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"1 0 0 0 1 0 0 {∊+/¨⍺⊂⍵} ⍳7 ⍝ Sum two parts independently"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This pattern is easily turned into a [dop]():"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"DCBAZYX\n",
""
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/html": [
"10 18\n",
""
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"_P ← {∊⍺⍺¨⍺⊂⍵} ⍝ A partitioned-function-application operator\n",
"p ← 1 0 0 0 1 0 0\n",
"p ⌽ _P 'ABCDXYZ'\n",
"p +/_P ⍳7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, this looping approach can become inefficient for large arrays with many parts. Page 10 of Boolean Functions and Techniques gives some definitions for more array-oriented solutions for use with specific functions. Here we will look at a couple of these, starting with *partitioned-reverse* `PREVERSE`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The *plus-scan* is useful with these types of Boolean partitioned vectors, as it causes partitions to be marked by groups of successive integers:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"A B C D X Y Z\n",
"1 0 0 0 1 0 0\n",
"1 1 1 1 2 2 2\n",
""
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a ← 'ABCDXYZ'\n",
"↑ a p (+\\p)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we *downgrade* the plus-scan, the relative positions of indices for each partition **relative to the whole array** are reversed, but the positions of indices **within each partition** stay in ascending order:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"5 6 7 1 2 3 4\n",
""
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⍒+\\p"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Reversing the plus-scan then restores the relative positions of partitions within the whole array, but the indices **within each partition** are reversed:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"4 3 2 1 7 6 5\n",
""
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"⌽⍒+\\p"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Indexing with this vector creates the desired effect:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"DCBAZYX\n",
""
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a[⌽⍒+\\p]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These kinds of techniques often give performance improvements compared to their general nested-array counterparts:\n",
"```\n",
" p ← 1,1↓1=?1e4⍴10\n",
" t ← ⎕A[?1e4⍴26]\n",
"\n",
" ]runtime -c \"p ⌽_P t\" \"p {⍵[⌽⍒+\\⍺]} t\"\n",
" \n",
" p ⌽_P t → 1.4E¯4 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ \n",
" p {⍵[⌽⍒+\\⍺]} t → 3.5E¯5 | -75% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The partitioned sum starts with the plus-scan of the whole vector:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1 3 6 10 15 21 28\n",
""
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"+\\⍳7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We then extract the progressive sums up to the end of each partition:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10 28\n",
""
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(1⌽p)/+\\⍳7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally subtract each partition's sum from the sum of its preceding partition:"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10 18\n",
""
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"¯2-/0,(1⌽p)/+\\⍳7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once again, there is a speedup relative to the general case:\n",
"```\n",
" p ← 1,1↓1=?1e4⍴10\n",
" i ← ?1e4⍴1000\n",
"\n",
" ]runtime -c \"p +/_P i\" \"p {¯2-/0,(1⌽⍺)/+\\⍵} i\"\n",
" \n",
" p +/_P i → 6.8E¯5 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ \n",
" p {¯2-/0,(1⌽⍺)/+\\⍵} i → 8.9E¯6 | -87% ⎕⎕⎕⎕⎕ \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Efficient implementation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some languages implement boolean types as [bytes](https://en.wikipedia.org/wiki/Byte) (8 bits). The decision to use bit representations [was made by Larry Breed for the first implementation](https://aplwiki.com/wiki/Boolean#Boolean_optimization) of APL.\n",
"\n",
"Not only do boolean arrays use the smallest amount of memory possible, they are also subject to [SIMD](https://en.wikipedia.org/wiki/SIMD) parallelisations - even on hardware without special SIMD capabilities.\n",
"\n",
"More on this topic can be found on [the APL Wiki article on Booleans](https://apl.wiki/boolean)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sixteen boolean reductions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally let's return to the table of Boolean functions from earlier. In the [Vector article](http://archive.vector.org.uk/art10501140), Phil Last gives plain English descriptions for properties of boolean vectors when their reduction using a Boolean function `f` gives `1`.\n",
"\n",
"|Function `f`|`f/⍵` is true if `⍵` satisfies:|\n",
"|:---:|---:|\n",
"|`0⍨`|Never| \n",
"|`∧`|All ones| \n",
"|`>`|Odd leading ones| \n",
"|`⊣`|First is one| \n",
"|`<`|Last is the only one| \n",
"|`⊢`|Last is one| \n",
"|`≠`|Odd ones| \n",
"|`∨`|At least one one| \n",
"|`⍱`|Odd leading zeros else the last is the only one|\n",
"|`=`|Even zeroes|\n",
"|`~⍤⊢`|Last is parity of the length|\n",
"|`≥`|Even leading zeros|\n",
"|`~⍤⊣`|First is zero|\n",
"|`≤`|Last is not the only zero|\n",
"|`⍲`|Even leading ones else last is the only zero|\n",
"|`1⍨`|Always|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For those not described in this notebook, you might want to experiment and show yourself the relationships between the descriptions given in the table, and the properties of the scans `f\\⍵`. APL expressions offering similar insights are given on page 23 of the Boolean Functions and Techniques publication."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Further reading\n",
"- [apl.wiki/boolean](https://apl.wiki/boolean)\n",
"- **S.B. Jaffe** [Topics for a Second Course in APL](https://dl.acm.org/doi/10.1145/22008.22025)\n",
"- **R. Bernecky** [A Compendium of SIMD Boolean Array Algorithms](https://dyalog.tv/Dyalog16/?v=G2g13YKjWes) (Video)\n",
"- **P. Last** [Boolean Reductions](http://archive.vector.org.uk/art10501140)\n",
"- **STSC** [Boolean Functions and Techniques, 2nd Edition](http://www.sudleyplace.com/APL/boolean.pdf)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Dyalog APL",
"language": "apl",
"name": "dyalog-kernel"
},
"language_info": {
"file_extension": ".apl",
"mimetype": "text/apl",
"name": "APL"
}
},
"nbformat": 4,
"nbformat_minor": 4
}