{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "# Programming concepts cheat sheet\n",
    "\n",
    "When trying to figure out what went into a program, look at \n",
    " 1. documentation,\n",
    " 1. file names and subdirectories under which the source code has been organized,\n",
    " 1. imported libraries and their documentation,\n",
    " 1. function names and parameters, \n",
    " 1. function contents. \n",
    " \n",
    "Try to find the main function, and start delving from there."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## Libraries\n",
    "\n",
    "Contain functions and data types. Used to organize code and package large functionalities into reusable units."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [],
   "source": [
    "# Python\n",
    "import re # regular expressions\n",
    "import requests # web requests\n",
    "import pandas as pd # data science computation\n",
    "import numpy as np # numerical computation\n",
    "import matplotlib.pyplot as plt # plotting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading required package: NLP\n",
      "\n",
      "Attaching package: ‘NLP’\n",
      "\n",
      "The following object is masked from ‘package:ggplot2’:\n",
      "\n",
      "    annotate\n",
      "\n",
      "Loading required package: RColorBrewer\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "library(ggplot2) # plotting\n",
    "library(tidyverse) # data wrangling\n",
    "library(cluster) # data clustering\n",
    "library(slam) # numerical computation\n",
    "library(tm) # text mining \n",
    "library(SnowballC) # word stemming\n",
    "library(wordcloud) # word clouds\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[32mimport \u001b[39m\u001b[36m$ivy.$                                       // this one is just for the notebook, and not usual Scala\n",
       "\u001b[39m\n",
       "\u001b[32mimport \u001b[39m\u001b[36mcom.github.tototoshi.csv._\n",
       "\u001b[39m\n",
       "\u001b[32mimport \u001b[39m\u001b[36mscala.io.Source\n",
       "\u001b[39m\n",
       "\u001b[32mimport \u001b[39m\u001b[36mscala.collection.JavaConverters._\n",
       "\u001b[39m"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "import $ivy.`com.github.tototoshi::scala-csv:1.3.5` // this one is just for the notebook, and not usual Scala\n",
    "import com.github.tototoshi.csv._\n",
    "import scala.io.Source\n",
    "import scala.collection.JavaConverters._\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## Functions\n",
    "\n",
    "Allow you to package code in reusable packages. Used to organize a codebase. Zero or one output, as many input parameters as you like."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "where are we i dont know \n",
      "6\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "import re\n",
    "\n",
    "def standardize(text):\n",
    "  text = text.replace(\".\",\" \").replace(\",\",\" \").replace(\"?\",\" \").replace(\"!\",\" \").replace(\"'\",\"\").lower()\n",
    "  return re.sub(\"\\s+\",\" \", text)\n",
    "\n",
    "print(standardize(\"Where are we? I don't know!\"))\n",
    "\n",
    "def sum(values):\n",
    "  sum = 0\n",
    "  for value in values:\n",
    "    sum += value\n",
    "  return sum\n",
    "\n",
    "print(sum([1,2,3]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"where are we i dont know \"\n",
      "[1] 6\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "standardize <- function(text) {\n",
    "  return(tolower(gsub(\"\\\\s+\",\" \",gsub(\"\\\\.\",\" \",gsub(\",\",\" \",gsub(\"\\\\?\",\" \",gsub(\"!\",\" \",gsub(\"'\",\"\",text))))))))\n",
    "}\n",
    "\n",
    "print(standardize(\"Where are we? I don't know!\"))\n",
    "\n",
    "sum <- function(values) {\n",
    "  sum <- 0\n",
    "  for (value in values) {\n",
    "    sum <- sum + value\n",
    "  }\n",
    "  return(sum)\n",
    "}\n",
    "\n",
    "print(sum(c(1,2,3)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "where are we i dont know \n",
      "6\n",
      "6\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "defined \u001b[32mfunction\u001b[39m \u001b[36mstandardize\u001b[39m\n",
       "defined \u001b[32mfunction\u001b[39m \u001b[36msum1\u001b[39m\n",
       "defined \u001b[32mfunction\u001b[39m \u001b[36msum2\u001b[39m"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "def standardize(text: String) =\n",
    "  text.replace(\".\",\" \").replace(\",\",\" \").replace(\"?\",\" \").replace(\"!\",\" \").replace(\"'\",\"\").toLowerCase().replaceAll(\"\\\\s+\",\" \")\n",
    "\n",
    "println(standardize(\"Where are we? I don't know!\"))\n",
    "\n",
    "def sum1(values: Seq[Int]) = {\n",
    "    var sum = 0\n",
    "    for (value <- values) sum += value\n",
    "    sum\n",
    "}\n",
    "// here's a functional variant for sum\n",
    "def sum2(values: Seq[Int]) = values.reduce(_+_)\n",
    "println(sum1(Seq(1,2,3)))\n",
    "println(sum2(Seq(1,2,3)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "In many programming languages, methods are functions associated with data types, with a different syntax for specifying the key parameter:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "XXXX\n",
      "bXbX\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "print(\"abab\".replace(\"a\",\"b\").replace(\"b\",\"X\"))\n",
    "print(\"abab\".replace(\"b\",\"X\").replace(\"a\",\"b\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "XXXX\n",
      "bXbX\n"
     ]
    }
   ],
   "source": [
    "// Scala\n",
    "println(\"abab\".replace(\"a\",\"b\").replace(\"b\",\"X\"))\n",
    "println(\"abab\".replace(\"b\",\"X\").replace(\"a\",\"b\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "R doesn't really believe in methods.\n",
    "\n",
    "Operators are yet another, easier syntax for core functions. In Python and Scala, they really are syntactic sugar for methods, but in R they're a separate language construct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "15\n",
      "15\n",
      "[1, 2, 3, 4]\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "print((5).__add__(3).__add__(7))\n",
    "print(5+3+7)\n",
    "\n",
    "values = [1,2]\n",
    "values.extend([3])\n",
    "values += [4]\n",
    "print(values)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "15\n",
      "15\n",
      "ArrayBuffer(1, 2, 3, 4)"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[32mimport \u001b[39m\u001b[36mscala.collection.mutable.ArrayBuffer\n",
       "\n",
       "\u001b[39m\n",
       "\u001b[36mvalues\u001b[39m: \u001b[32mArrayBuffer\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mArrayBuffer\u001b[39m(\u001b[32m1\u001b[39m, \u001b[32m2\u001b[39m, \u001b[32m3\u001b[39m, \u001b[32m4\u001b[39m)\n",
       "\u001b[36mres3_4\u001b[39m: \u001b[32mArrayBuffer\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mArrayBuffer\u001b[39m(\u001b[32m1\u001b[39m, \u001b[32m2\u001b[39m, \u001b[32m3\u001b[39m, \u001b[32m4\u001b[39m)\n",
       "\u001b[36mres3_5\u001b[39m: \u001b[32mArrayBuffer\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mArrayBuffer\u001b[39m(\u001b[32m1\u001b[39m, \u001b[32m2\u001b[39m, \u001b[32m3\u001b[39m, \u001b[32m4\u001b[39m)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "import scala.collection.mutable.ArrayBuffer\n",
    "\n",
    "println(5.+(3).+(7))\n",
    "println(5+3+7)\n",
    "\n",
    "val values = ArrayBuffer(1,2)\n",
    "values.+=(3)\n",
    "values += 4\n",
    "print(values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## Variables\n",
    "\n",
    "Allow you to store data and refer to it using self-defined symbols in your code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Eetu is an adult\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "name = \"Eetu\"\n",
    "age = 18\n",
    "\n",
    "if age>=18:\n",
    "  print(name + \" is an adult\")\n",
    "else:\n",
    "  print(name + \" is a child\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"Eetu is an adult\"\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "name <- \"Eetu\"\n",
    "age <- 18\n",
    "\n",
    "if (age>=18) {\n",
    "  print(paste(name,\" is an adult\",sep=\"\"))\n",
    "} else {\n",
    "  print(paste(name, \" is a child\",sep=\"\"))\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Eetu is an adult\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mname\u001b[39m: \u001b[32mString\u001b[39m = \u001b[32m\"Eetu\"\u001b[39m\n",
       "\u001b[36mage\u001b[39m: \u001b[32mInt\u001b[39m = \u001b[32m18\u001b[39m"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val name = \"Eetu\"\n",
    "val age = 18\n",
    "\n",
    "if (age>=18)\n",
    "  println(name + \" is an adult\")\n",
    "else\n",
    "  println(name + \" is a child\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## If/else\n",
    "\n",
    "Program flow control statement that allows you to choose between alternate courses of action based on data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Eetu is an adult\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "name = \"Eetu\"\n",
    "age = 18\n",
    "\n",
    "if age<18:\n",
    "  print(name + \" is a child\")\n",
    "elif age>65:\n",
    "  print(name + \" is old\")\n",
    "elif age>100:\n",
    "  print(name + \" is ancient\")\n",
    "else:\n",
    "  print(name + \" is an adult\")\n",
    "  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"Eetu is an adult\"\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "name <- \"Eetu\"\n",
    "age <- 18\n",
    "\n",
    "if (age<18) {\n",
    "  print(paste(name, \"is a child\"))\n",
    "} else if (age>65) {\n",
    "  print(paste(name, \"is old\"))\n",
    "} else if (age>100) {\n",
    "  print(paste(name, \"is ancient\"))\n",
    "} else {\n",
    "  print(paste(name,\"is an adult\"))\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Eetu is an adult\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mname\u001b[39m: \u001b[32mString\u001b[39m = \u001b[32m\"Eetu\"\u001b[39m\n",
       "\u001b[36mage\u001b[39m: \u001b[32mInt\u001b[39m = \u001b[32m18\u001b[39m"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val name = \"Eetu\"\n",
    "val age = 18\n",
    "\n",
    "if (age<18)\n",
    "  println(name + \" is a child\")\n",
    "else if (age>65)\n",
    "  println(name + \" is old\")\n",
    "else if (age>100)\n",
    "  println(name + \" is ancient\")\n",
    "else \n",
    "  println(name + \" is an adult\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "scala"
   },
   "source": [
    "Some languages such as Scala and R have construct to make certain if else statements a bit easier:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello Batman\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mname\u001b[39m: \u001b[32mString\u001b[39m = \u001b[32m\"Batman\"\u001b[39m"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val name = \"Batman\"\n",
    "name match {\n",
    "    case \"John\" => println(\"Hello Johnny\")\n",
    "    case \"Bruce Wayne\" => println(\"Hello Batman\")\n",
    "    case anyname => println(\"Hello \"+anyname)\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "kernel": "ir"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"Hello Batman\"\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "name <- \"Batman\"\n",
    "switch(name,\n",
    "   \"John\" = print(\"Hello Johnny\"),\n",
    "   \"Bruce Wayne\" = print(\"Hello Batman\"),\n",
    "    print(paste(\"Hello\",name))\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## While\n",
    "\n",
    "General flow control structure for doing something as long as a condition holds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "First age over 18 (age nr. 3): 19\n",
      "Average age: 36.0\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "ages = [ 15, 17, 19, 20, 55, 90 ]\n",
    "\n",
    "i = 0\n",
    "while (ages[i]<18): i+=1\n",
    "\n",
    "print(\"First age over 18 (age nr. \"+str(i+1)+\"): \"+str(ages[i]))\n",
    "\n",
    "i = 0\n",
    "agesum = 0\n",
    "while i<len(ages):\n",
    "  agesum += ages[i]\n",
    "  i+=1\n",
    "print(\"Average age: \"+str(agesum/len(ages)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"First age over 18 (age nr. 3): 19\"\n",
      "[1] \"Average age:  36\"\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "ages <- c(15, 17, 19, 20, 55, 90)\n",
    "\n",
    "i <- 1\n",
    "while (ages[i]<18) i <- i+1\n",
    "\n",
    "print(paste(\"First age over 18 (age nr. \",i,\"): \",ages[i],sep=\"\"))\n",
    "\n",
    "i <- 1\n",
    "agesum <- 0\n",
    "while (i<=length(ages)) {\n",
    "  agesum <- agesum + ages[i]\n",
    "  i <- i + 1\n",
    "}\n",
    "print(paste(\"Average age: \",agesum/length(ages)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "First age over 18 (age nr. 3): 19\n",
      "Average age: 36\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mages\u001b[39m: \u001b[32mSeq\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mList\u001b[39m(\u001b[32m15\u001b[39m, \u001b[32m17\u001b[39m, \u001b[32m19\u001b[39m, \u001b[32m20\u001b[39m, \u001b[32m55\u001b[39m, \u001b[32m90\u001b[39m)\n",
       "\u001b[36mi\u001b[39m: \u001b[32mInt\u001b[39m = \u001b[32m6\u001b[39m\n",
       "\u001b[36magesum\u001b[39m: \u001b[32mInt\u001b[39m = \u001b[32m216\u001b[39m"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val ages = Seq(15, 17, 19, 20, 55, 90)\n",
    "\n",
    "var i = 0\n",
    "while (ages(i)<18) i+=1\n",
    "\n",
    "println(\"First age over 18 (age nr. \"+(i+1)+\"): \"+ages(i))\n",
    "\n",
    "\n",
    "i = 0\n",
    "var agesum = 0\n",
    "while (i<ages.length) {\n",
    "  agesum += ages(i)\n",
    "  i += 1\n",
    "}\n",
    "println(\"Average age: \"+agesum/ages.length)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## For\n",
    "\n",
    "Specific structure available in most languages for repeatedly doing something to a set of values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Average age: 36.0\n",
      "[26, 34, 29]\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "ages = [ 15, 17, 19, 20, 55, 90 ]\n",
    "\n",
    "agesum = 0\n",
    "for age in ages: agesum += age\n",
    "\n",
    "print(\"Average age: \"+str(agesum/len(ages)))\n",
    "\n",
    "birth_years = [1918, 1910, 1915]\n",
    "ages = []\n",
    "for birth_year in birth_years: ages += [1944 - birth_year]\n",
    "print(ages)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"Average age: 36\"\n",
      "[1] 26 34 29\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "ages <- c(15, 17, 19, 20, 55, 90)\n",
    "\n",
    "agesum <- 0\n",
    "for (age in ages) agesum <- agesum + age\n",
    "\n",
    "print(paste(\"Average age:\",agesum/length(ages)))\n",
    "\n",
    "birth_years = c(1918, 1910, 1915)\n",
    "ages <- c()\n",
    "for (birth_year in birth_years)\n",
    "  ages <- c(ages, 1944 - birth_year)\n",
    "print(ages)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "inputHidden": false,
    "kernel": "scala",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Average age: 36\n",
      "ArrayBuffer(26, 34, 29)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[32mimport \u001b[39m\u001b[36mscala.collection.mutable.ArrayBuffer\n",
       "\n",
       "\u001b[39m\n",
       "\u001b[36mages\u001b[39m: \u001b[32mSeq\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mList\u001b[39m(\u001b[32m15\u001b[39m, \u001b[32m17\u001b[39m, \u001b[32m19\u001b[39m, \u001b[32m20\u001b[39m, \u001b[32m55\u001b[39m, \u001b[32m90\u001b[39m)\n",
       "\u001b[36magesum\u001b[39m: \u001b[32mInt\u001b[39m = \u001b[32m216\u001b[39m\n",
       "\u001b[36mbirth_years\u001b[39m: \u001b[32mSeq\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mList\u001b[39m(\u001b[32m1918\u001b[39m, \u001b[32m1910\u001b[39m, \u001b[32m1915\u001b[39m)\n",
       "\u001b[36mages2\u001b[39m: \u001b[32mArrayBuffer\u001b[39m[\u001b[32mInt\u001b[39m] = \u001b[33mArrayBuffer\u001b[39m(\u001b[32m26\u001b[39m, \u001b[32m34\u001b[39m, \u001b[32m29\u001b[39m)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "import scala.collection.mutable.ArrayBuffer\n",
    "\n",
    "val ages = Seq(15, 17, 19, 20, 55, 90)\n",
    "\n",
    "var agesum = 0\n",
    "for (age <- ages) agesum += age\n",
    "\n",
    "println(\"Average age: \"+agesum/ages.length)\n",
    "\n",
    "val birth_years = Seq(1918, 1910, 1915)\n",
    "val ages2 = ArrayBuffer[Int]()\n",
    "for (birth_year <- birth_years) ages2 += 1944 - birth_year\n",
    "println(ages2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## Lists/sequences/arrays\n",
    "\n",
    "Lists are data structures for holding multiple values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Where are we? I don't know!\n",
      "This, programming... is... terrifying!\n",
      "Where are we? I don't know!\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "sentences = [ \"Where are we? I don't know!\", \"This, programming... is... terrifying!\" ]\n",
    "\n",
    "# Here we're calling the function once for each string in the sentences list\n",
    "for sentence in sentences:\n",
    "    print(sentence)\n",
    "    \n",
    "# You can also explicitly refer to a particular slot in a list using square brackets:\n",
    "print(sentences[0])\n",
    "# In the above, note that the first entry in the list is at index 0, not 1. That's a conventional relic that permeates most programming languages, and comes originally from the way computers handle memory.\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "inputHidden": false,
    "kernel": "ir",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"Where are we? I don't know!\"\n",
      "[1] \"This, programming... is... terrifying!\"\n",
      "[1] \"Where are we? I don't know!\"\n"
     ]
    }
   ],
   "source": [
    "# R\n",
    "sentences = c(\"Where are we? I don't know!\", \"This, programming... is... terrifying!\")\n",
    "\n",
    "# Here we're calling the function once for each string in the sentences list\n",
    "for (sentence in sentences)\n",
    "    print(sentence)\n",
    "    \n",
    "# You can also explicitly refer to a particular slot in a list using square brackets:\n",
    "print(sentences[1])\n",
    "# In the above, note how R indices start at 1, in contrast to many other languages.\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Where are we? I don't know!\n",
      "This, programming... is... terrifying!\n",
      "Where are we? I don't know!\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36msentences\u001b[39m: \u001b[32mSeq\u001b[39m[\u001b[32mString\u001b[39m] = \u001b[33mList\u001b[39m(\n",
       "  \u001b[32m\"Where are we? I don't know!\"\u001b[39m,\n",
       "  \u001b[32m\"This, programming... is... terrifying!\"\u001b[39m\n",
       ")"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val sentences = Seq(\"Where are we? I don't know!\", \"This, programming... is... terrifying!\")\n",
    "\n",
    "// Here we're calling the function once for each string in the sentences list\n",
    "for (sentence <- sentences)\n",
    "    println(sentence)\n",
    "    \n",
    "// You can also explicitly refer to a particular slot in a list using square brackets:\n",
    "println(sentences(0))\n",
    "// In the above, note that the first entry in the list is at index 0, not 1. That's a conventional relic that permeates most programming languages, and comes originally from the way computers handle memory.\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "kernel": "SoS"
   },
   "source": [
    "## Dictionaries/maps/hashes\n",
    "\n",
    "Dictionaries are useful data structures for mapping values to other values, or for creating simple structured data. Python and Scala have them. R has named vectors, but those are a bit more complicated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Where are we  and I dont know \n",
      "and\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "replacements = {\n",
    "    \".\": \" \",\n",
    "    \",\": \" \",\n",
    "    \"!\": \" \",\n",
    "    \"?\": \" \",\n",
    "    \"'\": \"\",\n",
    "    \"&\": \"and\" \n",
    "}\n",
    "\n",
    "# Here we're going over all the keys in the replacement dictionary and acting on them\n",
    "text = \"Where are we? & I don't know!\"\n",
    "for key in replacements:\n",
    "    text = text.replace(key, replacements[key])\n",
    "print(text)\n",
    "\n",
    "# You can also explicitly refer to a particular slot in a list or a key in a dictionary using square brackets:\n",
    "print(replacements[\"&\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Where are we  and I dont know \n",
      "and\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mreplacements\u001b[39m: \u001b[32mMap\u001b[39m[\u001b[32mString\u001b[39m, \u001b[32mString\u001b[39m] = \u001b[33mMap\u001b[39m(\n",
       "  \u001b[32m\".\"\u001b[39m -> \u001b[32m\" \"\u001b[39m,\n",
       "  \u001b[32m\"&\"\u001b[39m -> \u001b[32m\"and\"\u001b[39m,\n",
       "  \u001b[32m\"!\"\u001b[39m -> \u001b[32m\" \"\u001b[39m,\n",
       "  \u001b[32m\",\"\u001b[39m -> \u001b[32m\" \"\u001b[39m,\n",
       "  \u001b[32m\"'\"\u001b[39m -> \u001b[32m\"\"\u001b[39m,\n",
       "  \u001b[32m\"?\"\u001b[39m -> \u001b[32m\" \"\u001b[39m\n",
       ")\n",
       "\u001b[36mtext\u001b[39m: \u001b[32mString\u001b[39m = \u001b[32m\"Where are we  and I dont know \"\u001b[39m"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "val replacements = Map(\n",
    "    \".\" -> \" \",\n",
    "    \",\" -> \" \",\n",
    "    \"!\" -> \" \",\n",
    "    \"?\" -> \" \",\n",
    "    \"'\" -> \"\",\n",
    "    \"&\" -> \"and\" \n",
    ")\n",
    "\n",
    "// Here we're going over all the keys in the replacement dictionary and acting on them\n",
    "var text = \"Where are we? & I don't know!\"\n",
    "for ((key,replacement) <- replacements)\n",
    "    text = text.replace(key, replacement)\n",
    "println(text)\n",
    "\n",
    "// You can also explicitly refer to a particular slot in a list or a key in a dictionary using square brackets:\n",
    "println(replacements(\"&\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "!\n",
      "['?', '!']\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "# Note that a dictionary can only contain one value for each key\n",
    "replacements = {\n",
    "    \".\" : \"?\",\n",
    "    \".\" : \"!\"\n",
    "}\n",
    "print(replacements[\".\"])\n",
    "\n",
    "# Therefore, if you need multiple values, you have to combine dictionaries with lists:\n",
    "replacements = {\n",
    "    \".\" : [\"?\",\"!\"]\n",
    "}\n",
    "\n",
    "print(replacements[\".\"])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "!\n",
      "List(?, !)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mreplacements\u001b[39m: \u001b[32mMap\u001b[39m[\u001b[32mString\u001b[39m, \u001b[32mString\u001b[39m] = \u001b[33mMap\u001b[39m(\u001b[32m\".\"\u001b[39m -> \u001b[32m\"!\"\u001b[39m)\n",
       "\u001b[36mreplacements2\u001b[39m: \u001b[32mMap\u001b[39m[\u001b[32mString\u001b[39m, \u001b[32mSeq\u001b[39m[\u001b[32mString\u001b[39m]] = \u001b[33mMap\u001b[39m(\u001b[32m\".\"\u001b[39m -> \u001b[33mList\u001b[39m(\u001b[32m\"?\"\u001b[39m, \u001b[32m\"!\"\u001b[39m))"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "// Note that a dictionary can only contain one value for each key\n",
    "val replacements = Map(\n",
    "    \".\" -> \"?\",\n",
    "    \".\" -> \"!\"\n",
    ")\n",
    "println(replacements(\".\"))\n",
    "\n",
    "// Therefore, if you need multiple values, you have to combine dictionaries with lists:\n",
    "val replacements2 = Map(\n",
    "    \".\" -> Seq(\"?\",\"!\")\n",
    ")\n",
    "\n",
    "println(replacements2(\".\"))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "inputHidden": false,
    "kernel": "python3",
    "outputHidden": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Batman', 'Philanthropist']\n"
     ]
    }
   ],
   "source": [
    "# Python\n",
    "# Here's some structured data stored in a combination of arrays and dictionaries:\n",
    "people = [\n",
    "    { \n",
    "        \"name\": \"Eetu\",\n",
    "        \"age\": 18,\n",
    "        \"jobs\": [ \"Researcher\", \"Lecturer\"]\n",
    "    },\n",
    "    {\n",
    "        \"name\": \"Bruce Wayne\",\n",
    "        \"age\": 65,\n",
    "        \"jobs\": [ \"Batman\", \"Philanthropist\"]\n",
    "    }\n",
    "]\n",
    "\n",
    "for person in people: \n",
    "    if person[\"name\"] == \"Bruce Wayne\": \n",
    "        print(person[\"jobs\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "kernel": "scala"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "List(Batman, Philanthropist)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\u001b[36mpeople\u001b[39m: \u001b[32mSeq\u001b[39m[\u001b[32mMap\u001b[39m[\u001b[32mString\u001b[39m, \u001b[32mAny\u001b[39m]] = \u001b[33mList\u001b[39m(\n",
       "  \u001b[33mMap\u001b[39m(\u001b[32m\"name\"\u001b[39m -> \u001b[32m\"Eetu\"\u001b[39m, \u001b[32m\"age\"\u001b[39m -> \u001b[32m18\u001b[39m, \u001b[32m\"jobs\"\u001b[39m -> \u001b[33mList\u001b[39m(\u001b[32m\"Researcher\"\u001b[39m, \u001b[32m\"Lecturer\"\u001b[39m)),\n",
       "  \u001b[33mMap\u001b[39m(\n",
       "    \u001b[32m\"name\"\u001b[39m -> \u001b[32m\"Bruce Wayne\"\u001b[39m,\n",
       "    \u001b[32m\"age\"\u001b[39m -> \u001b[32m65\u001b[39m,\n",
       "    \u001b[32m\"jobs\"\u001b[39m -> \u001b[33mList\u001b[39m(\u001b[32m\"Batman\"\u001b[39m, \u001b[32m\"Philanthropist\"\u001b[39m)\n",
       "  )\n",
       ")"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// Scala\n",
    "// Here's some structured data stored in a combination of arrays and dictionaries:\n",
    "val people = Seq(\n",
    "    Map( \n",
    "        \"name\" -> \"Eetu\",\n",
    "        \"age\" -> 18,\n",
    "        \"jobs\" -> Seq(\"Researcher\", \"Lecturer\")\n",
    "    ),\n",
    "    Map(\n",
    "        \"name\" -> \"Bruce Wayne\",\n",
    "        \"age\" -> 65,\n",
    "        \"jobs\" -> Seq(\"Batman\", \"Philanthropist\")\n",
    "    )\n",
    ")\n",
    "\n",
    "for (person <- people)\n",
    "    if (person(\"name\") == \"Bruce Wayne\")\n",
    "        println(person(\"jobs\"))"
   ]
  }
 ],
 "metadata": {
  "kernel_info": {
   "name": "python3"
  },
  "kernelspec": {
   "display_name": "SoS",
   "language": "sos",
   "name": "sos"
  },
  "language_info": {
   "codemirror_mode": "sos",
   "file_extension": ".sos",
   "mimetype": "text/x-sos",
   "name": "sos",
   "nbconvert_exporter": "sos_notebook.converter.SoS_Exporter",
   "pygments_lexer": "sos"
  },
  "nteract": {
   "version": "0.2.0"
  },
  "sos": {
   "kernels": [
    [
     "python3",
     "python3",
     "python3",
     "#FFD91A"
    ],
    [
     "ir",
     "ir",
     "ir",
     "#DCDCDA"
    ],
    [
     "scala",
     "scala",
     "",
     ""
    ]
   ],
   "panel": {
    "displayed": true,
    "height": "432px",
    "style": "side"
   },
   "version": "0.17.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}