{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# \"Causality modelling in Python for data scientists\"\n", "> \"Data science is increasingly commonplace in industry and the enterprise. Industrial data scientists have a vast toolbox for descriptive and predictive analyses at their disposal. However, data science tools for decision-making in industry and the enterprise are less well established. Here we survey Python packages that can aid industrial data scientists facilitate intelligent decision-making through causality modelling.\"\n", "- hidden: true\n", "- toc: true\n", "- branch: master\n", "- badges: true\n", "- comments: true\n", "- categories: [causal inference, causal discovery, causality modelling, python]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import random" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "np.random.seed(123)\n", "random.seed(123)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "no_samples = 10000\n", "\n", "seasons = np.random.choice(['winter', 'spring', 'summer', 'fall'], size=(no_samples,))\n", "color = np.array(\n", " [\n", " random.choice(['yellow', 'pink'])\n", " if season in ['spring', 'summer']\n", " else random.choice(['navy', 'grey'])\n", " for season in seasons\n", " ]\n", ")\n", "price = np.random.lognormal(size=(no_samples,))\n", "rank = np.array(\n", " [\n", " \n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['fall', 'spring', 'spring', ..., 'spring', 'spring', 'spring'],\n", " dtype='