{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "
\n", " \n", " \"QuantEcon\"\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data and Statistics Packages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "\n", "- [Data and Statistics Packages](#Data-and-Statistics-Packages) \n", " - [Overview](#Overview) \n", " - [DataFrames](#DataFrames) \n", " - [Statistics and Econometrics](#Statistics-and-Econometrics) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "This lecture explores some of the key packages for working with data and doing statistics in Julia.\n", "\n", "In particular, we will examine the `DataFrame` object in detail (i.e., construction, manipulation, querying, visualization, and nuances like missing data).\n", "\n", "While Julia is not an ideal language for pure cookie-cutter statistical analysis, it has many useful packages to provide those tools as part of a more general solution.\n", "\n", "This list is not exhaustive, and others can be found in organizations such as [JuliaStats](https://github.com/JuliaStats), [JuliaData](https://github.com/JuliaData/), and [QueryVerse](https://github.com/queryverse)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "hide-output": true }, "outputs": [], "source": [ "using InstantiateFromURL\n", "# optionally add arguments to force installation: instantiate = true, precompile = true\n", "github_project(\"QuantEcon/quantecon-notebooks-julia\", version = \"0.8.0\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "hide-output": true }, "outputs": [], "source": [ "using LinearAlgebra, Statistics\n", "using DataFrames, RDatasets, DataFramesMeta, CategoricalArrays, Query, VegaLite\n", "using GLM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DataFrames\n", "\n", "A useful package for working with data is [DataFrames.jl](https://github.com/JuliaStats/DataFrames.jl).\n", "\n", "The most important data type provided is a `DataFrame`, a two dimensional array for storing heterogeneous data.\n", "\n", "Although data can be heterogeneous within a `DataFrame`, the contents of the columns must be homogeneous\n", "(of the same type).\n", "\n", "This is analogous to a `data.frame` in R, a `DataFrame` in Pandas (Python) or, more loosely, a spreadsheet in Excel.\n", "\n", "There are a few different ways to create a DataFrame." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Constructing and Accessing a DataFrame\n", "\n", "The first is to set up columns and construct a dataframe by assigning names" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

4 rows × 2 columns

commodprice
StringFloat64?
1crude4.2
2gas11.3
3gold12.1
4silvermissing
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& commod & price\\\\\n", "\t\\hline\n", "\t& String & Float64?\\\\\n", "\t\\hline\n", "\t1 & crude & 4.2 \\\\\n", "\t2 & gas & 11.3 \\\\\n", "\t3 & gold & 12.1 \\\\\n", "\t4 & silver & \\emph{missing} \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×2 DataFrame\n", "│ Row │ commod │ price │\n", "│ │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64?\u001b[39m │\n", "├─────┼────────┼──────────┤\n", "│ 1 │ crude │ 4.2 │\n", "│ 2 │ gas │ 11.3 │\n", "│ 3 │ gold │ 12.1 │\n", "│ 4 │ silver │ \u001b[90mmissing\u001b[39m │" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using DataFrames, RDatasets # RDatasets provides good standard data examples from R\n", "\n", "# note use of missing\n", "commodities = [\"crude\", \"gas\", \"gold\", \"silver\"]\n", "last_price = [4.2, 11.3, 12.1, missing]\n", "df = DataFrame(commod = commodities, price = last_price)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Columns of the `DataFrame` can be accessed by name using `df.col`, as below" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Union{Missing, Float64},1}:\n", " 4.2\n", " 11.3\n", " 12.1\n", " missing" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.price" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the type of this array has values `Union{Missing, Float64}` since it was created with a `missing` value." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{String,1}:\n", " \"crude\"\n", " \"gas\"\n", " \"gold\"\n", " \"silver\"" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.commod" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `DataFrames.jl` package provides a number of methods for acting on `DataFrame`’s, such as `describe`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

2 rows × 8 columns

variablemeanminmedianmaxnuniquenmissingeltype
SymbolUnion…AnyUnion…AnyUnion…Union…Type
1commodcrudesilver4String
2price9.24.211.312.11Union{Missing, Float64}
" ], "text/latex": [ "\\begin{tabular}{r|cccccccc}\n", "\t& variable & mean & min & median & max & nunique & nmissing & eltype\\\\\n", "\t\\hline\n", "\t& Symbol & Union… & Any & Union… & Any & Union… & Union… & Type\\\\\n", "\t\\hline\n", "\t1 & commod & & crude & & silver & 4 & & String \\\\\n", "\t2 & price & 9.2 & 4.2 & 11.3 & 12.1 & & 1 & Union\\{Missing, Float64\\} \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "2×8 DataFrame. Omitted printing of 1 columns\n", "│ Row │ variable │ mean │ min │ median │ max │ nunique │ nmissing │\n", "│ │ \u001b[90mSymbol\u001b[39m │ \u001b[90mUnion…\u001b[39m │ \u001b[90mAny\u001b[39m │ \u001b[90mUnion…\u001b[39m │ \u001b[90mAny\u001b[39m │ \u001b[90mUnion…\u001b[39m │ \u001b[90mUnion…\u001b[39m │\n", "├─────┼──────────┼────────┼───────┼────────┼────────┼─────────┼──────────┤\n", "│ 1 │ commod │ │ crude │ │ silver │ 4 │ │\n", "│ 2 │ price │ 9.2 │ 4.2 │ 11.3 │ 12.1 │ │ 1 │" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DataFrames.describe(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While often data will be generated all at once, or read from a file, you can add to a `DataFrame` by providing the key parameters." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

5 rows × 2 columns

commodprice
StringFloat64?
1crude4.2
2gas11.3
3gold12.1
4silvermissing
5nickel5.1
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& commod & price\\\\\n", "\t\\hline\n", "\t& String & Float64?\\\\\n", "\t\\hline\n", "\t1 & crude & 4.2 \\\\\n", "\t2 & gas & 11.3 \\\\\n", "\t3 & gold & 12.1 \\\\\n", "\t4 & silver & \\emph{missing} \\\\\n", "\t5 & nickel & 5.1 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "5×2 DataFrame\n", "│ Row │ commod │ price │\n", "│ │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64?\u001b[39m │\n", "├─────┼────────┼──────────┤\n", "│ 1 │ crude │ 4.2 │\n", "│ 2 │ gas │ 11.3 │\n", "│ 3 │ gold │ 12.1 │\n", "│ 4 │ silver │ \u001b[90mmissing\u001b[39m │\n", "│ 5 │ nickel │ 5.1 │" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nt = (commod = \"nickel\", price= 5.1)\n", "push!(df, nt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Named tuples can also be used to construct a `DataFrame`, and have it properly deduce all types." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

2 rows × 2 columns

tcol1
Int64Float64
113.0
224.0
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& t & col1\\\\\n", "\t\\hline\n", "\t& Int64 & Float64\\\\\n", "\t\\hline\n", "\t1 & 1 & 3.0 \\\\\n", "\t2 & 2 & 4.0 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "2×2 DataFrame\n", "│ Row │ t │ col1 │\n", "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mFloat64\u001b[39m │\n", "├─────┼───────┼─────────┤\n", "│ 1 │ 1 │ 3.0 │\n", "│ 2 │ 2 │ 4.0 │" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nt = (t = 1, col1 = 3.0)\n", "df2 = DataFrame([nt])\n", "push!(df2, (t=2, col1 = 4.0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to modify a column, access the mutating version by the symbol `df[!, :col]`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "5-element Array{Union{Missing, Float64},1}:\n", " 4.2\n", " 11.3\n", " 12.1\n", " missing\n", " 5.1" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[!, :price]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which allows modifications, like other mutating `!` functions in julia." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "5-element Array{Union{Missing, Float64},1}:\n", " 8.4\n", " 22.6\n", " 24.2\n", " missing\n", " 10.2" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[!, :price] *= 2.0 # double prices" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As discussed in the next section, note that the [fundamental types](../getting_started_julia/fundamental_types.html#missing), is propagated, i.e. `missing * 2 === missing`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Working with Missing\n", "\n", "As we discussed in [fundamental types](../getting_started_julia/fundamental_types.html#missing), the semantics of `missing` are that mathematical operations will not silently ignore it.\n", "\n", "In order to allow `missing` in a column, you can create/load the `DataFrame`\n", "from a source with `missing`’s, or call `allowmissing!` on a column." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

4 rows × 2 columns

tcol1
Int64Float64?
113.0
224.0
33missing
445.1
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& t & col1\\\\\n", "\t\\hline\n", "\t& Int64 & Float64?\\\\\n", "\t\\hline\n", "\t1 & 1 & 3.0 \\\\\n", "\t2 & 2 & 4.0 \\\\\n", "\t3 & 3 & \\emph{missing} \\\\\n", "\t4 & 4 & 5.1 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×2 DataFrame\n", "│ Row │ t │ col1 │\n", "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │\n", "├─────┼───────┼──────────┤\n", "│ 1 │ 1 │ 3.0 │\n", "│ 2 │ 2 │ 4.0 │\n", "│ 3 │ 3 │ \u001b[90mmissing\u001b[39m │\n", "│ 4 │ 4 │ 5.1 │" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "allowmissing!(df2, :col1) # necessary to add in a for col1\n", "push!(df2, (t=3, col1 = missing))\n", "push!(df2, (t=4, col1 = 5.1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see the propagation of `missing` to caller functions, as well as a way to efficiently calculate with non-missing data." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "hide-output": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mean(df2.col1) = missing\n", "mean(skipmissing(df2.col1)) = 4.033333333333333\n" ] }, { "data": { "text/plain": [ "4.033333333333333" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@show mean(df2.col1)\n", "@show mean(skipmissing(df2.col1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And to replace the `missing`" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Union{Missing, Float64},1}:\n", " 3.0\n", " 4.0\n", " 0.0\n", " 5.1" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2.col1 .= coalesce.(df2.col1, 0.0) # replace all missing with 0.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Manipulating and Transforming DataFrames\n", "\n", "One way to do an additional calculation with a `DataFrame` is to tuse the `@transform` macro from `DataFramesMeta.jl`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

4 rows × 3 columns

tcol1col2
Int64Float64?Float64
113.09.0
224.016.0
330.00.0
445.126.01
" ], "text/latex": [ "\\begin{tabular}{r|ccc}\n", "\t& t & col1 & col2\\\\\n", "\t\\hline\n", "\t& Int64 & Float64? & Float64\\\\\n", "\t\\hline\n", "\t1 & 1 & 3.0 & 9.0 \\\\\n", "\t2 & 2 & 4.0 & 16.0 \\\\\n", "\t3 & 3 & 0.0 & 0.0 \\\\\n", "\t4 & 4 & 5.1 & 26.01 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×3 DataFrame\n", "│ Row │ t │ col1 │ col2 │\n", "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mFloat64?\u001b[39m │ \u001b[90mFloat64\u001b[39m │\n", "├─────┼───────┼──────────┼─────────┤\n", "│ 1 │ 1 │ 3.0 │ 9.0 │\n", "│ 2 │ 2 │ 4.0 │ 16.0 │\n", "│ 3 │ 3 │ 0.0 │ 0.0 │\n", "│ 4 │ 4 │ 5.1 │ 26.01 │" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using DataFramesMeta\n", "f(x) = x^2\n", "df2 = @transform(df2, col2 = f.(:col1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Categorical Data\n", "\n", "For data that is [categorical](https://juliadata.github.io/DataFrames.jl/stable/man/categorical/)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

4 rows × 2 columns

idy
Int64Cat…
11old
22young
33young
44old
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& id & y\\\\\n", "\t\\hline\n", "\t& Int64 & Cat…\\\\\n", "\t\\hline\n", "\t1 & 1 & old \\\\\n", "\t2 & 2 & young \\\\\n", "\t3 & 3 & young \\\\\n", "\t4 & 4 & old \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "4×2 DataFrame\n", "│ Row │ id │ y │\n", "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mCat…\u001b[39m │\n", "├─────┼───────┼───────┤\n", "│ 1 │ 1 │ old │\n", "│ 2 │ 2 │ young │\n", "│ 3 │ 3 │ young │\n", "│ 4 │ 4 │ old │" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using CategoricalArrays\n", "id = [1, 2, 3, 4]\n", "y = [\"old\", \"young\", \"young\", \"old\"]\n", "y = CategoricalArray(y)\n", "df = DataFrame(id=id, y=y)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "2-element Array{String,1}:\n", " \"old\"\n", " \"young\"" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "levels(df.y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualization, Querying, and Plots\n", "\n", "The `DataFrame` (and similar types that fulfill a standard generic interface) can fit into a variety of packages.\n", "\n", "One set of them is the [QueryVerse](https://github.com/queryverse).\n", "\n", "**Note:** The QueryVerse, in the same spirit as R’s tidyverse, makes heavy use of the pipeline syntax `|>`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "hide-output": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "g(f(x)) = 2.1972245773362196\n", "(x |> f) |> g = " ] }, { "name": "stdout", "output_type": "stream", "text": [ "2.1972245773362196\n" ] } ], "source": [ "x = 3.0\n", "f(x) = x^2\n", "g(x) = log(x)\n", "\n", "@show g(f(x))\n", "@show x |> f |> g; # pipes nest function calls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To give an example directly from the source of the LINQ inspired [Query.jl](http://www.queryverse.org/Query.jl/stable/)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/html": [ "

1 rows × 2 columns

namechildren
StringInt64
1Kirk2
" ], "text/latex": [ "\\begin{tabular}{r|cc}\n", "\t& name & children\\\\\n", "\t\\hline\n", "\t& String & Int64\\\\\n", "\t\\hline\n", "\t1 & Kirk & 2 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "1×2 DataFrame\n", "│ Row │ name │ children │\n", "│ │ \u001b[90mString\u001b[39m │ \u001b[90mInt64\u001b[39m │\n", "├─────┼────────┼──────────┤\n", "│ 1 │ Kirk │ 2 │" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Query\n", "\n", "df = DataFrame(name=[\"John\", \"Sally\", \"Kirk\"], age=[23., 42., 59.], children=[3,5,2])\n", "\n", "x = @from i in df begin\n", " @where i.age>50\n", " @select {i.name, i.children}\n", " @collect DataFrame\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While it is possible to just use the `Plots.jl` library, there may be better options for displaying tabular data – such as [VegaLite.jl](https://github.com/queryverse/VegaLite.jl)." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "hide-output": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARN Missing type for channel \"color\", using \"nominal\" instead.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARN Missing type for channel \"color\", using \"nominal\" instead.\n" ] }, { "data": { "application/vnd.vegalite.v4+json": { "data": { "values": [ { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 5.1, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 4.9, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.2, "SepalLength": 4.7, "SepalWidth": 3.2, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 4.6, "SepalWidth": 3.1, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 5.0, "SepalWidth": 3.6, "Species": "setosa" }, { "PetalLength": 1.7, "PetalWidth": 0.4, "SepalLength": 5.4, "SepalWidth": 3.9, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.3, "SepalLength": 4.6, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 5.0, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 4.4, "SepalWidth": 2.9, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.1, "SepalLength": 4.9, "SepalWidth": 3.1, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 5.4, "SepalWidth": 3.7, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.2, "SepalLength": 4.8, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.1, "SepalLength": 4.8, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.1, "PetalWidth": 0.1, "SepalLength": 4.3, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.2, "PetalWidth": 0.2, "SepalLength": 5.8, "SepalWidth": 4.0, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.4, "SepalLength": 5.7, "SepalWidth": 4.4, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.4, "SepalLength": 5.4, "SepalWidth": 3.9, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.3, "SepalLength": 5.1, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.7, "PetalWidth": 0.3, "SepalLength": 5.7, "SepalWidth": 3.8, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.3, "SepalLength": 5.1, "SepalWidth": 3.8, "Species": "setosa" }, { "PetalLength": 1.7, "PetalWidth": 0.2, "SepalLength": 5.4, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.4, "SepalLength": 5.1, "SepalWidth": 3.7, "Species": "setosa" }, { "PetalLength": 1.0, "PetalWidth": 0.2, "SepalLength": 4.6, "SepalWidth": 3.6, "Species": "setosa" }, { "PetalLength": 1.7, "PetalWidth": 0.5, "SepalLength": 5.1, "SepalWidth": 3.3, "Species": "setosa" }, { "PetalLength": 1.9, "PetalWidth": 0.2, "SepalLength": 4.8, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.2, "SepalLength": 5.0, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.4, "SepalLength": 5.0, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 5.2, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 5.2, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.2, "SepalLength": 4.7, "SepalWidth": 3.2, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.2, "SepalLength": 4.8, "SepalWidth": 3.1, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.4, "SepalLength": 5.4, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.1, "SepalLength": 5.2, "SepalWidth": 4.1, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 5.5, "SepalWidth": 4.2, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 4.9, "SepalWidth": 3.1, "Species": "setosa" }, { "PetalLength": 1.2, "PetalWidth": 0.2, "SepalLength": 5.0, "SepalWidth": 3.2, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.2, "SepalLength": 5.5, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.1, "SepalLength": 4.9, "SepalWidth": 3.6, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.2, "SepalLength": 4.4, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 5.1, "SepalWidth": 3.4, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.3, "SepalLength": 5.0, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.3, "SepalLength": 4.5, "SepalWidth": 2.3, "Species": "setosa" }, { "PetalLength": 1.3, "PetalWidth": 0.2, "SepalLength": 4.4, "SepalWidth": 3.2, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.6, "SepalLength": 5.0, "SepalWidth": 3.5, "Species": "setosa" }, { "PetalLength": 1.9, "PetalWidth": 0.4, "SepalLength": 5.1, "SepalWidth": 3.8, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.3, "SepalLength": 4.8, "SepalWidth": 3.0, "Species": "setosa" }, { "PetalLength": 1.6, "PetalWidth": 0.2, "SepalLength": 5.1, "SepalWidth": 3.8, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 4.6, "SepalWidth": 3.2, "Species": "setosa" }, { "PetalLength": 1.5, "PetalWidth": 0.2, "SepalLength": 5.3, "SepalWidth": 3.7, "Species": "setosa" }, { "PetalLength": 1.4, "PetalWidth": 0.2, "SepalLength": 5.0, "SepalWidth": 3.3, "Species": "setosa" }, { "PetalLength": 4.7, "PetalWidth": 1.4, "SepalLength": 7.0, "SepalWidth": 3.2, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.5, "SepalLength": 6.4, "SepalWidth": 3.2, "Species": "versicolor" }, { "PetalLength": 4.9, "PetalWidth": 1.5, "SepalLength": 6.9, "SepalWidth": 3.1, "Species": "versicolor" }, { "PetalLength": 4.0, "PetalWidth": 1.3, "SepalLength": 5.5, "SepalWidth": 2.3, "Species": "versicolor" }, { "PetalLength": 4.6, "PetalWidth": 1.5, "SepalLength": 6.5, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.3, "SepalLength": 5.7, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 4.7, "PetalWidth": 1.6, "SepalLength": 6.3, "SepalWidth": 3.3, "Species": "versicolor" }, { "PetalLength": 3.3, "PetalWidth": 1.0, "SepalLength": 4.9, "SepalWidth": 2.4, "Species": "versicolor" }, { "PetalLength": 4.6, "PetalWidth": 1.3, "SepalLength": 6.6, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 3.9, "PetalWidth": 1.4, "SepalLength": 5.2, "SepalWidth": 2.7, "Species": "versicolor" }, { "PetalLength": 3.5, "PetalWidth": 1.0, "SepalLength": 5.0, "SepalWidth": 2.0, "Species": "versicolor" }, { "PetalLength": 4.2, "PetalWidth": 1.5, "SepalLength": 5.9, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.0, "PetalWidth": 1.0, "SepalLength": 6.0, "SepalWidth": 2.2, "Species": "versicolor" }, { "PetalLength": 4.7, "PetalWidth": 1.4, "SepalLength": 6.1, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 3.6, "PetalWidth": 1.3, "SepalLength": 5.6, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 4.4, "PetalWidth": 1.4, "SepalLength": 6.7, "SepalWidth": 3.1, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.5, "SepalLength": 5.6, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.1, "PetalWidth": 1.0, "SepalLength": 5.8, "SepalWidth": 2.7, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.5, "SepalLength": 6.2, "SepalWidth": 2.2, "Species": "versicolor" }, { "PetalLength": 3.9, "PetalWidth": 1.1, "SepalLength": 5.6, "SepalWidth": 2.5, "Species": "versicolor" }, { "PetalLength": 4.8, "PetalWidth": 1.8, "SepalLength": 5.9, "SepalWidth": 3.2, "Species": "versicolor" }, { "PetalLength": 4.0, "PetalWidth": 1.3, "SepalLength": 6.1, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 4.9, "PetalWidth": 1.5, "SepalLength": 6.3, "SepalWidth": 2.5, "Species": "versicolor" }, { "PetalLength": 4.7, "PetalWidth": 1.2, "SepalLength": 6.1, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 4.3, "PetalWidth": 1.3, "SepalLength": 6.4, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 4.4, "PetalWidth": 1.4, "SepalLength": 6.6, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.8, "PetalWidth": 1.4, "SepalLength": 6.8, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 5.0, "PetalWidth": 1.7, "SepalLength": 6.7, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.5, "SepalLength": 6.0, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 3.5, "PetalWidth": 1.0, "SepalLength": 5.7, "SepalWidth": 2.6, "Species": "versicolor" }, { "PetalLength": 3.8, "PetalWidth": 1.1, "SepalLength": 5.5, "SepalWidth": 2.4, "Species": "versicolor" }, { "PetalLength": 3.7, "PetalWidth": 1.0, "SepalLength": 5.5, "SepalWidth": 2.4, "Species": "versicolor" }, { "PetalLength": 3.9, "PetalWidth": 1.2, "SepalLength": 5.8, "SepalWidth": 2.7, "Species": "versicolor" }, { "PetalLength": 5.1, "PetalWidth": 1.6, "SepalLength": 6.0, "SepalWidth": 2.7, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.5, "SepalLength": 5.4, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.5, "PetalWidth": 1.6, "SepalLength": 6.0, "SepalWidth": 3.4, "Species": "versicolor" }, { "PetalLength": 4.7, "PetalWidth": 1.5, "SepalLength": 6.7, "SepalWidth": 3.1, "Species": "versicolor" }, { "PetalLength": 4.4, "PetalWidth": 1.3, "SepalLength": 6.3, "SepalWidth": 2.3, "Species": "versicolor" }, { "PetalLength": 4.1, "PetalWidth": 1.3, "SepalLength": 5.6, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.0, "PetalWidth": 1.3, "SepalLength": 5.5, "SepalWidth": 2.5, "Species": "versicolor" }, { "PetalLength": 4.4, "PetalWidth": 1.2, "SepalLength": 5.5, "SepalWidth": 2.6, "Species": "versicolor" }, { "PetalLength": 4.6, "PetalWidth": 1.4, "SepalLength": 6.1, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.0, "PetalWidth": 1.2, "SepalLength": 5.8, "SepalWidth": 2.6, "Species": "versicolor" }, { "PetalLength": 3.3, "PetalWidth": 1.0, "SepalLength": 5.0, "SepalWidth": 2.3, "Species": "versicolor" }, { "PetalLength": 4.2, "PetalWidth": 1.3, "SepalLength": 5.6, "SepalWidth": 2.7, "Species": "versicolor" }, { "PetalLength": 4.2, "PetalWidth": 1.2, "SepalLength": 5.7, "SepalWidth": 3.0, "Species": "versicolor" }, { "PetalLength": 4.2, "PetalWidth": 1.3, "SepalLength": 5.7, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 4.3, "PetalWidth": 1.3, "SepalLength": 6.2, "SepalWidth": 2.9, "Species": "versicolor" }, { "PetalLength": 3.0, "PetalWidth": 1.1, "SepalLength": 5.1, "SepalWidth": 2.5, "Species": "versicolor" }, { "PetalLength": 4.1, "PetalWidth": 1.3, "SepalLength": 5.7, "SepalWidth": 2.8, "Species": "versicolor" }, { "PetalLength": 6.0, "PetalWidth": 2.5, "SepalLength": 6.3, "SepalWidth": 3.3, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 1.9, "SepalLength": 5.8, "SepalWidth": 2.7, "Species": "virginica" }, { "PetalLength": 5.9, "PetalWidth": 2.1, "SepalLength": 7.1, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 1.8, "SepalLength": 6.3, "SepalWidth": 2.9, "Species": "virginica" }, { "PetalLength": 5.8, "PetalWidth": 2.2, "SepalLength": 6.5, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 6.6, "PetalWidth": 2.1, "SepalLength": 7.6, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 4.5, "PetalWidth": 1.7, "SepalLength": 4.9, "SepalWidth": 2.5, "Species": "virginica" }, { "PetalLength": 6.3, "PetalWidth": 1.8, "SepalLength": 7.3, "SepalWidth": 2.9, "Species": "virginica" }, { "PetalLength": 5.8, "PetalWidth": 1.8, "SepalLength": 6.7, "SepalWidth": 2.5, "Species": "virginica" }, { "PetalLength": 6.1, "PetalWidth": 2.5, "SepalLength": 7.2, "SepalWidth": 3.6, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 2.0, "SepalLength": 6.5, "SepalWidth": 3.2, "Species": "virginica" }, { "PetalLength": 5.3, "PetalWidth": 1.9, "SepalLength": 6.4, "SepalWidth": 2.7, "Species": "virginica" }, { "PetalLength": 5.5, "PetalWidth": 2.1, "SepalLength": 6.8, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.0, "PetalWidth": 2.0, "SepalLength": 5.7, "SepalWidth": 2.5, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 2.4, "SepalLength": 5.8, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 5.3, "PetalWidth": 2.3, "SepalLength": 6.4, "SepalWidth": 3.2, "Species": "virginica" }, { "PetalLength": 5.5, "PetalWidth": 1.8, "SepalLength": 6.5, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 6.7, "PetalWidth": 2.2, "SepalLength": 7.7, "SepalWidth": 3.8, "Species": "virginica" }, { "PetalLength": 6.9, "PetalWidth": 2.3, "SepalLength": 7.7, "SepalWidth": 2.6, "Species": "virginica" }, { "PetalLength": 5.0, "PetalWidth": 1.5, "SepalLength": 6.0, "SepalWidth": 2.2, "Species": "virginica" }, { "PetalLength": 5.7, "PetalWidth": 2.3, "SepalLength": 6.9, "SepalWidth": 3.2, "Species": "virginica" }, { "PetalLength": 4.9, "PetalWidth": 2.0, "SepalLength": 5.6, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 6.7, "PetalWidth": 2.0, "SepalLength": 7.7, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 4.9, "PetalWidth": 1.8, "SepalLength": 6.3, "SepalWidth": 2.7, "Species": "virginica" }, { "PetalLength": 5.7, "PetalWidth": 2.1, "SepalLength": 6.7, "SepalWidth": 3.3, "Species": "virginica" }, { "PetalLength": 6.0, "PetalWidth": 1.8, "SepalLength": 7.2, "SepalWidth": 3.2, "Species": "virginica" }, { "PetalLength": 4.8, "PetalWidth": 1.8, "SepalLength": 6.2, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 4.9, "PetalWidth": 1.8, "SepalLength": 6.1, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 2.1, "SepalLength": 6.4, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 5.8, "PetalWidth": 1.6, "SepalLength": 7.2, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 6.1, "PetalWidth": 1.9, "SepalLength": 7.4, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 6.4, "PetalWidth": 2.0, "SepalLength": 7.9, "SepalWidth": 3.8, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 2.2, "SepalLength": 6.4, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 1.5, "SepalLength": 6.3, "SepalWidth": 2.8, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 1.4, "SepalLength": 6.1, "SepalWidth": 2.6, "Species": "virginica" }, { "PetalLength": 6.1, "PetalWidth": 2.3, "SepalLength": 7.7, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 2.4, "SepalLength": 6.3, "SepalWidth": 3.4, "Species": "virginica" }, { "PetalLength": 5.5, "PetalWidth": 1.8, "SepalLength": 6.4, "SepalWidth": 3.1, "Species": "virginica" }, { "PetalLength": 4.8, "PetalWidth": 1.8, "SepalLength": 6.0, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.4, "PetalWidth": 2.1, "SepalLength": 6.9, "SepalWidth": 3.1, "Species": "virginica" }, { "PetalLength": 5.6, "PetalWidth": 2.4, "SepalLength": 6.7, "SepalWidth": 3.1, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 2.3, "SepalLength": 6.9, "SepalWidth": 3.1, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 1.9, "SepalLength": 5.8, "SepalWidth": 2.7, "Species": "virginica" }, { "PetalLength": 5.9, "PetalWidth": 2.3, "SepalLength": 6.8, "SepalWidth": 3.2, "Species": "virginica" }, { "PetalLength": 5.7, "PetalWidth": 2.5, "SepalLength": 6.7, "SepalWidth": 3.3, "Species": "virginica" }, { "PetalLength": 5.2, "PetalWidth": 2.3, "SepalLength": 6.7, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.0, "PetalWidth": 1.9, "SepalLength": 6.3, "SepalWidth": 2.5, "Species": "virginica" }, { "PetalLength": 5.2, "PetalWidth": 2.0, "SepalLength": 6.5, "SepalWidth": 3.0, "Species": "virginica" }, { "PetalLength": 5.4, "PetalWidth": 2.3, "SepalLength": 6.2, "SepalWidth": 3.4, "Species": "virginica" }, { "PetalLength": 5.1, "PetalWidth": 1.8, "SepalLength": 5.9, "SepalWidth": 3.0, "Species": "virginica" } ] }, "encoding": { "color": { "field": "Species" }, "x": { "field": "PetalLength", "type": "quantitative" }, "y": { "field": "PetalWidth", "type": "quantitative" } }, "mark": "point" }, "image/png": "", "image/svg+xml": [ "\n", "\n", "01234567PetalLength0.00.51.01.52.02.5PetalWidthsetosaversicolorvirginicaSpecies\n" ], "text/plain": [ "@vlplot(\n", " mark=\"point\",\n", " encoding={\n", " x={\n", " field=\"PetalLength\"\n", " },\n", " y={\n", " field=\"PetalWidth\"\n", " },\n", " color={\n", " field=\"Species\"\n", " }\n", " },\n", " data={\n", " values=...\n", " }\n", ")" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using RDatasets, VegaLite\n", "iris = dataset(\"datasets\", \"iris\")\n", "\n", "iris |> @vlplot(\n", " :point,\n", " x=:PetalLength,\n", " y=:PetalWidth,\n", " color=:Species\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Statistics and Econometrics\n", "\n", "While Julia is not intended as a replacement for R, Stata, and similar specialty languages, it has a growing number of packages aimed at statistics and econometrics.\n", "\n", "Many of the packages live in the [JuliaStats organization](https://github.com/JuliaStats/).\n", "\n", "A few to point out\n", "\n", "- [StatsBase](https://github.com/JuliaStats/StatsBase.jl) has basic statistical functions such as geometric and harmonic means, auto-correlations, robust statistics, etc. \n", "- [StatsFuns](https://github.com/JuliaStats/StatsFuns.jl) has a variety of mathematical functions and constants such as pdf and cdf of many distributions, softmax, etc. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### General Linear Models\n", "\n", "To run linear regressions and similar statistics, use the [GLM](http://juliastats.github.io/GLM.jl/latest/) package." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "hide-output": false }, "outputs": [ { "data": { "text/plain": [ "StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}\n", "\n", "y ~ 1 + x\n", "\n", "Coefficients:\n", "──────────────────────────────────────────────────────────────────────────\n", " Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%\n", "──────────────────────────────────────────────────────────────────────────\n", "(Intercept) 0.239083 0.0149413 16.0015 <1e-28 0.209432 0.268733\n", "x 0.910987 0.0149344 60.9993 <1e-79 0.88135 0.940624\n", "──────────────────────────────────────────────────────────────────────────" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using GLM\n", "\n", "x = randn(100)\n", "y = 0.9 .* x + 0.5 * rand(100)\n", "df = DataFrame(x=x, y=y)\n", "ols = lm(@formula(y ~ x), df) # R-style notation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To display the results in a useful tables for LaTeX and the REPL, use\n", "[RegressionTables](https://github.com/jmboehm/RegressionTables.jl/) for output\n", "similar to the Stata package esttab and the R package stargazer." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "hide-output": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "----------------------\n", " y \n", " --------\n", " (1)\n", "----------------------\n", "(Intercept) 0.239***\n", " (0.015)\n", "x 0.911***\n", " (0.015)\n", "----------------------\n", "Estimator OLS\n", "----------------------\n", "N 100\n", "R2 0.974\n", "----------------------\n", "\n", "\n" ] } ], "source": [ "using RegressionTables\n", "regtable(ols)\n", "# regtable(ols, renderSettings = latexOutput()) # for LaTex output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fixed Effects\n", "\n", "While Julia may be overkill for estimating a simple linear regression,\n", "fixed-effects estimation with dummies for multiple variables are much more computationally intensive.\n", "\n", "For a 2-way fixed-effect, taking the example directly from the documentation using [cigarette consumption data](https://github.com/johnmyleswhite/RDatasets.jl/blob/master/doc/plm/rst/Cigar.rst)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "hide-output": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "----------------------------\n", " Sales \n", " ---------\n", " (1)\n", "----------------------------\n", "NDI -0.005***\n", " (0.001)\n", "----------------------------\n", "StateCategorical Yes\n", "YearCategorical Yes\n", "----------------------------\n", "Estimator OLS\n", "----------------------------\n", "N 1,380\n", "R2 0.803\n", "----------------------------\n", "\n", "\n" ] } ], "source": [ "using FixedEffectModels\n", "cigar = dataset(\"plm\", \"Cigar\")\n", "cigar.StateCategorical = categorical(cigar.State)\n", "cigar.YearCategorical = categorical(cigar.Year)\n", "fixedeffectresults = reg(cigar, @formula(Sales ~ NDI + fe(StateCategorical) + fe(YearCategorical)),\n", " weights = :Pop, Vcov.cluster(:State))\n", "regtable(fixedeffectresults)" ] } ], "metadata": { "date": 1591310622.499557, "download_nb": 1, "download_nb_path": "https://julia.quantecon.org/", "filename": "data_statistical_packages.rst", "filename_with_path": "more_julia/data_statistical_packages", "kernelspec": { "display_name": "Julia 1.4.2", "language": "julia", "name": "julia-1.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.2" }, "title": "Data and Statistics Packages" }, "nbformat": 4, "nbformat_minor": 2 }