{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Querying hierarchical data with Rabbit\n", "\n", "We reproduce the queries from the section on querying hierarchical data using Rabbit syntax." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The database\n", "We start with loading a sample database. Our sample database is derived from the dataset of all employees of the city of Chicago ([source](https://data.cityofchicago.org/Administration-Finance/Current-Employee-Names-Salaries-and-Position-Title/xzkq-xp2w))." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Department:\n", " name :: UTF8String # unique\n", " employee (inverse of Employee.department) :: Array{Employee,1} # unique, covering\n", "Employee:\n", " name :: UTF8String\n", " surname :: UTF8String\n", " position :: UTF8String\n", " salary :: Int64\n", " department :: Department\n", " managed_by :: Nullable{Employee}\n", " manages (inverse of Employee.managed_by) :: Array{Employee,1} # unique" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "include(\"../citydb.jl\")\n", "\n", "using RBT\n", "setdb(citydb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can execute a query using `@query()` command:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "42" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(6*(3+4))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Traversing the hierarchy\n", "*Find the names of all departments.*" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35-element Array{UTF8String,1}:\n", " \"WATER MGMNT\" \n", " \"POLICE\" \n", " \"GENERAL SERVICES\" \n", " \"CITY COUNCIL\" \n", " \"STREETS & SAN\" \n", " ⋮ \n", " \"BOARD OF ETHICS\" \n", " \"POLICE BOARD\" \n", " \"BUDGET & MGMT\" \n", " \"ADMIN HEARNG\" \n", " \"LICENSE APPL COMM\"" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(department.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the names of all employees.*" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32181-element Array{UTF8String,1}:\n", " \"ELVIA\" \n", " \"VICENTE\" \n", " \"MUHAMMAD\" \n", " \"GIRLEY\" \n", " \"DILAN\" \n", " ⋮ \n", " \"NANCY\" \n", " \"DARCI\" \n", " \"THADDEUS\" \n", " \"RACHENETTE\"\n", " \"MICHELLE\" " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(department.employee.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are not longer restricted by the hierarchical structure of the database, so we can query employees directly." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32181-element Array{UTF8String,1}:\n", " \"ELVIA\" \n", " \"JEFFERY\" \n", " \"KARINA\" \n", " \"KIMBERLEI\"\n", " \"VICENTE\" \n", " ⋮ \n", " \"MICHAEL\" \n", " \"PETER\" \n", " \"MARK\" \n", " \"CARLO\" \n", " \"DARIUSZ\" " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(employee.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the traversal ends at an entity class, an array of records is generated." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32181-element Array{Tuple{UTF8String,UTF8String,UTF8String,UTF8String,Int64},1}:\n", " (\"ELVIA\",\"A\",\"WATER MGMNT\",\"WATER RATE TAKER\",88968) \n", " (\"JEFFERY\",\"A\",\"POLICE\",\"POLICE OFFICER\",80778) \n", " (\"KARINA\",\"A\",\"POLICE\",\"POLICE OFFICER\",80778) \n", " (\"KIMBERLEI\",\"A\",\"GENERAL SERVICES\",\"CHIEF CONTRACT EXPEDITER\",84780) \n", " (\"VICENTE\",\"A\",\"WATER MGMNT\",\"CIVIL ENGINEER IV\",104736) \n", " ⋮ \n", " (\"MICHAEL\",\"Z\",\"GENERAL SERVICES\",\"FRM OF MACHINISTS - AUTOMOTIVE\",97448)\n", " (\"PETER\",\"Z\",\"POLICE\",\"POLICE OFFICER\",86520) \n", " (\"MARK\",\"Z\",\"POLICE\",\"POLICE OFFICER\",83616) \n", " (\"CARLO\",\"Z\",\"POLICE\",\"POLICE OFFICER\",86520) \n", " (\"DARIUSZ\",\"Z\",\"DoIT\",\"CHIEF DATA BASE ANALYST\",110352) " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(employee)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summarizing data\n", "*Find the number of departments.*" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(count(department))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the number of employees for each department.*" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35-element Array{Int64,1}:\n", " 1848\n", " 13570\n", " 924\n", " 397\n", " 2090\n", " ⋮\n", " 9\n", " 2\n", " 43\n", " 39\n", " 1" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(department.count(employee))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the total number of employees.*" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32181" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(count(department.employee))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, we can query `employee` directly." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32181" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(count(employee))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the top salary among all employees.*" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Nullable(260004)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(max(employee.salary))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the maximum number of employees per department.*" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Nullable(13570)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(max(department.count(employee)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tabular output\n", "*For each department, find the number of employees.*" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35-element Array{Tuple{UTF8String,Int64},1}:\n", " (\"WATER MGMNT\",1848) \n", " (\"POLICE\",13570) \n", " (\"GENERAL SERVICES\",924)\n", " (\"CITY COUNCIL\",397) \n", " (\"STREETS & SAN\",2090) \n", " ⋮ \n", " (\"BOARD OF ETHICS\",9) \n", " (\"POLICE BOARD\",2) \n", " (\"BUDGET & MGMT\",43) \n", " (\"ADMIN HEARNG\",39) \n", " (\"LICENSE APPL COMM\",1) " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(department:select(name,count(employee)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is easy to add new columns to the output. Let us add *the top salary per department.*" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35-element Array{Tuple{UTF8String,Int64,Nullable{Int64}},1}:\n", " (\"WATER MGMNT\",1848,Nullable(169512)) \n", " (\"POLICE\",13570,Nullable(260004)) \n", " (\"GENERAL SERVICES\",924,Nullable(157092))\n", " (\"CITY COUNCIL\",397,Nullable(160248)) \n", " (\"STREETS & SAN\",2090,Nullable(157092)) \n", " ⋮ \n", " (\"BOARD OF ETHICS\",9,Nullable(131688)) \n", " (\"POLICE BOARD\",2,Nullable(97728)) \n", " (\"BUDGET & MGMT\",43,Nullable(169992)) \n", " (\"ADMIN HEARNG\",39,Nullable(156420)) \n", " (\"LICENSE APPL COMM\",1,Nullable(69888)) " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(\n", " department\n", " :select(\n", " name,\n", " count(employee),\n", " max(employee.salary)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Filtering data\n", "*Find the employees with salary greater than $200k.*" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "3-element Array{Tuple{UTF8String,UTF8String,UTF8String,UTF8String,Int64},1}:\n", " (\"RAHM\",\"E\",\"MAYOR'S OFFICE\",\"MAYOR\",216210) \n", " (\"GARRY\",\"M\",\"POLICE\",\"SUPERINTENDENT OF POLICE\",260004)\n", " (\"JOSE\",\"S\",\"FIRE\",\"FIRE COMMISSIONER\",202728) " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(\n", " employee\n", " :filter(salary>200000))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the departments with more than 1000 employees.*" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "7-element Array{Tuple{UTF8String,Int64},1}:\n", " (\"WATER MGMNT\",1848) \n", " (\"POLICE\",13570) \n", " (\"STREETS & SAN\",2090)\n", " (\"AVIATION\",1344) \n", " (\"FIRE\",4875) \n", " (\"OEMC\",1135) \n", " (\"TRANSPORTN\",1200) " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(\n", " department\n", " :filter(count(employee)>1000)\n", " :select(name, count(employee)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Find the number of departments with more than 1000 employees.*" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(\n", " count(\n", " department\n", " :filter(count(employee)>1000)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*For each department, find the number of employees with salary higher than $100k.*" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "35-element Array{Tuple{UTF8String,Int64},1}:\n", " (\"WATER MGMNT\",179) \n", " (\"POLICE\",1493) \n", " (\"GENERAL SERVICES\",79)\n", " (\"CITY COUNCIL\",54) \n", " (\"STREETS & SAN\",39) \n", " ⋮ \n", " (\"BOARD OF ETHICS\",2) \n", " (\"POLICE BOARD\",0) \n", " (\"BUDGET & MGMT\",12) \n", " (\"ADMIN HEARNG\",3) \n", " (\"LICENSE APPL COMM\",0)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@query(\n", " department\n", " :select(\n", " name,\n", " count(employee:filter(salary>100000))))" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.5.0-dev", "language": "julia", "name": "julia-0.5" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.5.0" } }, "nbformat": 4, "nbformat_minor": 0 }