{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Late contributions Received and Made"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%load_ext sql"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"u'Connected: ccdc@calaccess_website'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from django.conf import settings\n",
"connection_string = 'postgresql+psycopg2://{USER}:{PASSWORD}@{HOST}:{PORT}/{NAME}'.format(\n",
" **settings.DATABASES['default']\n",
")\n",
"%sql $connection_string"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Unique Composite Key"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The documentation says that the records are unique on the following fields:\n",
"* `FILING_ID`\n",
"* `AMEND_ID`\n",
"* `LINE_ITEM`\n",
"* `REC_TYPE`\n",
"* `FORM_TYPE`\n",
"\n",
"`REC_TYPE` is always the same value: `S497`, so we can ignore this column. \n",
"\n",
"`FORM_TYPE` is either `F497P1` or `F497P2`, indicating in whether itemized transaction is listed under Part 1 (Contributions Received) or Part 2 (Contributions Made). I'll split these up into separate tables."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Are the `S497_CD` records actually unique on `FILING_ID`, `AMEND_ID` and `LINE_ITEM`?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Yes. And this is even true across the Parts 1 and 2 (Contributions Received and Contributions Made)."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 rows affected.\n"
]
},
{
"data": {
"text/html": [
"
\n",
" \n",
" FILING_ID | \n",
" AMEND_ID | \n",
" LINE_ITEM | \n",
" count | \n",
"
\n",
"
"
],
"text/plain": [
"[]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \"FILING_ID\", \"AMEND_ID\", \"LINE_ITEM\", COUNT(*)\n",
"FROM \"S497_CD\"\n",
"GROUP BY 1, 2, 3\n",
"HAVING COUNT(*) > 1\n",
"ORDER BY COUNT(*) DESC;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `TRAN_ID`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `S497_CD` table includes a `TRAN_ID` field, which the [documentation](http://calaccess.californiacivicdata.org/documentation/calaccess-files/s497-cd/#fields) describes as a \"Permanent value unique to this item\"."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Is `TRAN_ID` ever `NULL` or blank?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"No."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" count | \n",
"
\n",
" \n",
" 0 | \n",
"
\n",
"
"
],
"text/plain": [
"[(0L,)]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*)\n",
"FROM \"S497_CD\"\n",
"WHERE \"TRAN_ID\" IS NULL OR \"TRAN_ID\" = '' OR \"TRAN_ID\" = '0';"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Is `TRAN_ID` unique across filings?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Decidedly no."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" TRAN_ID | \n",
" count | \n",
"
\n",
" \n",
" 1 | \n",
" 14271 | \n",
"
\n",
" \n",
" 2 | \n",
" 5893 | \n",
"
\n",
" \n",
" 3 | \n",
" 3490 | \n",
"
\n",
" \n",
" 4 | \n",
" 2336 | \n",
"
\n",
" \n",
" 5 | \n",
" 1661 | \n",
"
\n",
" \n",
" 6 | \n",
" 1139 | \n",
"
\n",
" \n",
" 7 | \n",
" 848 | \n",
"
\n",
" \n",
" 8 | \n",
" 659 | \n",
"
\n",
" \n",
" 01 | \n",
" 627 | \n",
"
\n",
" \n",
" 9 | \n",
" 534 | \n",
"
\n",
" \n",
" 10 | \n",
" 460 | \n",
"
\n",
" \n",
" 11 | \n",
" 383 | \n",
"
\n",
" \n",
" INC1 | \n",
" 347 | \n",
"
\n",
" \n",
" EXP1 | \n",
" 332 | \n",
"
\n",
" \n",
" 12 | \n",
" 318 | \n",
"
\n",
" \n",
" 13 | \n",
" 273 | \n",
"
\n",
" \n",
" 02 | \n",
" 272 | \n",
"
\n",
" \n",
" 14 | \n",
" 239 | \n",
"
\n",
" \n",
" INC2 | \n",
" 225 | \n",
"
\n",
" \n",
" EXP2 | \n",
" 218 | \n",
"
\n",
" \n",
" 15 | \n",
" 208 | \n",
"
\n",
" \n",
" INC3 | \n",
" 187 | \n",
"
\n",
" \n",
" 16 | \n",
" 182 | \n",
"
\n",
" \n",
" INC4 | \n",
" 176 | \n",
"
\n",
" \n",
" EXP3 | \n",
" 167 | \n",
"
\n",
" \n",
" INC5 | \n",
" 163 | \n",
"
\n",
" \n",
" INC7 | \n",
" 158 | \n",
"
\n",
" \n",
" INC6 | \n",
" 157 | \n",
"
\n",
" \n",
" 17 | \n",
" 147 | \n",
"
\n",
" \n",
" 03 | \n",
" 147 | \n",
"
\n",
" \n",
" INC8 | \n",
" 142 | \n",
"
\n",
" \n",
" EXP4 | \n",
" 142 | \n",
"
\n",
" \n",
" 18 | \n",
" 139 | \n",
"
\n",
" \n",
" EXP5 | \n",
" 127 | \n",
"
\n",
" \n",
" INC10 | \n",
" 126 | \n",
"
\n",
" \n",
" EXP7 | \n",
" 125 | \n",
"
\n",
" \n",
" 19 | \n",
" 124 | \n",
"
\n",
" \n",
" INC12 | \n",
" 124 | \n",
"
\n",
" \n",
" INC9 | \n",
" 117 | \n",
"
\n",
" \n",
" EXP9 | \n",
" 115 | \n",
"
\n",
" \n",
" 20 | \n",
" 114 | \n",
"
\n",
" \n",
" INC11 | \n",
" 112 | \n",
"
\n",
" \n",
" INC14 | \n",
" 110 | \n",
"
\n",
" \n",
" EXP6 | \n",
" 108 | \n",
"
\n",
" \n",
" INC15 | \n",
" 108 | \n",
"
\n",
" \n",
" INC13 | \n",
" 108 | \n",
"
\n",
" \n",
" 21 | \n",
" 106 | \n",
"
\n",
" \n",
" EXP8 | \n",
" 103 | \n",
"
\n",
" \n",
" INC18 | \n",
" 102 | \n",
"
\n",
" \n",
" INC25 | \n",
" 101 | \n",
"
\n",
" \n",
" INC563 | \n",
" 100 | \n",
"
\n",
" \n",
" INC573 | \n",
" 100 | \n",
"
\n",
" \n",
" 04 | \n",
" 100 | \n",
"
\n",
" \n",
" INC564 | \n",
" 100 | \n",
"
\n",
" \n",
" EXP11 | \n",
" 98 | \n",
"
\n",
" \n",
" INC26 | \n",
" 98 | \n",
"
\n",
" \n",
" INC670 | \n",
" 97 | \n",
"
\n",
" \n",
" INC714 | \n",
" 97 | \n",
"
\n",
" \n",
" INC583 | \n",
" 97 | \n",
"
\n",
" \n",
" INC24 | \n",
" 96 | \n",
"
\n",
" \n",
" INC809 | \n",
" 96 | \n",
"
\n",
" \n",
" INC625 | \n",
" 96 | \n",
"
\n",
" \n",
" INC16 | \n",
" 96 | \n",
"
\n",
" \n",
" INC609 | \n",
" 96 | \n",
"
\n",
" \n",
" INC20 | \n",
" 95 | \n",
"
\n",
" \n",
" INC778 | \n",
" 95 | \n",
"
\n",
" \n",
" INC575 | \n",
" 95 | \n",
"
\n",
" \n",
" EXP18 | \n",
" 94 | \n",
"
\n",
" \n",
" INC17 | \n",
" 94 | \n",
"
\n",
" \n",
" INC567 | \n",
" 94 | \n",
"
\n",
" \n",
" INC721 | \n",
" 94 | \n",
"
\n",
" \n",
" INC22 | \n",
" 94 | \n",
"
\n",
" \n",
" INC681 | \n",
" 94 | \n",
"
\n",
" \n",
" INC574 | \n",
" 93 | \n",
"
\n",
" \n",
" INC665 | \n",
" 93 | \n",
"
\n",
" \n",
" INC95 | \n",
" 93 | \n",
"
\n",
" \n",
" INC701 | \n",
" 93 | \n",
"
\n",
" \n",
" EXP16 | \n",
" 93 | \n",
"
\n",
" \n",
" INC32 | \n",
" 93 | \n",
"
\n",
" \n",
" INC887 | \n",
" 92 | \n",
"
\n",
" \n",
" INC832 | \n",
" 92 | \n",
"
\n",
" \n",
" INC570 | \n",
" 92 | \n",
"
\n",
" \n",
" INC561 | \n",
" 92 | \n",
"
\n",
" \n",
" INC766 | \n",
" 92 | \n",
"
\n",
" \n",
" INC716 | \n",
" 92 | \n",
"
\n",
" \n",
" 22 | \n",
" 92 | \n",
"
\n",
" \n",
" INC738 | \n",
" 92 | \n",
"
\n",
" \n",
" INC23 | \n",
" 91 | \n",
"
\n",
" \n",
" INC565 | \n",
" 91 | \n",
"
\n",
" \n",
" INC591 | \n",
" 91 | \n",
"
\n",
" \n",
" INC722 | \n",
" 91 | \n",
"
\n",
" \n",
" INC646 | \n",
" 91 | \n",
"
\n",
" \n",
" INC764 | \n",
" 91 | \n",
"
\n",
" \n",
" INC606 | \n",
" 91 | \n",
"
\n",
" \n",
" INC584 | \n",
" 91 | \n",
"
\n",
" \n",
" INC705 | \n",
" 91 | \n",
"
\n",
" \n",
" INC651 | \n",
" 91 | \n",
"
\n",
" \n",
" INC736 | \n",
" 91 | \n",
"
\n",
" \n",
" INC664 | \n",
" 91 | \n",
"
\n",
" \n",
" INC640 | \n",
" 90 | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'1', 14271L),\n",
" (u'2', 5893L),\n",
" (u'3', 3490L),\n",
" (u'4', 2336L),\n",
" (u'5', 1661L),\n",
" (u'6', 1139L),\n",
" (u'7', 848L),\n",
" (u'8', 659L),\n",
" (u'01', 627L),\n",
" (u'9', 534L),\n",
" (u'10', 460L),\n",
" (u'11', 383L),\n",
" (u'INC1', 347L),\n",
" (u'EXP1', 332L),\n",
" (u'12', 318L),\n",
" (u'13', 273L),\n",
" (u'02', 272L),\n",
" (u'14', 239L),\n",
" (u'INC2', 225L),\n",
" (u'EXP2', 218L),\n",
" (u'15', 208L),\n",
" (u'INC3', 187L),\n",
" (u'16', 182L),\n",
" (u'INC4', 176L),\n",
" (u'EXP3', 167L),\n",
" (u'INC5', 163L),\n",
" (u'INC7', 158L),\n",
" (u'INC6', 157L),\n",
" (u'17', 147L),\n",
" (u'03', 147L),\n",
" (u'INC8', 142L),\n",
" (u'EXP4', 142L),\n",
" (u'18', 139L),\n",
" (u'EXP5', 127L),\n",
" (u'INC10', 126L),\n",
" (u'EXP7', 125L),\n",
" (u'19', 124L),\n",
" (u'INC12', 124L),\n",
" (u'INC9', 117L),\n",
" (u'EXP9', 115L),\n",
" (u'20', 114L),\n",
" (u'INC11', 112L),\n",
" (u'INC14', 110L),\n",
" (u'EXP6', 108L),\n",
" (u'INC15', 108L),\n",
" (u'INC13', 108L),\n",
" (u'21', 106L),\n",
" (u'EXP8', 103L),\n",
" (u'INC18', 102L),\n",
" (u'INC25', 101L),\n",
" (u'INC563', 100L),\n",
" (u'INC573', 100L),\n",
" (u'04', 100L),\n",
" (u'INC564', 100L),\n",
" (u'EXP11', 98L),\n",
" (u'INC26', 98L),\n",
" (u'INC670', 97L),\n",
" (u'INC714', 97L),\n",
" (u'INC583', 97L),\n",
" (u'INC24', 96L),\n",
" (u'INC809', 96L),\n",
" (u'INC625', 96L),\n",
" (u'INC16', 96L),\n",
" (u'INC609', 96L),\n",
" (u'INC20', 95L),\n",
" (u'INC778', 95L),\n",
" (u'INC575', 95L),\n",
" (u'EXP18', 94L),\n",
" (u'INC17', 94L),\n",
" (u'INC567', 94L),\n",
" (u'INC721', 94L),\n",
" (u'INC22', 94L),\n",
" (u'INC681', 94L),\n",
" (u'INC574', 93L),\n",
" (u'INC665', 93L),\n",
" (u'INC95', 93L),\n",
" (u'INC701', 93L),\n",
" (u'EXP16', 93L),\n",
" (u'INC32', 93L),\n",
" (u'INC887', 92L),\n",
" (u'INC832', 92L),\n",
" (u'INC570', 92L),\n",
" (u'INC561', 92L),\n",
" (u'INC766', 92L),\n",
" (u'INC716', 92L),\n",
" (u'22', 92L),\n",
" (u'INC738', 92L),\n",
" (u'INC23', 91L),\n",
" (u'INC565', 91L),\n",
" (u'INC591', 91L),\n",
" (u'INC722', 91L),\n",
" (u'INC646', 91L),\n",
" (u'INC764', 91L),\n",
" (u'INC606', 91L),\n",
" (u'INC584', 91L),\n",
" (u'INC705', 91L),\n",
" (u'INC651', 91L),\n",
" (u'INC736', 91L),\n",
" (u'INC664', 91L),\n",
" (u'INC640', 90L)]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \"TRAN_ID\", COUNT(DISTINCT \"FILING_ID\") \n",
"FROM \"S497_CD\"\n",
"GROUP BY 1\n",
"HAVING COUNT(DISTINCT \"FILING_ID\") > 1\n",
"ORDER BY COUNT(DISTINCT \"FILING_ID\") DESC\n",
"LIMIT 100;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But `TRAN_ID` does appear to be unique within each filing amendment, and appears to be reused for each filing."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" FILING_ID | \n",
" TRAN_ID | \n",
" amend_count | \n",
" row_count | \n",
"
\n",
" \n",
" 1896576 | \n",
" C10522528 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON98 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON92 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1896576 | \n",
" C10522532 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1742A | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON100 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1743 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON95 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON99 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 977789 | \n",
" F497P10000244485 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON102 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON96 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1896576 | \n",
" C10522529 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON101 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1361544 | \n",
" NON97 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1896576 | \n",
" C10522533 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1896576 | \n",
" C10522531 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" 1299277 | \n",
" 2007-0458 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2674 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1299277 | \n",
" 2007-0459 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2671 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0921 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0918 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0922 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0925 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2670 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2673 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1066 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" NON1071 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1299277 | \n",
" 2007-0445 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2672 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1299277 | \n",
" 2007-0460 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0908 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1110656 | \n",
" F497P10000340769 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0909 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1749 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97966 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97962 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 977789 | \n",
" F497P10000244544 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1750 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1745 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1748 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1753 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1110656 | \n",
" F497P10000340767 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0904 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0907 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0914 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0917 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0919 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0920 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0923 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0924 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0926 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0927 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1058 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1068 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1059 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1060 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" INC1069 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 868371 | \n",
" NON1070 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2675 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1169898 | \n",
" C2676 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0910 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97928 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97923 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97930 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97918 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97908 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97907 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97911 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97919 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97931 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97914 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97916 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97921 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97922 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97925 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97926 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97957 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97961 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97902 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 961164 | \n",
" INC8906 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97877 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97903 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1746 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1747 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1751 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1058082 | \n",
" C1752 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1110656 | \n",
" F497P10000340705 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1110656 | \n",
" F497P10000340707 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0902 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0903 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0905 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0906 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0912 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0913 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0915 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1161821 | \n",
" 2006-0916 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97875 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" 960322 | \n",
" C97874 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
"
"
],
"text/plain": [
"[(1896576, u'C10522528', 6L, 6L),\n",
" (1361544, u'NON98', 6L, 6L),\n",
" (1361544, u'NON92', 6L, 6L),\n",
" (1896576, u'C10522532', 6L, 6L),\n",
" (1058082, u'C1742A', 6L, 6L),\n",
" (1361544, u'NON100', 6L, 6L),\n",
" (1058082, u'C1743', 6L, 6L),\n",
" (1361544, u'NON95', 6L, 6L),\n",
" (1361544, u'NON99', 6L, 6L),\n",
" (977789, u'F497P10000244485', 6L, 6L),\n",
" (1361544, u'NON102', 6L, 6L),\n",
" (1361544, u'NON96', 6L, 6L),\n",
" (1896576, u'C10522529', 6L, 6L),\n",
" (1361544, u'NON101', 6L, 6L),\n",
" (1361544, u'NON97', 6L, 6L),\n",
" (1896576, u'C10522533', 6L, 6L),\n",
" (1896576, u'C10522531', 6L, 6L),\n",
" (1299277, u'2007-0458', 5L, 5L),\n",
" (1169898, u'C2674', 5L, 5L),\n",
" (1299277, u'2007-0459', 5L, 5L),\n",
" (1169898, u'C2671', 5L, 5L),\n",
" (1161821, u'2006-0921', 5L, 5L),\n",
" (1161821, u'2006-0918', 5L, 5L),\n",
" (1161821, u'2006-0922', 5L, 5L),\n",
" (1161821, u'2006-0925', 5L, 5L),\n",
" (1169898, u'C2670', 5L, 5L),\n",
" (1169898, u'C2673', 5L, 5L),\n",
" (868371, u'INC1066', 5L, 5L),\n",
" (868371, u'NON1071', 5L, 5L),\n",
" (1299277, u'2007-0445', 5L, 5L),\n",
" (1169898, u'C2672', 5L, 5L),\n",
" (1299277, u'2007-0460', 5L, 5L),\n",
" (1161821, u'2006-0908', 5L, 5L),\n",
" (1110656, u'F497P10000340769', 5L, 5L),\n",
" (1161821, u'2006-0909', 5L, 5L),\n",
" (1058082, u'C1749', 5L, 5L),\n",
" (960322, u'C97966', 5L, 5L),\n",
" (960322, u'C97962', 5L, 5L),\n",
" (977789, u'F497P10000244544', 5L, 5L),\n",
" (1058082, u'C1750', 5L, 5L),\n",
" (1058082, u'C1745', 5L, 5L),\n",
" (1058082, u'C1748', 5L, 5L),\n",
" (1058082, u'C1753', 5L, 5L),\n",
" (1110656, u'F497P10000340767', 5L, 5L),\n",
" (1161821, u'2006-0904', 5L, 5L),\n",
" (1161821, u'2006-0907', 5L, 5L),\n",
" (1161821, u'2006-0914', 5L, 5L),\n",
" (1161821, u'2006-0917', 5L, 5L),\n",
" (1161821, u'2006-0919', 5L, 5L),\n",
" (1161821, u'2006-0920', 5L, 5L),\n",
" (1161821, u'2006-0923', 5L, 5L),\n",
" (1161821, u'2006-0924', 5L, 5L),\n",
" (1161821, u'2006-0926', 5L, 5L),\n",
" (1161821, u'2006-0927', 5L, 5L),\n",
" (868371, u'INC1058', 5L, 5L),\n",
" (868371, u'INC1068', 5L, 5L),\n",
" (868371, u'INC1059', 5L, 5L),\n",
" (868371, u'INC1060', 5L, 5L),\n",
" (868371, u'INC1069', 5L, 5L),\n",
" (868371, u'NON1070', 5L, 5L),\n",
" (1169898, u'C2675', 5L, 5L),\n",
" (1169898, u'C2676', 5L, 5L),\n",
" (1161821, u'2006-0910', 5L, 5L),\n",
" (960322, u'C97928', 5L, 5L),\n",
" (960322, u'C97923', 5L, 5L),\n",
" (960322, u'C97930', 5L, 5L),\n",
" (960322, u'C97918', 5L, 5L),\n",
" (960322, u'C97908', 5L, 5L),\n",
" (960322, u'C97907', 5L, 5L),\n",
" (960322, u'C97911', 5L, 5L),\n",
" (960322, u'C97919', 5L, 5L),\n",
" (960322, u'C97931', 5L, 5L),\n",
" (960322, u'C97914', 5L, 5L),\n",
" (960322, u'C97916', 5L, 5L),\n",
" (960322, u'C97921', 5L, 5L),\n",
" (960322, u'C97922', 5L, 5L),\n",
" (960322, u'C97925', 5L, 5L),\n",
" (960322, u'C97926', 5L, 5L),\n",
" (960322, u'C97957', 5L, 5L),\n",
" (960322, u'C97961', 5L, 5L),\n",
" (960322, u'C97902', 5L, 5L),\n",
" (961164, u'INC8906', 5L, 5L),\n",
" (960322, u'C97877', 5L, 5L),\n",
" (960322, u'C97903', 5L, 5L),\n",
" (1058082, u'C1746', 5L, 5L),\n",
" (1058082, u'C1747', 5L, 5L),\n",
" (1058082, u'C1751', 5L, 5L),\n",
" (1058082, u'C1752', 5L, 5L),\n",
" (1110656, u'F497P10000340705', 5L, 5L),\n",
" (1110656, u'F497P10000340707', 5L, 5L),\n",
" (1161821, u'2006-0902', 5L, 5L),\n",
" (1161821, u'2006-0903', 5L, 5L),\n",
" (1161821, u'2006-0905', 5L, 5L),\n",
" (1161821, u'2006-0906', 5L, 5L),\n",
" (1161821, u'2006-0912', 5L, 5L),\n",
" (1161821, u'2006-0913', 5L, 5L),\n",
" (1161821, u'2006-0915', 5L, 5L),\n",
" (1161821, u'2006-0916', 5L, 5L),\n",
" (960322, u'C97875', 5L, 5L),\n",
" (960322, u'C97874', 5L, 5L)]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \"FILING_ID\", \"TRAN_ID\", COUNT(DISTINCT \"AMEND_ID\") AS amend_count, COUNT(*) AS row_count\n",
"FROM \"S497_CD\"\n",
"GROUP BY 1, 2\n",
"ORDER BY COUNT(*) DESC\n",
"LIMIT 100;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There's one exception:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" FILING_ID | \n",
" TRAN_ID | \n",
" AMEND_ID | \n",
" count | \n",
"
\n",
" \n",
" 2072379 | \n",
" EXP9671 | \n",
" 1 | \n",
" 2 | \n",
"
\n",
"
"
],
"text/plain": [
"[(2072379, u'EXP9671', 1, 2L)]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \"FILING_ID\", \"TRAN_ID\", \"AMEND_ID\", COUNT(*)\n",
"FROM \"S497_CD\"\n",
"GROUP BY 1, 2, 3\n",
"HAVING COUNT(*) > 1;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks like this `TRAN_ID` is duplicated across the two parts of the filing. So it was both a contribution both made and received?"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" id | \n",
" FILING_ID | \n",
" AMEND_ID | \n",
" LINE_ITEM | \n",
" REC_TYPE | \n",
" FORM_TYPE | \n",
" TRAN_ID | \n",
" ENTITY_CD | \n",
" ENTY_NAML | \n",
" ENTY_NAMF | \n",
" ENTY_NAMT | \n",
" ENTY_NAMS | \n",
" ENTY_CITY | \n",
" ENTY_ST | \n",
" ENTY_ZIP4 | \n",
" CTRIB_EMP | \n",
" CTRIB_OCC | \n",
" CTRIB_SELF | \n",
" ELEC_DATE | \n",
" CTRIB_DATE | \n",
" DATE_THRU | \n",
" AMOUNT | \n",
" CMTE_ID | \n",
" CAND_NAML | \n",
" CAND_NAMF | \n",
" CAND_NAMT | \n",
" CAND_NAMS | \n",
" OFFICE_CD | \n",
" OFFIC_DSCR | \n",
" JURIS_CD | \n",
" JURIS_DSCR | \n",
" DIST_NO | \n",
" OFF_S_H_CD | \n",
" BAL_NAME | \n",
" BAL_NUM | \n",
" BAL_JURIS | \n",
" MEMO_CODE | \n",
" MEMO_REFNO | \n",
" BAL_ID | \n",
" CAND_ID | \n",
" SUP_OFF_CD | \n",
" SUP_OPP_CD | \n",
"
\n",
" \n",
" 62013206 | \n",
" 2072379 | \n",
" 1 | \n",
" 2 | \n",
" S497 | \n",
" F497P1 | \n",
" EXP9671 | \n",
" COM | \n",
" McCarty for Assembly 2016 | \n",
" | \n",
" | \n",
" | \n",
" Sacramento | \n",
" CA | \n",
" 95814 | \n",
" | \n",
" | \n",
" | \n",
" None | \n",
" 2016-08-25 | \n",
" None | \n",
" 1000.00 | \n",
" 1373943 | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" EXP:A:9671 | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" 62013205 | \n",
" 2072379 | \n",
" 1 | \n",
" 1 | \n",
" S497 | \n",
" F497P2 | \n",
" EXP9671 | \n",
" COM | \n",
" McCarty for Assembly 2016 | \n",
" | \n",
" | \n",
" | \n",
" Sacramento | \n",
" CA | \n",
" 95814 | \n",
" | \n",
" | \n",
" | \n",
" 2016-11-08 | \n",
" 2016-08-25 | \n",
" None | \n",
" -1000.00 | \n",
" 1373943 | \n",
" Assembly Member Kevin McCarty | \n",
" | \n",
" | \n",
" | \n",
" ASM | \n",
" | \n",
" ASM | \n",
" | \n",
" 07 | \n",
" H | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" EXP:S497:9671 | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"[(62013206, 2072379, 1, 2, u'S497', u'F497P1', u'EXP9671', u'COM', u'McCarty for Assembly 2016', u'', u'', u'', u'Sacramento', u'CA', u'95814', u'', u'', u'', None, datetime.date(2016, 8, 25), None, Decimal('1000.00'), u'1373943', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'EXP:A:9671', u'', u'', u'', u''),\n",
" (62013205, 2072379, 1, 1, u'S497', u'F497P2', u'EXP9671', u'COM', u'McCarty for Assembly 2016', u'', u'', u'', u'Sacramento', u'CA', u'95814', u'', u'', u'', datetime.date(2016, 11, 8), datetime.date(2016, 8, 25), None, Decimal('-1000.00'), u'1373943', u'Assembly Member Kevin McCarty', u'', u'', u'', u'ASM', u'', u'ASM', u'', u'07', u'H', u'', u'', u'', u'', u'EXP:S497:9671', u'', u'', u'', u'')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT *\n",
"FROM \"S497_CD\"\n",
"WHERE \"FILING_ID\" = 2072379\n",
"AND \"TRAN_ID\" = 'EXP9671';"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at the [PDF for the filing](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2072379&amendid=1), it appears to be a check from the California Psychological Association PAC to the McCarty for Assembly 2016 committee, which was given and returned on 8/25/2016.\n",
"\n",
"Regardless, because the combinations of `FILING_ID`, `AMEND_ID` and `TRAN_ID` are unique within each part of the Schedule 497, we could substitute `TRAN_ID` for `LINE_ITEM` in the composite key when splitting up the contributions received from the contributions made.\n",
"\n",
"The advantage is that the `TRAN_ID` purportedly points to the same contribution from one amendment to the next, whereas the same `LINE_ITEM` might not because the filers don't necessarily list transactions on the same line from one filing amendment to the next.\n",
"\n",
"Here's an example: On the [original Schedule 497 filing](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=0) for Steven Bradford for Senate 2016, a $8,500.00 contribution from an AFL-CIO sub-committee is listed on line 1. But on the [first](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=1) and [second](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=2) amendments to the filing, it is listed on line 4.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 1
}