{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Late contributions Received and Made" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load_ext sql" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "u'Connected: ccdc@calaccess_website'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from django.conf import settings\n", "connection_string = 'postgresql+psycopg2://{USER}:{PASSWORD}@{HOST}:{PORT}/{NAME}'.format(\n", " **settings.DATABASES['default']\n", ")\n", "%sql $connection_string" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unique Composite Key" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The documentation says that the records are unique on the following fields:\n", "* `FILING_ID`\n", "* `AMEND_ID`\n", "* `LINE_ITEM`\n", "* `REC_TYPE`\n", "* `FORM_TYPE`\n", "\n", "`REC_TYPE` is always the same value: `S497`, so we can ignore this column. \n", "\n", "`FORM_TYPE` is either `F497P1` or `F497P2`, indicating in whether itemized transaction is listed under Part 1 (Contributions Received) or Part 2 (Contributions Made). I'll split these up into separate tables." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Are the `S497_CD` records actually unique on `FILING_ID`, `AMEND_ID` and `LINE_ITEM`?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yes. And this is even true across the Parts 1 and 2 (Contributions Received and Contributions Made)." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
FILING_IDAMEND_IDLINE_ITEMcount
" ], "text/plain": [ "[]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT \"FILING_ID\", \"AMEND_ID\", \"LINE_ITEM\", COUNT(*)\n", "FROM \"S497_CD\"\n", "GROUP BY 1, 2, 3\n", "HAVING COUNT(*) > 1\n", "ORDER BY COUNT(*) DESC;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `TRAN_ID`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `S497_CD` table includes a `TRAN_ID` field, which the [documentation](http://calaccess.californiacivicdata.org/documentation/calaccess-files/s497-cd/#fields) describes as a \"Permanent value unique to this item\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Is `TRAN_ID` ever `NULL` or blank?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "No." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
count
0
" ], "text/plain": [ "[(0L,)]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT COUNT(*)\n", "FROM \"S497_CD\"\n", "WHERE \"TRAN_ID\" IS NULL OR \"TRAN_ID\" = '' OR \"TRAN_ID\" = '0';" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Is `TRAN_ID` unique across filings?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Decidedly no." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TRAN_IDcount
114271
25893
33490
42336
51661
61139
7848
8659
01627
9534
10460
11383
INC1347
EXP1332
12318
13273
02272
14239
INC2225
EXP2218
15208
INC3187
16182
INC4176
EXP3167
INC5163
INC7158
INC6157
17147
03147
INC8142
EXP4142
18139
EXP5127
INC10126
EXP7125
19124
INC12124
INC9117
EXP9115
20114
INC11112
INC14110
EXP6108
INC15108
INC13108
21106
EXP8103
INC18102
INC25101
INC563100
INC573100
04100
INC564100
EXP1198
INC2698
INC67097
INC71497
INC58397
INC2496
INC80996
INC62596
INC1696
INC60996
INC2095
INC77895
INC57595
EXP1894
INC1794
INC56794
INC72194
INC2294
INC68194
INC57493
INC66593
INC9593
INC70193
EXP1693
INC3293
INC88792
INC83292
INC57092
INC56192
INC76692
INC71692
2292
INC73892
INC2391
INC56591
INC59191
INC72291
INC64691
INC76491
INC60691
INC58491
INC70591
INC65191
INC73691
INC66491
INC64090
" ], "text/plain": [ "[(u'1', 14271L),\n", " (u'2', 5893L),\n", " (u'3', 3490L),\n", " (u'4', 2336L),\n", " (u'5', 1661L),\n", " (u'6', 1139L),\n", " (u'7', 848L),\n", " (u'8', 659L),\n", " (u'01', 627L),\n", " (u'9', 534L),\n", " (u'10', 460L),\n", " (u'11', 383L),\n", " (u'INC1', 347L),\n", " (u'EXP1', 332L),\n", " (u'12', 318L),\n", " (u'13', 273L),\n", " (u'02', 272L),\n", " (u'14', 239L),\n", " (u'INC2', 225L),\n", " (u'EXP2', 218L),\n", " (u'15', 208L),\n", " (u'INC3', 187L),\n", " (u'16', 182L),\n", " (u'INC4', 176L),\n", " (u'EXP3', 167L),\n", " (u'INC5', 163L),\n", " (u'INC7', 158L),\n", " (u'INC6', 157L),\n", " (u'17', 147L),\n", " (u'03', 147L),\n", " (u'INC8', 142L),\n", " (u'EXP4', 142L),\n", " (u'18', 139L),\n", " (u'EXP5', 127L),\n", " (u'INC10', 126L),\n", " (u'EXP7', 125L),\n", " (u'19', 124L),\n", " (u'INC12', 124L),\n", " (u'INC9', 117L),\n", " (u'EXP9', 115L),\n", " (u'20', 114L),\n", " (u'INC11', 112L),\n", " (u'INC14', 110L),\n", " (u'EXP6', 108L),\n", " (u'INC15', 108L),\n", " (u'INC13', 108L),\n", " (u'21', 106L),\n", " (u'EXP8', 103L),\n", " (u'INC18', 102L),\n", " (u'INC25', 101L),\n", " (u'INC563', 100L),\n", " (u'INC573', 100L),\n", " (u'04', 100L),\n", " (u'INC564', 100L),\n", " (u'EXP11', 98L),\n", " (u'INC26', 98L),\n", " (u'INC670', 97L),\n", " (u'INC714', 97L),\n", " (u'INC583', 97L),\n", " (u'INC24', 96L),\n", " (u'INC809', 96L),\n", " (u'INC625', 96L),\n", " (u'INC16', 96L),\n", " (u'INC609', 96L),\n", " (u'INC20', 95L),\n", " (u'INC778', 95L),\n", " (u'INC575', 95L),\n", " (u'EXP18', 94L),\n", " (u'INC17', 94L),\n", " (u'INC567', 94L),\n", " (u'INC721', 94L),\n", " (u'INC22', 94L),\n", " (u'INC681', 94L),\n", " (u'INC574', 93L),\n", " (u'INC665', 93L),\n", " (u'INC95', 93L),\n", " (u'INC701', 93L),\n", " (u'EXP16', 93L),\n", " (u'INC32', 93L),\n", " (u'INC887', 92L),\n", " (u'INC832', 92L),\n", " (u'INC570', 92L),\n", " (u'INC561', 92L),\n", " (u'INC766', 92L),\n", " (u'INC716', 92L),\n", " (u'22', 92L),\n", " (u'INC738', 92L),\n", " (u'INC23', 91L),\n", " (u'INC565', 91L),\n", " (u'INC591', 91L),\n", " (u'INC722', 91L),\n", " (u'INC646', 91L),\n", " (u'INC764', 91L),\n", " (u'INC606', 91L),\n", " (u'INC584', 91L),\n", " (u'INC705', 91L),\n", " (u'INC651', 91L),\n", " (u'INC736', 91L),\n", " (u'INC664', 91L),\n", " (u'INC640', 90L)]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT \"TRAN_ID\", COUNT(DISTINCT \"FILING_ID\") \n", "FROM \"S497_CD\"\n", "GROUP BY 1\n", "HAVING COUNT(DISTINCT \"FILING_ID\") > 1\n", "ORDER BY COUNT(DISTINCT \"FILING_ID\") DESC\n", "LIMIT 100;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But `TRAN_ID` does appear to be unique within each filing amendment, and appears to be reused for each filing." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
FILING_IDTRAN_IDamend_countrow_count
1896576C1052252866
1361544NON9866
1361544NON9266
1896576C1052253266
1058082C1742A66
1361544NON10066
1058082C174366
1361544NON9566
1361544NON9966
977789F497P1000024448566
1361544NON10266
1361544NON9666
1896576C1052252966
1361544NON10166
1361544NON9766
1896576C1052253366
1896576C1052253166
12992772007-045855
1169898C267455
12992772007-045955
1169898C267155
11618212006-092155
11618212006-091855
11618212006-092255
11618212006-092555
1169898C267055
1169898C267355
868371INC106655
868371NON107155
12992772007-044555
1169898C267255
12992772007-046055
11618212006-090855
1110656F497P1000034076955
11618212006-090955
1058082C174955
960322C9796655
960322C9796255
977789F497P1000024454455
1058082C175055
1058082C174555
1058082C174855
1058082C175355
1110656F497P1000034076755
11618212006-090455
11618212006-090755
11618212006-091455
11618212006-091755
11618212006-091955
11618212006-092055
11618212006-092355
11618212006-092455
11618212006-092655
11618212006-092755
868371INC105855
868371INC106855
868371INC105955
868371INC106055
868371INC106955
868371NON107055
1169898C267555
1169898C267655
11618212006-091055
960322C9792855
960322C9792355
960322C9793055
960322C9791855
960322C9790855
960322C9790755
960322C9791155
960322C9791955
960322C9793155
960322C9791455
960322C9791655
960322C9792155
960322C9792255
960322C9792555
960322C9792655
960322C9795755
960322C9796155
960322C9790255
961164INC890655
960322C9787755
960322C9790355
1058082C174655
1058082C174755
1058082C175155
1058082C175255
1110656F497P1000034070555
1110656F497P1000034070755
11618212006-090255
11618212006-090355
11618212006-090555
11618212006-090655
11618212006-091255
11618212006-091355
11618212006-091555
11618212006-091655
960322C9787555
960322C9787455
" ], "text/plain": [ "[(1896576, u'C10522528', 6L, 6L),\n", " (1361544, u'NON98', 6L, 6L),\n", " (1361544, u'NON92', 6L, 6L),\n", " (1896576, u'C10522532', 6L, 6L),\n", " (1058082, u'C1742A', 6L, 6L),\n", " (1361544, u'NON100', 6L, 6L),\n", " (1058082, u'C1743', 6L, 6L),\n", " (1361544, u'NON95', 6L, 6L),\n", " (1361544, u'NON99', 6L, 6L),\n", " (977789, u'F497P10000244485', 6L, 6L),\n", " (1361544, u'NON102', 6L, 6L),\n", " (1361544, u'NON96', 6L, 6L),\n", " (1896576, u'C10522529', 6L, 6L),\n", " (1361544, u'NON101', 6L, 6L),\n", " (1361544, u'NON97', 6L, 6L),\n", " (1896576, u'C10522533', 6L, 6L),\n", " (1896576, u'C10522531', 6L, 6L),\n", " (1299277, u'2007-0458', 5L, 5L),\n", " (1169898, u'C2674', 5L, 5L),\n", " (1299277, u'2007-0459', 5L, 5L),\n", " (1169898, u'C2671', 5L, 5L),\n", " (1161821, u'2006-0921', 5L, 5L),\n", " (1161821, u'2006-0918', 5L, 5L),\n", " (1161821, u'2006-0922', 5L, 5L),\n", " (1161821, u'2006-0925', 5L, 5L),\n", " (1169898, u'C2670', 5L, 5L),\n", " (1169898, u'C2673', 5L, 5L),\n", " (868371, u'INC1066', 5L, 5L),\n", " (868371, u'NON1071', 5L, 5L),\n", " (1299277, u'2007-0445', 5L, 5L),\n", " (1169898, u'C2672', 5L, 5L),\n", " (1299277, u'2007-0460', 5L, 5L),\n", " (1161821, u'2006-0908', 5L, 5L),\n", " (1110656, u'F497P10000340769', 5L, 5L),\n", " (1161821, u'2006-0909', 5L, 5L),\n", " (1058082, u'C1749', 5L, 5L),\n", " (960322, u'C97966', 5L, 5L),\n", " (960322, u'C97962', 5L, 5L),\n", " (977789, u'F497P10000244544', 5L, 5L),\n", " (1058082, u'C1750', 5L, 5L),\n", " (1058082, u'C1745', 5L, 5L),\n", " (1058082, u'C1748', 5L, 5L),\n", " (1058082, u'C1753', 5L, 5L),\n", " (1110656, u'F497P10000340767', 5L, 5L),\n", " (1161821, u'2006-0904', 5L, 5L),\n", " (1161821, u'2006-0907', 5L, 5L),\n", " (1161821, u'2006-0914', 5L, 5L),\n", " (1161821, u'2006-0917', 5L, 5L),\n", " (1161821, u'2006-0919', 5L, 5L),\n", " (1161821, u'2006-0920', 5L, 5L),\n", " (1161821, u'2006-0923', 5L, 5L),\n", " (1161821, u'2006-0924', 5L, 5L),\n", " (1161821, u'2006-0926', 5L, 5L),\n", " (1161821, u'2006-0927', 5L, 5L),\n", " (868371, u'INC1058', 5L, 5L),\n", " (868371, u'INC1068', 5L, 5L),\n", " (868371, u'INC1059', 5L, 5L),\n", " (868371, u'INC1060', 5L, 5L),\n", " (868371, u'INC1069', 5L, 5L),\n", " (868371, u'NON1070', 5L, 5L),\n", " (1169898, u'C2675', 5L, 5L),\n", " (1169898, u'C2676', 5L, 5L),\n", " (1161821, u'2006-0910', 5L, 5L),\n", " (960322, u'C97928', 5L, 5L),\n", " (960322, u'C97923', 5L, 5L),\n", " (960322, u'C97930', 5L, 5L),\n", " (960322, u'C97918', 5L, 5L),\n", " (960322, u'C97908', 5L, 5L),\n", " (960322, u'C97907', 5L, 5L),\n", " (960322, u'C97911', 5L, 5L),\n", " (960322, u'C97919', 5L, 5L),\n", " (960322, u'C97931', 5L, 5L),\n", " (960322, u'C97914', 5L, 5L),\n", " (960322, u'C97916', 5L, 5L),\n", " (960322, u'C97921', 5L, 5L),\n", " (960322, u'C97922', 5L, 5L),\n", " (960322, u'C97925', 5L, 5L),\n", " (960322, u'C97926', 5L, 5L),\n", " (960322, u'C97957', 5L, 5L),\n", " (960322, u'C97961', 5L, 5L),\n", " (960322, u'C97902', 5L, 5L),\n", " (961164, u'INC8906', 5L, 5L),\n", " (960322, u'C97877', 5L, 5L),\n", " (960322, u'C97903', 5L, 5L),\n", " (1058082, u'C1746', 5L, 5L),\n", " (1058082, u'C1747', 5L, 5L),\n", " (1058082, u'C1751', 5L, 5L),\n", " (1058082, u'C1752', 5L, 5L),\n", " (1110656, u'F497P10000340705', 5L, 5L),\n", " (1110656, u'F497P10000340707', 5L, 5L),\n", " (1161821, u'2006-0902', 5L, 5L),\n", " (1161821, u'2006-0903', 5L, 5L),\n", " (1161821, u'2006-0905', 5L, 5L),\n", " (1161821, u'2006-0906', 5L, 5L),\n", " (1161821, u'2006-0912', 5L, 5L),\n", " (1161821, u'2006-0913', 5L, 5L),\n", " (1161821, u'2006-0915', 5L, 5L),\n", " (1161821, u'2006-0916', 5L, 5L),\n", " (960322, u'C97875', 5L, 5L),\n", " (960322, u'C97874', 5L, 5L)]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT \"FILING_ID\", \"TRAN_ID\", COUNT(DISTINCT \"AMEND_ID\") AS amend_count, COUNT(*) AS row_count\n", "FROM \"S497_CD\"\n", "GROUP BY 1, 2\n", "ORDER BY COUNT(*) DESC\n", "LIMIT 100;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There's one exception:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
FILING_IDTRAN_IDAMEND_IDcount
2072379EXP967112
" ], "text/plain": [ "[(2072379, u'EXP9671', 1, 2L)]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT \"FILING_ID\", \"TRAN_ID\", \"AMEND_ID\", COUNT(*)\n", "FROM \"S497_CD\"\n", "GROUP BY 1, 2, 3\n", "HAVING COUNT(*) > 1;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looks like this `TRAN_ID` is duplicated across the two parts of the filing. So it was both a contribution both made and received?" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idFILING_IDAMEND_IDLINE_ITEMREC_TYPEFORM_TYPETRAN_IDENTITY_CDENTY_NAMLENTY_NAMFENTY_NAMTENTY_NAMSENTY_CITYENTY_STENTY_ZIP4CTRIB_EMPCTRIB_OCCCTRIB_SELFELEC_DATECTRIB_DATEDATE_THRUAMOUNTCMTE_IDCAND_NAMLCAND_NAMFCAND_NAMTCAND_NAMSOFFICE_CDOFFIC_DSCRJURIS_CDJURIS_DSCRDIST_NOOFF_S_H_CDBAL_NAMEBAL_NUMBAL_JURISMEMO_CODEMEMO_REFNOBAL_IDCAND_IDSUP_OFF_CDSUP_OPP_CD
62013206207237912S497F497P1EXP9671COMMcCarty for Assembly 2016SacramentoCA95814None2016-08-25None1000.001373943EXP:A:9671
62013205207237911S497F497P2EXP9671COMMcCarty for Assembly 2016SacramentoCA958142016-11-082016-08-25None-1000.001373943Assembly Member Kevin McCartyASMASM07HEXP:S497:9671
" ], "text/plain": [ "[(62013206, 2072379, 1, 2, u'S497', u'F497P1', u'EXP9671', u'COM', u'McCarty for Assembly 2016', u'', u'', u'', u'Sacramento', u'CA', u'95814', u'', u'', u'', None, datetime.date(2016, 8, 25), None, Decimal('1000.00'), u'1373943', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'EXP:A:9671', u'', u'', u'', u''),\n", " (62013205, 2072379, 1, 1, u'S497', u'F497P2', u'EXP9671', u'COM', u'McCarty for Assembly 2016', u'', u'', u'', u'Sacramento', u'CA', u'95814', u'', u'', u'', datetime.date(2016, 11, 8), datetime.date(2016, 8, 25), None, Decimal('-1000.00'), u'1373943', u'Assembly Member Kevin McCarty', u'', u'', u'', u'ASM', u'', u'ASM', u'', u'07', u'H', u'', u'', u'', u'', u'EXP:S497:9671', u'', u'', u'', u'')]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT *\n", "FROM \"S497_CD\"\n", "WHERE \"FILING_ID\" = 2072379\n", "AND \"TRAN_ID\" = 'EXP9671';" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at the [PDF for the filing](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2072379&amendid=1), it appears to be a check from the California Psychological Association PAC to the McCarty for Assembly 2016 committee, which was given and returned on 8/25/2016.\n", "\n", "Regardless, because the combinations of `FILING_ID`, `AMEND_ID` and `TRAN_ID` are unique within each part of the Schedule 497, we could substitute `TRAN_ID` for `LINE_ITEM` in the composite key when splitting up the contributions received from the contributions made.\n", "\n", "The advantage is that the `TRAN_ID` purportedly points to the same contribution from one amendment to the next, whereas the same `LINE_ITEM` might not because the filers don't necessarily list transactions on the same line from one filing amendment to the next.\n", "\n", "Here's an example: On the [original Schedule 497 filing](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=0) for Steven Bradford for Senate 2016, a $8,500.00 contribution from an AFL-CIO sub-committee is listed on line 1. But on the [first](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=1) and [second](http://cal-access.ss.ca.gov/PDFGen/pdfgen.prg?filingid=2083478&amendid=2) amendments to the filing, it is listed on line 4.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 1 }