{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 09\n", "\n", "\n", "- Fraud Detection Dataset from Microsoft Azure: [data](http://gallery.cortanaintelligence.com/Experiment/8e9fe4e03b8b4c65b9ca947c72b8e463)\n", "\n", "Fraud detection is one of the earliest industrial applications of data mining and machine learning. Fraud detection is typically handled as a binary classification problem, but the class population is unbalanced because instances of fraud are usually very rare compared to the overall volume of transactions. Moreover, when fraudulent transactions are discovered, the business typically takes measures to block the accounts from transacting to prevent further losses. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | accountAge | \n", "digitalItemCount | \n", "sumPurchaseCount1Day | \n", "sumPurchaseAmount1Day | \n", "sumPurchaseAmount30Day | \n", "paymentBillingPostalCode - LogOddsForClass_0 | \n", "accountPostalCode - LogOddsForClass_0 | \n", "paymentBillingState - LogOddsForClass_0 | \n", "accountState - LogOddsForClass_0 | \n", "paymentInstrumentAgeInAccount | \n", "ipState - LogOddsForClass_0 | \n", "transactionAmount | \n", "transactionAmountUSD | \n", "ipPostalCode - LogOddsForClass_0 | \n", "localHour - LogOddsForClass_0 | \n", "Label | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2000 | \n", "0 | \n", "0 | \n", "0.00 | \n", "720.25 | \n", "5.064533 | \n", "0.421214 | \n", "1.312186 | \n", "0.566395 | \n", "3279.574306 | \n", "1.218157 | \n", "599.00 | \n", "626.164650 | \n", "1.259543 | \n", "4.745402 | \n", "0 | \n", "
1 | \n", "62 | \n", "1 | \n", "1 | \n", "1185.44 | \n", "2530.37 | \n", "0.538996 | \n", "0.481838 | \n", "4.401370 | \n", "4.500157 | \n", "61.970139 | \n", "4.035601 | \n", "1185.44 | \n", "1185.440000 | \n", "3.981118 | \n", "4.921349 | \n", "0 | \n", "
2 | \n", "2000 | \n", "0 | \n", "0 | \n", "0.00 | \n", "0.00 | \n", "5.064533 | \n", "5.096396 | \n", "3.056357 | \n", "3.155226 | \n", "0.000000 | \n", "3.314186 | \n", "32.09 | \n", "32.090000 | \n", "5.008490 | \n", "4.742303 | \n", "0 | \n", "
3 | \n", "1 | \n", "1 | \n", "0 | \n", "0.00 | \n", "0.00 | \n", "5.064533 | \n", "5.096396 | \n", "3.331154 | \n", "3.331239 | \n", "0.000000 | \n", "3.529398 | \n", "133.28 | \n", "132.729554 | \n", "1.324925 | \n", "4.745402 | \n", "0 | \n", "
4 | \n", "1 | \n", "1 | \n", "0 | \n", "0.00 | \n", "132.73 | \n", "5.412885 | \n", "0.342945 | \n", "5.563677 | \n", "4.086965 | \n", "0.001389 | \n", "3.529398 | \n", "543.66 | \n", "543.660000 | \n", "2.693451 | \n", "4.876771 | \n", "0 | \n", "