{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Adversarial Robustness Toolbox for Poisoning Attacks on Support Vector Machines (SVM) using Scikitlearn's SVC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we will learn how to use ART to run a poisoning attack on Support Vector Machines. We will be training our data on a subset of the IRIS dataset." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import HTML\n", "HTML('')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from __future__ import absolute_import, division, print_function, unicode_literals\n", "\n", "import os, sys\n", "from os.path import abspath\n", "\n", "module_path = os.path.abspath(os.path.join('..'))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", "\n", "from sklearn.svm import SVC, LinearSVC\n", "from sklearn.datasets import load_iris\n", "\n", "import numpy as np\n", "from matplotlib import pyplot as plt\n", "\n", "from art.estimators.classification import SklearnClassifier\n", "from art.attacks.poisoning.poisoning_attack_svm import PoisoningAttackSVM\n", "\n", "np.random.seed(301)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Utility Functions" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def find_duplicates(x_train):\n", " \"\"\"\n", " Returns an array of booleans that is true if that element was previously in the array\n", "\n", " :param x_train: training data\n", " :type x_train: `np.ndarray`\n", " :return: duplicates array\n", " :rtype: `np.ndarray`\n", " \"\"\"\n", " dup = np.zeros(x_train.shape[0])\n", " for idx, x in enumerate(x_train):\n", " dup[idx] = np.isin(x_train[:idx], x).all(axis=1).any()\n", " return dup" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def get_data():\n", " iris = load_iris()\n", " X = iris.data\n", " y = iris.target\n", " \n", " X = X[y != 0, :2]\n", " y = y[y != 0]\n", " labels = np.zeros((y.shape[0], 2))\n", " labels[y == 2] = np.array([1, 0])\n", " labels[y == 1] = np.array([0, 1])\n", " y = labels\n", " \n", " n_sample = len(X)\n", " \n", " order = np.random.permutation(n_sample)\n", " X = X[order]\n", " y = y[order].astype(np.float)\n", " \n", " X_train = X[:int(.9 * n_sample)]\n", " y_train = y[:int(.9 * n_sample)]\n", " train_dups = find_duplicates(X_train)\n", " X_train = X_train[train_dups == False]\n", " y_train = y_train[train_dups == False]\n", " X_test = X[int(.9 * n_sample):]\n", " y_test = y[int(.9 * n_sample):]\n", " test_dups = find_duplicates(X_test)\n", " X_test = X_test[test_dups == False]\n", " y_test = y_test[test_dups == False]\n", " return X_train, y_train, X_test, y_test" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def get_adversarial_examples(x_train, y_train, attack_idx, x_val, y_val, kernel):\n", " # Create ART classfier for scikit-learn SVC\n", " art_classifier = SklearnClassifier(model=SVC(kernel=kernel), clip_values=(0, 10))\n", " art_classifier.fit(x_train, y_train)\n", " init_attack = np.copy(x_train[attack_idx])\n", " y_attack = np.array([1, 1]) - np.copy(y_train[attack_idx])\n", " attack = PoisoningAttackSVM(art_classifier, 0.001, 1.0, x_train, y_train, x_val, y_val, max_iter=100)\n", " final_attack, _ = attack.poison(np.array([init_attack]), y=np.array([y_attack]))\n", " return final_attack, art_classifier" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def plot_results(model, x_train, y_train, x_train_adv, title):\n", " import matplotlib.pyplot as plt\n", " import warnings\n", " warnings.filterwarnings(\"ignore\")\n", "\n", " plt.figure()\n", " plt.clf()\n", "\n", " get_color = lambda idx: 'orange' if np.argmax(idx) == 1 else 'blue'\n", " for i_class_2 in [np.array([0, 1]), np.array([1, 0])]:\n", " mask = np.all(y_train == i_class_2, axis=1)\n", " plt.scatter(x_train[mask][:, 0], x_train[mask][:, 1], s=20, zorder=2, c=get_color(i_class_2))\n", " # plt.axes.set_aspect('equal', adjustable='box')\n", "\n", " for sv in model.support_vectors_:\n", " plt.scatter(sv[0], sv[1], s=200, linewidth=1, facecolors='none', edgecolors='lightgreen',\n", " zorder=2)\n", " h = .01\n", " x_min, x_max = 1.5, 8.5\n", " y_min, y_max = 0, 7\n", "\n", " xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n", "\n", " Z = model.decision_function(np.c_[xx.ravel(), yy.ravel()])\n", " Z = Z.reshape(xx.shape)\n", " plt.pcolormesh(xx, yy, Z > 0, cmap=plt.cm.Paired)\n", " plt.contour(xx, yy, Z, colors=['k', 'k', 'k'],\n", " linestyles=['--', '-', '--'], levels=[-.5, 0, .5])\n", "\n", " x_values = []\n", " y_values = []\n", " for adv in x_train_adv:\n", " x_values.append(adv[0, 0])\n", " y_values.append(adv[0, 1])\n", " x_values = np.array(x_values)\n", " y_values = np.array(y_values)\n", " plt.scatter(x_values, y_values, zorder=2,\n", " c='red', marker='X')\n", " plt.axes().set_xlim((x_min, x_max))\n", " plt.axes().set_ylim((y_min, y_max))\n", "\n", " plt.axes().set_title(title)\n", " plt.axes().set_xlabel('feature 1')\n", " plt.axes().set_ylabel('feature 2')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Data\n", "\n", "In this example, we take two features from the IRIS dataset and train an SVM." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "train_data, train_labels, test_data, test_labels = get_data()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize Effect of Attack on SVM\n", "\n", "After training the SVM on just one attack point, a noticable change occurs in the decision boundary for the classifier." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Clean model accuracy on train set (68 samples): 0.7352941176470589\n", "Poison model accuracy on train set (68 samples): 0.7352941176470589\n", "Clean model accuracy on test set (10 samples): 0.6\n", "Poison model accuracy on test set (10 samples): 0.6\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "kernel = 'linear' # one of ['linear', 'poly', 'rbf']\n", "\n", "attack_point, poisoned = get_adversarial_examples(train_data, train_labels, 0, test_data, test_labels, kernel)\n", "clean = SVC(kernel=kernel)\n", "art_clean = SklearnClassifier(clean, clip_values=(0, 10))\n", "art_clean.fit(x=train_data, y=train_labels)\n", "\n", "plot_results(art_clean._model, train_data, train_labels, [], \"SVM Before Attack\")\n", "plot_results(poisoned._model, train_data, train_labels, [attack_point], \"SVM After Poison\")\n", "\n", "clean_acc_train = np.average(np.all(art_clean.predict(train_data) == train_labels, axis=1))\n", "poison_acc_train = np.average(np.all(poisoned.predict(train_data) == train_labels, axis=1))\n", "clean_acc_test = np.average(np.all(art_clean.predict(test_data) == test_labels, axis=1))\n", "poison_acc_test = np.average(np.all(poisoned.predict(test_data) == test_labels, axis=1))\n", "\n", "print(\"Clean model accuracy on train set ({} samples): {}\".format(len(train_labels), clean_acc_train))\n", "print(\"Poison model accuracy on train set ({} samples): {}\".format(len(train_labels), poison_acc_train))\n", "print(\"Clean model accuracy on test set ({} samples): {}\".format(len(test_labels), clean_acc_test))\n", "print(\"Poison model accuracy on test set ({} samples): {}\".format(len(test_labels), poison_acc_test))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A perfect classifier would have all points in yellow on the orange side of the decision boundary and all points in blue on the light blue side of the decision boundary. The attack point is shown in red and support vectors are circled.\n", "\n", "Even with small changes in overall accuracy, inserting just a *single* poison point can have major impacts on the model's ability to generalize well. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 2 }