{ "cells": [ { "cell_type": "markdown", "id": "ebf214ed", "metadata": {}, "source": [ "# Label-only Membership Inference" ] }, { "cell_type": "markdown", "id": "79c348c5", "metadata": {}, "source": [ "## Intuition\n", "\n", "The basic intuition behind the attack is that samples used in training will be farther away from the decision boundary than non-training data.\n", "\n", "The attacker uses various means of **perturbations** to measure the amount of noise that is needed to \"change the classifier's mind\" about their prediction for a given sample. Since the ML model is more confident on training data, the attacker will need to perturb the input more to force the model to misclassify. Thus, the amount of perturbation needed will be analogue to the sample's distance from the decision boundary. Both of the below listed attacks use an adversarial perturbation technique called [**HopSkipJump**](https://arxiv.org/abs/1904.02144).\n", "\n", "### Learning $\\tau$\n", "\n", "Given some estimate of a sample's distance from the model's decision boundary, the attacker compares it to a threshold $\\tau$. Any distance greater than $\\tau$ will cause the sample to be classified as a training sample. \n", "\n", "There are two ways to learn the distance threshold $\\tau$:\n", "- with data (Choquette et al., https://arxiv.org/abs/2007.14321)\n", "- without data (Li et al., https://arxiv.org/abs/2007.15528)\n", "\n", "#### With data\n", "In this scenario, the attacker needs to know about a subset of the data if it had been used in training or not. It uses this data to calculate their distances to the decision boundary, and sets $\\tau$ such that it maximizes membership inference accuracy. Misclassified samples will be regarded as non-training samples.\n", "\n", "#### Without data\n", "Here the attacker generates _random data_, and uses the same perturbation techniques as before to measure their distance from the decision threshold. In the end, the attacker chooses a suitable top t percentile over these distances to calibrate $\\tau$." ] }, { "cell_type": "markdown", "id": "c8406056", "metadata": {}, "source": [ "## Overview\n", "How to implement the attack using ART.\n", "#### 1. [Preliminaries](#preliminaries)\n", "1. [Load data and attacked model](#load)\n", "2. [Wrap model in ART classifier wrapper](#wrap)\n", "\n", "#### 2. Attack ([w/ data](#attack), [w/o data](#attack_nodata))\n", "1. Instantiate attack ([w/ data](#instantiate), [w/o data](#instantiate_nodata))\n", "2. Calibrate distance threshold $\\tau$ ([w/ data](#calibrate), [w/o data](#calibrate_nodata))\n", "3. Infer membership on evaluation data ([w/ data](#infer), [w/o data](#infer_nodata))" ] }, { "cell_type": "code", "execution_count": 1, "id": "5e5fd511", "metadata": {}, "outputs": [], "source": [ "import torch\n", "from torch import nn\n", "import numpy as np" ] }, { "cell_type": "markdown", "id": "e26c8646", "metadata": {}, "source": [ "<a id='preliminaries'></a>\n", "## Preliminaries" ] }, { "cell_type": "code", "execution_count": 2, "id": "5dd608c3", "metadata": {}, "outputs": [], "source": [ "from art.utils import load_mnist\n", "\n", "# data\n", "(x_train, y_train), (x_test, y_test), _min, _max = load_mnist(raw=True)\n", "\n", "x_train = np.expand_dims(x_train, axis=1).astype(np.float32)\n", "x_test = np.expand_dims(x_test, axis=1).astype(np.float32)" ] }, { "cell_type": "code", "execution_count": 3, "id": "312dfff9", "metadata": {}, "outputs": [], "source": [ "# model\n", "model = nn.Sequential(\n", " nn.Conv2d(1, 16, 4, stride=2, padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(16, 32, 4, stride=2, padding=1),\n", " nn.ReLU(),\n", " nn.Flatten(),\n", " nn.Linear(32*7*7,100),\n", " nn.ReLU(),\n", " nn.Linear(100, 10)\n", ")" ] }, { "cell_type": "markdown", "id": "97af923a", "metadata": {}, "source": [ "<a id='wrap'></a>\n", "### Wrap model in PyTorchClassifier" ] }, { "cell_type": "code", "execution_count": 4, "id": "5721144b", "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Base model accuracy: 0.9801\n" ] } ], "source": [ "import torch.optim as optim\n", "from art.estimators.classification.pytorch import PyTorchClassifier\n", "\n", "criterion = nn.CrossEntropyLoss()\n", "optimizer = optim.Adam(model.parameters())\n", "\n", "art_model = PyTorchClassifier(model=model, loss=criterion, optimizer=optimizer, channels_first=True, input_shape=(1,28,28,), nb_classes=10, clip_values=(_min,_max))\n", "art_model.fit(x_train, y_train, nb_epochs=10, batch_size=128)\n", "\n", "pred = np.array([np.argmax(arr) for arr in art_model.predict(x_test)])\n", "\n", "print('Base model accuracy: ', np.sum(pred == y_test) / len(y_test))" ] }, { "cell_type": "markdown", "id": "7868197c", "metadata": {}, "source": [ "<a id='attack'></a>\n", "## Attack (supervised, with data)" ] }, { "cell_type": "markdown", "id": "3476ad5b", "metadata": {}, "source": [ "<a id='instantiate'></a>\n", "### Instantiate attack" ] }, { "cell_type": "code", "execution_count": 5, "id": "ae0ca2f1", "metadata": {}, "outputs": [], "source": [ "from art.attacks.inference.membership_inference import LabelOnlyDecisionBoundary\n", "\n", "mia_label_only = LabelOnlyDecisionBoundary(art_model)" ] }, { "cell_type": "markdown", "id": "aed64f6b", "metadata": {}, "source": [ "<a id='calibrate'></a>\n", "### Calibrate distance threshold" ] }, { "cell_type": "code", "execution_count": 6, "id": "10996719", "metadata": {}, "outputs": [], "source": [ "# number of samples used to calibrate distance threshold\n", "attack_train_size = 1500\n", "attack_test_size = 1500\n", "\n", "x = np.concatenate([x_train, x_test])\n", "y = np.concatenate([y_train, y_test])\n", "training_sample = np.array([1] * len(x_train) + [0] * len(x_test))" ] }, { "cell_type": "code", "execution_count": 7, "id": "9757718c", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "455ab2837b09458798aab233eabca21f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HopSkipJump: 0%| | 0/1500 [00:00<?, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8b3b49de9d844cc99482fd62d1c82299", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HopSkipJump: 0%| | 0/1500 [00:00<?, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mia_label_only.calibrate_distance_threshold(x_train[:attack_train_size], y_train[:attack_train_size],\n", " x_test[:attack_test_size], y_test[:attack_test_size])" ] }, { "cell_type": "markdown", "id": "f3d0451f", "metadata": {}, "source": [ "<a id='infer'></a>\n", "### Infer membership on evaluation data" ] }, { "cell_type": "code", "execution_count": 8, "id": "92bde49e", "metadata": {}, "outputs": [], "source": [ "from numpy.random import choice\n", "\n", "# evaluation data\n", "n = 500\n", "eval_data_idx = choice(len(x), n)\n", "x_eval, y_eval = x[eval_data_idx], y[eval_data_idx]\n", "eval_label = training_sample[eval_data_idx]" ] }, { "cell_type": "code", "execution_count": 9, "id": "bd93d08c", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fcb66a7c79f342e0ad8ce9448a89e75a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HopSkipJump: 0%| | 0/500 [00:00<?, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pred_label = mia_label_only.infer(x_eval, y_eval)" ] }, { "cell_type": "code", "execution_count": 10, "id": "f511eceb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.656000\n" ] } ], "source": [ "from sklearn.metrics import accuracy_score\n", "\n", "print(\"Accuracy: %f\" % accuracy_score(eval_label, pred_label))" ] }, { "cell_type": "markdown", "id": "b3c46823", "metadata": {}, "source": [ "<a id='attack_nodata'></a>\n", "## Attack (unsupervised, without data)" ] }, { "cell_type": "markdown", "id": "7294bc05", "metadata": {}, "source": [ "<a id='instantiate_nodata'></a>\n", "### Instantiate attack" ] }, { "cell_type": "code", "execution_count": 11, "id": "4e6d1966", "metadata": {}, "outputs": [], "source": [ "mia_label_only_unsupervised = LabelOnlyDecisionBoundary(art_model)" ] }, { "cell_type": "markdown", "id": "76253392", "metadata": {}, "source": [ "<a id='calibrate_nodata'></a>\n", "### Calibrate distance threshold" ] }, { "cell_type": "code", "execution_count": 12, "id": "35972879", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "816ac6a610034a92a5206e1ab2880d5a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HopSkipJump: 0%| | 0/500 [00:00<?, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# calibrate distance threshold in an UNSUPERVISED way, without data\n", "mia_label_only_unsupervised.calibrate_distance_threshold_unsupervised(top_t=50, num_samples=500, max_queries=2, verbose=True, batch_size=256)" ] }, { "cell_type": "markdown", "id": "291a0079", "metadata": {}, "source": [ "<a id='infer_nodata'></a>\n", "### Infer membership on evaluation data" ] }, { "cell_type": "code", "execution_count": 13, "id": "8c29161e", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cb56e14695c4cc6b8a7f97e3bf73160", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HopSkipJump: 0%| | 0/500 [00:00<?, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pred_label_unsupervised = mia_label_only_unsupervised.infer(x_eval, y_eval)" ] }, { "cell_type": "code", "execution_count": 14, "id": "f99d945f", "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.868000\n" ] } ], "source": [ "print(\"Accuracy: %f\" % accuracy_score(eval_label, pred_label_unsupervised))" ] }, { "cell_type": "markdown", "id": "9022ce57", "metadata": {}, "source": [ "**As we can see, one does not need any data, their correct label or knowledge about their membership to perform a successful membership inference attack.**\n", "\n", "**The attacker needs only the observed output labels of the model.**" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.11" } }, "nbformat": 4, "nbformat_minor": 5 }