{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "##
Проверка статистической значимости желания британцев покинуть ЕС
\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**По данным, представленным [BBC](http://www.bbc.com/news/politics/eu_referendum/results), за выход Британии из ЕС проголосовали 17.410.742 чел. (51.9%), против - 16.141.241 (48.1%).**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "N_LEAVE, N_REMAIN = 17410742, 16141241" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "referendum_results = np.array([1] * N_LEAVE + [0] * N_REMAIN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Точечная оценка доли желающих покинуть ЕС:**" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.51891841981441156" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean(referendum_results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Доверительный интервал на основе нормального распределения для доли желающих покинуть ЕС:**\n", "\n", "$$\\hat{p}\\pm z_{1-\\frac{\\alpha}{2}} \\sqrt{\\frac{\\hat{p}\\left(1-\\hat{p}\\right)}{n}}$$" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "normal_interval [0.518749, 0.519087] with width 0.000338\n" ] } ], "source": [ "from statsmodels.stats.proportion import proportion_confint\n", "\n", "normal_interval = proportion_confint(np.sum(referendum_results), \n", " referendum_results.shape[0], \n", " method = 'normal')\n", "print 'normal_interval [%f, %f] with width %f' % (normal_interval[0],\n", " normal_interval[1], \n", " normal_interval[1] - \n", " normal_interval[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**У нас выборка большая, поэтому доверительный интервал Уилсона получается точно таким же.**\n", "\n", "$$\\frac1{ 1 + \\frac{z^2}{n} } \\left( \\hat{p} + \\frac{z^2}{2n} \\pm z \\sqrt{ \\frac{ \\hat{p}\\left(1-\\hat{p}\\right)}{n} + \\frac{\n", "z^2}{4n^2} } \\right), \\;\\; z \\equiv z_{1-\\frac{\\alpha}{2}}$$ " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "wilson_interval [0.518749, 0.519087] with width 0.000338\n" ] } ], "source": [ "wilson_interval = proportion_confint(np.sum(referendum_results), \n", " referendum_results.shape[0],\n", " method = 'wilson')\n", "print 'wilson_interval [%f, %f] with width %f' % (wilson_interval[0],\n", " wilson_interval[1],\n", " wilson_interval[1] - \n", " wilson_interval[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Одновыборочный критерий Стьюдента говорит о том, что отличие выборочного среднего от 0.5 статистически значимо.**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Ttest_1sampResult(statistic=219.32343584383185, pvalue=0.0)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from scipy.stats import ttest_1samp\n", "\n", "ttest_1samp(referendum_results, 0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Это же подтверждают критерий знаков, критерий знаковых рангов Вилкоксона и биномиальный критерий для доли:**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from scipy.stats import wilcoxon, binom_test\n", "from statsmodels.stats.descriptivestats import sign_test" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(634750.5, 1.8774494541967369e-322)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sign_test(referendum_results, 0.5)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "WilcoxonResult(statistic=270785329886072.0, pvalue=0.0)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wilcoxon(referendum_results - 0.5)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "binom_test(N_LEAVE, N_LEAVE + N_REMAIN, 0.5, alternative = 'greater')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 1 }