{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Theano 在 Windows 上的配置 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "注意:不建议在 `windows` 进行 `theano` 的配置。\n", "\n", "务必确认你的显卡支持 `CUDA`。\n", "\n", "我个人的电脑搭载的是 `Windows 10 x64` 系统,显卡是 `Nvidia GeForce GTX 850M`。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 安装 theano" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "首先是用 `anaconda` 安装 `theano`:\n", "\n", " conda install mingw libpython\n", " pip install theano" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 安装 VS 和 CUDA" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "按顺序安装这两个软件:\n", "- 安装 Visual Studio 2010/2012/2013\n", "- 安装 对应的 x64 或 x86 CUDA\n", "\n", "Cuda 的版本与电脑的显卡兼容。\n", "\n", "我安装的是 Visual Studio 2012 和 CUDA v7.0v。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 配置环境变量" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`CUDA` 会自动帮你添加一个 `CUDA_PATH` 环境变量(环境变量在 控制面板->系统与安全->系统->高级系统设置 中),表示你的 `CUDA` 安装位置,我的电脑上为:\n", "\n", "- `CUDA_PATH`\n", " - `C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v7.0`\n", "\n", "我们配置两个相关变量:\n", "\n", "- `CUDA_BIN_PATH`\n", " - `%CUDA_PATH%\\bin`\n", "- `CUDA_LIB_PATH`\n", " - `%CUDA_PATH%\\lib\\Win32`\n", "\n", "接下来在 `Path` 环境变量的后面加上:\n", "\n", "- `Minicoda` 中关于 `mingw` 的项:\n", " - `C:\\Miniconda\\MinGW\\bin;`\n", " - `C:\\Miniconda\\MinGW\\x86_64-w64-mingw32\\lib;`\n", "\n", "- `VS` 中的 `cl` 编译命令: \n", " - `C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\VC\\bin;`\n", " - `C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\Common7\\IDE;`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "生成测试文件:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing test_theano.py\n" ] } ], "source": [ "%%file test_theano.py\n", "from theano import config\n", "print 'using device:', config.device" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们可以通过临时设置环境变量 `THEANO_FLAGS` 来改变 `theano` 的运行模式,在 linux 下,临时环境变量直接用:\n", "\n", " THEANO_FLAGS=xxx \n", " \n", "就可以完成,设置完成之后,该环境变量只在当前的命令窗口有效,你可以这样运行你的代码:\n", "\n", " THEANO_FLAGS=xxx python .py\n", " \n", "在 `Windows` 下,需要使用 `set` 命令来临时设置环境变量,所以运行方式为:\n", " \n", " set THEANO_FLAGS=xxx && python .py " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "using device: cpu\r\n" ] } ], "source": [ "import sys\n", "\n", "if sys.platform == 'win32':\n", " !set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py\n", "else:\n", " !THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n", "using device: gpu\n" ] } ], "source": [ "if sys.platform == 'win32':\n", " !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n", "else:\n", " !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "测试 `CPU` 和 `GPU` 的差异:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting test_theano.py\n" ] } ], "source": [ "%%file test_theano.py\n", "\n", "from theano import function, config, shared, sandbox\n", "import theano.tensor as T\n", "import numpy\n", "import time\n", "\n", "vlen = 10 * 30 * 768 # 10 x #cores x # threads per core\n", "iters = 1000\n", "\n", "rng = numpy.random.RandomState(22)\n", "x = shared(numpy.asarray(rng.rand(vlen), config.floatX))\n", "f = function([], T.exp(x))\n", "\n", "t0 = time.time()\n", "for i in xrange(iters):\n", " r = f()\n", "t1 = time.time()\n", "print(\"Looping %d times took %f seconds\" % (iters, t1 - t0))\n", "print(\"Result is %s\" % (r,))\n", "if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):\n", " print('Used the cpu')\n", "else:\n", " print('Used the gpu')" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looping 1000 times took 3.498123 seconds\r\n", "Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761\r\n", " 1.62323284]\r\n", "Used the cpu\r\n" ] } ], "source": [ "if sys.platform == 'win32':\n", " !set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py\n", "else:\n", " !THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n", "Looping 1000 times took 0.847006 seconds\n", "Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761\n", " 1.62323296]\n", "Used the gpu\n" ] } ], "source": [ "if sys.platform == 'win32':\n", " !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n", "else:\n", " !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以看到 `GPU` 明显要比 `CPU` 快。\n", "\n", "使用 `GPU` 模式的 `T.exp(x)` 可以获得更快的加速效果:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting test_theano.py\n" ] } ], "source": [ "%%file test_theano.py\n", "\n", "from theano import function, config, shared, sandbox\n", "import theano.sandbox.cuda.basic_ops\n", "import theano.tensor as T\n", "import numpy\n", "import time\n", "\n", "vlen = 10 * 30 * 768 # 10 x #cores x # threads per core\n", "iters = 1000\n", "\n", "rng = numpy.random.RandomState(22)\n", "x = shared(numpy.asarray(rng.rand(vlen), 'float32'))\n", "f = function([], sandbox.cuda.basic_ops.gpu_from_host(T.exp(x)))\n", "\n", "t0 = time.time()\n", "for i in xrange(iters):\n", " r = f()\n", "t1 = time.time()\n", "print(\"Looping %d times took %f seconds\" % (iters, t1 - t0))\n", "print(\"Result is %s\" % (r,))\n", "print(\"Numpy result is %s\" % (numpy.asarray(r),))\n", "if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):\n", " print('Used the cpu')\n", "else:\n", " print('Used the gpu')" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n", "Looping 1000 times took 0.318359 seconds\n", "Result is \n", "Numpy result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761\n", " 1.62323296]\n", "Used the gpu\n" ] } ], "source": [ "if sys.platform == 'win32':\n", " !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n", "else:\n", " !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!rm test_theano.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 配置 .theanorc.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们可以在个人文件夹下配置 .theanorc.txt 文件来省去每次都使用环境变量设置的麻烦:\n", "\n", "例如我现在的 .theanorc.txt 配置为:\n", "```\n", "[global]\n", "device = gpu\n", "floatX = float32\n", "\n", "[nvcc]\n", "fastmath = True\n", "flags = -LC:\\Miniconda\\libs\n", "compiler_bindir=C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\VC\\bin\n", "\n", "[gcc]\n", "cxxflags = -LC:\\Miniconda\\MinGW\n", "```\n", "\n", "具体这些配置有什么作用之后可以查看官网上的教程。" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }