{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Theano 在 Windows 上的配置 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"注意:不建议在 `windows` 进行 `theano` 的配置。\n",
"\n",
"务必确认你的显卡支持 `CUDA`。\n",
"\n",
"我个人的电脑搭载的是 `Windows 10 x64` 系统,显卡是 `Nvidia GeForce GTX 850M`。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 安装 theano"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"首先是用 `anaconda` 安装 `theano`:\n",
"\n",
" conda install mingw libpython\n",
" pip install theano"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 安装 VS 和 CUDA"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"按顺序安装这两个软件:\n",
"- 安装 Visual Studio 2010/2012/2013\n",
"- 安装 对应的 x64 或 x86 CUDA\n",
"\n",
"Cuda 的版本与电脑的显卡兼容。\n",
"\n",
"我安装的是 Visual Studio 2012 和 CUDA v7.0v。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 配置环境变量"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`CUDA` 会自动帮你添加一个 `CUDA_PATH` 环境变量(环境变量在 控制面板->系统与安全->系统->高级系统设置 中),表示你的 `CUDA` 安装位置,我的电脑上为:\n",
"\n",
"- `CUDA_PATH`\n",
" - `C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v7.0`\n",
"\n",
"我们配置两个相关变量:\n",
"\n",
"- `CUDA_BIN_PATH`\n",
" - `%CUDA_PATH%\\bin`\n",
"- `CUDA_LIB_PATH`\n",
" - `%CUDA_PATH%\\lib\\Win32`\n",
"\n",
"接下来在 `Path` 环境变量的后面加上:\n",
"\n",
"- `Minicoda` 中关于 `mingw` 的项:\n",
" - `C:\\Miniconda\\MinGW\\bin;`\n",
" - `C:\\Miniconda\\MinGW\\x86_64-w64-mingw32\\lib;`\n",
"\n",
"- `VS` 中的 `cl` 编译命令: \n",
" - `C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\VC\\bin;`\n",
" - `C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\Common7\\IDE;`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"生成测试文件:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing test_theano.py\n"
]
}
],
"source": [
"%%file test_theano.py\n",
"from theano import config\n",
"print 'using device:', config.device"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"我们可以通过临时设置环境变量 `THEANO_FLAGS` 来改变 `theano` 的运行模式,在 linux 下,临时环境变量直接用:\n",
"\n",
" THEANO_FLAGS=xxx \n",
" \n",
"就可以完成,设置完成之后,该环境变量只在当前的命令窗口有效,你可以这样运行你的代码:\n",
"\n",
" THEANO_FLAGS=xxx python .py\n",
" \n",
"在 `Windows` 下,需要使用 `set` 命令来临时设置环境变量,所以运行方式为:\n",
" \n",
" set THEANO_FLAGS=xxx && python .py "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"using device: cpu\r\n"
]
}
],
"source": [
"import sys\n",
"\n",
"if sys.platform == 'win32':\n",
" !set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py\n",
"else:\n",
" !THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n",
"using device: gpu\n"
]
}
],
"source": [
"if sys.platform == 'win32':\n",
" !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n",
"else:\n",
" !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"测试 `CPU` 和 `GPU` 的差异:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting test_theano.py\n"
]
}
],
"source": [
"%%file test_theano.py\n",
"\n",
"from theano import function, config, shared, sandbox\n",
"import theano.tensor as T\n",
"import numpy\n",
"import time\n",
"\n",
"vlen = 10 * 30 * 768 # 10 x #cores x # threads per core\n",
"iters = 1000\n",
"\n",
"rng = numpy.random.RandomState(22)\n",
"x = shared(numpy.asarray(rng.rand(vlen), config.floatX))\n",
"f = function([], T.exp(x))\n",
"\n",
"t0 = time.time()\n",
"for i in xrange(iters):\n",
" r = f()\n",
"t1 = time.time()\n",
"print(\"Looping %d times took %f seconds\" % (iters, t1 - t0))\n",
"print(\"Result is %s\" % (r,))\n",
"if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):\n",
" print('Used the cpu')\n",
"else:\n",
" print('Used the gpu')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Looping 1000 times took 3.498123 seconds\r\n",
"Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761\r\n",
" 1.62323284]\r\n",
"Used the cpu\r\n"
]
}
],
"source": [
"if sys.platform == 'win32':\n",
" !set THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 && python test_theano.py\n",
"else:\n",
" !THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python test_theano.py"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n",
"Looping 1000 times took 0.847006 seconds\n",
"Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761\n",
" 1.62323296]\n",
"Used the gpu\n"
]
}
],
"source": [
"if sys.platform == 'win32':\n",
" !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n",
"else:\n",
" !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"可以看到 `GPU` 明显要比 `CPU` 快。\n",
"\n",
"使用 `GPU` 模式的 `T.exp(x)` 可以获得更快的加速效果:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting test_theano.py\n"
]
}
],
"source": [
"%%file test_theano.py\n",
"\n",
"from theano import function, config, shared, sandbox\n",
"import theano.sandbox.cuda.basic_ops\n",
"import theano.tensor as T\n",
"import numpy\n",
"import time\n",
"\n",
"vlen = 10 * 30 * 768 # 10 x #cores x # threads per core\n",
"iters = 1000\n",
"\n",
"rng = numpy.random.RandomState(22)\n",
"x = shared(numpy.asarray(rng.rand(vlen), 'float32'))\n",
"f = function([], sandbox.cuda.basic_ops.gpu_from_host(T.exp(x)))\n",
"\n",
"t0 = time.time()\n",
"for i in xrange(iters):\n",
" r = f()\n",
"t1 = time.time()\n",
"print(\"Looping %d times took %f seconds\" % (iters, t1 - t0))\n",
"print(\"Result is %s\" % (r,))\n",
"print(\"Numpy result is %s\" % (numpy.asarray(r),))\n",
"if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):\n",
" print('Used the cpu')\n",
"else:\n",
" print('Used the gpu')"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using gpu device 0: Tesla C2075 (CNMeM is disabled)\n",
"Looping 1000 times took 0.318359 seconds\n",
"Result is \n",
"Numpy result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761\n",
" 1.62323296]\n",
"Used the gpu\n"
]
}
],
"source": [
"if sys.platform == 'win32':\n",
" !set THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 && python test_theano.py\n",
"else:\n",
" !THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test_theano.py"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"!rm test_theano.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 配置 .theanorc.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"我们可以在个人文件夹下配置 .theanorc.txt 文件来省去每次都使用环境变量设置的麻烦:\n",
"\n",
"例如我现在的 .theanorc.txt 配置为:\n",
"```\n",
"[global]\n",
"device = gpu\n",
"floatX = float32\n",
"\n",
"[nvcc]\n",
"fastmath = True\n",
"flags = -LC:\\Miniconda\\libs\n",
"compiler_bindir=C:\\Program Files (x86)\\Microsoft Visual Studio 11.0\\VC\\bin\n",
"\n",
"[gcc]\n",
"cxxflags = -LC:\\Miniconda\\MinGW\n",
"```\n",
"\n",
"具体这些配置有什么作用之后可以查看官网上的教程。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}