{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Theano 配置和编译模式" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 配置" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "之前我们已经知道, `theano` 的配置可以用 `config` 模块来查看:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "floatX (('float64', 'float32', 'float16')) \n", " Doc: Default floating-point precision for python casts.\n", "\n", "Note: float16 support is experimental, use at your own risk.\n", " Value: float32\n", "\n", "warn_float64 (('ignore', 'warn', 'raise', 'pdb')) \n", " Doc: Do an action when a tensor variable with float64 dtype is created. They can't be run on the GPU with the current(old) gpu back-end and are slow with gamer GPUs.\n", " Value: ignore\n", "\n", "cast_policy (('custom', 'numpy+floatX')) \n", " Doc: Rules for implicit type casting\n", " Value: custom\n", "\n", "int_division (('int', 'raise', 'floatX')) \n", " Doc: What to do when one computes x / y, where both x and y are of integer types\n", " Value: int\n", "\n", "device (cpu, gpu*, opencl*, cuda*) \n", " Doc: Default device for computations. If gpu*, change the default to try to move computation to it and to put shared variable of float32 on it. Do not use upper case letters, only lower case even if NVIDIA use capital letters.\n", " Value: gpu1\n", "\n", "init_gpu_device (, gpu*, opencl*, cuda*) \n", " Doc: Initialize the gpu device to use, works only if device=cpu. Unlike 'device', setting this option will NOT move computations, nor shared variables, to the specified GPU. It can be used to run GPU-specific tests on a particular GPU.\n", " Value: \n", "\n", "force_device () \n", " Doc: Raise an error if we can't use the specified device\n", " Value: False\n", "\n", "\n", " Doc: \n", " Context map for multi-gpu operation. Format is a\n", " semicolon-separated list of names and device names in the\n", " 'name->dev_name' format. An example that would map name 'test' to\n", " device 'cuda0' and name 'test2' to device 'opencl0:0' follows:\n", " \"test->cuda0;test2->opencl0:0\".\n", "\n", " Invalid context names are 'cpu', 'cuda*' and 'opencl*'\n", " \n", " Value: \n", "\n", "print_active_device () \n", " Doc: Print active device at when the GPU device is initialized.\n", " Value: True\n", "\n", "enable_initial_driver_test () \n", " Doc: Tests the nvidia driver when a GPU device is initialized.\n", " Value: True\n", "\n", "cuda.root () \n", " Doc: directory with bin/, lib/, include/ for cuda utilities.\n", " This directory is included via -L and -rpath when linking\n", " dynamically compiled modules. If AUTO and nvcc is in the\n", " path, it will use one of nvcc parent directory. Otherwise\n", " /usr/local/cuda will be used. Leave empty to prevent extra\n", " linker directives. Default: environment variable \"CUDA_ROOT\"\n", " or else \"AUTO\".\n", " \n", " Value: /usr/local/cuda-7.0\n", "\n", "\n", " Doc: Extra compiler flags for nvcc\n", " Value: \n", "\n", "nvcc.compiler_bindir () \n", " Doc: If defined, nvcc compiler driver will seek g++ and gcc in this directory\n", " Value: \n", "\n", "nvcc.fastmath () \n", " Doc: \n", " Value: False\n", "\n", "gpuarray.sync () \n", " Doc: If True, every op will make sure its work is done before\n", " returning. Setting this to True will slow down execution,\n", " but give much more accurate results in profiling.\n", " Value: False\n", "\n", "gpuarray.preallocate () \n", " Doc: If 0 it doesn't do anything. If between 0 and 1 it\n", " will preallocate that fraction of the total GPU memory.\n", " If 1 or greater it will preallocate that amount of memory\n", " (in megabytes).\n", " Value: 0.0\n", "\n", "\n", " Doc: This flag is deprecated; use dnn.conv.algo_fwd.\n", " Value: True\n", "\n", "\n", " Doc: This flag is deprecated; use dnn.conv.algo_bwd.\n", " Value: True\n", "\n", "\n", " Doc: This flag is deprecated; use dnn.conv.algo_bwd_data and dnn.conv.algo_bwd_filter.\n", " Value: True\n", "\n", "dnn.conv.algo_fwd (('small', 'none', 'large', 'fft', 'fft_tiling', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change')) \n", " Doc: Default implementation to use for CuDNN forward convolution.\n", " Value: small\n", "\n", "dnn.conv.algo_bwd_data (('none', 'deterministic', 'fft', 'fft_tiling', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change')) \n", " Doc: Default implementation to use for CuDNN backward convolution to get the gradients of the convolution with regard to the inputs.\n", " Value: none\n", "\n", "dnn.conv.algo_bwd_filter (('none', 'deterministic', 'fft', 'small', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change')) \n", " Doc: Default implementation to use for CuDNN backward convolution to get the gradients of the convolution with regard to the filters.\n", " Value: none\n", "\n", "dnn.conv.precision (('as_input', 'float16', 'float32', 'float64')) \n", " Doc: Default data precision to use for the computation in CuDNN convolutions (defaults to the same dtype as the inputs of the convolutions).\n", " Value: as_input\n", "\n", "dnn.include_path () \n", " Doc: Location of the cudnn header (defaults to the cuda root)\n", " Value: /usr/local/cuda-7.0/include\n", "\n", "dnn.library_path () \n", " Doc: Location of the cudnn header (defaults to the cuda root)\n", " Value: /usr/local/cuda-7.0/lib64\n", "\n", "assert_no_cpu_op (('ignore', 'warn', 'raise', 'pdb')) \n", " Doc: Raise an error/warning if there is a CPU op in the computational graph.\n", " Value: ignore\n", "\n", "mode (('Mode', 'ProfileMode', 'DebugMode', 'FAST_RUN', 'NanGuardMode', 'FAST_COMPILE', 'PROFILE_MODE', 'DEBUG_MODE')) \n", " Doc: Default compilation mode\n", " Value: Mode\n", "\n", "cxx () \n", " Doc: The C++ compiler to use. Currently only g++ is supported, but supporting additional compilers should not be too difficult. If it is empty, no C++ code is compiled.\n", " Value: /usr/bin/g++\n", "\n", "linker (('cvm', 'c|py', 'py', 'c', 'c|py_nogc', 'vm', 'vm_nogc', 'cvm_nogc')) \n", " Doc: Default linker used if the theano flags mode is Mode or ProfileMode(deprecated)\n", " Value: cvm\n", "\n", "allow_gc () \n", " Doc: Do we default to delete intermediate results during Theano function calls? Doing so lowers the memory requirement, but asks that we reallocate memory at the next function call. This is implemented for the default linker, but may not work for all linkers.\n", " Value: True\n", "\n", "optimizer (('fast_run', 'merge', 'fast_compile', 'None')) \n", " Doc: Default optimizer. If not None, will use this linker with the Mode object (not ProfileMode(deprecated) or DebugMode)\n", " Value: fast_run\n", "\n", "optimizer_verbose () \n", " Doc: If True, we print all optimization being applied\n", " Value: False\n", "\n", "on_opt_error (('warn', 'raise', 'pdb', 'ignore')) \n", " Doc: What to do when an optimization crashes: warn and skip it, raise the exception, or fall into the pdb debugger.\n", " Value: warn\n", "\n", "\n", " Doc: This config option was removed in 0.5: do not use it!\n", " Value: True\n", "\n", "nocleanup () \n", " Doc: Suppress the deletion of code files that did not compile cleanly\n", " Value: False\n", "\n", "on_unused_input (('raise', 'warn', 'ignore')) \n", " Doc: What to do if a variable in the 'inputs' list of theano.function() is not used in the graph.\n", " Value: raise\n", "\n", "tensor.cmp_sloppy () \n", " Doc: Relax tensor._allclose (0) not at all, (1) a bit, (2) more\n", " Value: 0\n", "\n", "tensor.local_elemwise_fusion () \n", " Doc: Enable or not in fast_run mode(fast_run optimization) the elemwise fusion optimization\n", " Value: True\n", "\n", "gpu.local_elemwise_fusion () \n", " Doc: Enable or not in fast_run mode(fast_run optimization) the gpu elemwise fusion optimization\n", " Value: True\n", "\n", "lib.amdlibm () \n", " Doc: Use amd's amdlibm numerical library\n", " Value: False\n", "\n", "gpuelemwise.sync () \n", " Doc: when true, wait that the gpu fct finished and check it error code.\n", " Value: True\n", "\n", "traceback.limit () \n", " Doc: The number of stack to trace. -1 mean all.\n", " Value: 8\n", "\n", "experimental.mrg () \n", " Doc: Another random number generator that work on the gpu\n", " Value: False\n", "\n", "experimental.unpickle_gpu_on_cpu () \n", " Doc: Allow unpickling of pickled CudaNdarrays as numpy.ndarrays.This is useful, if you want to open a CudaNdarray without having cuda installed.If you have cuda installed, this will force unpickling tobe done on the cpu to numpy.ndarray.Please be aware that this may get you access to the data,however, trying to unpicke gpu functions will not succeed.This flag is experimental and may be removed any time, whengpu<>cpu transparency is solved.\n", " Value: False\n", "\n", "numpy.seterr_all (('ignore', 'warn', 'raise', 'call', 'print', 'log', 'None')) \n", " Doc: (\"Sets numpy's behaviour for floating-point errors, \", \"see numpy.seterr. 'None' means not to change numpy's default, which can be different for different numpy releases. This flag sets the default behaviour for all kinds of floating-point errors, its effect can be overriden for specific errors by the following flags: seterr_divide, seterr_over, seterr_under and seterr_invalid.\")\n", " Value: ignore\n", "\n", "numpy.seterr_divide (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log')) \n", " Doc: Sets numpy's behavior for division by zero, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.\n", " Value: None\n", "\n", "numpy.seterr_over (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log')) \n", " Doc: Sets numpy's behavior for floating-point overflow, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.\n", " Value: None\n", "\n", "numpy.seterr_under (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log')) \n", " Doc: Sets numpy's behavior for floating-point underflow, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.\n", " Value: None\n", "\n", "numpy.seterr_invalid (('None', 'ignore', 'warn', 'raise', 'call', 'print', 'log')) \n", " Doc: Sets numpy's behavior for invalid floating-point operation, see numpy.seterr. 'None' means using the default, defined by numpy.seterr_all.\n", " Value: None\n", "\n", "warn.ignore_bug_before (('0.6', 'None', 'all', '0.3', '0.4', '0.4.1', '0.5', '0.7')) \n", " Doc: If 'None', we warn about all Theano bugs found by default. If 'all', we don't warn about Theano bugs found by default. If a version, we print only the warnings relative to Theano bugs found after that version. Warning for specific bugs can be configured with specific [warn] flags.\n", " Value: 0.6\n", "\n", "warn.argmax_pushdown_bug () \n", " Doc: Warn if in past version of Theano we generated a bug with the theano.tensor.nnet.nnet.local_argmax_pushdown optimization. Was fixed 27 may 2010\n", " Value: False\n", "\n", "warn.gpusum_01_011_0111_bug () \n", " Doc: Warn if we are in a case where old version of Theano had a silent bug with GpuSum pattern 01,011 and 0111 when the first dimensions was bigger then 4096. Was fixed 31 may 2010\n", " Value: False\n", "\n", "warn.sum_sum_bug () \n", " Doc: Warn if we are in a case where Theano version between version 9923a40c7b7a and the 2 august 2010 (fixed date), generated an error in that case. This happens when there are 2 consecutive sums in the graph, bad code was generated. Was fixed 2 August 2010\n", " Value: False\n", "\n", "warn.sum_div_dimshuffle_bug () \n", " Doc: Warn if previous versions of Theano (between rev. 3bd9b789f5e8, 2010-06-16, and cfc6322e5ad4, 2010-08-03) would have given incorrect result. This bug was triggered by sum of division of dimshuffled tensors.\n", " Value: False\n", "\n", "warn.subtensor_merge_bug () \n", " Doc: Warn if previous versions of Theano (before 0.5rc2) could have given incorrect results when indexing into a subtensor with negative stride (for instance, for instance, x[a:b:-1][c]).\n", " Value: False\n", "\n", "warn.gpu_set_subtensor1 () \n", " Doc: Warn if previous versions of Theano (before 0.6) could have given incorrect results when moving to the gpu set_subtensor(x[int vector], new_value)\n", " Value: False\n", "\n", "warn.vm_gc_bug () \n", " Doc: There was a bug that existed in the default Theano configuration, only in the development version between July 5th 2012 and July 30th 2012. This was not in a released version. If your code was affected by this bug, a warning will be printed during the code execution if you use the `linker=vm,vm.lazy=True,warn.vm_gc_bug=True` Theano flags. This warning is disabled by default as the bug was not released.\n", " Value: False\n", "\n", "warn.signal_conv2d_interface () \n", " Doc: Warn we use the new signal.conv2d() when its interface changed mid June 2014\n", " Value: True\n", "\n", "warn.reduce_join () \n", " Doc: Your current code is fine, but Theano versions prior to 0.7 (or this development version) might have given an incorrect result. To disable this warning, set the Theano flag warn.reduce_join to False. The problem was an optimization, that modified the pattern \"Reduce{scalar.op}(Join(axis=0, a, b), axis=0)\", did not check the reduction axis. So if the reduction axis was not 0, you got a wrong answer.\n", " Value: True\n", "\n", "warn.inc_set_subtensor1 () \n", " Doc: Warn if previous versions of Theano (before 0.7) could have given incorrect results for inc_subtensor and set_subtensor when using some patterns of advanced indexing (indexing with one vector or matrix of ints).\n", " Value: True\n", "\n", "compute_test_value (('off', 'ignore', 'warn', 'raise', 'pdb')) \n", " Doc: If 'True', Theano will run each op at graph build time, using Constants, SharedVariables and the tag 'test_value' as inputs to the function. This helps the user track down problems in the graph before it gets optimized.\n", " Value: off\n", "\n", "print_test_value () \n", " Doc: If 'True', the __eval__ of a Theano variable will return its test_value when this is available. This has the practical conseguence that, e.g., in debugging `my_var` will print the same as `my_var.tag.test_value` when a test value is defined.\n", " Value: False\n", "\n", "compute_test_value_opt (('off', 'ignore', 'warn', 'raise', 'pdb')) \n", " Doc: For debugging Theano optimization only. Same as compute_test_value, but is used during Theano optimization\n", " Value: off\n", "\n", "unpickle_function () \n", " Doc: Replace unpickled Theano functions with None. This is useful to unpickle old graphs that pickled them when it shouldn't\n", " Value: True\n", "\n", "reoptimize_unpickled_function () \n", " Doc: Re-optimize the graph when a theano function is unpickled from the disk.\n", " Value: False\n", "\n", "exception_verbosity (('low', 'high')) \n", " Doc: If 'low', the text of exceptions will generally refer to apply nodes with short names such as Elemwise{add_no_inplace}. If 'high', some exceptions will also refer to apply nodes with long descriptions like:\n", " A. Elemwise{add_no_inplace}\n", " B. log_likelihood_v_given_h\n", " C. log_likelihood_h\n", " Value: low\n", "\n", "openmp () \n", " Doc: Allow (or not) parallel computation on the CPU with OpenMP. This is the default value used when creating an Op that supports OpenMP parallelization. It is preferable to define it via the Theano configuration file ~/.theanorc or with the environment variable THEANO_FLAGS. Parallelization is only done for some operations that implement it, and even for operations that implement parallelism, each operation is free to respect this flag or not. You can control the number of threads used with the environment variable OMP_NUM_THREADS. If it is set to 1, we disable openmp in Theano by default.\n", " Value: False\n", "\n", "openmp_elemwise_minsize () \n", " Doc: If OpenMP is enabled, this is the minimum size of vectors for which the openmp parallelization is enabled in element wise ops.\n", " Value: 200000\n", "\n", "check_input () \n", " Doc: Specify if types should check their input in their C code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.\n", " Value: True\n", "\n", "cache_optimizations () \n", " Doc: WARNING: work in progress, does not work yet. Specify if the optimization cache should be used. This cache will any optimized graph and its optimization. Actually slow downs a lot the first optimization, and could possibly still contains some bugs. Use at your own risks.\n", " Value: False\n", "\n", "unittests.rseed () \n", " Doc: Seed to use for randomized unit tests. Special value 'random' means using a seed of None.\n", " Value: 666\n", "\n", "compile.wait () \n", " Doc: Time to wait before retrying to aquire the compile lock.\n", " Value: 5\n", "\n", "compile.timeout () \n", " Doc: In seconds, time that a process will wait before deciding to\n", "override an existing lock. An override only happens when the existing\n", "lock is held by the same owner *and* has not been 'refreshed' by this\n", "owner for more than this period. Refreshes are done every half timeout\n", "period for running processes.\n", " Value: 120\n", "\n", "compiledir_format () \n", " Doc: Format string for platform-dependent compiled module subdirectory\n", "(relative to base_compiledir). Available keys: gxx_version, hostname,\n", "numpy_version, platform, processor, python_bitwidth,\n", "python_int_bitwidth, python_version, short_platform, theano_version.\n", "Defaults to 'compiledir_%(short_platform)s-%(processor)s-%(python_vers\n", "ion)s-%(python_bitwidth)s'.\n", " Value: compiledir_%(short_platform)s-%(processor)s-%(python_version)s-%(python_bitwidth)s\n", "\n", "\n", " Doc: platform-independent root directory for compiled modules\n", " Value: /home/lijin/.theano\n", "\n", "\n", " Doc: platform-dependent cache directory for compiled modules\n", " Value: /home/lijin/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64\n", "\n", "cmodule.mac_framework_link () \n", " Doc: If set to True, breaks certain MacOS installations with the infamous Bus Error\n", " Value: False\n", "\n", "cmodule.warn_no_version () \n", " Doc: If True, will print a warning when compiling one or more Op with C code that can't be cached because there is no c_code_cache_version() function associated to at least one of those Ops.\n", " Value: False\n", "\n", "cmodule.remove_gxx_opt () \n", " Doc: If True, will remove the -O* parameter passed to g++.This is useful to debug in gdb modules compiled by Theano.The parameter -g is passed by default to g++\n", " Value: False\n", "\n", "cmodule.compilation_warning () \n", " Doc: If True, will print compilation warnings.\n", " Value: False\n", "\n", "cmodule.preload_cache () \n", " Doc: If set to True, will preload the C module cache at import time\n", " Value: False\n", "\n", "gcc.cxxflags () \n", " Doc: Extra compiler flags for gcc\n", " Value: \n", "\n", "metaopt.verbose () \n", " Doc: Enable verbose output for meta optimizers\n", " Value: False\n", "\n", "optdb.position_cutoff () \n", " Doc: Where to stop eariler during optimization. It represent the position of the optimizer where to stop.\n", " Value: inf\n", "\n", "optdb.max_use_ratio () \n", " Doc: A ratio that prevent infinite loop in EquilibriumOptimizer.\n", " Value: 5.0\n", "\n", "profile () \n", " Doc: If VM should collect profile information\n", " Value: False\n", "\n", "profile_optimizer () \n", " Doc: If VM should collect optimizer profile information\n", " Value: False\n", "\n", "profile_memory () \n", " Doc: If VM should collect memory profile information and print it\n", " Value: False\n", "\n", "\n", " Doc: Useful only for the vm linkers. When lazy is None, auto detect if lazy evaluation is needed and use the apropriate version. If lazy is True/False, force the version used between Loop/LoopGC and Stack.\n", " Value: None\n", "\n", "optimizer_excluding () \n", " Doc: When using the default mode, we will remove optimizer with these tags. Separate tags with ':'.\n", " Value: \n", "\n", "optimizer_including () \n", " Doc: When using the default mode, we will add optimizer with these tags. Separate tags with ':'.\n", " Value: \n", "\n", "optimizer_requiring () \n", " Doc: When using the default mode, we will require optimizer with these tags. Separate tags with ':'.\n", " Value: \n", "\n", "DebugMode.patience () \n", " Doc: Optimize graph this many times to detect inconsistency\n", " Value: 10\n", "\n", "DebugMode.check_c () \n", " Doc: Run C implementations where possible\n", " Value: True\n", "\n", "DebugMode.check_py () \n", " Doc: Run Python implementations where possible\n", " Value: True\n", "\n", "DebugMode.check_finite () \n", " Doc: True -> complain about NaN/Inf results\n", " Value: True\n", "\n", "DebugMode.check_strides () \n", " Doc: Check that Python- and C-produced ndarrays have same strides. On difference: (0) - ignore, (1) warn, or (2) raise error\n", " Value: 0\n", "\n", "DebugMode.warn_input_not_reused () \n", " Doc: Generate a warning when destroy_map or view_map says that an op works inplace, but the op did not reuse the input for its output.\n", " Value: True\n", "\n", "DebugMode.check_preallocated_output () \n", " Doc: Test thunks with pre-allocated memory as output storage. This is a list of strings separated by \":\". Valid values are: \"initial\" (initial storage in storage map, happens with Scan),\"previous\" (previously-returned memory), \"c_contiguous\", \"f_contiguous\", \"strided\" (positive and negative strides), \"wrong_size\" (larger and smaller dimensions), and \"ALL\" (all of the above).\n", " Value: \n", "\n", "DebugMode.check_preallocated_output_ndim () \n", " Doc: When testing with \"strided\" preallocated output memory, test all combinations of strides over that number of (inner-most) dimensions. You may want to reduce that number to reduce memory or time usage, but it is advised to keep a minimum of 2.\n", " Value: 4\n", "\n", "profiling.time_thunks () \n", " Doc: Time individual thunks when profiling\n", " Value: True\n", "\n", "profiling.n_apply () \n", " Doc: Number of Apply instances to print by default\n", " Value: 20\n", "\n", "profiling.n_ops () \n", " Doc: Number of Ops to print by default\n", " Value: 20\n", "\n", "profiling.output_line_width () \n", " Doc: Max line width for the profiling output\n", " Value: 512\n", "\n", "profiling.min_memory_size () \n", " Doc: For the memory profile, do not print Apply nodes if the size\n", " of their outputs (in bytes) is lower than this threshold\n", " Value: 1024\n", "\n", "profiling.min_peak_memory () \n", " Doc: The min peak memory usage of the order\n", " Value: False\n", "\n", "profiling.destination () \n", " Doc: \n", " File destination of the profiling output\n", " \n", " Value: stderr\n", "\n", "profiling.debugprint () \n", " Doc: \n", " Do a debugprint of the profiled functions\n", " \n", " Value: False\n", "\n", "ProfileMode.n_apply_to_print () \n", " Doc: Number of apply instances to print by default\n", " Value: 15\n", "\n", "ProfileMode.n_ops_to_print () \n", " Doc: Number of ops to print by default\n", " Value: 20\n", "\n", "ProfileMode.min_memory_size () \n", " Doc: For the memory profile, do not print apply nodes if the size of their outputs (in bytes) is lower then this threshold\n", " Value: 1024\n", "\n", "ProfileMode.profile_memory () \n", " Doc: Enable profiling of memory used by Theano functions\n", " Value: False\n", "\n", "on_shape_error (('warn', 'raise')) \n", " Doc: warn: print a warning and use the default value. raise: raise an error\n", " Value: warn\n", "\n", "tensor.insert_inplace_optimizer_validate_nb () \n", " Doc: -1: auto, if graph have less then 500 nodes 1, else 10\n", " Value: -1\n", "\n", "experimental.local_alloc_elemwise () \n", " Doc: DEPRECATED: If True, enable the experimental optimization local_alloc_elemwise. Generates error if not True. Use optimizer_excluding=local_alloc_elemwise to dsiable.\n", " Value: True\n", "\n", "experimental.local_alloc_elemwise_assert () \n", " Doc: When the local_alloc_elemwise is applied, add an assert to highlight shape errors.\n", " Value: True\n", "\n", "blas.ldflags () \n", " Doc: lib[s] to include for [Fortran] level-3 blas implementation\n", " Value: -lblas\n", "\n", "warn.identify_1pexp_bug () \n", " Doc: Warn if Theano versions prior to 7987b51 (2011-12-18) could have yielded a wrong result due to a bug in the is_1pexp function\n", " Value: False\n", "\n", "scan.allow_gc () \n", " Doc: Allow/disallow gc inside of Scan (default: False)\n", " Value: False\n", "\n", "scan.allow_output_prealloc () \n", " Doc: Allow/disallow memory preallocation for outputs inside of scan (default: True)\n", " Value: True\n", "\n", "pycuda.init () \n", " Doc: If True, always initialize PyCUDA when Theano want to\n", " initilize the GPU. Currently, we must always initialize\n", " PyCUDA before Theano do it. Setting this flag to True,\n", " ensure that, but always import PyCUDA. It can be done\n", " manually by importing theano.misc.pycuda_init before theano\n", " initialize the GPU device.\n", " \n", " Value: False\n", "\n", "cublas.lib () \n", " Doc: Name of the cuda blas library for the linker.\n", " Value: cublas\n", "\n", "lib.cnmem () \n", " Doc: Do we enable CNMeM or not (a faster CUDA memory allocator).\n", "\n", " The parameter represent the start size (in MB or % of\n", " total GPU memory) of the memory pool.\n", "\n", " 0: not enabled.\n", " 0 < N <= 1: % of the total GPU memory (clipped to .985 for driver memory)\n", " > 0: use that number of MB of memory.\n", "\n", " \n", " Value: 0.0\n", "\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Using gpu device 1: Tesla K10.G2.8GB (CNMeM is disabled)\n" ] } ], "source": [ "import theano\n", "import theano.tensor as T\n", "\n", "print theano.config" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这些配置影响着 `theano` 的运行,很多的参数都是只读的,因此,**我们应当尽量避免在程序中直接修改这些参数**。\n", "\n", "大部分参数都有指定的默认值,我们可以在 `.theanorc` 文件中对配置进行修改,也可以在环境变量 `THEANO_FLAGS` 中进行修改,它们的优先级顺序如下:\n", "\n", "- 首先是对 `theano.config.` 的赋值\n", "- 然后是 `THEANO_FLAGS` 环境变量指定的内容\n", "- 最后是 `.theanorc` 文件或者 `THEANORC` 环境变量所指示的文件中的内容\n", "\n", "具体的参数含义可以参考:\n", "\n", "http://deeplearning.net/software/theano/library/config.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 环境变量 THEANO_FLAGS " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "使用 `THEANO_FLAGS` 环境变量,运行程序的方法如下:\n", "\n", " THEANO_FLAGS='floatX=float32,device=gpu0,nvcc.fastmath=True' python .py\n", " \n", "如果是 `window` 下,则需要进行稍微的改动:\n", "\n", " set THEANO_FLAGS='floatX=float32,device=gpu0,nvcc.fastmath=True' && python .py\n", " \n", "示例中的配置将浮点数的精度设为了 `32` 位,并将使用 `GPU 0` 和 `CUDA` 的 `fastmath` 模式进行编译和运算。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 配置文件 THEANORC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "环境变量 `THEANORC` 的默认位置为 `$HOME/.theanorc` (`windows` 下为 `$HOME/.theanorc:$HOME/.theanorc.txt`)。\n", "\n", "与前面 `THEANO_FLAGS` 指定的内容相同的配置文件为:\n", "\n", " [global]\n", " floatX = float32\n", " device = gpu0\n", "\n", " [nvcc]\n", " fastmath = True\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这里 `[golbal]` 对应的是 `config` 中的参数,如 `config.device, config.mode`; `config` 的子模块中的参数,如 `config.nvcc.fastmath, config.blas.ldflags` 则需要用 `[nvcc], [blas]` 的部分去设置。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 模式" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "每次调用 `theano.function` 的时候,那些符号变量之间的结构会被优化和计算,而优化和计算的模式都是由 `config.mode` 所决定的。\n", "\n", "`Theano` 中定义了这四种模式:\n", "\n", "- `FAST_COMPILE`\n", " - `compile.mode.Mode(linker='py', optimizer='fast_compile')`\n", " - `Python` 实现,构造很快,运行慢\n", "- `FAST_RUN`\n", " - `compile.mode.Mode(linker='cvm', optimizer='fast_run')`\n", " - `C` 实现,构造较慢,运行快\n", "- `DebugMode`\n", " - `compile.debugmode.DebugMode()`\n", " - 调试模式,两种实现都可以\n", "- `ProfileMode`\n", " - `compile.profilemode.ProfileMode()`\n", " - `C` 实现,已经停用,使用 `theano.profile` 替代\n", " \n", "更多的细节,可以参考:\n", "\n", "http://deeplearning.net/software/theano/library/compile/mode.html#libdoc-compile-mode" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linkers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "从上面的定义可以看出,一个模式由两部分构成,`optimizer` 和 `linker`, `ProfileMode` 和 `DebugMode` 模式使用自带的 `linker`。\n", "\n", "可用的 `linker` 可以从下表中查看:\n", "\n", "http://deeplearning.net/software/theano/tutorial/modes.html#linkers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 使用 DebugMode" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "一般在使用 `FAST_RUN` 或者 `FAST_COMPILE` 模式之前,最好先用 `DebugMode` 进行调试,不过速度会比前两个模式慢得多。\n", "\n", "我们用一个实例看一下两者的区别:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 2.]\n", "[ inf]\n", "[ 1.42857143]\n" ] } ], "source": [ "x = T.dvector('x')\n", "\n", "f_1 = theano.function([x], 10 / x)\n", "\n", "print f_1([5])\n", "print f_1([0])\n", "print f_1([7])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在非 Debug 模式下,除以 0 是合法的,但是在 `DebugMode` 下,会给出错误,帮助我们进行调试:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 2.]\n" ] }, { "ename": "InvalidValueError", "evalue": "InvalidValueError\n type(variable) = TensorType(float64, vector)\n variable = Elemwise{true_div,no_inplace}.0\n type(value) = \n dtype(value) = float64\n shape(value) = (1,)\n value = [ inf]\n min(value) = inf\n max(value) = inf\n isfinite = False\n client_node = None\n hint = perform output\n specific_hint = non-finite elements not allowed\n context = ...\n Elemwise{true_div,no_inplace} [id A] '' \n |TensorConstant{(1,) of 10.0} [id B]\n |x [id C]\n\n ", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mInvalidValueError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;32mprint\u001b[0m \u001b[0mf_2\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 4\u001b[1;33m \u001b[1;32mprint\u001b[0m \u001b[0mf_2\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 5\u001b[0m \u001b[1;32mprint\u001b[0m \u001b[0mf_2\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m7\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.pyc\u001b[0m in \u001b[0;36m__call__\u001b[1;34m(self, *args, **kwargs)\u001b[0m\n\u001b[0;32m 857\u001b[0m \u001b[0mt0_fn\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mtime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtime\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 858\u001b[0m \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 859\u001b[1;33m \u001b[0moutputs\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 860\u001b[0m \u001b[1;32mexcept\u001b[0m \u001b[0mException\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 861\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'position_of_error'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.pyc\u001b[0m in \u001b[0;36mdeco\u001b[1;34m()\u001b[0m\n\u001b[0;32m 2339\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmaker\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcheck_isfinite\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2340\u001b[0m \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2341\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2342\u001b[0m \u001b[1;32mfinally\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2343\u001b[0m \u001b[1;31m# put back the filter_checks_isfinite\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.pyc\u001b[0m in \u001b[0;36mf\u001b[1;34m()\u001b[0m\n\u001b[0;32m 2079\u001b[0m raise InvalidValueError(r, storage_map[r][0],\n\u001b[0;32m 2080\u001b[0m \u001b[0mhint\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'perform output'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2081\u001b[1;33m specific_hint=hint2)\n\u001b[0m\u001b[0;32m 2082\u001b[0m \u001b[0mwarn_inp\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mconfig\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDebugMode\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwarn_input_not_reused\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2083\u001b[0m py_inplace_outs = _check_inputs(\n", "\u001b[1;31mInvalidValueError\u001b[0m: InvalidValueError\n type(variable) = TensorType(float64, vector)\n variable = Elemwise{true_div,no_inplace}.0\n type(value) = \n dtype(value) = float64\n shape(value) = (1,)\n value = [ inf]\n min(value) = inf\n max(value) = inf\n isfinite = False\n client_node = None\n hint = perform output\n specific_hint = non-finite elements not allowed\n context = ...\n Elemwise{true_div,no_inplace} [id A] '' \n |TensorConstant{(1,) of 10.0} [id B]\n |x [id C]\n\n " ] } ], "source": [ "f_2 = theano.function([x], 10 / x, mode='DebugMode')\n", "\n", "print f_2([5])\n", "print f_2([0])\n", "print f_2([7])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "更多细节可以参考:\n", "\n", "http://deeplearning.net/software/theano/library/compile/debugmode.html#debugmode" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }