摘要:而且我们可以看到他自动帮我们安装了,,等等需要注意的是最后会出现这里选择才能把加入环境变量中,然后才能使用不然之后就得手动配置。来安装支持的。步骤中下载太慢了,需要个小时,还是直接在线安装吧,先下载这个,然后这个只需要分钟左右。
前言
最近上了几门深度学习的公开课,还是觉得不过瘾,总觉得要搞一个框架来试试。那么caffe,tensorflow,torch等等选哪一个呢?经过一番比较我还是选择tensorflow,首先他是一个更通用的框架,而且对python支持最好,其次还有google支持,也是开源的,相信在未来无论是学术界还是工业界,他都会流行起来的。
安装-实况记录首先得在我的电脑(win10)上装一个双系统(不装虚拟机是因为虚拟机对显卡等资源的利用不是很好),就装一个ubuntu吧(版本14.10),怎么装就不写了,毕竟网上一大把,然后就是安装tensorflow了,官网提供了5种安装办法,基于pip,基于docker,基于Anaconda,基于Virtualenv,基于源码。由于Anaconda包含了众多的科学计算库,相信对未来的工作能大有用处,所以我就选择了基于Anaconda的安装方式。
1.首先在这里选择相应的Anaconda版本下载。
2.进入下载目录,输入命令 bash Anaconda2-4.1.1-Linux-x86_64.sh
然后根据提示进行安装,他会提示安装目录等。而且我们可以看到他自动帮我们安装了python2.7.12,beautifulsoup,ipython等等:
installing: python-2.7.12-1 ... installing: _nb_ext_conf-0.2.0-py27_0 ... installing: alabaster-0.7.8-py27_0 ... installing: anaconda-client-1.4.0-py27_0 ... installing: anaconda-navigator-1.2.1-py27_0 ... installing: argcomplete-1.0.0-py27_1 ... installing: astropy-1.2.1-np111py27_0 ... installing: babel-2.3.3-py27_0 ... installing: backports-1.0-py27_0 ... installing: backports_abc-0.4-py27_0 ... installing: beautifulsoup4-4.4.1-py27_0 ...
需要注意的是最后会出现:
Do you wish the installer to prepend the Anaconda2 install location to PATH in your /root/.bashrc ? [yes|no]
这里选择yes才能把anaconda加入环境变量(path)中,然后才能使用,不然之后就得手动配置path。由于修改了环境变量,所以打开一个新的终端来测试安装结果:在新的终端中输入python,显示:
Python 2.7.12 |Anaconda 4.1.1 (64-bit)| (default, Jul 2 2016, 17:42:40) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org
可见的确是安装成功了。
3.conda create -n tensorflow python=2.7 来建立一个conda 计算环境
4.source activate tensorflow 来激活计算环境。
5.pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl 来安装支持GPU的tensorflow。
需要注意,支持GPU要先安装Cuda Toolkit 和 CUDNN Toolkit(先在官网注册)
6.安装成功后打开python,
import tensorflow as tf
然后报了一堆错:
Traceback (most recent call last): File "", line 1, in File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/__init__.py", line 23, in from tensorflow.python import * File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 45, in from tensorflow.python import pywrap_tensorflow File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in _pywrap_tensorflow = swig_import_helper() File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper _mod = imp.load_module("_pywrap_tensorflow", fp, pathname, description) ImportError: libcudart.so.7.5: cannot open shared object file: No such file or directory
看样子是我还没有安装好cuda所致。步骤5中下载Cuda Toolkit 太慢了,需要10个小时,还是直接在线安装吧,先下载这个,然后
dpkg -i cuda-repo-ubuntu1410_7.0-28_amd64.deb apt-get update apt-get install cuda
这个只需要20分钟左右。安装好过后cuda应该就在/usr/local/路径下了。然后安装CUDNN Toolkit,进入其下载目录:
tar xvzf cudnn-7.0-linux-x64-v3.0-prod.tgz cp cuda/include/cudnn.h /usr/local/cuda/include cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
然后设置 LD_LIBRARY_PATH 和 CUDA_HOME 环境变量. 可以将下面的命令 添加到 ~/.bashrc文件中, 这样每次登陆后自动生效:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64" export CUDA_HOME=/usr/local/cuda
7.测试
测试之时发现依然报上面的错。libcudart.so.7.5没找到,我先在磁盘上查找这个文件,locate libcudart.so.7.5,果然没有,应该是我的cuda版本低了吧,cd /usr/local/cuda/lib64,然后果然发现了libcudart.so.7.0.28,而不是 libcudart.so.7.5
8.重装Cuda Toolkit
apt-get remove cuda apt-get autoremove #下载http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb apt-get remove cuda-repo-ubuntu1410 dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb#正试图覆盖 /etc/apt/sources.list.d/cuda.list,它同时被包含于软件包 cuda-repo-ubuntu1410 7.0-28,所以必须要上一步 apt-get update sudo apt-get install cuda #报错:cuda : 依赖: cuda-7-5 (= 7.5-18) 但是它将不会被安装 #E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
太乱了,还是重头来过吧
同上
同上
conda create -n tensor python=2.7
source activate tensor
安装Cuda Toolkit,先下载,进入目录:
dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb apt-get update apt-get install cuda #报错:cuda : 依赖: cuda-7-5 (= 7.5-18) 但是它将不会被安装 #E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。 #也是醉了
装错了版本真是麻烦,清理一下系统吧
apt-get --purge remove nvidia-* #彻底卸载nvidia rm -rf anaconda2 # .bashrc文件中删除关于把anaconda加入环境变量的那一句 #还是不行,依旧报错:cuda : 依赖: cuda-7-5 (= 7.5-18) 但是它将不会被安装 #E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
搞不定了,还是换成本地安装试试吧,下载cuda 和 cudnn。奇怪:ubuntu下载很慢,但是windows上就快好多了,在windows上下好直接在ubuntu中拷贝过去吧。
安装-无bug版 1.由于包依赖问题没法解决,重装了系统Ubuntu14.04.5
2.下载cuda 和cudnn,进入下载目录
dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb sudo apt-get update sudo apt-get install cuda #稍等片刻,然后配置cudnn tar xvzf cudnn-7.5-linux-x64-v5.0-ga-tgz cp cuda/include/cudnn.h /usr/local/cuda/include cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*3.
修改 .bashrc 加入:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64" export CUDA_HOME=/usr/local/cuda4.
下载Anaconda,进入下载目录
bash Anaconda2-4.1.1-Linux-x86_64.sh
注意修改配置,根据你的喜好来修改目录5.
重新打开一个终端
conda create -n tfgpu python=2.7 source activate tfgpu pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0rc0-cp27-none-linux_x86_64.whl6.
装好过后,重启,黑屏了。应该是双显卡的问题,不管了,先进入tty试试tensorflow是否装好了。
Ctrl+Alt+F2#进入tty2,并登陆 root@mageek-ThinkPad-T550:~# source activate tfgpu (tfgpu) root@mageek-ThinkPad-T550:~# python Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:42:40) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> import tensorflow as tf I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally >>> sess = tf.Session() I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) >>> (tfgpu) root@mageek-ThinkPad-T550:~# source deactivate
可见是安装成功了7. 解决黑屏
vim /etc/modprobe.d/blacklist.conf #添加如下几句来屏蔽一些软件 blacklist amd76x_edac blacklist vga16fb blacklist nouveau blacklist rivafb blacklist nvidiafb blacklist rivatv #退出 sudo prime-select intel #优先intel集显 reboot#重启就进入图像化界面了8. IPython
这个时候直接用ipython 可以进入界面,但是没法import tensorflow,要先安装conda install ipython然后再次进入ipython,就可以了,因为只有执行了这个命令才能将ipython加入虚拟环境tfgpu,在同一个环境中ipython才能找到tensorflow。
9. IDE虽然IPython已经比原生的python终端好多了,但是每次都要敲相同命令,比如import tensorflow as tf还是相当麻烦的,所以还是要搞一个IDE才行。这里推荐Komodo Edit,下载过后,解压。进入目录运行 ./install.sh 然后按照提示修改安装目录(注意要有权限)。比如我的目录就是 /usr/local/Komodo-Edit-10/ 然后加入环境变量。这样就可以重新打开一个终端,命令 komodo,就可以打开这个IDE了,然后配置一些基本的选项比如缩进,配色方案等等就可以正式使用了。
新建一个 tf1.py:
import tensorflow as tf import numpy as np # Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3 x_data = np.random.rand(100).astype(np.float32) y_data = x_data * 0.1 + 0.3 # Try to find values for W and b that compute y_data = W * x_data + b # (We know that W should be 0.1 and b 0.3, but TensorFlow will # figure that out for us.) W = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) b = tf.Variable(tf.zeros([1])) y = W * x_data + b # Minimize the mean squared errors. loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss) # Before starting, initialize the variables. We will "run" this first. init = tf.initialize_all_variables() # Launch the graph. sess = tf.Session() sess.run(init) # Fit the line. for step in range(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b)) # Learns best fit is W: [0.1], b: [0.3]
运行:
#进入文件目录 source activate tfgpu python tf1.py
结果:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) (0, array([-0.09839484], dtype=float32), array([ 0.5272761], dtype=float32)) (20, array([ 0.02831561], dtype=float32), array([ 0.33592272], dtype=float32)) (40, array([ 0.07941294], dtype=float32), array([ 0.31031665], dtype=float32)) (60, array([ 0.09408762], dtype=float32), array([ 0.30296284], dtype=float32)) (80, array([ 0.09830203], dtype=float32), array([ 0.3008509], dtype=float32)) (100, array([ 0.09951238], dtype=float32), array([ 0.30024436], dtype=float32)) (120, array([ 0.09985995], dtype=float32), array([ 0.3000702], dtype=float32)) (140, array([ 0.09995978], dtype=float32), array([ 0.30002016], dtype=float32)) (160, array([ 0.09998845], dtype=float32), array([ 0.30000579], dtype=float32)) (180, array([ 0.09999669], dtype=float32), array([ 0.30000168], dtype=float32)) (200, array([ 0.09999905], dtype=float32), array([ 0.30000049], dtype=float32))10.NN
#找到tensorflow的目录 python -c "import os; import inspect; import tensorflow; print(os.path.dirname(inspect.getfile(tensorflow)))" #/root/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow cd /root/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/models/image/mnist/#j进入目录 python convolutional.py I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Traceback (most recent call last): File "convolutional.py", line 326, intf.app.run() File "/root/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "convolutional.py", line 138, in main train_data = extract_data(train_data_filename, 60000) File "convolutional.py", line 85, in extract_data buf = bytestream.read(IMAGE_SIZE * IMAGE_SIZE * num_images * NUM_CHANNELS) File "/root/anaconda2/envs/tfgpu/lib/python2.7/gzip.py", line 268, in read self._read(readsize) File "/root/anaconda2/envs/tfgpu/lib/python2.7/gzip.py", line 315, in _read self._read_eof() File "/root/anaconda2/envs/tfgpu/lib/python2.7/gzip.py", line 354, in _read_eof hex(self.crc))) IOError: CRC check failed 0x4b01c89e != 0xd2b9b600L
看来是CRC校验出错,还是直接去官网下载吧,然后直接拷贝到data路径中。读一下convolutional.py就知道下载路径了,其实比较一下data里程序已经下载的文件和官网的文件就知道程序下载的文件出错了,文件小了不少,应该是丢包了。
再次执行:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) Initialized! E tensorflow/stream_executor/cuda/cuda_dnn.cc:347] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 4007 (compatibility version 4000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration. F tensorflow/core/kernels/conv_ops.cc:457] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms) Aborted (core dumped)
意思就是cudnn我安装的是v5,但是cuda7.5支持的是v4,所以就去下载v4,然后按照步骤2来重新配置cudnnv4:
#这里会覆盖cudnnv5,所以记得备份cudnnv5,万一用得上,我把原来解压的cuda改为cudnn5005 cd /usr/local/cuda/lib64 rm -f libcudnn* #删掉cudnnv5 #先进入cudnnv4下载目录 tar xvzf cudnn-7.0-linux-x64-v4.0-prod.tgz cp cuda/include/cudnn.h /usr/local/cuda/include#用v4覆盖v5 cp cuda/lib64/libcudnn* /usr/local/cuda/lib64#加入v4 chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
再次执行:
cd /root/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/models/image/mnist/#j进入目录 python convolutional.py
结果:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_NO_DEVICE I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: mageek-ThinkPad-T550 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: mageek-ThinkPad-T550 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: 352.63.0 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:356] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.63 Sat Nov 7 21:25:42 PST 2015 GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) """ I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 352.63.0 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:293] kernel version seems to match DSO: 352.63.0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine. Initialized! Step 0 (epoch 0.00), 5.4 ms Minibatch loss: 12.054, learning rate: 0.010000 Minibatch error: 90.6% Validation error: 84.6% Step 100 (epoch 0.12), 280.2 ms Minibatch loss: 3.287, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 7.0% Step 200 (epoch 0.23), 281.0 ms Minibatch loss: 3.491, learning rate: 0.010000 Minibatch error: 12.5% Validation error: 3.6% Step 300 (epoch 0.35), 281.0 ms Minibatch loss: 3.265, learning rate: 0.010000 Minibatch error: 10.9% Validation error: 3.2% Step 400 (epoch 0.47), 293.0 ms Minibatch loss: 3.221, learning rate: 0.010000 Minibatch error: 7.8% Validation error: 2.7% Step 500 (epoch 0.58), 289.0 ms Minibatch loss: 3.292, learning rate: 0.010000 Minibatch error: 7.8% Validation error: 2.7% Step 600 (epoch 0.70), 287.4 ms Minibatch loss: 3.227, learning rate: 0.010000 Minibatch error: 7.8% Validation error: 2.6% Step 700 (epoch 0.81), 287.0 ms Minibatch loss: 3.015, learning rate: 0.010000 Minibatch error: 3.1% Validation error: 2.4% Step 800 (epoch 0.93), 287.0 ms Minibatch loss: 3.152, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 2.0% Step 900 (epoch 1.05), 287.7 ms Minibatch loss: 2.938, learning rate: 0.009500 Minibatch error: 3.1% Validation error: 1.6% Step 1000 (epoch 1.16), 287.4 ms Minibatch loss: 2.862, learning rate: 0.009500 Minibatch error: 1.6% Validation error: 1.7% . . .
可见程序是跑起来了,但是没有找到GPU,
reboot #..... source activate tfgpu cd /root/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/models/image/mnist/#j进入目录 python convolutional.py
结果:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) Initialized! Step 0 (epoch 0.00), 81.3 ms Minibatch loss: 12.054, learning rate: 0.010000 Minibatch error: 90.6% Validation error: 84.6% Step 100 (epoch 0.12), 44.4 ms Minibatch loss: 3.291, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 7.1% Step 200 (epoch 0.23), 44.4 ms Minibatch loss: 3.462, learning rate: 0.010000 Minibatch error: 12.5% Validation error: 3.6% Step 300 (epoch 0.35), 44.0 ms Minibatch loss: 3.188, learning rate: 0.010000 Minibatch error: 4.7% Validation error: 3.2% Step 400 (epoch 0.47), 44.3 ms Minibatch loss: 3.253, learning rate: 0.010000 Minibatch error: 9.4% Validation error: 2.8% Step 500 (epoch 0.58), 44.3 ms Minibatch loss: 3.288, learning rate: 0.010000 Minibatch error: 9.4% Validation error: 2.5% Step 600 (epoch 0.70), 43.9 ms Minibatch loss: 3.180, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 2.8% Step 700 (epoch 0.81), 44.2 ms Minibatch loss: 3.033, learning rate: 0.010000 Minibatch error: 3.1% Validation error: 2.4% Step 800 (epoch 0.93), 44.0 ms Minibatch loss: 3.149, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 2.0% Step 900 (epoch 1.05), 44.0 ms Minibatch loss: 2.919, learning rate: 0.009500 Minibatch error: 3.1% Validation error: 1.6% Step 1000 (epoch 1.16), 43.8 ms Minibatch loss: 2.849, learning rate: 0.009500 Minibatch error: 0.0% Validation error: 1.7% Step 1100 (epoch 1.28), 43.6 ms Minibatch loss: 2.822, learning rate: 0.009500 Minibatch error: 0.0% Validation error: 1.6% Step 1200 (epoch 1.40), 43.6 ms Minibatch loss: 2.979, learning rate: 0.009500 Minibatch error: 7.8% Validation error: 1.5% Step 1300 (epoch 1.51), 43.6 ms Minibatch loss: 2.763, learning rate: 0.009500 Minibatch error: 0.0% Validation error: 1.9% Step 1400 (epoch 1.63), 43.6 ms Minibatch loss: 2.781, learning rate: 0.009500 Minibatch error: 3.1% Validation error: 1.5% Step 1500 (epoch 1.75), 43.6 ms Minibatch loss: 2.861, learning rate: 0.009500 Minibatch error: 6.2% Validation error: 1.4% Step 1600 (epoch 1.86), 43.8 ms Minibatch loss: 2.698, learning rate: 0.009500 Minibatch error: 1.6% Validation error: 1.3% Step 1700 (epoch 1.98), 43.9 ms Minibatch loss: 2.650, learning rate: 0.009500 Minibatch error: 0.0% Validation error: 1.3% Step 1800 (epoch 2.09), 44.1 ms Minibatch loss: 2.652, learning rate: 0.009025 Minibatch error: 1.6% Validation error: 1.3% Step 1900 (epoch 2.21), 44.1 ms Minibatch loss: 2.655, learning rate: 0.009025 Minibatch error: 1.6% Validation error: 1.3% Step 2000 (epoch 2.33), 43.9 ms Minibatch loss: 2.640, learning rate: 0.009025 Minibatch error: 3.1% Validation error: 1.2% Step 2100 (epoch 2.44), 44.0 ms Minibatch loss: 2.568, learning rate: 0.009025 Minibatch error: 0.0% Validation error: 1.1% Step 2200 (epoch 2.56), 44.0 ms Minibatch loss: 2.564, learning rate: 0.009025 Minibatch error: 0.0% Validation error: 1.1% Step 2300 (epoch 2.68), 44.2 ms Minibatch loss: 2.561, learning rate: 0.009025 Minibatch error: 1.6% Validation error: 1.2% Step 2400 (epoch 2.79), 44.2 ms Minibatch loss: 2.500, learning rate: 0.009025 Minibatch error: 0.0% Validation error: 1.3% Step 2500 (epoch 2.91), 44.0 ms Minibatch loss: 2.471, learning rate: 0.009025 Minibatch error: 0.0% Validation error: 1.2% Step 2600 (epoch 3.03), 43.8 ms Minibatch loss: 2.451, learning rate: 0.008574 Minibatch error: 0.0% Validation error: 1.2% Step 2700 (epoch 3.14), 43.6 ms Minibatch loss: 2.483, learning rate: 0.008574 Minibatch error: 1.6% Validation error: 1.1% Step 2800 (epoch 3.26), 43.7 ms Minibatch loss: 2.426, learning rate: 0.008574 Minibatch error: 1.6% Validation error: 1.1% Step 2900 (epoch 3.37), 44.3 ms Minibatch loss: 2.449, learning rate: 0.008574 Minibatch error: 3.1% Validation error: 1.1% Step 3000 (epoch 3.49), 43.9 ms Minibatch loss: 2.395, learning rate: 0.008574 Minibatch error: 1.6% Validation error: 1.0% Step 3100 (epoch 3.61), 44.1 ms Minibatch loss: 2.390, learning rate: 0.008574 Minibatch error: 3.1% Validation error: 1.0% Step 3200 (epoch 3.72), 43.6 ms Minibatch loss: 2.330, learning rate: 0.008574 Minibatch error: 0.0% Validation error: 1.1% Step 3300 (epoch 3.84), 43.8 ms Minibatch loss: 2.319, learning rate: 0.008574 Minibatch error: 1.6% Validation error: 1.1% Step 3400 (epoch 3.96), 44.4 ms Minibatch loss: 2.296, learning rate: 0.008574 Minibatch error: 0.0% Validation error: 1.0% Step 3500 (epoch 4.07), 44.4 ms Minibatch loss: 2.273, learning rate: 0.008145 Minibatch error: 0.0% Validation error: 1.0% Step 3600 (epoch 4.19), 44.2 ms Minibatch loss: 2.253, learning rate: 0.008145 Minibatch error: 0.0% Validation error: 0.9% Step 3700 (epoch 4.31), 44.4 ms Minibatch loss: 2.237, learning rate: 0.008145 Minibatch error: 0.0% Validation error: 1.0% Step 3800 (epoch 4.42), 43.8 ms Minibatch loss: 2.234, learning rate: 0.008145 Minibatch error: 1.6% Validation error: 0.9% Step 3900 (epoch 4.54), 43.9 ms Minibatch loss: 2.325, learning rate: 0.008145 Minibatch error: 3.1% Validation error: 0.9% Step 4000 (epoch 4.65), 43.6 ms Minibatch loss: 2.215, learning rate: 0.008145 Minibatch error: 0.0% Validation error: 1.1% Step 4100 (epoch 4.77), 43.6 ms Minibatch loss: 2.209, learning rate: 0.008145 Minibatch error: 1.6% Validation error: 1.0% Step 4200 (epoch 4.89), 43.6 ms Minibatch loss: 2.242, learning rate: 0.008145 Minibatch error: 1.6% Validation error: 1.0% Step 4300 (epoch 5.00), 43.5 ms Minibatch loss: 2.188, learning rate: 0.007738 Minibatch error: 1.6% Validation error: 0.9% Step 4400 (epoch 5.12), 43.5 ms Minibatch loss: 2.155, learning rate: 0.007738 Minibatch error: 3.1% Validation error: 1.0% Step 4500 (epoch 5.24), 43.5 ms Minibatch loss: 2.164, learning rate: 0.007738 Minibatch error: 4.7% Validation error: 0.9% Step 4600 (epoch 5.35), 43.5 ms Minibatch loss: 2.095, learning rate: 0.007738 Minibatch error: 0.0% Validation error: 0.9% Step 4700 (epoch 5.47), 43.6 ms Minibatch loss: 2.062, learning rate: 0.007738 Minibatch error: 0.0% Validation error: 0.9% Step 4800 (epoch 5.59), 43.6 ms Minibatch loss: 2.068, learning rate: 0.007738 Minibatch error: 1.6% Validation error: 1.0% Step 4900 (epoch 5.70), 43.6 ms Minibatch loss: 2.062, learning rate: 0.007738 Minibatch error: 1.6% Validation error: 1.0% Step 5000 (epoch 5.82), 43.5 ms Minibatch loss: 2.148, learning rate: 0.007738 Minibatch error: 3.1% Validation error: 1.0% Step 5100 (epoch 5.93), 43.5 ms Minibatch loss: 2.017, learning rate: 0.007738 Minibatch error: 1.6% Validation error: 0.9% Step 5200 (epoch 6.05), 43.5 ms Minibatch loss: 2.074, learning rate: 0.007351 Minibatch error: 3.1% Validation error: 1.0% Step 5300 (epoch 6.17), 43.6 ms Minibatch loss: 1.983, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 1.1% Step 5400 (epoch 6.28), 43.6 ms Minibatch loss: 1.957, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 0.8% Step 5500 (epoch 6.40), 43.5 ms Minibatch loss: 1.955, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 0.9% Step 5600 (epoch 6.52), 43.5 ms Minibatch loss: 1.926, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 0.8% Step 5700 (epoch 6.63), 43.5 ms Minibatch loss: 1.914, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 1.0% Step 5800 (epoch 6.75), 43.6 ms Minibatch loss: 1.897, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 0.9% Step 5900 (epoch 6.87), 43.5 ms Minibatch loss: 1.887, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 0.8% Step 6000 (epoch 6.98), 43.6 ms Minibatch loss: 1.878, learning rate: 0.007351 Minibatch error: 0.0% Validation error: 1.0% Step 6100 (epoch 7.10), 43.5 ms Minibatch loss: 1.859, learning rate: 0.006983 Minibatch error: 0.0% Validation error: 0.8% Step 6200 (epoch 7.21), 43.6 ms Minibatch loss: 1.844, learning rate: 0.006983 Minibatch error: 0.0% Validation error: 0.8% Step 6300 (epoch 7.33), 43.6 ms Minibatch loss: 1.850, learning rate: 0.006983 Minibatch error: 1.6% Validation error: 0.9% Step 6400 (epoch 7.45), 43.6 ms Minibatch loss: 1.916, learning rate: 0.006983 Minibatch error: 3.1% Validation error: 0.8% Step 6500 (epoch 7.56), 43.6 ms Minibatch loss: 1.808, learning rate: 0.006983 Minibatch error: 0.0% Validation error: 0.8% Step 6600 (epoch 7.68), 43.5 ms Minibatch loss: 1.839, learning rate: 0.006983 Minibatch error: 1.6% Validation error: 0.9% Step 6700 (epoch 7.80), 43.6 ms Minibatch loss: 1.781, learning rate: 0.006983 Minibatch error: 0.0% Validation error: 0.8% Step 6800 (epoch 7.91), 43.6 ms Minibatch loss: 1.773, learning rate: 0.006983 Minibatch error: 0.0% Validation error: 0.8% Step 6900 (epoch 8.03), 43.5 ms Minibatch loss: 1.762, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.9% Step 7000 (epoch 8.15), 43.5 ms Minibatch loss: 1.797, learning rate: 0.006634 Minibatch error: 1.6% Validation error: 0.9% Step 7100 (epoch 8.26), 43.5 ms Minibatch loss: 1.741, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.8% Step 7200 (epoch 8.38), 43.5 ms Minibatch loss: 1.744, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.9% Step 7300 (epoch 8.49), 43.6 ms Minibatch loss: 1.726, learning rate: 0.006634 Minibatch error: 1.6% Validation error: 0.8% Step 7400 (epoch 8.61), 43.5 ms Minibatch loss: 1.704, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.8% Step 7500 (epoch 8.73), 43.6 ms Minibatch loss: 1.695, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.8% Step 7600 (epoch 8.84), 43.5 ms Minibatch loss: 1.808, learning rate: 0.006634 Minibatch error: 3.1% Validation error: 0.8% Step 7700 (epoch 8.96), 43.6 ms Minibatch loss: 1.667, learning rate: 0.006634 Minibatch error: 0.0% Validation error: 0.9% Step 7800 (epoch 9.08), 43.5 ms Minibatch loss: 1.660, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.9% Step 7900 (epoch 9.19), 43.6 ms Minibatch loss: 1.649, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.9% Step 8000 (epoch 9.31), 43.5 ms Minibatch loss: 1.666, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.8% Step 8100 (epoch 9.43), 43.6 ms Minibatch loss: 1.626, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.8% Step 8200 (epoch 9.54), 43.5 ms Minibatch loss: 1.633, learning rate: 0.006302 Minibatch error: 1.6% Validation error: 0.8% Step 8300 (epoch 9.66), 43.6 ms Minibatch loss: 1.616, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.8% Step 8400 (epoch 9.77), 43.6 ms Minibatch loss: 1.597, learning rate: 0.006302 Minibatch error: 0.0% Validation error: 0.8% Step 8500 (epoch 9.89), 43.5 ms Minibatch loss: 1.612, learning rate: 0.006302 Minibatch error: 1.6% Validation error: 0.8% Test error: 0.8%
Finally Dode!!!
总结来来回回折腾了4天。教训就是一定要根据官网一步一步来,因为不同版本兼容性不行,所以不要随意下载其他版本,同时要仔细分析报出的错误,再采取下一步行动。
欢迎访问我的主页(http://mageek.cn/)
文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。
转载请注明本文地址:https://www.ucloud.cn/yun/45512.html
摘要:模块中包含着大量的语料库,可以很方便地完成很多自然语言处理的任务,包括分词词性标注命名实体识别及句法分析。导入工具包,下载数据源。在终端输入是第一被添加到核心中的高级别框架,成为的默认。至此开发环境配置完毕 1. mac电脑推荐配置 内存:8G+cpu:i5+硬盘:SSD 128G+ 本人的电脑配置是cpu:i7, 内存:16G,硬盘:SSD 256G 2. mac开发环境配置 1.1...
阅读 2409·2021-11-25 09:43
阅读 1244·2021-11-24 09:39
阅读 740·2021-11-23 09:51
阅读 2381·2021-09-07 10:18
阅读 1840·2021-09-01 11:39
阅读 2775·2019-08-30 15:52
阅读 2588·2019-08-30 14:21
阅读 2850·2019-08-29 16:57