cuda10+cudnn7.3+python3.6でtensorflowをビルド

前回はcuda10, cudnn7.3, python3.7の環境でTensorFlowのbuildに失敗したので、今度はcuda10, cudnn7.3, python3.6の組み合わせでtensorflowをbuildしてみた。

スポンサーリンク

tensorflowをbuild

!pip3 install -U six wheel mock
Requirement already up-to-date: six in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (1.11.0)
Collecting wheel
  Using cached https://files.pythonhosted.org/packages/81/30/e935244ca6165187ae8be876b6316ae201b71485538ffac1d718843025a9/wheel-0.31.1-py2.py3-none-any.whl
Collecting mock
  Using cached https://files.pythonhosted.org/packages/e6/35/f187bdf23be87092bd0f1200d43d23076cee4d0dec109f195173fd3ebc79/mock-2.0.0-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock)
  Using cached https://files.pythonhosted.org/packages/69/1c/98cba002ed975a91a0294863d9c774cc0ebe38e05bbb65e83314550b1677/pbr-4.2.0-py2.py3-none-any.whl
Installing collected packages: wheel, pbr, mock
Successfully installed mock-2.0.0 pbr-4.2.0 wheel-0.31.1
!pip3 install -U keras_applications==1.0.5 --no-deps
Collecting keras_applications==1.0.5
  Using cached https://files.pythonhosted.org/packages/3f/9c/6e9393ead970fd97be0cfde912697dafec5800d9191f5ba25352fa537d72/Keras_Applications-1.0.5-py2.py3-none-any.whl
Installing collected packages: keras-applications
Successfully installed keras-applications-1.0.5
!pip install -U keras_preprocessing==1.0.3 --no-deps
Collecting keras_preprocessing==1.0.3
  Using cached https://files.pythonhosted.org/packages/b3/bd/796f986980da4d6adc77ffd8b2b11074e7b17a7b74b03789aefac5709c4b/Keras_Preprocessing-1.0.3-py2.py3-none-any.whl
Installing collected packages: keras-preprocessing
Successfully installed keras-preprocessing-1.0.3
cd /root/tensorflow
/root/tensorflow
!git pull
remote: Enumerating objects: 39, done.
remote: Counting objects: 100% (39/39), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 45 (delta 30), reused 39 (delta 30), pack-reused 6
Unpacking objects: 100% (45/45), done.
From https://github.com/tensorflow/tensorflow
   7fd14feb9c..19b2383cc0  master     -> origin/master
You are not currently on a branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

!git checkout v1.11.0
HEAD is now at c19e29306c Final version strings for 1.11.0 (#22513)
!./configure
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:/root/.cache/bazel/_bazel_root/install/792a28b07894763eaa2bd870f8776b23/_embedded_binaries/A-server.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.17.2 installed.
Please specify the location of python. [Default is /root/.pyenv/versions/py36/bin/python]: 
!bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

python3.7では失敗したが、python3.6ではビルドに成功した。

cd /root/tensorflow
/root/tensorflow
!./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
Fri Sep 28 19:55:54 JST 2018 : === Preparing sources in dir: /tmp/tmp.lDlQCU6oyh
~/tensorflow ~/tensorflow
~/tensorflow
Fri Sep 28 19:56:10 JST 2018 : === Building wheel
warning: no files found matching '*.pd' under directory '*'
warning: no files found matching '*.dll' under directory '*'
warning: no files found matching '*.lib' under directory '*'
warning: no files found matching '*.h' under directory 'tensorflow/include/tensorflow'
warning: no files found matching '*' under directory 'tensorflow/include/Eigen'
warning: no files found matching '*.h' under directory 'tensorflow/include/google'
warning: no files found matching '*' under directory 'tensorflow/include/third_party'
warning: no files found matching '*' under directory 'tensorflow/include/unsupported'
Fri Sep 28 19:56:35 JST 2018 : === Output wheel file is in: /tmp/tensorflow_pkg

tensorflowをinstall

ls /tmp/tensorflow_pkg/
tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl
!pip3 install /tmp/tensorflow_pkg/tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl
Processing /tmp/tensorflow_pkg/tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl
Collecting tensorboard<1.12.0,>=1.11.0 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/9b/2f/4d788919b1feef04624d63ed6ea45a49d1d1c834199ec50716edb5d310f4/tensorboard-1.11.0-py3-none-any.whl (3.0MB)
    100% |################################| 3.0MB 7.9MB/s eta 0:00:01
Requirement already satisfied: numpy>=1.13.3 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (1.15.2)
Requirement already satisfied: keras-preprocessing>=1.0.3 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (1.0.3)
Requirement already satisfied: keras-applications>=1.0.5 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (1.0.5)
Requirement already satisfied: six>=1.10.0 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (1.11.0)
Requirement already satisfied: setuptools<=39.1.0 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (39.0.1)
Collecting protobuf>=3.6.0 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/c2/f9/28787754923612ca9bfdffc588daa05580ed70698add063a5629d1a4209d/protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl (1.1MB)
    100% |################################| 1.1MB 33.3MB/s ta 0:00:01    94% |##############################  | 1.1MB 81.4MB/s eta 0:00:01
Collecting grpcio>=1.8.6 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/a7/9c/523fec4e50cd4de5effeade9fab6c1da32e7e1d72372e8e514274ffb6509/grpcio-1.15.0-cp36-cp36m-manylinux1_x86_64.whl (9.5MB)
    100% |################################| 9.5MB 7.2MB/s eta 0:00:011
Collecting termcolor>=1.1.0 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Collecting astor>=0.6.0 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl
Requirement already satisfied: wheel>=0.26 in /root/.pyenv/versions/3.6.6/envs/py36/lib/python3.6/site-packages (from tensorflow==1.11.0) (0.31.1)
Collecting gast>=0.2.0 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/5c/78/ff794fcae2ce8aa6323e789d1f8b3b7765f601e7702726f430e814822b96/gast-0.2.0.tar.gz
Collecting absl-py>=0.1.6 (from tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/16/db/cce5331638138c178dd1d5fb69f3f55eb3787a12efd9177177ae203e847f/absl-py-0.5.0.tar.gz (90kB)
    100% |################################| 92kB 28.8MB/s ta 0:00:01
Collecting werkzeug>=0.11.10 (from tensorboard<1.12.0,>=1.11.0->tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
    100% |################################| 327kB 40.3MB/s ta 0:00:01
Collecting markdown>=2.6.8 (from tensorboard<1.12.0,>=1.11.0->tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl (89kB)
    100% |################################| 92kB 42.5MB/s ta 0:00:01
Collecting scipy>=0.14 (from keras-preprocessing>=1.0.3->tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/a8/0b/f163da98d3a01b3e0ef1cab8dd2123c34aee2bafbb1c5bffa354cc8a1730/scipy-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (31.2MB)
    100% |################################| 31.2MB 2.6MB/s eta 0:00:011
Collecting keras>=2.1.6 (from keras-preprocessing>=1.0.3->tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/34/7d/b1dedde8af99bd82f20ed7e9697aac0597de3049b1f786aa2aac3b9bd4da/Keras-2.2.2-py2.py3-none-any.whl (299kB)
    100% |################################| 307kB 54.0MB/s ta 0:00:01
Collecting h5py (from keras-applications>=1.0.5->tensorflow==1.11.0)
  Downloading https://files.pythonhosted.org/packages/8e/cb/726134109e7bd71d98d1fcc717ffe051767aac42ede0e7326fd1787e5d64/h5py-2.8.0-cp36-cp36m-manylinux1_x86_64.whl (2.8MB)
    100% |################################| 2.8MB 30.7MB/s eta 0:00:01
Collecting pyyaml (from keras>=2.1.6->keras-preprocessing>=1.0.3->tensorflow==1.11.0)
  Using cached https://files.pythonhosted.org/packages/9e/a3/1d13970c3f36777c583f136c136f804d70f500168edc1edea6daa7200769/PyYAML-3.13.tar.gz
Building wheels for collected packages: termcolor, gast, absl-py, pyyaml
  Running setup.py bdist_wheel for termcolor ... done
  Stored in directory: /root/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6
  Running setup.py bdist_wheel for gast ... done
  Stored in directory: /root/.cache/pip/wheels/9a/1f/0e/3cde98113222b853e98fc0a8e9924480a3e25f1b4008cedb4f
  Running setup.py bdist_wheel for absl-py ... done
  Stored in directory: /root/.cache/pip/wheels/3c/33/ae/db8cd618e62f87594c13a5483f96e618044f9b01596efd013f
  Running setup.py bdist_wheel for pyyaml ... done
  Stored in directory: /root/.cache/pip/wheels/ad/da/0c/74eb680767247273e2cf2723482cb9c924fe70af57c334513f
Successfully built termcolor gast absl-py pyyaml
keras 2.2.2 has requirement keras-applications==1.0.4, but you'll have keras-applications 1.0.5 which is incompatible.
keras 2.2.2 has requirement keras-preprocessing==1.0.2, but you'll have keras-preprocessing 1.0.3 which is incompatible.
Installing collected packages: protobuf, grpcio, werkzeug, markdown, tensorboard, termcolor, astor, gast, absl-py, tensorflow, scipy, h5py, pyyaml, keras
Successfully installed absl-py-0.5.0 astor-0.7.1 gast-0.2.0 grpcio-1.15.0 h5py-2.8.0 keras-2.2.2 markdown-3.0.1 protobuf-3.6.1 pyyaml-3.13 scipy-1.1.0 tensorboard-1.11.0 tensorflow-1.11.0 termcolor-1.1.0 werkzeug-0.14.1
cd
/root

tensorflowをimport

%load_ext version_information
import tensorflow
%version_information tensorflow, pycuda, scikit-cuda
The version_information extension is already loaded. To reload it, use:
  %reload_ext version_information
Software Version
Python 3.6.6 64bit [GCC 7.3.0]
IPython 7.0.1
OS Linux 4.15.0 34 generic x86_64 with debian buster sid
tensorflow 1.11.0
pycuda 2018.1.1
scikit-cuda 0.5.2
Fri Sep 28 20:23:35 2018 JST
!nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

tensorflowをテストする

import numpy as np
import tensorflow as tf
import time

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.667)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

dim = 10000
rnd = np.random.RandomState(0)
a = rnd.rand(dim, dim).astype(np.float32)
b = rnd.rand(dim, dim).astype(np.float32)

start = time.time()
rescpu = np.dot(a, b)
c = time.time() - start
print ('CPU:', c)

X = tf.placeholder(tf.float32, shape=(dim, dim), name=None)
Y = tf.placeholder(tf.float32, shape=(dim, dim), name=None)
Z = tf.matmul(X, Y)

sess = tf.Session()

start = time.time()
resgpu = sess.run(Z, feed_dict={X: a, Y: b})
d = time.time() - start
print ('GPU:', d)
print('Speed difference: {:0.1f}x'.format(c / d))
print (np.allclose(rescpu, resgpu))
print (np.allclose(resgpu, rescpu))
CPU: 6.295258283615112
GPU: 1.1881449222564697
Speed difference: 5.3x
True
True

numpyが高速化されている一方でtensorflowは劣化している。

import pycuda.gpuarray as gpuarray
import pycuda.autoinit
import numpy as np, time
import skcuda.linalg as culinalg
import skcuda
culinalg.init()
import tensorflow as tf
import time

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.667)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

dim = 10000
rnd = np.random.RandomState(0)
a = rnd.rand(dim, dim).astype(np.float32)
b = rnd.rand(dim, dim).astype(np.float32)
a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)

start = time.time()
ressk = culinalg.dot(a_gpu, b_gpu)
c = time.time() - start
print ('sk:', c)
ressk = ressk.get()

X = tf.placeholder(tf.float32, shape=(dim, dim), name=None)
Y = tf.placeholder(tf.float32, shape=(dim, dim), name=None)
Z = tf.matmul(X, Y)
sess = tf.Session()
start = time.time()
restf = sess.run(Z, feed_dict={X: a, Y: b})
d = time.time() - start
print ('tf:', d)
print ('Speed difference: {:0.1f}x'.format(d / c))
print (np.allclose(ressk, restf))
print (np.allclose(restf, ressk))
sk: 0.03292489051818848
tf: 1.037017583847046
Speed difference: 31.5x
True
True

skcudaもtensorflowも速度が劣化している。cuda10とcudnn7.3のメリットがないというか、GPUカードがしょぼいのが原因なのか、何れにしても、2年以上前の古臭いGPUカードではあまり効果がないということか。最新のRTX 2070あたりを買う必要があるのかもしれない。