cuda-10+cudnn-7.3は、Nvidia driver 410.xxを要求するので、ドライバーの更新が必要になるかもしれない。ドライバー更新は以下のようにして行えばいいとこのサイトに書いてあった。

ドライバーの更新¶

sudo apt remove nvidia-*
sudo apt update
sudo apt-key adv --fetch-keys <a href="http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub">http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub</a>
sudo bash -c 'echo "deb <a href="http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64">http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64</a> /" > /etc/apt/sources.list.d/cuda.list'
sudo apt update
sudo apt install nvidia-driver-410

ドライバがちゃんと更新されているかを確認する。

!nvidia-smi

Tue Sep 25 22:56:14 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   54C    P5    11W /  N/A |    343MiB /  6078MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

pytorchとcaffe2のインストール¶

先ず、pytorch gitをクローンする。

!git clone https://github.com/pytorch/pytorch.git

Cloning into 'pytorch'...
remote: Enumerating objects: 65, done.
remote: Counting objects: 100% (65/65), done.
remote: Compressing objects: 100% (62/62), done.
remote: Total 135572 (delta 13), reused 13 (delta 3), pack-reused 135507
Receiving objects: 100% (135572/135572), 74.44 MiB | 4.68 MiB/s, done.
Resolving deltas: 100% (102824/102824), done.

pytorchフォルダに移動する。

cd pytorch

/root/pytorch

!git submodule update --init --recursive

Submodule 'third_party/ComputeLibrary' (https://github.com/ARM-software/ComputeLibrary.git) registered for path 'third_party/ComputeLibrary'
Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16'
Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv'
Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK'
Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark'
Submodule 'third_party/catch' (https://github.com/catchorg/Catch2.git) registered for path 'third_party/catch'
Submodule 'third_party/cereal' (https://github.com/USCiLab/cereal) registered for path 'third_party/cereal'
Submodule 'third-party/cpuinfo' (https://github.com/Maratyszcza/cpuinfo.git) registered for path 'third_party/cpuinfo'
Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub'
Submodule 'third_party/eigen' (https://github.com/eigenteam/eigen-git-mirror.git) registered for path 'third_party/eigen'
Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo'
Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest'
Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep'
Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake'
Submodule 'third_party/nervanagpu' (https://github.com/NervanaSystems/nervanagpu.git) registered for path 'third_party/nervanagpu'
Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx'
Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt'
Submodule 'third_party/protobuf' (https://github.com/google/protobuf.git) registered for path 'third_party/protobuf'
Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd'
Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool'
Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11'
Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum'
Submodule 'third_party/python-peachpy' (https://github.com/Maratyszcza/PeachPy.git) registered for path 'third_party/python-peachpy'
Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six'
Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef'
Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd'
Cloning into '/root/pytorch/third_party/ComputeLibrary'...
Cloning into '/root/pytorch/third_party/FP16'...
Cloning into '/root/pytorch/third_party/FXdiv'...
Cloning into '/root/pytorch/third_party/NNPACK'...
Cloning into '/root/pytorch/third_party/benchmark'...
Cloning into '/root/pytorch/third_party/catch'...
Cloning into '/root/pytorch/third_party/cereal'...
Cloning into '/root/pytorch/third_party/cpuinfo'...
Cloning into '/root/pytorch/third_party/cub'...
Cloning into '/root/pytorch/third_party/eigen'...
Cloning into '/root/pytorch/third_party/gloo'...
Cloning into '/root/pytorch/third_party/googletest'...
Cloning into '/root/pytorch/third_party/ideep'...
Cloning into '/root/pytorch/third_party/ios-cmake'...
Cloning into '/root/pytorch/third_party/nervanagpu'...
Cloning into '/root/pytorch/third_party/onnx'...
Cloning into '/root/pytorch/third_party/onnx-tensorrt'...
Cloning into '/root/pytorch/third_party/protobuf'...
Cloning into '/root/pytorch/third_party/psimd'...
Cloning into '/root/pytorch/third_party/pthreadpool'...
Cloning into '/root/pytorch/third_party/pybind11'...
Cloning into '/root/pytorch/third_party/python-enum'...
Cloning into '/root/pytorch/third_party/python-peachpy'...
Cloning into '/root/pytorch/third_party/python-six'...
Cloning into '/root/pytorch/third_party/sleef'...
Cloning into '/root/pytorch/third_party/zstd'...
Submodule path 'third_party/ComputeLibrary': checked out '292227986edb37b01061afcad6df18ba9d6ccbeb'
Submodule path 'third_party/FP16': checked out '4b37bd31c9cc1380ef9f205f7dd031efe0e847ab'
Submodule path 'third_party/FXdiv': checked out '811b482bcd9e8d98ad80c6c78d5302bb830184b0'
Submodule path 'third_party/NNPACK': checked out 'af40ea7d12702f8ae55aeb13701c09cad09334c3'
Submodule path 'third_party/benchmark': checked out '505be96ab23056580a3a2315abba048f4428b04e'
Submodule path 'third_party/catch': checked out '0a34cc201ef28bf25c88b0062f331369596cb7b7'
Submodule path 'third_party/cereal': checked out '51cbda5f30e56c801c07fe3d3aba5d7fb9e6cca4'
Submodule path 'third_party/cpuinfo': checked out '4e8f04355892c5deb64a51731a6afdb544a4294d'
Submodule path 'third_party/cub': checked out '285aeebaa34b0e8a7670867a2e66c1a52d998d6a'
From https://github.com/eigenteam/eigen-git-mirror
 * branch                cafae68f33f7f41270b2e8c2dd181f510aa4d918 -> FETCH_HEAD
Submodule path 'third_party/eigen': checked out 'cafae68f33f7f41270b2e8c2dd181f510aa4d918'
Submodule path 'third_party/gloo': checked out 'aa0d2e3f8aa4f9cee5ffa46070491cf1ed6aae70'
Submodule path 'third_party/googletest': checked out '2fe3bd994b3189899d93f1d5a881e725e046fdc2'
Submodule path 'third_party/ideep': checked out '4bd9a6800bf7db068187619e0582d34dec9651dc'
Submodule 'mkl-dnn' (https://github.com/01org/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn'
Submodule 'tests/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/ideep/tests/googletest'
Submodule 'tests/rapidcheck' (https://github.com/emil-e/rapidcheck.git) registered for path 'third_party/ideep/tests/rapidcheck'
Cloning into '/root/pytorch/third_party/ideep/mkl-dnn'...
Cloning into '/root/pytorch/third_party/ideep/tests/googletest'...
Cloning into '/root/pytorch/third_party/ideep/tests/rapidcheck'...
Submodule path 'third_party/ideep/mkl-dnn': checked out 'c0095ec26c215970b9b94915963b59e76398265e'
Submodule path 'third_party/ideep/tests/googletest': checked out '08d5b1f33af8c18785fb8ca02792b5fac81e248f'
Submodule path 'third_party/ideep/tests/rapidcheck': checked out '10fc0cbaa46a765d8134e99995de9c81c048a331'
Submodule 'ext/catch' (https://github.com/philsquared/Catch.git) registered for path 'third_party/ideep/tests/rapidcheck/ext/catch'
Submodule 'ext/googletest' (https://github.com/google/googletest) registered for path 'third_party/ideep/tests/rapidcheck/ext/googletest'
Cloning into '/root/pytorch/third_party/ideep/tests/rapidcheck/ext/catch'...
Cloning into '/root/pytorch/third_party/ideep/tests/rapidcheck/ext/googletest'...
Submodule path 'third_party/ideep/tests/rapidcheck/ext/catch': checked out '2ce6c74f8fcbd28c977e70d8c020939123a9ea3c'
Submodule path 'third_party/ideep/tests/rapidcheck/ext/googletest': checked out 'ecd530865cefdfa7dea58e84f6aa1b548950363d'
Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724'
Submodule path 'third_party/nervanagpu': checked out 'd4eefd50fbd7d34a17dddbc829888835d67b5f4a'
Submodule path 'third_party/onnx': checked out 'c4734c6200cb42c1aa36eb1f0160041d2401644d'
Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark'
Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11'
Cloning into '/root/pytorch/third_party/onnx/third_party/benchmark'...
Cloning into '/root/pytorch/third_party/onnx/third_party/pybind11'...
Submodule path 'third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508'
Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c'
Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx/third_party/pybind11/tools/clang'
Cloning into '/root/pytorch/third_party/onnx/third_party/pybind11/tools/clang'...
Submodule path 'third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
Submodule path 'third_party/onnx-tensorrt': checked out 'fa0964e8477fc004ee2f49ee77ffce0bf7f711a9'
Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx'
Cloning into '/root/pytorch/third_party/onnx-tensorrt/third_party/onnx'...
Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out 'b4072194c2e6ef90693bcfdea4c6f45cf30bb65e'
Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'
Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'
Cloning into '/root/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'...
Cloning into '/root/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'...
Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508'
Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c'
Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'
Cloning into '/root/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'...
Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
Submodule path 'third_party/protobuf': checked out '2761122b810fe8861004ae785cc3ab39f384d342'
Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark'
Cloning into '/root/pytorch/third_party/protobuf/third_party/benchmark'...
Submodule path 'third_party/protobuf/third_party/benchmark': checked out '360e66c1c4777c99402cf8cd535aa510fee16573'
Submodule path 'third_party/psimd': checked out '4ac61b112252778b174575931c641bef661ab3cd'
Submodule path 'third_party/pthreadpool': checked out '2b06b31f6a315162348e1f3c24325eedaf6cc559'
Submodule path 'third_party/pybind11': checked out '5c8746ff135abb390bf95944be593e895a586a50'
Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/pybind11/tools/clang'
Cloning into '/root/pytorch/third_party/pybind11/tools/clang'...
Submodule path 'third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7'
Submodule path 'third_party/python-peachpy': checked out '07d8fde8ac45d7705129475c0f94ed8925b93473'
Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a'
Submodule path 'third_party/sleef': checked out '6ff7a135a1e31979d1e1844a2e7171dfbd34f54f'
Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8'

pytorchとcaffe2をインストールする。

!FULL_CAFFE2=1 python setup.py install

Building wheel torch-1.0.0a0+2cdf98a
running install
setup.py::run()
running build_deps
setup.py::build_deps::run()
Traceback (most recent call last):
  File "setup.py", line 334, in check_pydep
    importlib.import_module(importname)
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 1237, in <module>
    rel_site_packages + '/caffe2/**/*.py'
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/setuptools/__init__.py", line 129, in setup
    return distutils.core.setup(**attrs)
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "setup.py", line 730, in run
    self.run_command('build_deps')
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "setup.py", line 442, in run
    check_pydep('yaml', 'pyyaml')
  File "setup.py", line 336, in check_pydep
    raise RuntimeError(missing_pydep.format(importname=importname, module=module))
RuntimeError: Missing build dependency: Unable to `import yaml`.
Please install it via `conda install pyyaml` or `pip install pyyaml`

依存関係のあるpyyamlをインストールする。

!pip install pyyaml

Collecting pyyaml
  Downloading https://files.pythonhosted.org/packages/9e/a3/1d13970c3f36777c583f136c136f804d70f500168edc1edea6daa7200769/PyYAML-3.13.tar.gz (270kB)
    100% |████████████████████████████████| 276kB 8.0MB/s ta 0:00:01
Installing collected packages: pyyaml
  Running setup.py install for pyyaml ... done
Successfully installed pyyaml-3.13

!FULL_CAFFE2=1 python setup.py install

pytorchのベンチマークテスト¶

ベンチマークサイトをクローンする。

cd

/root

!git clone https://github.com/u39kun/deep-learning-benchmark.git

Cloning into 'deep-learning-benchmark'...
remote: Enumerating objects: 129, done.
remote: Total 129 (delta 0), reused 0 (delta 0), pack-reused 129
Receiving objects: 100% (129/129), 121.63 KiB | 492.00 KiB/s, done.
Resolving deltas: 100% (54/54), done.

cd deep-learning-benchmark

/root/deep-learning-benchmark

ls

README.md  benchmark.py  create_table.py  frameworks/  results/

# %load benchmark.py
import argparse
from collections import OrderedDict
from importlib import import_module
import pickle

import numpy as np

frameworks = [
    'pytorch',
    'tensorflow',
    'caffe2'
]

models = [
    'vgg16',
    'resnet152',
    'densenet161'
]

precisions = [
    'fp32',
    'fp16'
]

class Benchmark():

    def get_framework_model(self, framework, model):
        framework_model = import_module('.'.join(['frameworks', framework, 'models']))
        return getattr(framework_model, model)

    def benchmark_model(self, mode, framework, model, precision, image_shape=(224, 224), batch_size=16, num_iterations=20, num_warmups=20):
        framework_model = self.get_framework_model(framework, model)(precision, image_shape, batch_size)
        durations = framework_model.eval(num_iterations, num_warmups) if mode == 'eval' else framework_model.train(num_iterations, num_warmups)
        durations = np.array(durations)
        return durations.mean() * 1000

    def benchmark_all(self):
        results = OrderedDict()
        for framework in frameworks:
            results[framework] = self.benchmark_framework(framework)
        return results

    def benchmark_framework(self, framework):
        results = OrderedDict()
        for precision in precisions:
            results[precision] = []
            for model in models:
                if model == 'densenet161' and framework != 'pytorch':
                    eval_duration = 0
                    train_duration = 0
                else:
                    eval_duration = self.benchmark_model('eval', framework, model, precision)
                    train_duration = self.benchmark_model('train', framework, model, precision)
                print("{}'s {} eval at {}: {}ms avg".format(framework, model, precision, round(eval_duration, 1)))
                print("{}'s {} train at {}: {}ms avg".format(framework, model, precision, round(train_duration, 1)))
                results[precision].append(eval_duration)
                results[precision].append(train_duration)

        return results

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-f', dest='framework', required=False)
    args = parser.parse_args()

    if args.framework:
        print('running benchmark for framework', args.framework)
        results = Benchmark().benchmark_framework(args.framework)
        pickle.dump(results, open('{}_results.pkl'.format(args.framework), 'wb'))
    else:
        print('running benchmark for frameworks', frameworks)
        results = Benchmark().benchmark_all()
        pickle.dump(results, open('all_results.pkl', 'wb'))

上のコードのtensorflowを消してベンチマークを走らせる。

!python benchmark.py

running benchmark for frameworks ['pytorch', 'caffe2']
Traceback (most recent call last):
  File "benchmark.py", line 72, in <module>
    results = Benchmark().benchmark_all()
  File "benchmark.py", line 40, in benchmark_all
    results[framework] = self.benchmark_framework(framework)
  File "benchmark.py", line 52, in benchmark_framework
    eval_duration = self.benchmark_model('eval', framework, model, precision)
  File "benchmark.py", line 32, in benchmark_model
    framework_model = self.get_framework_model(framework, model)(precision, image_shape, batch_size)
  File "benchmark.py", line 28, in get_framework_model
    framework_model = import_module('.'.join(['frameworks', framework, 'models']))
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/root/deep-learning-benchmark/frameworks/pytorch/models.py", line 6, in <module>
    import torchvision
ModuleNotFoundError: No module named 'torchvision'

!pip3 install torchvision

Collecting torchvision
  Downloading https://files.pythonhosted.org/packages/ca/0d/f00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1/torchvision-0.2.1-py2.py3-none-any.whl (54kB)
    100% |████████████████████████████████| 61kB 3.4MB/s ta 0:00:011
Requirement already satisfied: pillow>=4.1.1 in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from torchvision) (5.2.0)
Requirement already satisfied: numpy in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from torchvision) (1.15.2)
Requirement already satisfied: six in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from torchvision) (1.11.0)
Requirement already satisfied: torch in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from torchvision) (1.0.0a0+2cdf98a)
Installing collected packages: torchvision
Successfully installed torchvision-0.2.1

!python benchmark.py

running benchmark for frameworks ['pytorch', 'caffe2']
cuda version= 10.0.130
cudnn version= 7300
/root/deep-learning-benchmark/frameworks/pytorch/models.py:17: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.eval_input = torch.autograd.Variable(x, volatile=True).cuda() if precision == 'fp32' \
pytorch's vgg16 eval at fp32: 109.1ms avg
pytorch's vgg16 train at fp32: 337.8ms avg
pytorch's resnet152 eval at fp32: 166.2ms avg
pytorch's resnet152 train at fp32: 536.6ms avg
/root/.pyenv/versions/py37/lib/python3.7/site-packages/torchvision/models/densenet.py:212: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
  nn.init.kaiming_normal(m.weight.data)
pytorch's densenet161 eval at fp32: 164.8ms avg
pytorch's densenet161 train at fp32: 539.7ms avg
/root/deep-learning-benchmark/frameworks/pytorch/models.py:18: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  else torch.autograd.Variable(x, volatile=True).cuda().half()
pytorch's vgg16 eval at fp16: 104.0ms avg
pytorch's vgg16 train at fp16: 314.2ms avg
pytorch's resnet152 eval at fp16: 137.5ms avg
pytorch's resnet152 train at fp16: 478.4ms avg
pytorch's densenet161 eval at fp16: 140.9ms avg
pytorch's densenet161 train at fp16: 481.6ms avg
Traceback (most recent call last):
  File "benchmark.py", line 72, in <module>
    results = Benchmark().benchmark_all()
  File "benchmark.py", line 40, in benchmark_all
    results[framework] = self.benchmark_framework(framework)
  File "benchmark.py", line 52, in benchmark_framework
    eval_duration = self.benchmark_model('eval', framework, model, precision)
  File "benchmark.py", line 32, in benchmark_model
    framework_model = self.get_framework_model(framework, model)(precision, image_shape, batch_size)
  File "benchmark.py", line 28, in get_framework_model
    framework_model = import_module('.'.join(['frameworks', framework, 'models']))
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/root/deep-learning-benchmark/frameworks/caffe2/models.py", line 3, in <module>
    from caffe2.proto import caffe2_pb2
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/caffe2/proto/caffe2_pb2.py", line 6, in <module>
    from google.protobuf.internal import enum_type_wrapper
ModuleNotFoundError: No module named 'google'

!pip3 install protobuf future

Collecting protobuf
  Downloading https://files.pythonhosted.org/packages/77/78/a7f1ce761e2c738e209857175cd4f90a8562d1bde32868a8cd5290d58926/protobuf-3.6.1-py2.py3-none-any.whl (390kB)
    100% |████████████████████████████████| 399kB 11.5MB/s ta 0:00:01
Collecting future
  Downloading https://files.pythonhosted.org/packages/00/2b/8d082ddfed935f3608cc61140df6dcbf0edea1bc3ab52fb6c29ae3e81e85/future-0.16.0.tar.gz (824kB)
    100% |████████████████████████████████| 829kB 15.8MB/s ta 0:00:01
Requirement already satisfied: six>=1.9 in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from protobuf) (1.11.0)
Requirement already satisfied: setuptools in /root/.pyenv/versions/3.7.0/envs/py37/lib/python3.7/site-packages (from protobuf) (39.0.1)
Installing collected packages: protobuf, future
  Running setup.py install for future ... done
Successfully installed future-0.16.0 protobuf-3.6.1

!python benchmark.py

running benchmark for frameworks ['pytorch', 'caffe2']
cuda version= 10.0.130
cudnn version= 7300
/root/deep-learning-benchmark/frameworks/pytorch/models.py:17: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.eval_input = torch.autograd.Variable(x, volatile=True).cuda() if precision == 'fp32' \
pytorch's vgg16 eval at fp32: 110.7ms avg
pytorch's vgg16 train at fp32: 357.3ms avg
pytorch's resnet152 eval at fp32: 165.7ms avg
pytorch's resnet152 train at fp32: 551.0ms avg
/root/.pyenv/versions/py37/lib/python3.7/site-packages/torchvision/models/densenet.py:212: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
  nn.init.kaiming_normal(m.weight.data)
Traceback (most recent call last):
  File "benchmark.py", line 72, in <module>
    results = Benchmark().benchmark_all()
  File "benchmark.py", line 40, in benchmark_all
    results[framework] = self.benchmark_framework(framework)
  File "benchmark.py", line 53, in benchmark_framework
    train_duration = self.benchmark_model('train', framework, model, precision)
  File "benchmark.py", line 33, in benchmark_model
    durations = framework_model.eval(num_iterations, num_warmups) if mode == 'eval' else framework_model.train(num_iterations, num_warmups)
  File "/root/deep-learning-benchmark/frameworks/pytorch/models.py", line 42, in train
    out = self.model(self.train_input)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torchvision/models/densenet.py", line 220, in forward
    features = self.features(x)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/.pyenv/versions/py37/lib/python3.7/site-packages/torchvision/models/densenet.py", line 144, in forward
    return torch.cat([x, new_features], 1)
RuntimeError: CUDA error: out of memory

メモリ不足なので、batch_sizeを16から4に変更して再テスト。

!python benchmark.py

running benchmark for frameworks ['pytorch', 'caffe2']
cuda version= 10.0.130
cudnn version= 7300
/root/deep-learning-benchmark/frameworks/pytorch/models.py:17: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.eval_input = torch.autograd.Variable(x, volatile=True).cuda() if precision == 'fp32' \
pytorch's vgg16 eval at fp32: 48.5ms avg
pytorch's vgg16 train at fp32: 134.6ms avg
pytorch's resnet152 eval at fp32: 47.2ms avg
pytorch's resnet152 train at fp32: 177.5ms avg
/root/.pyenv/versions/py37/lib/python3.7/site-packages/torchvision/models/densenet.py:212: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
  nn.init.kaiming_normal(m.weight.data)
pytorch's densenet161 eval at fp32: 49.2ms avg
pytorch's densenet161 train at fp32: 172.4ms avg
/root/deep-learning-benchmark/frameworks/pytorch/models.py:18: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  else torch.autograd.Variable(x, volatile=True).cuda().half()
pytorch's vgg16 eval at fp16: 37.2ms avg
pytorch's vgg16 train at fp16: 108.2ms avg
pytorch's resnet152 eval at fp16: 42.1ms avg
pytorch's resnet152 train at fp16: 171.8ms avg
pytorch's densenet161 eval at fp16: 43.1ms avg
pytorch's densenet161 train at fp16: 155.0ms avg
Traceback (most recent call last):
  File "benchmark.py", line 71, in <module>
    results = Benchmark().benchmark_all()
  File "benchmark.py", line 39, in benchmark_all
    results[framework] = self.benchmark_framework(framework)
  File "benchmark.py", line 51, in benchmark_framework
    eval_duration = self.benchmark_model('eval', framework, model, precision)
  File "benchmark.py", line 31, in benchmark_model
    framework_model = self.get_framework_model(framework, model)(precision, image_shape, batch_size)
  File "benchmark.py", line 27, in get_framework_model
    framework_model = import_module('.'.join(['frameworks', framework, 'models']))
  File "/root/.pyenv/versions/3.7.0/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/root/deep-learning-benchmark/frameworks/caffe2/models.py", line 4, in <module>
    from caffe2.python.modeling.initializers import Initializer, pFP16Initializer
ImportError: cannot import name 'pFP16Initializer' from 'caffe2.python.modeling.initializers' (/root/.pyenv/versions/py37/lib/python3.7/site-packages/caffe2/python/modeling/initializers.py)

caffe2の仕様変更でベンチマークテストに失敗した。

%load_ext version_information

%version_information torch, numpy

Software	Version
Python	3.7.0 64bit [GCC 7.3.0]
IPython	6.5.0
OS	Linux 4.15.0 34 generic x86_64 with debian buster sid
torch	1.0.0a0+2cdf98a
numpy	1.15.2
Tue Sep 25 23:58:04 2018 JST